Customize vSphere 7 with Tanzu Guest Clusters

Kubernetes clusters can come in many shapes and sizes. Over the past 18 months I’ve deployed quite a few Kubernetes clusters for customers but these clusters all have different requirements. What image registry am I connecting to? Do we need to configure proxies? Will we need to install new certificates to the nodes? Do we need to tweak some containerd configurations? During many of my customer engagements the answer to the above questions is, “yes”.

The Problem

One of the tricker tasks that comes up is determining how we might re-configure a new Tanzu Kubernetes cluster on vSphere 7 with NSX-T. In a vSphere 7 with Tanzu Guest Cluster each namespace that is created in the supervisor cluster has a new NSX-T T1 router deployed and connected to a northbound T0 router. This router does NAT which prevents us from directly accessing the new control plane nodes and worker nodes directly. Remember that when we connect to these clusters through the kubectl command, we’re going through a load balancer which has a NAT’d address.

If you look at the diagram below, you’ll see that in many cases the Orchestrator Tool (Denoted by Jenkins in the diagram) or a user on the corporate network, can’t directly access the Kubernetes nodes that were spun up by Tanzu. This make automating any non-default configurations like a custom image registry, a bit more challenging.

The Solution

A solution that we’ve started to use is to have the orchestrator (or Admin user) deploy a PodVM into the supervisor namespace and let it do the work for us. Building on some code from the incomparable William Lam a colleague of mine, Mike Tritabaugh and I built a PodVM to do some of this work for us. The PodVM consists of an init container, and a config container. Lets look at what they do.

The init container has a very specific task to run before our cluster configuration begins. The init container based on William’s code, performs a login function that builds a KUBECONFIG file with the appropriate cluster context. It stores this KUBECONFIG file in an ephemeral volume for future use by a second container.

Once the init container has completed its tasks, the config container can run. The config container runs a script to grab the SSH key with access to the guest cluster nodes, but the rest of the code is really up to you on how you want to build it. It depends on what tasks need to be executed on the guest clusters. Update containerd, copy certificate files over, or really anything you want. The important part is that the config container mounts the ephemeral volume used by the init container. Now the config scripts you build can leverage the KUBECONFIG file that has a token length of 10 hours, or SSH into the nodes to make configuration changes. Here’s a view of the PodVM to help explain.

Now that we have a new tool to use, our Orchestrator or Administrator can indirectly update the clusters by deploying our pod to the supervisor namespace which contains our guest clusters. We can login with a temporary (10 hours) login to the guest cluster and perform whatever configuration tasks we desire.

If you’d like to use this solution to configure your guest clusters, please take a look at the github repository for build instructions. Good luck with your coding.

The Problem#

The Solution#

The Problem

The Solution