Highly Available Envoy Proxies for the Kubernetes Control Plane

Highly Available Envoy Proxies for the Kubernetes Control Plane

February 24, 2020 0 By Eric Shanks

Recently I was tasked with setting up some virtual machines to be used as a load balancer for a Kubernetes cluster. The environment we were deploying our Kubernetes cluster didn’t have a load balancer available, so we thought we’d just throw some envoy proxies on some VMs to do the job. This post will show you how the following tasks were completed:

  1. Deploy Envoy on a pair of CentOS7 virtual machines.
  2. Configure Envoy with health checks for the Kubernetes Control Plane
  3. Install keepalived on both servers to manage failover.
  4. Configure keepalived to failover if a server goes offline, or the envoy service is not started.

Deploy Envoy

The first step will be to setup a pair of CentOS 7 servers. I’ve used virtual servers for this post, but baremetal would work the same. Also, similar steps could be used if you prefer debian as your linux flavor.

Once there is a working pair of servers, its time to install envoy.

Once the Envoy bits are installed, we should create a configuration file that tells envoy how to load balance across our Kubernetes control plane nodes and set health checks to make sure it is routed appropriately. Be sure to update this file with your own ports, and server names/IP Addresses before deploying.

Next, let’s setup a systemd service so that it will start on boot and restart if it crashes.

Lastly, we can enable and start the service.

Make Envoy Highly Available

At this point of the post, you should have two virtual machines with Envoy installed and able to distribute traffic to your Kubernetes control plane nodes. Either one of them should work. But what we’d really like to have is a single IP Address (Virtual IP Address – VIP) that can float between these two envoy nodes depending on which one is healthy. To do this, we’ll use the keepalived project.

The first step will be to install keepalived on both envoy nodes.

Keepalived will ensure that whichever node is healthy, will own the VIP. But having a healthy node now also includes having the envoy process we created is in a running status. To ensure that our service is running, we need to create our own script. The script is very simple and just gathers the process id of our envoy service. If it can’t get a process id, the script will fail, and keepalived will note this error to manage failover.

Our service will run that script as root, and for security reasons ONLY the root user should have access to execute or modify this script. So we need to change permissions. NOTE: if anyone other than root has access keepalived service will skip this check so be sure to set the permissions correctly.

Now, we need to set the keepalived configurations on each of the nodes. Pick a node and deploy the following keepalived configuration to /etc/keepalived/keepalived.conf which overwrites the existing configuration.

Node1

When you’re done with the first node, create a similar config file on the second node.

Node2

Now we should be ready to go. Start and enable the service for keepalived.

Test failover

You may not have a Kubernetes cluster setup yet for a full test, but we can at least see if our envoy server will failover to the other node. To do this you can look at the messages to see which keepalived node is advertising gratuitous arp commands in order to own the VIPs.

If you’re looking at the standby envoy node, the messages will state that the service is in a BACKUP STATE.

If you want to test the failover, stop the envoy service and see if the node in a backup state starts sending gratuitous arps to takeover the VIP.

Summary

A virtual load balancer could be handy in a lot of situations. This case called for a way to distribute load to my Kubernetes control plane nodes, but could really be used for anything. First deploy envoy and configure it to distribute load to the upstream services providing the appropriate health checks. Then use keepalived to ensure that a VIP floats between the healthy envoy nodes. What will you use this option to do? Post your configs in the comments.