Measure and Improve High Availability of Kubernetes Cluster During Reboot
How to reboot Kubernetes Cluster and how highly available it is during the reboot?
My previous article describes the structure of a highly available Kubernetes cluster that I build. There are cases that the cluster need reboot to apply security patches to the host system, or to the Kubernetes components.
So how to safely reboot the cluster? And what’s the impact of the reboot for the services running on the cluster?
In this article I write down my way to do safe reboot for the cluster, and the method to measure the impact of the reboot to the services running on the cluster, e.g. how highly available the cluster is during the reboot. I also try to improve the availability.
Measure the availability
To do the measurement, I start a simple “whoami” service on the cluster with 5 replicas. Use a test Ingress to provide external access.