Kubernetes Best Practices. Checking Kubernetes Health with Readiness and Liveness Tests

Kubernetes Best Practices. Creating Small Containers
Kubernetes Best Practices. Kubernetes organization with namespace

Kubernetes Best Practices. Checking Kubernetes Health with Readiness and Liveness Tests

Distributed systems can be difficult to manage due to the fact that they have many moving and changing elements, and all of them must work properly for the system to function. If one of the elements fails, then the system must detect it, bypass and fix it, and all this must be done automatically. In this Kubernetes Best Practices series, we'll learn how to set up Readiness and Liveness tests to test the health of a Kubernetes cluster.

The Health Check is an easy way to let the system know if your application instance is running or not. If your application instance is not running, then other services should not access or send requests to it. Instead, the request must be sent to another instance of the application that is already running or will start later. In addition, the system must return to your application the lost performance.

By default, Kubernetes will start sending traffic to the pod when all containers within the pods are running, and restart containers when they crash. This default system behavior might be good enough to start with, but you can increase the reliability of your product deployment by using custom health checks.

Kubernetes Best Practices. Checking Kubernetes Health with Readiness and Liveness Tests

Luckily, Kubernetes makes this pretty easy, so there's no excuse for ignoring these checks. Kubernetes provides two types of Health Check tests, and it's important to understand the differences in applying each.

The Readiness readiness test is designed to tell Kubernetes that your application is ready to serve traffic. Before allowing the service to send traffic to the pod, Kubernetes needs to make sure that the readiness check is successful. If the Readiness test fails, Kubernetes will stop sending traffic to the pod until the test succeeds.

The liveness test tells Kubernetes whether your application is alive or dead. In the first case, Kubernetes will leave it alone, in the second it will remove the dead pod and replace it with a new one.

Let's imagine a scenario in which your application takes 1 minute to "warm up" and start up. Your service will not start running until the application is fully loaded and running, even though the workflow has already started. And you will also have problems if you want to scale this deployment up to several copies, because these copies should not receive traffic until they are completely ready. However, by default, Kubernetes will start sending traffic as soon as the processes inside the container start.

When using the Readiness test, Kubernetes will wait until the application is fully launched before allowing the service to send traffic to the new copy.

Kubernetes Best Practices. Checking Kubernetes Health with Readiness and Liveness Tests

Let's imagine another scenario in which the application hangs for a long time, ceasing to serve requests. Since the process continues to run, by default Kubernetes will assume that everything is fine and continue to send requests to the non-working pod. But when using Liveness, Kubernetes will detect that the application is no longer serving requests and will restart the non-working pod by default.

Kubernetes Best Practices. Checking Kubernetes Health with Readiness and Liveness Tests

Consider how readiness and viability are tested. There are three ways to test - HTTP, Command and TCP. You can use any of them to check. The most common user test method is the HTTP probe.

Even if your application is not an HTTP server, you can still create a lightweight HTTP server inside your application to interact with the Liveness test. Kubernetes will then start pinging the pod, and if the HTTP response is in the 200ms or 300ms range, that means the pod is healthy. Otherwise, the module will be marked as "unhealthy".

Kubernetes Best Practices. Checking Kubernetes Health with Readiness and Liveness Tests

For tests with Command, Kubernetes executes the command inside your container. If the command returns with a zero exit code, then the container will be marked as healthy, otherwise, when receiving an exit status number from 1 to 255, the container will be marked as "sick". This way of testing is useful if you can't or don't want to run an HTTP server, but are able to run a command that checks the "health" of your application.

Kubernetes Best Practices. Checking Kubernetes Health with Readiness and Liveness Tests

The final verification mechanism is the TCP test. Kubernetes will try to establish a TCP connection on the specified port. If this can be done, the container is considered healthy; if not, it is not viable. This method can come in handy if you're in a scenario where testing with an HTTP request or running a command doesn't work well. For example, the main services to check with TCP would be gRPC or FTP.

Kubernetes Best Practices. Checking Kubernetes Health with Readiness and Liveness Tests

Tests can be configured in several ways with different options. You can specify how often they should be executed, what are the thresholds for success and failure, how long to wait for responses. For more information, see the documentation for the Readiness and Liveness tests. However, there is one very important point in setting up the Liveness test - the initial setting of the test delay initialDelaySeconds. As I mentioned, failing this test will cause the module to restart. Therefore, you need to make sure that testing does not start until the application is ready to go, otherwise it will start cycling. I recommend using the P99 Startup Time or Average Buffered Application Startup Time. Remember to adjust this value as your app's startup time gets faster or slower.

Most experts will agree that Health Checks are a must for any distributed system, and Kubernetes is no exception. Using service health checks ensures that Kubernetes is reliable and up-to-date, and is easy for users.

To be continued very soon...

Some ads πŸ™‚

Thank you for staying with us. Do you like our articles? Want to see more interesting content? Support us by placing an order or recommending to friends, cloud VPS for developers from $4.99, a unique analogue of entry-level servers, which was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $19 or how to share a server? (available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper in Equinix Tier IV data center in Amsterdam? Only here 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $199 in the Netherlands! Dell R420 - 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB - from $99! Read about How to build infrastructure corp. class with the use of Dell R730xd E5-2650 v4 servers worth 9000 euros for a penny?

Source: habr.com

Add a comment