Kubernetes Best Practices. Setting resource requests and limits

Kubernetes Best Practices. Creating Small Containers
Kubernetes Best Practices. Kubernetes organization with namespace
Kubernetes Best Practices. Checking Kubernetes Health with Readiness and Liveness Tests

For each Kubernetes resource, it is possible to configure two types of requirements - Requests and Limits. The first describes the minimum requirements for free node resources required to run a container or pod, the second severely limits the resources available to the container.

When Kubernetes plans pods, it's very important that the containers have enough resources to run normally. If you're planning to deploy a large application to a resource-constrained host, it's possible that it won't run because the host is running out of memory or running out of processor power. In this article, we will look at how you can solve the problems of lack of computing power using resource requests and limits.

Requests and Limits are mechanisms that Kubernetes uses to manage resources such as CPU and memory. Requests is what makes sure the container gets the requested resource. If a container requests a resource, Kubernetes will only schedule it on a node that can provide it. Limits control that the resources requested by the container never exceed a certain value.

Kubernetes Best Practices. Setting resource requests and limits

The container can only grow computing power up to a certain limit, after which it will be limited. Let's see how it works. So, there are two types of resources - processor and memory. The Kubernetes scheduler uses these resources to figure out where to run your pods. A typical resource specification for a pod looks like this.

Kubernetes Best Practices. Setting resource requests and limits

Each container in a pod can set its own requests and limits, it's all additive. Processor resources are defined in millicores. If your container needs two full cores to run, you set the value to 2000m. If the container needs only 1/4 core power, the value will be 250m. Be aware that if you assign a CPU resource value greater than the number of cores of the largest node, then your pod will not be scheduled to start at all. A similar situation will occur if you have a pod that needs four cores, and the Kubernetes cluster consists of only two main virtual machines.

Unless your application is specifically designed to take advantage of multiple cores (thus programs such as complex scientific calculations and database operations come to mind), then the best practice is to set CPU Requests to 1 or lower and then run more replicas to scalability. This solution will give the system greater flexibility and reliability.

When it comes to CPU limitations, things get more interesting as it is considered a compressible resource. If your application starts approaching the CPU limit, Kubernetes will start throttling your container using CPU Throttling. This means that the processor will be artificially limited, giving the application potentially worse performance, but the process will not be terminated or rendered.

Memory resources are defined in bytes. Usually the value in the settings is measured in Mib mebibytes, but you can set it to any value, from bytes to petabytes. The situation here is the same as with the CPU - if you place a request for an amount of memory that exceeds the amount of memory on your nodes, the execution of this pod will not be scheduled. But unlike CPU resources, memory is not compressed because there is no way to limit its usage. Therefore, the execution of the container will be stopped as soon as it runs out of memory allocated to it.

Kubernetes Best Practices. Setting resource requests and limits

It's important to remember that you can't configure requests that exceed the amount of resources your nodes can provide. Share specifications for GKE VMs can be found in the links below this video.

In an ideal world, the default container settings would be enough to keep workflows running smoothly. But the real world is not like that, people can easily forget to adjust resource usage, or hackers will set requests and limits that exceed the actual capabilities of the infrastructure. To prevent such scenarios, you can configure ResourceQuota resource quotas and LimitRange ranges.

Once a namespace has been created, they can be locked using quotas. For example, if you have the prod and dev namespaces, the pattern is that there are no production quotas at all, and development quotas are very strict. This allows prod to take all the available resource in the event of a sharp increase in traffic, completely blocking dev.

A resource quota might look like this. In this example, there are 4 sections - these are the 4 bottom lines of code.

Kubernetes Best Practices. Setting resource requests and limits

Let's look at each of them. Requests.cpu is the maximum number of combined requests for CPU power that can come from all namespace containers. In this example, you could have 50 containers with 10m requests, five containers with 100m requests, or just one container with 500m requests. As long as the total number of requests.cpu of a given namespace is less than 500m, everything will be fine.

Requested memory requests.memory is the maximum amount of combined memory requests that all containers in the namespace can have. As in the previous case, you can have 50 2 mib containers, five 20 mib containers, or a single 100 mib container as long as the total amount of memory requested in the namespace is less than 100 mebibytes.

Limits.cpu is the maximum combined amount of processor power that all namespace containers can use. We can assume that this is the limit of processor power requests.

Finally, limits.memory is the maximum amount of shared memory that all containers in the namespace can use. This is the limit on total memory requests.
So, by default, containers in a Kubernetes cluster run with unlimited computing resources. With resource quotas, cluster administrators can limit resource consumption and resource creation based on the namespace. In a namespace, a pod or container can consume as much CPU and memory power as defined by the namespace resource quota. However, there is a concern that one pod or container can monopolize all available resources. To prevent this situation, the limit Range is used - the policy for limiting the allocation of resources (for pods or containers) in the namespace.

The limit range provides limits that can:

  • ensure the minimum and maximum use of computing resources for each module or container in the namespace;
  • enforce a minimum and maximum Starage Request for each PersistentVolumeClaim in the namespace;
  • force a relationship between a Request and a Limit for a resource in a namespace;
  • set default Requests/Limits for compute resources in the namespace and automatically inject them into containers at runtime.

So you can create a limit range in your namespace. Unlike the quota, which applies to the entire namespace, the Limit Range is used for individual containers. This can prevent users from creating tiny or giant containers within a namespace. The Limit Range might look like this.

Kubernetes Best Practices. Setting resource requests and limits

As in the previous case, 4 sections can be distinguished here. Let's take a look at each.
The default section sets the default restrictions for the container in the pod. If you set these values ​​in the extreme range, then any containers for which these values ​​have not been explicitly set will follow the default values.

In the defaultRequest section, default requests are configured for the container in the pod. Again, if you set these values ​​to the extreme range, then any containers for which these parameters are not explicitly set will use these values ​​by default.

The max section specifies the maximum limits that can be set for a container in a pod. Values ​​in the default section and container limits cannot be set above this limit. It is important to note that if max is set and there is no default section, then the maximum value becomes the default value.

The min section specifies the minimum requests that can be set for a container in a pod. However, values ​​in the default section and requests for a container cannot be set below this limit.

Again, it's important to note that if this value is set and default is not, then the minimum value becomes the default query.

Ultimately, these resource requests are used by the Kubernetes scheduler to execute your workloads. In order for you to properly set up your containers, it is very important to understand how it works. Let's say you want to run multiple pods on your cluster. Assuming the pod specifications are valid, the Kubernetes schedule will use round robin to select the node to run the workload.

Kubernetes Best Practices. Setting resource requests and limits

Kubernetes will check if Node 1 has enough resources to fulfill the requests of the pod containers, and if not, it will move on to the next node. If none of the nodes in the system is able to satisfy the requests, the pods will go into the Pending state. Using features of the Google Kubernetes engine such as node autoscaling, GKE can automatically detect the pending state and create a few more additional nodes.

If there is an overcapacity of nodes later on, autoscaling will reduce the number of nodes to save you money. This is why Kubernetes schedules Pods based on requests. However, the limit may be higher than requests, and in some cases the host may actually run out of resources. We call this state the overcommitment state.

Kubernetes Best Practices. Setting resource requests and limits

Like I said, when it comes to CPU, Kubernetes will start limiting Pods. Each pod will receive as much as it asked for, but if it does not reach the limit, then throttling will begin to apply.

In terms of memory resources, here Kubernetes has to make decisions about which pods to delete and which to keep until you free system resources, otherwise the whole system will collapse.

Let's imagine a scenario where you have a machine running out of memory - how would Kubernetes handle this?

Kubernetes will look for Pods that are using more resources than requested. So if your containers don't have Requests at all, it means that by default they use more than they asked for, simply because they didn't ask for anything at all! Such containers become prime candidates for shutdown. The next candidates are containers that have satisfied all their requests but are still below the maximum limit.

So if Kubernetes finds a few pods that have exceeded their request parameters, it will sort them by priority and then remove the lowest priority pods. If all pods have the same priority, then Kubernetes will shut down those pods that have exceeded their requests by more than the rest of the pods.

In very rare cases, Kubernetes may terminate Pods that are still within their requests. This can happen when critical system components such as the Kubelet agent or Docker start consuming more resources than were reserved for them.
So, in the early stages of a small company, a Kubernetes cluster can work fine without setting resource requests and limits, but as your teams and projects start to grow in size, you run the risk of running into problems in this area. Adding queries and constraints to your modules and namespaces requires very little extra effort and can save you a lot of hassle.

Kubernetes Best Practices. Correct shutdown Terminate

Some ads πŸ™‚

Thank you for staying with us. Do you like our articles? Want to see more interesting content? Support us by placing an order or recommending to friends, cloud VPS for developers from $4.99, a unique analogue of entry-level servers, which was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $19 or how to share a server? (available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper in Equinix Tier IV data center in Amsterdam? Only here 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $199 in the Netherlands! Dell R420 - 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB - from $99! Read about How to build infrastructure corp. class with the use of Dell R730xd E5-2650 v4 servers worth 9000 euros for a penny?

Source: habr.com

Add a comment