How to Access Kubernetes Pod Resources

How to Access Kubernetes Pod ResourcesThe Reward by Tohad

When getting started with Kubernetes, it's common to forget about setting up container resources. At this point, it's enough to make sure that the Docker image works and can be deployed to the Kubernetes cluster.

But the later application needs to be deployed to a production cluster along with other applications. To do this, you need to allocate resources for the container and make sure that there are enough of them for the application to start and run, and there will be no problems in other running applications.

Team Kubernetes aaS from Mail.ru translated an article about container resources (CPU & MEM), resource requests and limits. You will learn what benefits these settings provide and what happens if they are not set.

Computing resources

We have two resource types with the following units:

  • Central processing unit (CPU) - cores;
  • Memory (MEM) - bytes.

Resources are specified for each container. In the following Pod YAML file, you will see a resource section that contains the requested and limit resources:

  • Pod Resources Requested = sum of requested resources of all pods;
  • Pod Resource Limit = Sum of all Pod Resource Limits.

apiVersion: v1
kind: Pod
metadata:
  name: backend-pod-name
  labels:
    application: backend
spec:
  containers:
    — name: main-container
      image: my-backend
      tag: v1
      ports:
      — containerPort: 8080
      resources:
        requests:
          cpu: 0.2 # REQUESTED CPU: 200m cores
          memory: "1Gi" # REQUESTED MEM: 1Gi
        limits:
          cpu: 1 # MAX CPU USAGE: 1 core
          memory: "1Gi" # MAX MEM USAGE:  1Gi
    — name: other-container
      image: other-app
      tag: v1
      ports:
      — containerPort: 8000
      resources:
        requests:
          cpu: "200m" # REQUESTED CPU: 200m cores
          memory: "0.5Gi" # REQUESTED MEM: 0.5Gi
        limits:
          cpu: 1 # MAX CPU USAGE: 1 core
          memory: "1Gi" # MAX MEM USAGE:  1Gi

Example of Requested and Limit Resources

Field resources.requested from the Pod specification - one of the elements that is used to find the desired node. You can already schedule a Pod deployment on it. How do you find a suitable node?

Kubernetes consists of several components, including the main node or master node (Kubernetes Control Plane). There are several processes in the master node: kube-apiserver, kube-controller-manager and kube-scheduler.

The kube-scheduler process is responsible for looking at newly created pods and looking for possible worker nodes that match all pod requests, including the number of requested resources. The list of nodes found by kube-scheduler is ranked. The pod is scheduled on the node with the highest scores.

How to Access Kubernetes Pod ResourcesWhere will the purple Pod be placed?

The picture shows that kube-scheduler needs to schedule a new purple Pod. The Kubernetes cluster contains two nodes: A and B. As you can see, kube-scheduler cannot schedule a Pod to node A - the available (unsolicited) resources do not match the requests of the purple Pod. For example, the 1 GB of memory requested by the purple Pod will not fit on Node A, since the available memory is 0,5 GB. But node B has enough resources. Finally, kube-scheduler decides that the destination of the purple Pod is node B.

Now we know how the requested resources affect the choice of a node to run a Pod. But what is the effect of marginal resources?

The resource limit is a limit that the CPU/MEM cannot cross. However, the CPU resource is flexible, so containers that hit their CPU limits will not cause the Pod to shut down. Instead, CPU throttling will start. If the MEM limit is reached, then the container will be stopped due to OOM-Killer and restarted if allowed by the RestartPolicy setting.

Requested and limit resources in detail

How to Access Kubernetes Pod ResourcesLinking resources between Docker and Kubernetes

The best way to explain how requested and resource limits work is to imagine the relationship between Kubernetes and Docker. In the figure above, you can see how Kubernetes fields and Docker startup flags are related.

Memory: request and limit

containers:
...
 resources:
   requests:
     memory: "0.5Gi"
   limits:
     memory: "1Gi"

As mentioned above, memory is measured in bytes. Based on Kubernetes documentation, we can specify the memory as a number. Usually it is an integer, for example 2678 - that is, 2678 bytes. You can also use suffixes G и Gi, the main thing is to remember that they are not equivalent. The first is decimal and the second is binary. As an example mentioned in the k8s documentation: 128974848, 129e6, 129M, 123Mi — they are practically equivalent.

Kubernetes parameter limits.memory matches the flag --memory from Docker. In case of request.memory there is no arrow for Docker because Docker does not use this field. You may ask if this is necessary at all? Yes need. As I said before, the field matters to Kubernetes. Based on the information from it, kube-scheduler decides which node to schedule the Pod to.

What happens if there is not enough memory for the request?

If the container has reached the limits of the requested memory, then the Pod is placed in a group of Pods that stop when there is not enough memory in the node.

What happens if you set the memory limit too low?

If a container exceeds the memory limit, it will terminate due to OOM-Killed. And will be restarted if possible based on the RestartPolicy, where the default value is Always.

What happens if you do not specify the requested memory?

Kubernetes will take the limit and set it as the default.

What can happen if you don't specify a memory limit?

The container has no limits, it can use as much memory as it wants. If he starts using all the available memory of the node, then OOM will kill him. The container will then be restarted if possible based on the RestartPolicy.

What happens if you do not specify memory limits?

This is the worst case scenario: the scheduler doesn't know how many resources the container needs and this can cause serious problems on the node. In this case, it would be nice to have default limits in the namespace (set by LimitRange). No default limit - Pod has no limit, it can use as much memory as it wants.

If the requested memory is more than the node can offer, the Pod will not be scheduled. It is important to remember that Requests.memory is not the minimum value. This is a description of the amount of memory that is sufficient to keep the container running all the time.

It is generally recommended to set the same value for request.memory и limit.memory. This prevents Kubernetes from scheduling a Pod on a node that has enough memory to run the Pod but not enough to run. Keep in mind that Kubernetes Pod planning only considers requests.memory, limits.memory does not take into account.

CPU: request and limit

containers:
...
 resources:
   requests:
     cpu: 1
   limits:
     cpu: "1200m"

With CPU, things are a bit more complicated. Returning to the picture with the relationship between Kubernetes and Docker, you can see that request.cpu соответствует --cpu-sharesWhereas limit.cpu matches the flag cpus in Docker.

The CPU requested by Kubernetes is multiplied by 1024, the proportion of CPU cycles. If you want to request 1 full core, then you must add cpu: 1as shown above.

Requesting a full core (proportion = 1024) does not mean that your container will receive it. If your host machine has only one core and you are using more than one container, then all containers must share the available CPU among themselves. How does this happen? Let's look at the picture.

How to Access Kubernetes Pod Resources
CPU Request - Single Core System

Let's imagine that you have a single core host system running containers. Mom (Kubernetes) baked a pie (CPU) and wants to share it among her children (containers). Three children want a whole pie (proportion = 1024), another child wants half a pie (512). Mom wants to be fair and makes a simple calculation.

# Сколько пирогов хотят дети?
# 3 ребенка хотят по целому пирогу и еще один хочет половину пирога
cakesNumberKidsWant = (3 * 1) + (1 * 0.5) = 3.5
# Выражение получается так:
3 (ребенка/контейнера) * 1 (целый пирог/полное ядро) + 1 (ребенок/контейнер) * 0.5 (половина пирога/половина ядра)
# Сколько пирогов испечено?
availableCakesNumber = 1
# Сколько пирога (максимально) дети реально могут получить?
newMaxRequest = 1 / 3.5 =~ 28%

Based on the calculation, three children will receive 28% of the core, and not the whole core. The fourth child will get 14% of the full core, not half. But things will be different if you have a multi-core system.

How to Access Kubernetes Pod Resources
CPU Request - Multi-Core (4) System

In the image above, you can see that three children want a whole pie, and one wants half. Since Mom baked four pies, each of her children will get as many as they want. In a multi-core system, processor resources are distributed across all available processor cores. If a container is limited to less than one full CPU core, it can still use it at 100%.

The above calculations are simplified to understand how the CPU is shared between containers. Of course, besides the containers themselves, there are other processes that also use CPU resources. When processes in one container are idle, others can use its resource. CPU: "200m" соответствует CPU: 0,2, which means approximately 20% of one core.

Now let's talk about limit.cpu. The CPU limiting Kubernetes is multiplied by 100. The result is the amount of time the container can use every 100 µs (cpu-period).

limit.cpu matches the Docker flag --cpus. It's a new combination of old --cpu-period и --cpu-quota. By setting it, we specify how much of the available CPU resources the container can max out before throttling starts:

  • cpus - combination cpu-period и cpu-quota. cpus = 1.5 is equivalent to setting cpu-period = 100000 и cpu-quota = 150000;
  • CPU-period - period CPU CFS scheduler, default 100 microseconds;
  • cpu-quota is the number of microseconds inside cpu-periodThe to which the container is constrained.

What happens if you install an insufficiently requested CPU?

If the container needs more than what is set, then it will steal the CPU from other processes.

What happens if you set an insufficient CPU limit?

Since the CPU resource is adjustable, throttling will turn on.

What happens if you don't specify a CPU request?

As in the case of memory, the value of the request is equal to the limit.

What happens if you do not specify a CPU limit?

The container will use as much CPU as it needs. If the default CPU policy (LimitRange) is defined in the namespace, then this limit is also used for the container.

What happens if neither request nor CPU limit is specified?

As with memory, this is the worst-case scenario. The scheduler does not know how many resources your container needs, and this can cause serious problems on the node. To avoid this, you need to set default limits for namespaces (LimitRange).

Remember, if you request more CPU than the nodes can provide, the Pod will not be scheduled. Requests.cpu - not the minimum value, but a value sufficient to start the Pod and work without failures. If the application does not perform complex calculations, the best option is to install request.cpu <= 1 and run as many replicas as needed.

Ideal amount of requested resources or resource limit

We learned about the limitation of computing resources. Now it's time to answer the question: "How many resources does my Pod need to run the application without problems? What is the ideal amount?

Unfortunately, there are no clear answers to these questions. If you don't know how your application works, how much CPU or memory it needs, the best option is to give the application a lot of memory and CPU, and then run benchmarks.

In addition to performance tests, observe the behavior of the application in monitoring during the week. If the graphs show that your application is consuming less resources than you requested, then you can reduce the amount of CPU or memory requested.

For an example, see this Grafana dashboard. It displays the difference between the requested resources or resource limit and the current resource usage.

Conclusion

Resource request and limit help keep the Kubernetes cluster healthy. Proper configuration of limits minimizes costs and keeps applications up and running at all times.

In short, there are a few things to keep in mind:

  1. Requested Resources - A configuration that is taken into account at startup time (when Kubernetes plans to host the application). In contrast, limiting resources is important at run time—when the application is already running on the host.
  2. Compared to memory, the CPU is a regulated resource. In the event of a lack of CPU, your Pod will not shut down, the throttling mechanism will turn on.
  3. Requested resources and resource limit are not minimum and maximum values! By specifying the resources to request, you ensure that the application runs smoothly.
  4. It is good practice to set the memory request equal to the memory limit.
  5. Well install requested CPU <=1if the application does not perform complex calculations.
  6. If you request more resources than a node has, the Pod will never be scheduled for that node.
  7. Use load testing and monitoring to determine the correct amount of requested resources/resource limits.

I hope this article helps you understand the basic concept of resource limiting. And you can apply this knowledge in your work.

Good Luck!

What else to read:

  1. SRE Observability: Namespaces and Metric Structure.
  2. 90+ Useful Tools for Kubernetes: Deployment, Management, Monitoring, Security and More.
  3. Our channel Around Kubernetes in Telegram.

Source: habr.com

Add a comment