/etc/resolv.conf for Kubernetes pods, option ndots:5, how this can negatively affect application performance

/etc/resolv.conf for Kubernetes pods, option ndots:5, how this can negatively affect application performance

We recently launched Kubernetes 1.9 on AWS with the help of Kops. Yesterday, while smoothly rolling out new traffic to our largest Kubernetes cluster, I started noticing unusual DNS name resolution errors logged by our application.

GitHub has quite a bit about it. they said, so I also decided to look into it. As a result, I realized that in our case this was caused by an increased load on kube-dns ΠΈ dnsmasq. The most interesting and new for me was the very reason for the significant increase in DNS query traffic. About this and what to do with it, my post.

DNS resolution inside the container - as in any Linux system - is determined by the configuration file /etc/resolv.conf. Default Kubernetes dnsPolicy it ClusterFirst, which means that any DNS request will be redirected to dnsmasqrunning in a pod kube-dns inside the cluster, which in turn will forward the request to the application kube-dns, if the name ends with a cluster suffix, or otherwise, to a higher-level DNS server.

File /etc/resolv.conf inside each container by default will look like this:

nameserver 100.64.0.10
search namespace.svc.cluster.local svc.cluster.local cluster.local 
eu-west-1.compute.internal
options ndots:5

As you can see, there are three directives:

  1. The name server is the IP of the service kube-dns
  2. 4 local search domains specified search
  3. There is an option ndots:5

The interesting part of this configuration is how the local search domains and settings ndots:5 coexist together. To understand this, you need to understand how DNS resolution works for unqualified names.

What is a full name?

A fully qualified name is a name that will not be searched locally and will be treated as an absolute name during name resolution. By convention, the DNS software considers a name to be fully qualified if it ends with a dot (.), and not fully qualified otherwise. That is google.com. fully defined and google.com - not.

How is an unqualified name handled?

When an application connects to the remote host specified in the name, DNS name resolution is usually done using a system call, for example, getaddrinfo(). But if the name is incomplete (does not end with .), I wonder if the system call will try to resolve the name as an absolute first, or will it go through the local search domains first? It depends on the option ndots.

From the manual for resolv.conf:

ndots:n

устанавливаСт ΠΏΠΎΡ€ΠΎΠ³ для количСства Ρ‚ΠΎΡ‡Π΅ΠΊ, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ Π΄ΠΎΠ»ΠΆΠ½Ρ‹ ΠΏΠΎΡΠ²ΠΈΡ‚ΡŒΡΡ Π² ΠΈΠΌΠ΅Π½ΠΈ, ΠΏΡ€Π΅ΠΆΠ΄Π΅ Ρ‡Π΅ΠΌ Π±ΡƒΠ΄Π΅Ρ‚ сдСлан Π½Π°Ρ‡Π°Π»ΡŒΠ½Ρ‹ΠΉ Π°Π±ΡΠΎΠ»ΡŽΡ‚Π½Ρ‹ΠΉ запрос. Π—Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ ΠΏΠΎ ΡƒΠΌΠΎΠ»Ρ‡Π°Π½ΠΈΡŽ для n Ρ€Π°Π²Π½ΠΎ 1, Ρ‡Ρ‚ΠΎ ΠΎΠ·Π½Π°Ρ‡Π°Π΅Ρ‚, Ρ‡Ρ‚ΠΎ Ссли Π² ΠΈΠΌΠ΅Π½ΠΈ Π΅ΡΡ‚ΡŒ ΠΊΠ°ΠΊΠΈΠ΅-Π»ΠΈΠ±ΠΎ Ρ‚ΠΎΡ‡ΠΊΠΈ, имя Π±ΡƒΠ΄Π΅Ρ‚ сначала ΠΎΠΏΡ€ΠΎΠ±ΠΎΠ²Π°Π½ΠΎ ΠΊΠ°ΠΊ Π°Π±ΡΠΎΠ»ΡŽΡ‚Π½ΠΎΠ΅ имя, ΠΏΡ€Π΅ΠΆΠ΄Π΅ Ρ‡Π΅ΠΌ ΠΊ Π½Π΅ΠΌΡƒ Π±ΡƒΠ΄ΡƒΡ‚ Π΄ΠΎΠ±Π°Π²Π»Π΅Π½Ρ‹ ΠΊΠ°ΠΊΠΈΠ΅-Π»ΠΈΠ±ΠΎ элСмСнты списка поиска.

This means that if for ndots is set to 5 and the name contains less than 5 dots, the system call will attempt to resolve it sequentially, first going through all local search domains, and failing, eventually resolving it as an absolute name.

Why same ndots:5 can negatively affect application performance?

As you can imagine, if your application uses a lot of external traffic, for every established TCP connection (or more precisely, for every resolved name), it will issue 5 DNS queries before the name is properly resolved, because it will first go through 4 local search domain, and at the end will issue an absolute name resolution request.

The following chart shows the total traffic on our 3 kube-dns pods before and after we switched several hostnames configured in our application to fully qualified ones.

/etc/resolv.conf for Kubernetes pods, option ndots:5, how this can negatively affect application performance

The following chart shows the latency of the application before and after we switched several hostnames configured in our application to full ones (the vertical blue line is deployment):

/etc/resolv.conf for Kubernetes pods, option ndots:5, how this can negatively affect application performance

Solution #1 - Use fully qualified names

If you have few static external names (i.e. those defined in the application configuration) to which you create a large number of connections, perhaps the easiest solution is to switch them to fully qualified ones by simply adding them. at the end.

This is not a final solution, but it helps to quickly, if not cleanly, improve the situation. We applied this patch to solve our problem, the results of which were shown in the screenshots above.

Decision #2 - customization ndots Π² dnsConfig

Kubernetes 1.9 introduced a feature in alpha mode (beta version v1.10) that allows you to better control DNS settings through a pod property in dnsConfig. Among other things, it allows you to customize the value ndots for a specific pod, i.e.

apiVersion: v1
kind: Pod
metadata:
  namespace: default
  name: dns-example
spec:
  containers:
    - name: test
      image: nginx
  dnsConfig:
    options:
      - name: ndots
        value: "1"

Sources of

Also read other articles on our blog:

Source: habr.com

Add a comment