6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

I le aluga o tausaga o le faʻaaogaina o Kubernetes i le gaosiga, ua matou faʻaputuina le tele o tala manaia i le faʻafefea ona oʻo atu faʻamaʻi i vaega eseese o le polokalama i taunuuga le lelei ma / poʻo le le malamalama e aʻafia ai le faʻaogaina o pusa ma pusa. I lenei tusiga ua matou faia se filifiliga o nisi o mea sili ona taatele pe manaia. E tusa lava pe e te le maua se laki e faʻafeiloaʻi i ia tulaga, o le faitau e uiga i na tala pupuu leoleo - aemaise lava "lima muamua" - e manaia i taimi uma, a ea?..

Tala 1. Supercronic ma Docker tautau

I luga o se tasi o fuifui, matou te mauaina i lea taimi ma lea taimi se Docker aisa, lea e faʻalavelave ai i le gaioiga masani o le fuifui. I le taimi lava e tasi, o mea nei na matauina i le Docker logs:

level=error msg="containerd: start init process" error="exit status 2: "runtime/cgo: pthread_create failed: No space left on device
SIGABRT: abort
PC=0x7f31b811a428 m=0

goroutine 0 [idle]:

goroutine 1 [running]:
runtime.systemstack_switch() /usr/local/go/src/runtime/asm_amd64.s:252 fp=0xc420026768 sp=0xc420026760
runtime.main() /usr/local/go/src/runtime/proc.go:127 +0x6c fp=0xc4200267c0 sp=0xc420026768
runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4200267c8 sp=0xc4200267c0

goroutine 17 [syscall, locked to thread]:
runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1

…

O le mea e sili ona tatou fiafia i ai i lenei mea sese o le savali: pthread_create failed: No space left on device. Suesuega vave fa'amaumauga faʻamatala e le mafai e Docker ona faʻaogaina se gaioiga, o le mafuaʻaga lea e malolo ai i lea taimi ma lea taimi.

I le mataituina, o le ata lea e fetaui ma le mea o loʻo tupu:

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

O se tulaga talitutusa e matauina i isi nodes:

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

I nodes tutusa tatou te vaʻaia:

root@kube-node-1 ~ # ps auxfww | grep curl -c
19782
root@kube-node-1 ~ # ps auxfww | grep curl | head
root     16688  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>
root     17398  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>
root     16852  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>
root      9473  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>
root      4664  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>
root     30571  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>
root     24113  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>
root     16475  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>
root      7176  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>
root      1090  0.0  0.0      0     0 ?        Z    Feb06   0:00      |       _ [curl] <defunct>

Na aliali mai o lenei amioga o se taunuuga o le pod o loʻo galue ma supercronic (o se faʻaoga Go lea matou te faʻaogaina e faʻatautaia ai galuega cron i pods):

 _ docker-containerd-shim 833b60bb9ff4c669bb413b898a5fd142a57a21695e5dc42684235df907825567 /var/run/docker/libcontainerd/833b60bb9ff4c669bb413b898a5fd142a57a21695e5dc42684235df907825567 docker-runc
|   _ /usr/local/bin/supercronic -json /crontabs/cron
|       _ /usr/bin/newrelic-daemon --agent --pidfile /var/run/newrelic-daemon.pid --logfile /dev/stderr --port /run/newrelic.sock --tls --define utilization.detect_aws=true --define utilization.detect_azure=true --define utilization.detect_gcp=true --define utilization.detect_pcf=true --define utilization.detect_docker=true
|       |   _ /usr/bin/newrelic-daemon --agent --pidfile /var/run/newrelic-daemon.pid --logfile /dev/stderr --port /run/newrelic.sock --tls --define utilization.detect_aws=true --define utilization.detect_azure=true --define utilization.detect_gcp=true --define utilization.detect_pcf=true --define utilization.detect_docker=true -no-pidfile
|       _ [newrelic-daemon] <defunct>
|       _ [curl] <defunct>
|       _ [curl] <defunct>
|       _ [curl] <defunct>
…

O le faʻafitauli o lenei: pe a faʻatautaia se galuega i le supercronic, o le faʻagasologa na tupu mai ai e le mafai ona faamuta sa'o, liliu atu i totonu zombie.

mataʻi: Ina ia sili atu le saʻo, o faʻagasologa e faʻavaeina e galuega cron, ae o le supercronic e le o se init system ma e le mafai ona "faʻaaogaina" faiga na faʻatupuina e ana fanau. Pe a siitia faailo SIGHUP poʻo le SIGTERM, e le tuʻuina atu i le tamaititi gaioiga, e mafua ai ona le faʻamutaina le gaioiga a le tamaititi ma tumau i le tulaga zombie. E mafai ona e faitau atili e uiga i nei mea uma, mo se faʻataʻitaʻiga, i se tala faapena.

E lua auala e foia ai faafitauli:

  1. I le avea ai o se fofo le tumau - faʻateleina le numera o PIDs i totonu o le polokalama i se taimi e tasi i le taimi:
           /proc/sys/kernel/pid_max (since Linux 2.5.34)
                  This file specifies the value at which PIDs wrap around (i.e., the value in this file is one greater than the maximum PID).  PIDs greater than this  value  are  not  allo‐
                  cated;  thus, the value in this file also acts as a system-wide limit on the total number of processes and threads.  The default value for this file, 32768, results in the
                  same range of PIDs as on earlier kernels
  2. Pe fa'alauiloa galuega i le supercronic e le tuusa'o, ae fa'aoga tutusa tini, lea e mafai ona faʻamutaina faʻagasologa saʻo ae le faʻatupuina zombies.

Tala 2. “Zombies” pe a tape se vaega

Na amata ona faʻaaogaina e Kubelet le tele o le PPU:

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

E leai se tasi e fiafia i lenei mea, o lea na matou faaauupegaina ai i matou lava mea manogi ma amata ona taulimaina le faafitauli. O i'uga o le su'esu'ega e fa'apea:

  • E fa'aalu e Kubelet le sili atu ma le tasi vaetolu o lona taimi CPU e tosoina ai fa'amaumauga manatua mai vaega uma:

    6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

  • I totonu ole lisi ole lisi ole kernel developers e mafai ona e mauaina talanoaga o le faafitauli. I se faapuupuuga, o le manatu e oʻo mai i lenei: eseese tmpfs faila ma isi mea faapena e leʻo aveesea atoatoa mai le faiga pe a tapeina se vaega, o le mea ua ta'ua memcg zombie. E le o toe mamao pe mulimuli ane o le a tapeina i latou mai le itulau cache, ae o loʻo i ai le tele o manatuaga i luga o le 'auʻaunaga ma e le iloa e le fatu le mea e faʻaumatia ai le taimi i le tapeina. O le mea lena e faaputuputu ai pea. Aisea ua tupu ai lenei mea? Ole 'au'auna lea e iai galuega cron e fa'atupu pea galuega fou, ma fa'atasi ai ma latou pods fou. O lea, ua fausia ai cgroups fou mo koneteina i totonu, lea e vave ona tapeina.
  • Aisea e fa'amaumau ai e le cAdvisor i le kubelet le tele o le taimi? E faigofie ona vaʻaia i le faʻatinoga sili ona faigofie time cat /sys/fs/cgroup/memory/memory.stat. Afai i luga o se masini soifua maloloina o le taotoga e 0,01 sekone, ona i luga o le cron02 faʻafitauli e manaʻomia le 1,2 sekone. O le mea o le cAdvisor, lea e faitau lemu faʻamatalaga mai sysfs, taumafai e faʻaogaina le manatua o loʻo faʻaaogaina i zombie cgroups.
  • Ina ia aveese faʻamalosi zombies, na matou taumafai e faʻamalo faʻamau e pei ona fautuaina i le LKML: sync; echo 3 > /proc/sys/vm/drop_caches, - ae o le fatu na sili atu ona faigata ma paʻu ai le taavale.

O le a le mea e fai? O loʻo faʻaleleia le faʻafitauli (tautino, ma mo se faʻamatalaga vaʻai tatala savali) faʻafouina le fatu Linux i le version 4.16.

Talafaasolopito 3. Systemd ma lona mauga

Ma toe, o le kubelet o loʻo faʻaaogaina le tele o punaoa i luga o nisi nodes, ae o le taimi lea e alu ai le tele o manatua:

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

Na aliali mai o loʻo i ai se faʻafitauli i le systemd faʻaaogaina i le Ubuntu 16.04, ma e tupu pe a faʻatautaia faʻamau na faia mo fesoʻotaʻiga. subPath mai ConfigMap's po'o mea lilo. A maeʻa le galuega a le pod e tumau pea le auaunaga systemd ma lana tautua i faiga. I le aluga o taimi, o se numera tele o latou faʻaputuina. E iai foʻi mataupu i lenei mataupu:

  1. #5916;
  2. kubernetes #57345.

...o le mea mulimuli e faasino i le PR i le systemd: #7811 (mataupu i le systemd - #7798).

Le faʻafitauli e le o toe iai i le Ubuntu 18.04, ae afai e te manaʻo e faʻaauau pea le faʻaaogaina o le Ubuntu 16.04, e mafai ona e mauaina le matou fofo i lenei autu aoga.

O lea na matou faia ai le DaemonSet nei:

---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  labels:
    app: systemd-slices-cleaner
  name: systemd-slices-cleaner
  namespace: kube-system
spec:
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: systemd-slices-cleaner
  template:
    metadata:
      labels:
        app: systemd-slices-cleaner
    spec:
      containers:
      - command:
        - /usr/local/bin/supercronic
        - -json
        - /app/crontab
        Image: private-registry.org/systemd-slices-cleaner/systemd-slices-cleaner:v0.1.0
        imagePullPolicy: Always
        name: systemd-slices-cleaner
        resources: {}
        securityContext:
          privileged: true
        volumeMounts:
        - name: systemd
          mountPath: /run/systemd/private
        - name: docker
          mountPath: /run/docker.sock
        - name: systemd-etc
          mountPath: /etc/systemd
        - name: systemd-run
          mountPath: /run/systemd/system/
        - name: lsb-release
          mountPath: /etc/lsb-release-host
      imagePullSecrets:
      - name: antiopa-registry
      priorityClassName: cluster-low
      tolerations:
      - operator: Exists
      volumes:
      - name: systemd
        hostPath:
          path: /run/systemd/private
      - name: docker
        hostPath:
          path: /run/docker.sock
      - name: systemd-etc
        hostPath:
          path: /etc/systemd
      - name: systemd-run
        hostPath:
          path: /run/systemd/system/
      - name: lsb-release
        hostPath:
          path: /etc/lsb-release

... ma o loʻo faʻaaogaina le tusitusiga lenei:

#!/bin/bash

# we will work only on xenial
hostrelease="/etc/lsb-release-host"
test -f ${hostrelease} && grep xenial ${hostrelease} > /dev/null || exit 0

# sleeping max 30 minutes to dispense load on kube-nodes
sleep $((RANDOM % 1800))

stoppedCount=0
# counting actual subpath units in systemd
countBefore=$(systemctl list-units | grep subpath | grep "run-" | wc -l)
# let's go check each unit
for unit in $(systemctl list-units | grep subpath | grep "run-" | awk '{print $1}'); do
  # finding description file for unit (to find out docker container, who born this unit)
  DropFile=$(systemctl status ${unit} | grep Drop | awk -F': ' '{print $2}')
  # reading uuid for docker container from description file
  DockerContainerId=$(cat ${DropFile}/50-Description.conf | awk '{print $5}' | cut -d/ -f6)
  # checking container status (running or not)
  checkFlag=$(docker ps | grep -c ${DockerContainerId})
  # if container not running, we will stop unit
  if [[ ${checkFlag} -eq 0 ]]; then
    echo "Stopping unit ${unit}"
    # stoping unit in action
    systemctl stop $unit
    # just counter for logs
    ((stoppedCount++))
    # logging current progress
    echo "Stopped ${stoppedCount} systemd units out of ${countBefore}"
  fi
done

... ma e taʻi 5 minute uma e faʻaaoga ai le supercronic na taʻua muamua. O lona Dockerfile e pei o lenei:

FROM ubuntu:16.04
COPY rootfs /
WORKDIR /app
RUN apt-get update && 
    apt-get upgrade -y && 
    apt-get install -y gnupg curl apt-transport-https software-properties-common wget
RUN add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable" && 
    curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - && 
    apt-get update && 
    apt-get install -y docker-ce=17.03.0*
RUN wget https://github.com/aptible/supercronic/releases/download/v0.1.6/supercronic-linux-amd64 -O 
    /usr/local/bin/supercronic && chmod +x /usr/local/bin/supercronic
ENTRYPOINT ["/bin/bash", "-c", "/usr/local/bin/supercronic -json /app/crontab"]

Tala 4. Fa'atauva'a pe a fa'atulaga pusa

Na maitauina e faapea: afai o loʻo i ai se matou pusa e tuʻuina i luga o se node ma o lona ata e pamu i fafo mo se taimi umi lava, o le isi pusa e "ta" le node lava e tasi o le a faigofie lava. e le amata ona toso le ata o le pod fou. Nai lo lena, e faʻatali seʻia oʻo ina toso le ata o le pod muamua. O se taunuuga, o se pod ua uma ona faʻatulagaina ma o lona ata na mafai ona sii mai i totonu o le minute o le a iu i le tulaga o containerCreating.

O mea na tutupu o le a foliga pei o lenei:

Normal  Pulling    8m    kubelet, ip-10-241-44-128.ap-northeast-1.compute.internal  pulling image "registry.example.com/infra/openvpn/openvpn:master"

E fesuiaʻi lena mea o se ata e tasi mai se resitala tuai e mafai ona poloka le faʻapipiʻiina i node.

Ae paga lea, e le tele ni auala e alu ese ai mai le tulaga:

  1. Taumafai e faʻaoga saʻo lau Docker Registry i le fuifui pe tuusaʻo ma le fuifui (mo se faʻataʻitaʻiga, GitLab Registry, Nexus, ma isi);
  2. Fa'aoga mea fa'aoga e pei o kraken.

Tala 5. Nodes tautau ona o le leai o se manatua

I le taimi o le faʻagaioiina o talosaga eseese, matou te feagai foi ma se tulaga e le toe mafai ai ona maua se node: E le tali mai le SSH, e paʻu uma timoni mataʻituina, ona leai lea o se mea (pe toetoe lava leai) faʻalavelave i totonu o ogalaau.

O le a ou taʻu atu ia te oe i ata e faʻaaoga ai le faʻataʻitaʻiga o le tasi node lea na galue ai MongoDB.

O le mea lea e foliga mai i luga i fa'alavelave:

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

Ma e pei o lenei- после fa'alavelave:

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

I le mataʻituina, o loʻo i ai foi se oso malosi, lea e le toe maua ai le node:

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

O lea la, mai le screenshots e manino lava e faapea:

  1. Ole RAM ile masini e latalata ile pito;
  2. O loʻo i ai se oso malosi i le faʻaaogaina o le RAM, a maeʻa ona faʻafuaseʻi ona le atoatoa le avanoa i le masini atoa;
  3. Ua o'o mai se galuega tele i Mongo, lea e fa'amalosia ai le fa'agasologa o le DBMS e fa'aoga atili manatua ma faitau mataalia mai le tisiki.

E aliali mai afai e leai se mea e manatua e Linux (faʻapipiʻi le mamafa manatua) ma e leai se swap, ona i A o'o mai le fasioti tagata o le OOM, e ono tula'i mai se faiga fa'apaleni i le va o le togiina o itulau i totonu o le 'upega o itulau ma toe tusi i tua i le tisiki. E faia lenei mea e le kswapd, lea e faʻasaʻolotoina ma le lototele le tele o itulau manatua e mafai ai mo le tufatufaina mulimuli ane.

Ae paga lea, faatasi ai ma se uta tele I/O faatasi ai ma sina vaega itiiti o le manatua fua, ua avea le kswapd ma fa'agata ole faiga atoa, auā ua noatia i latou i ai uma vaevaega (itulau fa'aletonu) o itulau manatua i le faiga. E mafai ona faʻaauau lenei mea mo se taimi umi pe a le toe manaʻo le faʻaogaina o le mafaufau, ae faʻamau i le pito o le OOM-killer abyss.

O le fesili masani o le: aisea ua tuai mai ai le fasioti tagata OOM? I lana faʻamatalaga o loʻo i ai nei, o le OOM killer e matua valea lava: o le a faʻaumatia le faagasologa pe a le manuia le taumafaiga e faʻasoa se itulau manatua, i.e. pe afai ua faaletonu le itulau. E le tupu lenei mea mo se taimi umi, aua o le kswapd e faʻasaʻolotoina ma le lototele itulau manatua, faʻamalo le itulau cache (le tisiki atoa I / O i totonu o le polokalama, i le mea moni) i tua i le tisiki. I nisi faʻamatalaga, faʻatasi ai ma se faʻamatalaga o laasaga e manaʻomia e faʻaumatia ai ia faʻafitauli i le fatu, e mafai ona e faitau iinei.

Le amio lea e tatau ona faaleleia fa'atasi ai ma le Linux kernel 4.6+.

Tala 6. Ua pipii pusa i le setete o lo'o faatali

I nisi o fuifui, o loʻo i ai le tele o pusa o loʻo faʻaogaina, na amata ona matou matauina o le tele o latou "tautau" mo se taimi umi i le setete Pending, e ui lava o le Docker containers ua uma ona tamoe i luga o nodes ma e mafai ona galue ma le lima.

I le taimi lava e tasi, i describe e leai se mea leaga:

  Type    Reason                  Age                From                     Message
  ----    ------                  ----               ----                     -------
  Normal  Scheduled               1m                 default-scheduler        Successfully assigned sphinx-0 to ss-dev-kub07
  Normal  SuccessfulAttachVolume  1m                 attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-6aaad34f-ad10-11e8-a44c-52540035a73b"
  Normal  SuccessfulMountVolume   1m                 kubelet, ss-dev-kub07    MountVolume.SetUp succeeded for volume "sphinx-config"
  Normal  SuccessfulMountVolume   1m                 kubelet, ss-dev-kub07    MountVolume.SetUp succeeded for volume "default-token-fzcsf"
  Normal  SuccessfulMountVolume   49s (x2 over 51s)  kubelet, ss-dev-kub07    MountVolume.SetUp succeeded for volume "pvc-6aaad34f-ad10-11e8-a44c-52540035a73b"
  Normal  Pulled                  43s                kubelet, ss-dev-kub07    Container image "registry.example.com/infra/sphinx-exporter/sphinx-indexer:v1" already present on machine
  Normal  Created                 43s                kubelet, ss-dev-kub07    Created container
  Normal  Started                 43s                kubelet, ss-dev-kub07    Started container
  Normal  Pulled                  43s                kubelet, ss-dev-kub07    Container image "registry.example.com/infra/sphinx/sphinx:v1" already present on machine
  Normal  Created                 42s                kubelet, ss-dev-kub07    Created container
  Normal  Started                 42s                kubelet, ss-dev-kub07    Started container

Ina ua maeʻa nisi eli, na matou faia le manatu o le kubelet e leai se taimi e auina atu ai faʻamatalaga uma e uiga i le tulaga o pods ma suʻega ola / saunia i le API server.

Ma ina ua maeʻa suʻesuʻega fesoasoani, na matou mauaina vaega nei:

--kube-api-qps - QPS to use while talking with kubernetes apiserver (default 5)
--kube-api-burst  - Burst to use while talking with kubernetes apiserver (default 10) 
--event-qps - If > 0, limit event creations per second to this value. If 0, unlimited. (default 5)
--event-burst - Maximum size of a bursty event records, temporarily allows event records to burst to this number, while still not exceeding event-qps. Only used if --event-qps > 0 (default 10) 
--registry-qps - If > 0, limit registry pull QPS to this value.
--registry-burst - Maximum size of bursty pulls, temporarily allows pulls to burst to this number, while still not exceeding registry-qps. Only used if --registry-qps > 0 (default 10)

E pei ona vaaia, tau fa'aletonu e fai si la'ititi, ma i le 90% latou te ufiufi uma manaʻoga ... Ae ui i lea, i la matou tulaga e leʻi lava. O le mea lea, matou te setiina tulaga taua nei:

--event-qps=30 --event-burst=40 --kube-api-burst=40 --kube-api-qps=30 --registry-qps=30 --registry-burst=40

... ma toe amata le kubelets, mulimuli ane na matou vaʻaia le ata o loʻo i lalo i kalafi o telefoni i le API server:

6 fa'afiafiaga faiga fa'aletonu i le fa'agaioiga a Kubernetes [ma a latou fofo]

... ma ioe, na amata ona lele mea uma!

SALA

Mo la latou fesoasoani i le aoina mai o pusa ma saunia lenei tusiga, ou te momoli atu laʻu faafetai tele i le tele o inisinia o la matou kamupani, aemaise lava i laʻu paaga mai le matou R&D team Andrey Klimentyev (zuzzas).

PPS

Faitau foi i la matou blog:

puna: www.habr.com

Faaopoopo i ai se faamatalaga