Raws li kev ua haujlwm ib ntus - nce tus naj npawb ntawm PIDs hauv qhov system ntawm tib lub sijhawm:
/proc/sys/kernel/pid_max (since Linux 2.5.34)
This file specifies the value at which PIDs wrap around (i.e., the value in this file is one greater than the maximum PID). PIDs greater than this value are not alloβ
cated; thus, the value in this file also acts as a system-wide limit on the total number of processes and threads. The default value for this file, 32768, results in the
same range of PIDs as on earlier kernels
Los yog ua kom lub community launch txog cov dej num hauv supercronic tsis ncaj qha, tab sis siv tib yam tsi, uas muaj peev xwm ntawm gracefully txiav cov txheej txheem thiab tsis spawning zombies.
Dab neeg 2. "Zombies" thaum rho tawm ib pawg
Kubelet pib siv ntau CPU:
Tsis muaj leej twg nyiam qhov no, yog li peb armed peb tus kheej zoo tag nrho thiab pib daws qhov teeb meem. Cov txiaj ntsig ntawm kev tshawb nrhiav tau raws li hauv qab no:
Kubelet siv ntau tshaj li ib feem peb ntawm CPU lub sij hawm rub cov ntaub ntawv nco los ntawm txhua pawg:
Nyob rau hauv daim ntawv teev npe tsim tawm kernel, koj tuaj yeem pom kev sib tham txog qhov teeb meem. Hauv luv, lub ntsiab lus yog qhov ntawd ntau yam tmpfs cov ntaub ntawv thiab lwm yam zoo sib xws tsis raug tshem tawm tag nrho ntawm qhov system thaum tshem ib cgroup, lub thiaj li hu memcg zombie. Tsis ntev los sis tom qab lawv tseem yuav raug rho tawm ntawm nplooj ntawv cache, txawm li cas los xij, muaj ntau lub cim xeeb ntawm cov neeg rau zaub mov thiab cov ntsiav tsis pom lub ntsiab lus hauv nkim sijhawm tshem lawv. Yog li ntawd, lawv tseem ua ke. Vim li cas qhov no tseem tshwm sim? Qhov no yog ib tug neeg rau zaub mov nrog cron txoj hauj lwm uas tas li tsim cov hauj lwm tshiab, thiab nrog lawv cov pods tshiab. Yog li, cov cgroups tshiab yog tsim rau cov ntim hauv lawv, uas yuav raug tshem tawm sai sai.
Vim li cas cAdvisor hauv kubelet siv sijhawm ntau? Qhov no yog ib qho yooj yim pom los ntawm kev ua kom yooj yim tshaj plaws time cat /sys/fs/cgroup/memory/memory.stat. Yog tias ntawm lub tshuab noj qab haus huv kev ua haujlwm yuav siv sijhawm 0,01 vib nas this, tom qab ntawd ntawm qhov teeb meem cron02 nws yuav siv sijhawm 1,2 vib nas this. Qhov tshaj plaws yog tias cAdvisor, uas nyeem cov ntaub ntawv los ntawm sysfs qeeb heev, sim coj mus rau hauv tus account siv lub cim xeeb hauv zombie cgroups thiab.
Txhawm rau tshem tawm cov zombies, peb sim tshem cov caches raws li kev pom zoo los ntawm LKML: sync; echo 3 > /proc/sys/vm/drop_caches, - tab sis cov tub ntxhais tau hloov mus ua qhov nyuaj dua thiab dai lub tsheb.
Yuav ua li cas? Qhov teeb meem raug khocog lus, thiab saib cov lus piav qhia hauv tso lus) los ntawm kev hloov kho Linux kernel rau version 4.16.
Dab Neeg 3. Systemd thiab nws mount
Ntxiv dua thiab, kubelet siv ntau cov peev txheej ntawm qee cov nodes, tab sis lub sijhawm no nws yog qhov nco ntau dua:
Nws tau muab tawm tias muaj teeb meem nrog lub systemd siv hauv Ubuntu 16.04, thiab nws tshwm sim thaum tswj cov mounts uas tsim los txuas. subPath los ntawm ConfigMap'ov los yog secret'ov. Tom qab kaw lub pod cov kev pabcuam systemd thiab nws cov kev pabcuam mount tseem nyob hauv qhov system. Nyob rau tib lub sij hawm, lawv sau ntau heev. Tseem muaj teeb meem ntawm lub ncauj lus no:
#!/bin/bash
# we will work only on xenial
hostrelease="/etc/lsb-release-host"
test -f ${hostrelease} && grep xenial ${hostrelease} > /dev/null || exit 0
# sleeping max 30 minutes to dispense load on kube-nodes
sleep $((RANDOM % 1800))
stoppedCount=0
# counting actual subpath units in systemd
countBefore=$(systemctl list-units | grep subpath | grep "run-" | wc -l)
# let's go check each unit
for unit in $(systemctl list-units | grep subpath | grep "run-" | awk '{print $1}'); do
# finding description file for unit (to find out docker container, who born this unit)
DropFile=$(systemctl status ${unit} | grep Drop | awk -F': ' '{print $2}')
# reading uuid for docker container from description file
DockerContainerId=$(cat ${DropFile}/50-Description.conf | awk '{print $5}' | cut -d/ -f6)
# checking container status (running or not)
checkFlag=$(docker ps | grep -c ${DockerContainerId})
# if container not running, we will stop unit
if [[ ${checkFlag} -eq 0 ]]; then
echo "Stopping unit ${unit}"
# stoping unit in action
systemctl stop $unit
# just counter for logs
((stoppedCount++))
# logging current progress
echo "Stopped ${stoppedCount} systemd units out of ${countBefore}"
fi
done
... thiab nws khiav txhua 5 feeb nrog kev pab los ntawm yav tas los hais supercronic. Nws Dockerfile zoo li no:
Thaum lub sijhawm ua haujlwm ntawm ntau daim ntawv thov, peb kuj tau txais qhov xwm txheej thaum lub node kiag li tsis muaj: SSH tsis teb, tag nrho cov saib xyuas daems poob, thiab tom qab ntawd tsis muaj dab tsi (lossis yuav luag tsis muaj dab tsi) txawv txav hauv cov cav.
Zaj Dab Neeg 6. Pods tau daig hauv lub xeev Pending
Nyob rau hauv ib co pawg uas muaj coob heev ntawm cov pods khiav, peb pib pom tias feem ntau ntawm lawv dai nyob rau hauv lub xeev ntev heev. Pending, txawm hais tias Docker ntim lawv tus kheej twb tau khiav ntawm cov nodes thiab koj tuaj yeem ua haujlwm nrog lawv.
Ntxiv mus, hauv describe tsis muaj dab tsi tsis ncaj ncees lawm:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 1m default-scheduler Successfully assigned sphinx-0 to ss-dev-kub07
Normal SuccessfulAttachVolume 1m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-6aaad34f-ad10-11e8-a44c-52540035a73b"
Normal SuccessfulMountVolume 1m kubelet, ss-dev-kub07 MountVolume.SetUp succeeded for volume "sphinx-config"
Normal SuccessfulMountVolume 1m kubelet, ss-dev-kub07 MountVolume.SetUp succeeded for volume "default-token-fzcsf"
Normal SuccessfulMountVolume 49s (x2 over 51s) kubelet, ss-dev-kub07 MountVolume.SetUp succeeded for volume "pvc-6aaad34f-ad10-11e8-a44c-52540035a73b"
Normal Pulled 43s kubelet, ss-dev-kub07 Container image "registry.example.com/infra/sphinx-exporter/sphinx-indexer:v1" already present on machine
Normal Created 43s kubelet, ss-dev-kub07 Created container
Normal Started 43s kubelet, ss-dev-kub07 Started container
Normal Pulled 43s kubelet, ss-dev-kub07 Container image "registry.example.com/infra/sphinx/sphinx:v1" already present on machine
Normal Created 42s kubelet, ss-dev-kub07 Created container
Normal Started 42s kubelet, ss-dev-kub07 Started container
Tom qab khawb ib ncig, peb tau ua qhov kev xav tias kubelet tsuas yog tsis muaj sijhawm los xa tag nrho cov ntaub ntawv hais txog lub xeev ntawm cov pods, kev ua neej nyob / kev npaj ua qauv rau API server.
Thiab tau kawm txog kev pab, peb pom cov hauv qab no tsis:
--kube-api-qps - QPS to use while talking with kubernetes apiserver (default 5)
--kube-api-burst - Burst to use while talking with kubernetes apiserver (default 10)
--event-qps - If > 0, limit event creations per second to this value. If 0, unlimited. (default 5)
--event-burst - Maximum size of a bursty event records, temporarily allows event records to burst to this number, while still not exceeding event-qps. Only used if --event-qps > 0 (default 10)
--registry-qps - If > 0, limit registry pull QPS to this value.
--registry-burst - Maximum size of bursty pulls, temporarily allows pulls to burst to this number, while still not exceeding registry-qps. Only used if --registry-qps > 0 (default 10)
Raws li pom, default values ββyog heev me me, thiab hauv 90% lawv them tag nrho cov kev xav tau ... Txawm li cas los xij, hauv peb cov ntaub ntawv, qhov no tsis txaus. Yog li ntawd, peb teev cov nqi hauv qab no: