ProHoster > Блог > Whakahaerenga > E 6 nga punaha whakangahau i roto i nga mahi a Kubernetes [me o raatau otinga]
E 6 nga punaha whakangahau i roto i nga mahi a Kubernetes [me o raatau otinga]
I roto i nga tau o te whakamahi i nga Kubernetes i roto i nga mahi whakangao, kua kohia e matou he maha nga korero whakahihiri mo te pehea i puta ai nga pepeke i roto i nga momo waahanga punaha ki nga hua kino, kaore ranei e mohiohia e pa ana ki te mahi o nga ipu me nga pene. I roto i tenei tuhinga kua tohua e matou etahi o nga mea tino noa, whakamere ranei. Ahakoa kaore koe i te waimarie ki te pa ki nga ahuatanga penei, ko te panui mo nga korero pakiwaitara poto - ina koa ko te "ringa tuatahi" - he pai tonu, kaore?
Pūrākau 1. Supercronic me Docker iri
I runga i tetahi o nga tautau, i ia wa ka whiwhi matou i te Docker tio, i whakararu i te mahi noa o te tautau. I te wa ano, ko nga mea e whai ake nei i kitea i roto i nga raarangi Docker:
level=error msg="containerd: start init process" error="exit status 2: "runtime/cgo: pthread_create failed: No space left on device
SIGABRT: abort
PC=0x7f31b811a428 m=0
goroutine 0 [idle]:
goroutine 1 [running]:
runtime.systemstack_switch() /usr/local/go/src/runtime/asm_amd64.s:252 fp=0xc420026768 sp=0xc420026760
runtime.main() /usr/local/go/src/runtime/proc.go:127 +0x6c fp=0xc4200267c0 sp=0xc420026768
runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4200267c8 sp=0xc4200267c0
goroutine 17 [syscall, locked to thread]:
runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1
…
Ko te mea e tino paingia ana e matou mo tenei hapa ko te karere: pthread_create failed: No space left on device. Ako Tere tuhinga i whakamaramahia kaore e taea e Docker te tarai i tetahi mahi, na reira ka maroke i ia wa.
I roto i te aroturuki, ko te pikitia e whai ake nei e rite ana ki nga mea e tupu ana:
I puta mai ko tenei whanonga he hua o te mahi a te pene supercronic (he taputapu Haere e whakamahia ana e matou ki te whakahaere i nga mahi cron i roto i nga putunga):
Ko te raruraru ko tenei: ka whakahaerehia he mahi i roto i te supercronic, ko te tukanga i puta mai i a ia e kore e taea te whakamutu tika, ka huri ki roto zombie.
parau: Hei whakamaarama ake, ka puta nga tikanga e nga mahi cron, engari ko te supercronic ehara i te punaha init, kaore e taea te "whakauru" i nga tikanga i puta mai i ana tamariki. Ina whakaarahia nga tohu SIGHUP, SIGTERM ranei, kaore e tukuna atu ki nga tukanga o te tamaiti, ka kore e mutu nga mahi a te tamaiti, ka noho tonu ki te mana zombie. Ka taea e koe te panui atu mo enei mea katoa, hei tauira, i roto i he tuhinga penei.
E rua nga huarahi hei whakaoti rapanga:
Hei whakatikatika rangitahi - whakanuia te maha o nga PID i te punaha i te wa kotahi:
/proc/sys/kernel/pid_max (since Linux 2.5.34)
This file specifies the value at which PIDs wrap around (i.e., the value in this file is one greater than the maximum PID). PIDs greater than this value are not allo‐
cated; thus, the value in this file also acts as a system-wide limit on the total number of processes and threads. The default value for this file, 32768, results in the
same range of PIDs as on earlier kernels
Ka whakarewahia ranei nga mahi i roto i te supercronic kaore i te tika, engari ma te whakamahi i te mea ano tini, e kaha ana ki te whakamutu tika i nga mahi kaore e puta nga zombies.
Pūrākau 2. "Zombies" i te wa e mukua ana he roopu
I timata a Kubelet ki te pau i te maha o te PTM:
Kaore tetahi e pai ki tenei, no reira i mau patu matou tino tika a ka timata ki te whakatutuki i te raruraru. Ko nga hua o te whakatewhatewha e whai ake nei:
Ka whakapaua e Kubelet neke atu i te hautoru o tana wa PTM ki te tango raraunga mahara mai i nga roopu c katoa:
I roto i te rarangi mēra kaiwhakawhanake kernel ka kitea e koe matapaki mo te raruraru. I te poto, ka tae mai te kaupapa ki tenei: Ko nga momo kōnae tmpfs me etahi atu mea rite kaore i te tino tangohia mai i te punaha i te whakakore i tetahi cgroup, ko te mea e kiia ana memecg zombie. Ake ake nei ka mukua mai i te keteroki wharangi, engari he nui te mahara kei runga i te tūmau, ā, karekau te kernel e kite i te take o te moumou taima ki te muku. Koira te take e putu tonu ana. He aha te take i penei ai? He tūmau tenei me nga mahi cron e hanga ana i nga mahi hou i nga wa katoa, me nga putunga hou. No reira, ka hangaia nga roopu c hou mo nga ipu kei roto, katahi ka mukua.
He aha te cAdvisor i te kubelet e moumou ana te taima? He ngawari tenei ki te kite me te mahi tino ngawari time cat /sys/fs/cgroup/memory/memory.stat. Mena i runga i te miihini hauora e 0,01 hēkona te mahi, na i runga i te cron02 raru ka 1,2 hēkona. Ko te mea ko te cAdvisor, e panui ana i nga raraunga mai i nga sysfs tino puhoi, ka ngana ki te whai whakaaro ki nga mahara i whakamahia i roto i nga roopu zombie.
Kia kaha te tango i nga zombies, i ngana matou ki te whakakore i nga keteroki e kiia ana i roto i te LKML: sync; echo 3 > /proc/sys/vm/drop_caches, - engari ko te kakano i puta he uaua ake ka tukitukia te motuka.
Me aha? Kei te whakatikahia te raru (tuku, a mo te whakaahuatanga tirohia tuku karere) te whakahou ake i te pata Linux ki te putanga 4.16.
History 3. Systemd me tona maunga
Ano, he maha rawa nga rauemi e pau ana te kubelet ki etahi node, engari i tenei wa ka pau te mahara:
I puta mai he raru kei roto i te punaha i whakamahia i Ubuntu 16.04, a ka puta i te wa e whakahaere ana i nga maunga i hangaia mo te hononga. subPath mai i te ConfigMap, i te mea ngaro ranei. I muri i te otinga o te mahi a te pona kei te noho tonu te ratonga systemd me tana maunga ratonga i roto i te punaha. I te wa o te wa, ka kohia te tini o ratou. He take ano mo tenei kaupapa:
...ko te mea whakamutunga e pa ana ki te PR i te systemd: #7811 (take i roto i te systemd - #7798).
Kua kore te raru i roto i te Ubuntu 18.04, engari mena kei te pirangi koe ki te whakamahi tonu i te Ubuntu 16.04, ka whai hua pea ta maatau mahi mo tenei kaupapa.
#!/bin/bash
# we will work only on xenial
hostrelease="/etc/lsb-release-host"
test -f ${hostrelease} && grep xenial ${hostrelease} > /dev/null || exit 0
# sleeping max 30 minutes to dispense load on kube-nodes
sleep $((RANDOM % 1800))
stoppedCount=0
# counting actual subpath units in systemd
countBefore=$(systemctl list-units | grep subpath | grep "run-" | wc -l)
# let's go check each unit
for unit in $(systemctl list-units | grep subpath | grep "run-" | awk '{print $1}'); do
# finding description file for unit (to find out docker container, who born this unit)
DropFile=$(systemctl status ${unit} | grep Drop | awk -F': ' '{print $2}')
# reading uuid for docker container from description file
DockerContainerId=$(cat ${DropFile}/50-Description.conf | awk '{print $5}' | cut -d/ -f6)
# checking container status (running or not)
checkFlag=$(docker ps | grep -c ${DockerContainerId})
# if container not running, we will stop unit
if [[ ${checkFlag} -eq 0 ]]; then
echo "Stopping unit ${unit}"
# stoping unit in action
systemctl stop $unit
# just counter for logs
((stoppedCount++))
# logging current progress
echo "Stopped ${stoppedCount} systemd units out of ${countBefore}"
fi
done
... ka rere ia 5 meneti ma te whakamahi i te supercronic kua whakahuahia i mua. Ko tana Dockerfile te ahua penei:
Pūrākau 4. Te whakataetae i te wa e whakarite ana i nga poti
I kitea e: mena ka tuuhia he poti ki runga i te node ka peehia tona ahua mo te wa roa, katahi ano ka "pa" ki te pona kotahi. karekau e timata ki te toia te ahua o te pene hou. Engari, ka tatari kia toia te ahua o te peera o mua. Ko te hua o tenei, ko te pona kua oti te whakarite me tana ahua ka taea te tango i roto i te meneti noa ka eke ki te mana o containerCreating.
Ka penei te ahua o nga huihuinga:
Normal Pulling 8m kubelet, ip-10-241-44-128.ap-northeast-1.compute.internal pulling image "registry.example.com/infra/openvpn/openvpn:master"
Ka huri te reira ka taea e te ahua kotahi mai i te rehita puhoi te aukati i te tuku ia kōpuku.
Kia aroha mai, kaore he maha o nga huarahi ka puta mai i tenei ahuatanga:
Whakamātauria ki te whakamahi tika i to Rēhita Docker i roto i te kāhui, i te tika ranei ki te tautau (hei tauira, GitLab Registry, Nexus, etc.);
Pūrākau 5. Ka iri ngā kōpuku nā te kore o te mahara
I te wa o te whakahaerenga o nga momo tono, i tutaki ano matou ki tetahi ahuatanga ka mutu te uru mai o te node: Kare a SSH e whakautu, ka taka katoa nga daemon aroturuki, katahi karekau he mea (kaore he mea) he rereke i roto i nga raarangi.
Ka korerotia e ahau ki a koe i roto i nga pikitia ma te whakamahi i te tauira o tetahi node i mahi ai a MongoDB.
Koinei te ahua o runga ki aituā:
A penei - после aituā:
I roto i te aroturuki, he peke koi ano hoki, ka mutu te waatea o te node:
No reira, mai i nga whakaahua ka maarama ko:
Ko te RAM i runga i te miihini e tata ana ki te mutunga;
He nui te peke i roto i te kohi RAM, ka mutu ka whakakorea te uru ki te miihini katoa;
Ka tae mai he mahi nui ki a Mongo, ka kaha te mahi a te DBMS ki te whakamahi mahara ake me te panui kaha mai i te kōpae.
Ka puta mai mena ka pau te mahara a Linux (ka uru te pehanga mahara) karekau he whakawhiti, katahi ka ki Ina tae mai te kaipatu OOM, ka ara ake he mahi taurite i waenga i te maka wharangi ki roto i te keteroki wharangi me te tuhi hoki ki te kōpae. Ka mahia tenei e kswapd, e whakaahuru ana i te maha o nga wharangi mahara ka taea mo te tohatoha o muri.
Kia aroha mai, me te uta I/O nui me te iti o te mahara kore utu, Ko te kswapd ka noho hei pounamu o te punaha katoa, no te mea kua herea ratou ki reira katoa nga tohatoha (nga hapa o te wharangi) o nga wharangi mahara i roto i te punaha. Ka taea tenei mo te wa roa mena ka kore nga mahi e hiahia ki te whakamahi mahara ano, engari ka mau ki te pito rawa o te OOM-killer abyss.
Ko te patai maori: he aha i tae mai ai te kaikohuru OOM? I roto i tona whitiwhitinga o naianei, he tino poauau te kaikohuru OOM: ka mate noa i te wa ka taka te ngana ki te toha wharangi mahara, ara. ki te rahua te he o te wharangi. Kare tenei e tupu mo te wa roa, na te mea ka maia te kswapd ki te wetewete i nga wharangi mahara, ka tukuna te keteroki wharangi (te katoa o te kōpae I/O i roto i te punaha, me te pono) ka hoki ki te kōpae. I roto i nga korero taipitopito, me te whakamaarama mo nga waahanga e hiahiatia ana hei whakakore i enei raru i roto i te kernel, ka taea e koe te panui konei.
Pūrākau 6. Ka mau nga Pods i roto i te ahua Tarewa
I etahi tautau, he maha nga putunga e mahi ana, ka tiimata matou ki te kite ko te nuinga o ratou e "iri" mo te wa roa i roto i te kawanatanga. Pending, ahakoa kei te rere tonu nga ipu Docker ki runga i nga pona ka taea te mahi ma te ringa.
Ano hoki, i roto i describe kahore he he:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 1m default-scheduler Successfully assigned sphinx-0 to ss-dev-kub07
Normal SuccessfulAttachVolume 1m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-6aaad34f-ad10-11e8-a44c-52540035a73b"
Normal SuccessfulMountVolume 1m kubelet, ss-dev-kub07 MountVolume.SetUp succeeded for volume "sphinx-config"
Normal SuccessfulMountVolume 1m kubelet, ss-dev-kub07 MountVolume.SetUp succeeded for volume "default-token-fzcsf"
Normal SuccessfulMountVolume 49s (x2 over 51s) kubelet, ss-dev-kub07 MountVolume.SetUp succeeded for volume "pvc-6aaad34f-ad10-11e8-a44c-52540035a73b"
Normal Pulled 43s kubelet, ss-dev-kub07 Container image "registry.example.com/infra/sphinx-exporter/sphinx-indexer:v1" already present on machine
Normal Created 43s kubelet, ss-dev-kub07 Created container
Normal Started 43s kubelet, ss-dev-kub07 Started container
Normal Pulled 43s kubelet, ss-dev-kub07 Container image "registry.example.com/infra/sphinx/sphinx:v1" already present on machine
Normal Created 42s kubelet, ss-dev-kub07 Created container
Normal Started 42s kubelet, ss-dev-kub07 Started container
I muri i etahi keri, i whakaaro matou kaore he wa o te kubelet ki te tuku i nga korero katoa mo te ahua o nga poti me nga whakamatautau oranga/reriri ki te tūmau API.
A, i muri i te ako i te awhina, i kitea e matou nga tohu e whai ake nei:
--kube-api-qps - QPS to use while talking with kubernetes apiserver (default 5)
--kube-api-burst - Burst to use while talking with kubernetes apiserver (default 10)
--event-qps - If > 0, limit event creations per second to this value. If 0, unlimited. (default 5)
--event-burst - Maximum size of a bursty event records, temporarily allows event records to burst to this number, while still not exceeding event-qps. Only used if --event-qps > 0 (default 10)
--registry-qps - If > 0, limit registry pull QPS to this value.
--registry-burst - Maximum size of bursty pulls, temporarily allows pulls to burst to this number, while still not exceeding registry-qps. Only used if --registry-qps > 0 (default 10)
Ka kitea, he iti noa nga uara taunoa, a i roto i te 90% ka kapi katoa nga hiahia ... Heoi, i roto i ta maatau kaore i ranea. Na reira, ka whakatauhia e matou nga uara e whai ake nei:
... ka whakaara ano i nga kubelets, i muri iho ka kite matou i te pikitia e whai ake nei i nga kauwhata o nga waea ki te tūmau API:
... ae, i timata nga mea katoa ki te rere!
PS
Mo a raatau awhina ki te kohi pepeha me te whakarite i tenei tuhinga, ka nui taku mihi ki nga miihini maha o ta maatau kamupene, otira ki taku hoa mahi mai i ta maatau roopu R&D Andrey Klimentyev (zuzzas).