ProHoster > Блог > Nchịkwa > 6 sistemu sistemu na-atọ ụtọ na arụ ọrụ Kubernetes [na ngwọta ha]
6 sistemu sistemu na-atọ ụtọ na arụ ọrụ Kubernetes [na ngwọta ha]
N'ime ọtụtụ afọ nke iji Kubernetes na-emepụta ihe, anyị achịkọtala ọtụtụ akụkọ na-atọ ụtọ banyere otú ahụhụ dị na ihe dị iche iche nke sistemu si eduga na nsonaazụ na-adịghị mma na / ma ọ bụ enweghị nghọta na-emetụta ọrụ nke arịa na pọd. N'isiokwu a, anyị emeela nhọrọ nke ụfọdụ ndị na-emekarị ma ọ bụ na-adọrọ mmasị. Ọbụna ma ọ bụrụ na ọ dịghị mgbe ị ga-enwe ihu ọma izute ọnọdụ ndị dị otú ahụ, ịgụ banyere akụkọ nchọpụta dị mkpirikpi - karịsịa "aka mbụ" - na-adọrọ mmasị mgbe niile, ọ bụghị ya?...
Akụkọ 1. Supercronic na Docker nghọta
N'otu n'ime ụyọkọ ahụ, anyị na-enweta Docker oyi kpọnwụrụ, nke gbochiri arụ ọrụ nke ụyọkọ ahụ. N'otu oge ahụ, a hụrụ ihe ndị a na ndekọ Docker:
level=error msg="containerd: start init process" error="exit status 2: "runtime/cgo: pthread_create failed: No space left on device
SIGABRT: abort
PC=0x7f31b811a428 m=0
goroutine 0 [idle]:
goroutine 1 [running]:
runtime.systemstack_switch() /usr/local/go/src/runtime/asm_amd64.s:252 fp=0xc420026768 sp=0xc420026760
runtime.main() /usr/local/go/src/runtime/proc.go:127 +0x6c fp=0xc4200267c0 sp=0xc420026768
runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4200267c8 sp=0xc4200267c0
goroutine 17 [syscall, locked to thread]:
runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1
…
Ihe kacha amasị anyị gbasara njehie a bụ ozi: pthread_create failed: No space left on device. Ọmụmụ ngwa ngwa akwụkwọ kọwara na Docker enweghị ike ịmegharị usoro, ya mere ọ na-ajụ oyi mgbe ụfọdụ.
Nsogbu bụ nke a: mgbe a na-arụ ọrụ na supercronic, usoro ahụ na-esi na ya pụta enweghị ike ịkwụsị nke ọma, na-atụgharị n'ime zombie.
Примечание: Iji bụrụ nke ziri ezi, a na-emepụta usoro site na ọrụ cron, ma supercronic abụghị usoro init na enweghị ike "ịnabata" usoro nke ụmụ ya zụlitere. Mgbe a na-ebuli akara SIGHUP ma ọ bụ SIGTERM, a naghị enyefe ha na usoro ụmụaka, na-eme ka usoro ụmụaka ghara ịkwụsị ma nọgide na ọnọdụ zombie. Ị nwere ike ịgụkwu gbasara ihe ndị a niile, dịka ọmụmaatụ, na isiokwu dị otú ahụ.
Enwere ụzọ abụọ iji dozie nsogbu:
Dị ka ihe na-arụ ọrụ nwa oge - mụbaa ọnụ ọgụgụ PID na sistemụ n'otu oge n'ime oge:
/proc/sys/kernel/pid_max (since Linux 2.5.34)
This file specifies the value at which PIDs wrap around (i.e., the value in this file is one greater than the maximum PID). PIDs greater than this value are not allo‐
cated; thus, the value in this file also acts as a system-wide limit on the total number of processes and threads. The default value for this file, 32768, results in the
same range of PIDs as on earlier kernels
Ma ọ bụ malite ọrụ na supercronic ọ bụghị ozugbo, kama na-eji otu ihe ahụ tini, nke na-enwe ike ịkwụsị usoro n'ụzọ ziri ezi na ọ bụghị spawn zombies.
Akụkọ nke 2. "Zombie" mgbe ị na-ehichapụ otu
Kubelet malitere iri ọtụtụ CPU:
Ọ dịghị onye ga-amasị nke a, ya mere anyị na-ejikere onwe anyị zuru oke wee malite imeri nsogbu ahụ. Nsonaazụ nyocha ahụ bụ nke a:
Kubelet na-etinye ihe karịrị otu ụzọ n'ụzọ atọ nke oge CPU na-adọta data ebe nchekwa site na otu niile:
N'ime ndepụta nzipu ozi nke ndị nrụpụta kernel ị nwere ike ịchọta mkparịta ụka nke nsogbu. Na nkenke, isi ihe na-agbada na nke a: faịlụ tmpf dị iche iche na ihe ndị ọzọ yiri ya adịghị ewepụ kpamkpam na sistemụ mgbe ihichapụ otu ìgwè, ihe a na-akpọ memcg zombie. N'oge na-adịghị, a ga-ehichapụ ha na cache ibe, mana enwere ọtụtụ ebe nchekwa na ihe nkesa na kernel ahụghị isi ihe na-egbu oge na ihichapụ ha. Ọ bụ ya mere ha ji na-agbakọta. Gịnị mere nke a ọbụna na-eme? Nke a bụ ihe nkesa nwere ọrụ cron nke na-emepụta ọrụ ọhụrụ mgbe niile, yana ya na pọd ọhụrụ. Ya mere, a na-emepụta otu ọhụrụ maka arịa dị n'ime ha, nke a na-ehichapụ n'oge na-adịghị anya.
Kedu ihe kpatara cAdvisor na kubelet ji egbu oge dị ukwuu? Nke a dị mfe ịhụ na ogbugbu kacha mfe time cat /sys/fs/cgroup/memory/memory.stat. Ọ bụrụ na igwe dị mma, ọrụ ahụ na-ewe 0,01 sekọnd, mgbe ahụ na cron02 nwere nsogbu ọ na-ewe 1,2 sekọnd. Ihe bụ na cAdvisor, nke na-agụ data sitere na sysfs nke nta nke nta, na-agbalị iburu n'uche ebe nchekwa ejiri na otu zombies.
Iji wepụ zombies ike, anyị nwara ikpochapụ cache dị ka akwadoro na LKML: sync; echo 3 > /proc/sys/vm/drop_caches, - mana kernel tụgharịrị gbagwojuru anya ma daa ụgbọ ala ahụ.
Ihe a ga-eme? A na-edozi nsogbu ahụ (eme, na maka nkọwa hụ izipu ozi) na-emelite kernel Linux na ụdị 4.16.
Akụkọ ihe mere eme 3. Systemd na ugwu ya
Ọzọ, kubelet na-eri ọtụtụ ihe onwunwe na ụfọdụ ọnụ, mana oge a ọ na-eri oke ebe nchekwa:
Ọ tụgharịrị na enwere nsogbu na sistemu eji eme ihe na Ubuntu 16.04, ọ na-eme mgbe ị na-ejikwa ugwu emepụtara maka njikọ. subPath site na ConfigMap ma ọ bụ ihe nzuzo. Mgbe pọd ahụ rụchara ọrụ ya ọrụ sistemu na ugwu ọrụ ya ka dị na sistemu. Ka oge na-aga, ọnụ ọgụgụ buru ibu n'ime ha na-agbakọta. Enwere ọbụna okwu gbasara isiokwu a:
#!/bin/bash
# we will work only on xenial
hostrelease="/etc/lsb-release-host"
test -f ${hostrelease} && grep xenial ${hostrelease} > /dev/null || exit 0
# sleeping max 30 minutes to dispense load on kube-nodes
sleep $((RANDOM % 1800))
stoppedCount=0
# counting actual subpath units in systemd
countBefore=$(systemctl list-units | grep subpath | grep "run-" | wc -l)
# let's go check each unit
for unit in $(systemctl list-units | grep subpath | grep "run-" | awk '{print $1}'); do
# finding description file for unit (to find out docker container, who born this unit)
DropFile=$(systemctl status ${unit} | grep Drop | awk -F': ' '{print $2}')
# reading uuid for docker container from description file
DockerContainerId=$(cat ${DropFile}/50-Description.conf | awk '{print $5}' | cut -d/ -f6)
# checking container status (running or not)
checkFlag=$(docker ps | grep -c ${DockerContainerId})
# if container not running, we will stop unit
if [[ ${checkFlag} -eq 0 ]]; then
echo "Stopping unit ${unit}"
# stoping unit in action
systemctl stop $unit
# just counter for logs
((stoppedCount++))
# logging current progress
echo "Stopped ${stoppedCount} systemd units out of ${countBefore}"
fi
done
... ọ na-agbakwa nkeji ise ọ bụla site na iji supercronic a kpọtụrụ aha na mbụ. Dockerfile ya dị ka nke a:
Achọpụtara na: ọ bụrụ na anyị nwere pọd etinye n'ọnụ ọnụ ma wepụ ihe oyiyi ya ruo ogologo oge, mgbe ahụ, pọd ọzọ nke "kụrụ" otu ọnụ ga-adị mfe. anaghị amalite sere onyinyo nke ọhụrụ pod. Kama, ọ na-echere ruo mgbe e wepụrụ ihe oyiyi nke pọd gara aga. N'ihi nke a, pọd nke edozilarị na nke enwere ike ibudata ihe oyiyi ya n'ime otu nkeji ga-ejedebe na ọkwa nke containerCreating.
Ihe omume ga-adị ka nke a:
Normal Pulling 8m kubelet, ip-10-241-44-128.ap-northeast-1.compute.internal pulling image "registry.example.com/infra/openvpn/openvpn:master"
Ọ na-apụta na otu onyonyo sitere na ndekọ ngwa ngwa nwere ike igbochi ntinye kwa ọnụ.
N'ụzọ dị mwute, ọ bụghị ọtụtụ ụzọ isi na ọnọdụ ahụ pụta:
Gbalịa iji ndekọ Docker gị ozugbo na ụyọkọ ma ọ bụ ozugbo na ụyọkọ (dịka ọmụmaatụ, GitLab Registry, Nexus, wdg);
Akụkọ 5. Nodes na-ekokwasị n'ihi enweghị ebe nchekwa
N'oge ọrụ nke dị iche iche ngwa, anyị na-ezutekwa ọnọdụ ebe a ọnụ kpamkpam akwụsị inweta: SSH adịghị aza, niile nlekota daemons ada oyi, na mgbe ahụ ọ dịghị ihe (ma ọ bụ fọrọ nke nta ka ọ dịghị) anomalous na ndekọ.
Aga m agwa gị na foto site na iji ihe atụ nke otu ọnụ ebe MongoDB rụrụ ọrụ.
Nke a bụ ihe atop dị ka ka ihe mberede:
Ma dị ka nke a - после ihe mberede:
Na nlekota oru, enwekwara mwụli elu, nke ọnụ na-akwụsị ịdị:
Ya mere, site na nseta ihuenyo o doro anya na:
RAM na igwe dị nso na njedebe;
Enwere mwụli elu na oriri RAM, mgbe nke ahụ gasịrị, ohere ịnweta igwe dum nwere nkwarụ na mberede;
Otu nnukwu ọrụ na-abịarute na Mongo, nke na-amanye usoro DBMS iji ebe nchekwa karịa ma na-agụsi ike na diski.
Ọ na-apụta na ọ bụrụ na Linux na-agwụ na ebe nchekwa efu (nrụgide nchekwa na-abanye) na enweghị mgbanwe, mgbe ahụ. ka Mgbe onye na-egbu OOM bịarutere, usoro ngbanwe nwere ike ibilite n'etiti ịtụba ibe n'ime cache ibe wee deghachi ha na diski. A na-eme nke a site na kswapd, nke ji obi ike tọhapụ ọtụtụ ibe ebe nchekwa dị ka o kwere mee maka nkesa na-esote.
O di nwute, na ibu I/O buru ibu tinyere obere ebe nchekwa efu, kswapd na-aghọ ihe mgbochi nke sistemu niile, n'ihi na e kekọtara ha na ya niile oke (mmejọ ibe) nke ibe ebe nchekwa na sistemụ. Nke a nwere ike ịga n'ihu ruo ogologo oge ma ọ bụrụ na usoro ahụ achọghị iji ebe nchekwa ọzọ, mana a na-edozi ya na nsọtụ nke abyss OOM-egbu.
Ajụjụ sitere n'okike bụ: gịnị kpatara onye na-egbu OOM ji abịa n'oge? N'ime nkwuputa ya ugbu a, onye na-egbu OOM dị nnọọ nzuzu: ọ ga-egbu usoro ahụ naanị mgbe mbọ iji nyefee ibe ebe nchekwa dara, ya bụ. ọ bụrụ na mmejọ ibe ahụ ada ada. Nke a anaghị eme ogologo oge, n'ihi na kswapd ji obi ike tọhapụ ibe ebe nchekwa, na-atụfu cache nke ibe (n'ezie diski I/O dum na sistemụ, n'ezie) na diski. Na nkọwa ndị ọzọ, na nkọwa nke usoro achọrọ iji kpochapụ nsogbu ndị dị otú ahụ na kernel, ị nwere ike ịgụ ebe a.
N'ụfọdụ ụyọkọ, ebe enwere ọtụtụ pọd na-arụ ọrụ, anyị malitere ịchọpụta na ọtụtụ n'ime ha "na-ekogidere" ogologo oge na steeti ahụ. Pending, ọ bụ ezie na igbe Docker n'onwe ha na-agba ọsọ na ọnụ ma nwee ike iji aka rụọ ọrụ.
Ọzọkwa, na describe ọ dịghị ihe ọjọọ:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 1m default-scheduler Successfully assigned sphinx-0 to ss-dev-kub07
Normal SuccessfulAttachVolume 1m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-6aaad34f-ad10-11e8-a44c-52540035a73b"
Normal SuccessfulMountVolume 1m kubelet, ss-dev-kub07 MountVolume.SetUp succeeded for volume "sphinx-config"
Normal SuccessfulMountVolume 1m kubelet, ss-dev-kub07 MountVolume.SetUp succeeded for volume "default-token-fzcsf"
Normal SuccessfulMountVolume 49s (x2 over 51s) kubelet, ss-dev-kub07 MountVolume.SetUp succeeded for volume "pvc-6aaad34f-ad10-11e8-a44c-52540035a73b"
Normal Pulled 43s kubelet, ss-dev-kub07 Container image "registry.example.com/infra/sphinx-exporter/sphinx-indexer:v1" already present on machine
Normal Created 43s kubelet, ss-dev-kub07 Created container
Normal Started 43s kubelet, ss-dev-kub07 Started container
Normal Pulled 43s kubelet, ss-dev-kub07 Container image "registry.example.com/infra/sphinx/sphinx:v1" already present on machine
Normal Created 42s kubelet, ss-dev-kub07 Created container
Normal Started 42s kubelet, ss-dev-kub07 Started container
Mgbe ụfọdụ egwu ala, anyị mere echiche na kubelet enweghị oge izipu ozi niile banyere steeti pods na liveness / njikere ule na API nkesa.
Ma mgbe anyị mụsịrị enyemaka, anyị chọtara parampat ndị a:
--kube-api-qps - QPS to use while talking with kubernetes apiserver (default 5)
--kube-api-burst - Burst to use while talking with kubernetes apiserver (default 10)
--event-qps - If > 0, limit event creations per second to this value. If 0, unlimited. (default 5)
--event-burst - Maximum size of a bursty event records, temporarily allows event records to burst to this number, while still not exceeding event-qps. Only used if --event-qps > 0 (default 10)
--registry-qps - If > 0, limit registry pull QPS to this value.
--registry-burst - Maximum size of bursty pulls, temporarily allows pulls to burst to this number, while still not exceeding registry-qps. Only used if --registry-qps > 0 (default 10)
Dị ka a hụrụ, ụkpụrụ ndabara dị obere, na na 90% ha na-ekpuchi mkpa niile ... Otú ọ dị, n'ọnọdụ anyị nke a ezughị. Ya mere, anyị debere ụkpụrụ ndị a:
... wee malitegharịa kubelets, mgbe nke ahụ gasịrị, anyị hụrụ foto a na eserese nke oku na sava API:
... na ee, ihe niile malitere ife!
PS
Maka enyemaka ha n'ịchịkọta ahụhụ na ịkwadebe akụkọ a, ana m ekwupụta ekele dị ukwuu nye ọtụtụ ndị injinia nke ụlọ ọrụ anyị, yana karịsịa onye ọrụ ibe m site na R&D otu Andrey Klimentyev (zuzzas).