Kuthetsa vuto la network latency ku Kubernetes

Kuthetsa vuto la network latency ku Kubernetes

Zaka zingapo zapitazo Kubernetes takambirana kale pa blog ya GitHub yovomerezeka. Kuyambira pamenepo, yakhala ukadaulo wokhazikika wotumizira mautumiki. Kubernetes tsopano amayang'anira gawo lalikulu la ntchito zamkati ndi zapagulu. Magulu athu atakula komanso zofunikira zogwirira ntchito zidayamba kukulirakulira, tidayamba kuzindikira kuti mautumiki ena pa Kubernetes anali kukumana ndi kuchedwa komwe sikungathe kufotokozedwa ndi kuchuluka kwa pulogalamuyo.

Kwenikweni, mapulogalamu amakumana ndi kuchedwa kwapaintaneti kwachisawawa mpaka 100ms kapena kupitilira apo, zomwe zimapangitsa kutha kwanthawi kapena kuyesanso. Ntchito zikuyembekezeka kuyankha zopempha mwachangu kwambiri kuposa 100ms. Koma izi sizingatheke ngati kulumikizana komweko kumatenga nthawi yayitali. Payokha, tidawona mafunso othamanga kwambiri a MySQL omwe amayenera kutenga ma milliseconds, ndipo MySQL idamaliza mu milliseconds, koma malinga ndi momwe amafunsira, kuyankha kudatenga 100ms kapena kupitilira apo.

Nthawi yomweyo zidadziwika kuti vutoli lidangochitika polumikizana ndi Kubernetes node, ngakhale kuyimbako kudachokera kunja kwa Kubernetes. Njira yosavuta yobweretsera vutoli ndi kuyesa Vegeta, yomwe imachokera kwa wolandira aliyense wamkati, imayesa ntchito ya Kubernetes pa doko linalake, ndipo nthawi ndi nthawi imalembetsa kuchedwa kwakukulu. M’nkhani ino, tiona mmene tinakwanilitsila kufufuza zimene zinayambitsa vutoli.

Kuchotsa zovuta zosafunikira mu unyolo zomwe zimabweretsa kulephera

Popanganso chitsanzo chomwechi, tinkafuna kuchepetsa vutolo ndikuchotsa zovuta zosafunikira. Poyamba, panali zinthu zambiri pakuyenda pakati pa Vegeta ndi Kubernetes pods. Kuti mudziwe vuto lakuya la intaneti, muyenera kuchotsa ena mwa iwo.

Kuthetsa vuto la network latency ku Kubernetes

Makasitomala (Vegeta) amapanga kulumikizana kwa TCP ndi node iliyonse pagulu. Kubernetes imagwira ntchito ngati netiweki yowonjezera (pamwamba pa intaneti yomwe ilipo) yomwe imagwiritsa ntchito IPIP, ndiko kuti, imayika mapaketi a IP a netiweki ya overlay mkati mwa mapaketi a IP a data center. Mukalumikiza ku node yoyamba, kumasulira adilesi ya netiweki kumachitika Kutanthauzira Kwama Network (NAT) ndizovomerezeka kumasulira adilesi ya IP ndi doko la Kubernetes node ku adilesi ya IP ndi doko pamanetiweki (makamaka, pod yokhala ndi pulogalamu). Pamapaketi obwera, kutsatizana kwazinthu kumachitidwa. Ndi dongosolo lovuta lomwe lili ndi maboma ambiri ndi zinthu zambiri zomwe zimasinthidwa nthawi zonse ndikusintha pamene ntchito zikugwiritsidwa ntchito ndikusuntha.

Zothandiza tcpdump mu mayeso a Vegeta pali kuchedwa panthawi ya TCP kugwirana chanza (pakati pa SYN ndi SYN-ACK). Kuti muchotse zovuta zosafunikira izi, mutha kugwiritsa ntchito hping3 pa "pings" zosavuta ndi mapaketi a SYN. Timayang'ana ngati pali kuchedwa mu paketi yoyankhira, ndikukhazikitsanso kulumikizana. Titha kusefa zomwe zikufunika kuti ziphatikize mapaketi akulu kuposa 100ms ndikupeza njira yosavuta yobweretsera vutoli kuposa kuyesa kwa netiweki 7 ku Vegeta. Nawa ma "pings" a Kubernetes pogwiritsa ntchito TCP SYN/SYN-ACK pa "node port" ya "node" (30927) pakadutsa 10ms, osefedwa ndi mayankho otsika kwambiri:

theojulienne@shell ~ $ sudo hping3 172.16.47.27 -S -p 30927 -i u10000 | egrep --line-buffered 'rtt=[0-9]{3}.'

len=46 ip=172.16.47.27 ttl=59 DF id=0 sport=30927 flags=SA seq=1485 win=29200 rtt=127.1 ms

len=46 ip=172.16.47.27 ttl=59 DF id=0 sport=30927 flags=SA seq=1486 win=29200 rtt=117.0 ms

len=46 ip=172.16.47.27 ttl=59 DF id=0 sport=30927 flags=SA seq=1487 win=29200 rtt=106.2 ms

len=46 ip=172.16.47.27 ttl=59 DF id=0 sport=30927 flags=SA seq=1488 win=29200 rtt=104.1 ms

len=46 ip=172.16.47.27 ttl=59 DF id=0 sport=30927 flags=SA seq=5024 win=29200 rtt=109.2 ms

len=46 ip=172.16.47.27 ttl=59 DF id=0 sport=30927 flags=SA seq=5231 win=29200 rtt=109.2 ms

Nthawi yomweyo kupanga kuwonetsetsa koyamba. Tikayang'ana mawerengero otsatizana ndi nthawi, n'zoonekeratu kuti izi siziri za nthawi imodzi. Kuchedwako nthawi zambiri kumachulukana ndipo pamapeto pake kumakonzedwa.

Kenako, tikufuna kudziwa kuti ndi zigawo ziti zomwe zingakhudzidwe ndi vuto la kusokonekera. Mwina awa ndi ena mwa mazana a malamulo a iptables mu NAT? Kapena pali vuto lililonse ndi IPIP tunneling pamaneti? Njira imodzi yoyesera izi ndikuyesa gawo lililonse la dongosololi pochotsa. Chimachitika ndi chiyani ngati mutachotsa NAT ndi logic ya firewall, ndikusiya gawo la IPIP lokha:

Kuthetsa vuto la network latency ku Kubernetes

Mwamwayi, Linux imapangitsa kuti zikhale zosavuta kuti mufikire wosanjikiza wa IP mwachindunji ngati makina ali pamaneti omwewo:

theojulienne@kube-node-client ~ $ sudo hping3 10.125.20.64 -S -i u10000 | egrep --line-buffered 'rtt=[0-9]{3}.'

len=40 ip=10.125.20.64 ttl=64 DF id=0 sport=0 flags=RA seq=7346 win=0 rtt=127.3 ms

len=40 ip=10.125.20.64 ttl=64 DF id=0 sport=0 flags=RA seq=7347 win=0 rtt=117.3 ms

len=40 ip=10.125.20.64 ttl=64 DF id=0 sport=0 flags=RA seq=7348 win=0 rtt=107.2 ms

Kutengera zotsatira zake, vuto likadalipo! Izi siziphatikiza ma iptables ndi NAT. Ndiye vuto ndi TCP? Tiyeni tiwone momwe ICMP ping yokhazikika imayendera:

theojulienne@kube-node-client ~ $ sudo hping3 10.125.20.64 --icmp -i u10000 | egrep --line-buffered 'rtt=[0-9]{3}.'

len=28 ip=10.125.20.64 ttl=64 id=42594 icmp_seq=104 rtt=110.0 ms

len=28 ip=10.125.20.64 ttl=64 id=49448 icmp_seq=4022 rtt=141.3 ms

len=28 ip=10.125.20.64 ttl=64 id=49449 icmp_seq=4023 rtt=131.3 ms

len=28 ip=10.125.20.64 ttl=64 id=49450 icmp_seq=4024 rtt=121.2 ms

len=28 ip=10.125.20.64 ttl=64 id=49451 icmp_seq=4025 rtt=111.2 ms

len=28 ip=10.125.20.64 ttl=64 id=49452 icmp_seq=4026 rtt=101.1 ms

len=28 ip=10.125.20.64 ttl=64 id=50023 icmp_seq=4343 rtt=126.8 ms

len=28 ip=10.125.20.64 ttl=64 id=50024 icmp_seq=4344 rtt=116.8 ms

len=28 ip=10.125.20.64 ttl=64 id=50025 icmp_seq=4345 rtt=106.8 ms

len=28 ip=10.125.20.64 ttl=64 id=59727 icmp_seq=9836 rtt=106.1 ms

Zotsatira zikuwonetsa kuti vutoli silinathe. Mwina iyi ndi njira ya IPIP? Tiyeni tifufuze mayeso mowonjezereka:

Kuthetsa vuto la network latency ku Kubernetes

Kodi mapaketi onse amatumizidwa pakati pa makamu awiriwa?

theojulienne@kube-node-client ~ $ sudo hping3 172.16.47.27 --icmp -i u10000 | egrep --line-buffered 'rtt=[0-9]{3}.'

len=46 ip=172.16.47.27 ttl=61 id=41127 icmp_seq=12564 rtt=140.9 ms

len=46 ip=172.16.47.27 ttl=61 id=41128 icmp_seq=12565 rtt=130.9 ms

len=46 ip=172.16.47.27 ttl=61 id=41129 icmp_seq=12566 rtt=120.8 ms

len=46 ip=172.16.47.27 ttl=61 id=41130 icmp_seq=12567 rtt=110.8 ms

len=46 ip=172.16.47.27 ttl=61 id=41131 icmp_seq=12568 rtt=100.7 ms

len=46 ip=172.16.47.27 ttl=61 id=9062 icmp_seq=31443 rtt=134.2 ms

len=46 ip=172.16.47.27 ttl=61 id=9063 icmp_seq=31444 rtt=124.2 ms

len=46 ip=172.16.47.27 ttl=61 id=9064 icmp_seq=31445 rtt=114.2 ms

len=46 ip=172.16.47.27 ttl=61 id=9065 icmp_seq=31446 rtt=104.2 ms

Tasintha zinthu kukhala ma node awiri a Kubernetes otumizirana paketi iliyonse, ngakhale ICMP ping. Amawonabe latency ngati yemwe akumufunayo ndi "woyipa" (ena oyipa kuposa ena).

Tsopano funso lomaliza: chifukwa chiyani kuchedwa kumangochitika pa maseva a kube-node? Ndipo zimachitika pamene kube-node ndi wotumiza kapena wolandila? Mwamwayi, izi ndizosavuta kuzizindikira potumiza paketi kuchokera kwa alendo kunja kwa Kubernetes, koma ndi wolandila yemweyo "wodziwika bwino". Monga mukuonera, vutoli silinathe:

theojulienne@shell ~ $ sudo hping3 172.16.47.27 -p 9876 -S -i u10000 | egrep --line-buffered 'rtt=[0-9]{3}.'

len=46 ip=172.16.47.27 ttl=61 DF id=0 sport=9876 flags=RA seq=312 win=0 rtt=108.5 ms

len=46 ip=172.16.47.27 ttl=61 DF id=0 sport=9876 flags=RA seq=5903 win=0 rtt=119.4 ms

len=46 ip=172.16.47.27 ttl=61 DF id=0 sport=9876 flags=RA seq=6227 win=0 rtt=139.9 ms

len=46 ip=172.16.47.27 ttl=61 DF id=0 sport=9876 flags=RA seq=7929 win=0 rtt=131.2 ms

Tidzayendetsanso zopempha zomwezo kuchokera ku gwero lakale la kube-node kupita kwa wolandila wakunja (omwe samaphatikizapo gwero la gwero popeza ping imaphatikizapo zonse za RX ndi TX):

theojulienne@kube-node-client ~ $ sudo hping3 172.16.33.44 -p 9876 -S -i u10000 | egrep --line-buffered 'rtt=[0-9]{3}.'
^C
--- 172.16.33.44 hping statistic ---
22352 packets transmitted, 22350 packets received, 1% packet loss
round-trip min/avg/max = 0.2/7.6/1010.6 ms

Poyang'ana zojambula za latency paketi, tapeza zina zowonjezera. Mwachindunji, kuti wotumiza (pansi) amawona nthawiyi, koma wolandira (pamwamba) samawona - onani gawo la Delta (mumasekondi):

Kuthetsa vuto la network latency ku Kubernetes

Kuonjezera apo, ngati muyang'ana kusiyana kwa dongosolo la mapaketi a TCP ndi ICMP (mwa manambala otsatizana) kumbali yolandira, mapaketi a ICMP amafika nthawi zonse mofanana ndi momwe adatumizidwa, koma ndi nthawi yosiyana. Nthawi yomweyo, mapaketi a TCP nthawi zina amalumikizana, ndipo ena amakakamira. Makamaka, ngati muyang'ana madoko a mapaketi a SYN, ali mu dongosolo kumbali ya wotumiza, koma osati kumbali ya wolandira.

Pali kusiyana kobisika momwe makadi a netiweki ma seva amakono (monga omwe ali pakati pa data yathu) amapangira mapaketi okhala ndi TCP kapena ICMP. Paketi ikafika, adaputala ya netiweki "imayimitsa pa kugwirizana", ndiko kuti, imayesa kuswa maulalo kukhala mizere ndikutumiza mzere uliwonse ku purosesa yosiyana. Kwa TCP, hashi iyi imaphatikizapo magwero ndi adilesi ya IP ndi doko. M'mawu ena, kulumikizana kulikonse kumakhala kofulumira (mwina) mosiyana. Kwa ICMP, ma adilesi a IP okha ndi omwe amathamangitsidwa, popeza kulibe madoko.

Kuwona kwina kwatsopano: panthawiyi tikuwona kuchedwa kwa ICMP pazolumikizana zonse pakati pa makamu awiri, koma TCP sichitero. Izi zikutiuza kuti choyambitsacho chikugwirizana ndi RX queue hashing: kusokonekera kumakhala pafupifupi pakukonza mapaketi a RX, osati potumiza mayankho.

Izi zimachotsa kutumiza mapaketi kuchokera pamndandanda wa zomwe zingayambitse. Tsopano tikudziwa kuti vuto lakukonza paketi lili kumbali yolandila pama seva ena a kube-node.

Kumvetsetsa kukonza paketi mu Linux kernel

Kuti timvetsetse chifukwa chomwe vutolo limachitikira pa wolandila pama seva ena a kube-node, tiyeni tiwone momwe Linux kernel imayendera mapaketi.

Kubwerera ku kukhazikitsa kosavuta kwachikhalidwe, khadi ya netiweki imalandira paketi ndikutumiza sokoneza pa Linux kernel kuti pali phukusi lomwe likufunika kukonzedwa. Kernel imayimitsa ntchito ina, imasintha zomwe zikuchitika ku chothandizira chosokoneza, kukonza paketi, kenako ndikubwerera ku ntchito zomwe zilipo.

Kuthetsa vuto la network latency ku Kubernetes

Kusintha kwa nkhaniyi kukuchedwa: kuchedwa sikunawonekere pa makhadi a 10Mbps mu '90s, koma pa makadi amakono a 10G okhala ndi mapaketi 15 miliyoni pa sekondi iliyonse, pakatikati pa seva yaying'ono eyiti ikhoza kusokonezedwa ndi mamiliyoni ambiri. nthawi pa sekondi iliyonse.

Kuti musakhale ndi zosokoneza nthawi zonse, zaka zambiri zapitazo Linux idawonjezera NAPI: Network API yomwe madalaivala onse amakono amagwiritsa ntchito kuti azitha kuyendetsa bwino kwambiri. Pa liwiro lotsika kernel imalandirabe zosokoneza kuchokera pa intaneti khadi mu njira yakale. Mapaketi okwanira akafika omwe amapitilira malire, kernel imalepheretsa kusokoneza ndipo m'malo mwake imayamba kuvotera adaputala ya netiweki ndikutola mapaketi mu chunks. Kukonza kumachitika mu softirq, ndiye kuti, in Kusokoneza kwa mapulogalamu pambuyo pa kuyitana kwadongosolo ndi kusokoneza kwa hardware, pamene kernel (mosiyana ndi malo ogwiritsira ntchito) ikugwira ntchito kale.

Kuthetsa vuto la network latency ku Kubernetes

Izi zimathamanga kwambiri, koma zimayambitsa vuto lina. Ngati mapaketi ali ochulukirapo, ndiye kuti nthawi yonseyi imathera pokonza mapaketi kuchokera pa netiweki khadi, ndipo njira za ogwiritsa ntchito sizikhala ndi nthawi yochotsa mizere iyi (kuwerenga kuchokera ku kulumikizana kwa TCP, ndi zina). Pamapeto pake mizere imadzaza ndipo timayamba kugwetsa mapaketi. Poyesera kupeza malire, kernel imayika bajeti ya chiwerengero chachikulu cha mapaketi okonzedwa mu softirq. Bajeti iyi ikadutsa, ulusi wina umadzutsidwa ksoftirqd (mudzaona mmodzi wa iwo mu ps per core) yomwe imagwira ma softirqs kunja kwa njira yokhazikika ya syscall/interrupt. Ulusiwu umakonzedwa pogwiritsa ntchito ndondomeko yokhazikika, yomwe imayesa kugawa chuma mwachilungamo.

Kuthetsa vuto la network latency ku Kubernetes

Mutaphunzira momwe kernel imayendera mapaketi, mutha kuwona kuti pali mwayi wina wosokonekera. Ngati mafoni a softirq amalandiridwa pafupipafupi, mapaketi amayenera kudikirira kwakanthawi kuti akonzedwe pamzere wa RX pa intaneti. Izi zitha kukhala chifukwa cha ntchito ina yotsekereza pachimake purosesa, kapena china chake chomwe chikulepheretsa pakatikati kuyendetsa softirq.

Kuchepetsa processing mpaka pachimake kapena njira

Kuchedwa kwa Softirq ndikungoyerekeza pakadali pano. Koma ndizomveka, ndipo tikudziwa kuti tikuwona zofanana kwambiri. Kotero sitepe yotsatira ndikutsimikizira chiphunzitsochi. Ndipo ngati zatsimikizika, pezani chifukwa chakuchedwetsa.

Tiyeni tibwerere ku mapaketi athu pang'onopang'ono:

len=46 ip=172.16.53.32 ttl=61 id=29573 icmp_seq=1953 rtt=99.3 ms

len=46 ip=172.16.53.32 ttl=61 id=29574 icmp_seq=1954 rtt=89.3 ms

len=46 ip=172.16.53.32 ttl=61 id=29575 icmp_seq=1955 rtt=79.2 ms

len=46 ip=172.16.53.32 ttl=61 id=29576 icmp_seq=1956 rtt=69.1 ms

len=46 ip=172.16.53.32 ttl=61 id=29577 icmp_seq=1957 rtt=59.1 ms

len=46 ip=172.16.53.32 ttl=61 id=29790 icmp_seq=2070 rtt=75.7 ms

len=46 ip=172.16.53.32 ttl=61 id=29791 icmp_seq=2071 rtt=65.6 ms

len=46 ip=172.16.53.32 ttl=61 id=29792 icmp_seq=2072 rtt=55.5 ms

Monga tafotokozera kale, mapaketi a ICMP awa amathamangitsidwa pamzere umodzi wa RX NIC ndikukonzedwa ndi core CPU imodzi. Ngati tikufuna kumvetsetsa momwe Linux imagwirira ntchito, ndizothandiza kudziwa komwe (pamene CPU core) ndi momwe (softirq, ksoftirqd) mapaketiwa amakonzedwa kuti azitsata ndondomekoyi.

Tsopano ndi nthawi yogwiritsa ntchito zida zomwe zimakupatsani mwayi wowunika ma Linux kernel munthawi yeniyeni. Apa tidagwiritsa ntchito Bcc. Zida izi zimakupatsani mwayi kuti mulembe mapulogalamu ang'onoang'ono a C omwe amalowetsa ntchito mopondereza mu kernel ndikuyika zochitikazo mu pulogalamu ya Python ya ogwiritsa ntchito yomwe imatha kuwakonza ndikukubwezerani zotsatira. Kugwira ntchito mosagwirizana mu kernel ndi bizinesi yachinyengo, koma zofunikirazo zimapangidwira kuti zikhale zotetezeka kwambiri ndipo zimapangidwira kuti zizitha kuyang'anira ndendende nkhani zopanga zomwe sizimapangidwanso mosavuta poyesa kapena chitukuko.

Dongosolo pano ndi losavuta: tikudziwa kuti kernel imapanga ma ICMP pings, ndiye tiyika mbedza pa ntchito ya kernel. icmp_echo, yomwe imavomereza paketi yopempha ya ICMP echo yomwe ikubwera ndikuyamba kutumiza yankho la ICMP echo. Titha kuzindikira paketi powonjezera nambala ya icmp_seq, yomwe ikuwonetsa hping3 apamwamba.

kachidindo bcc script zikuwoneka zovuta, koma sizowopsa monga zikuwonekera. Ntchito icmp_echo zimatumiza struct sk_buff *skb: Ichi ndi paketi yokhala ndi "pempho la echo". Tikhoza kuzilondolera, kutulutsa zotsatizana echo.sequence (yomwe ikufanana ndi icmp_seq pa hping3 выше), ndikutumiza ku malo ogwiritsa ntchito. Ndiwosavuta kujambula dzina/id yomwe ilipo pano. Pansipa pali zotsatira zomwe timawona mwachindunji pomwe kernel ikupanga mapaketi:

NJIRA YA TGID PID NAME ICMP_SEQ
0 0 swapper/11
770 0 swapper/0
11 771 swapper/0
0 11 swapper/772
0 0 swapper/11
773 0 prometheus 0
11 774 swapper/20041
20086 775 swapper/0
0 11 swapper/776
0 0 spokes-lipoti-s 11

Kuyenera kudziŵika apa kuti mu nkhani softirq njira zomwe zimayimba mafoni aziwoneka ngati "njira" pomwe kwenikweni ndi kernel yomwe imayang'anira mapaketi molingana ndi kernel.

Ndi chida ichi tikhoza kugwirizanitsa njira zenizeni ndi mapepala enieni omwe amasonyeza kuchedwa kwa hping3. Tiyeni tipange izo mophweka grep pa kujambula kwa zinthu zina icmp_seq. Mapaketi ofanana ndi icmp_seq omwe ali pamwambawa adayikidwa chizindikiro pamodzi ndi RTT yawo yomwe tawona pamwambapa (m'makoloko muli ma RTT omwe amayembekezeka pamapaketi omwe tasefa chifukwa cha mitengo ya RTT yochepera 50ms):

NJIRA YA TGID PID NAME ICMP_SEQ ** RTT
--
10137 10436 cadvisor 1951
10137 10436 cadvisor 1952
76 76 ksoftirqd/11 1953 ** 99ms
76 76 ksoftirqd/11 1954 ** 89ms
76 76 ksoftirqd/11 1955 ** 79ms
76 76 ksoftirqd/11 1956 ** 69ms
76 76 ksoftirqd/11 1957 ** 59ms
76 76 ksoftirqd/11 1958 ** (49ms)
76 76 ksoftirqd/11 1959 ** (39ms)
76 76 ksoftirqd/11 1960 ** (29ms)
76 76 ksoftirqd/11 1961 ** (19ms)
76 76 ksoftirqd/11 1962 ** (9ms)
--
10137 10436 cadvisor 2068
10137 10436 cadvisor 2069
76 76 ksoftirqd/11 2070 ** 75ms
76 76 ksoftirqd/11 2071 ** 65ms
76 76 ksoftirqd/11 2072 ** 55ms
76 76 ksoftirqd/11 2073 ** (45ms)
76 76 ksoftirqd/11 2074 ** (35ms)
76 76 ksoftirqd/11 2075 ** (25ms)
76 76 ksoftirqd/11 2076 ** (15ms)
76 76 ksoftirqd/11 2077 ** (5ms)

Zotsatira zake zimatiuza zinthu zingapo. Choyamba, mapepala onsewa amakonzedwa ndi nkhaniyo ksoftirqd/11. Izi zikutanthauza kuti pamakina awiriwa, mapaketi a ICMP adathamangitsidwa mpaka 11 pamapeto olandila. Timawonanso kuti nthawi iliyonse pakakhala kupanikizana, pali mapaketi omwe amasinthidwa malinga ndi kuyitanira kwadongosolo. cadvisor. Kenako ksoftirqd amatenga ntchitoyo ndikukonza mzere wosonkhanitsidwa: ndendende kuchuluka kwa mapaketi omwe adasonkhanitsidwa pambuyo pake. cadvisor.

Mfundo yakuti nthawi yomweyo isanayambe ntchito cadvisor, kumatanthauza kuloŵerera kwake m’vutoli. Chodabwitsa, cholinga mphunzitsi - "unikani kagwiritsidwe ntchito kazinthu ndi mawonekedwe a zotengera zomwe zikuyenda" m'malo moyambitsa vutoli.

Monga momwe zilili ndi zotengera zina, zonsezi ndi zida zapamwamba kwambiri ndipo zitha kuyembekezera kukumana ndi zovuta zogwirira ntchito munthawi zina zosayembekezereka.

Kodi cadvisor imachita chiyani kuti ichedwetse pamzere wa paketi?

Tsopano tikumvetsetsa bwino momwe ngoziyi imachitikira, njira yomwe ikuyambitsa, ndi CPU iti. Tikuwona kuti chifukwa chotsekereza movutikira, kernel ya Linux ilibe nthawi yokonzekera ksoftirqd. Ndipo tikuwona kuti mapaketi amakonzedwa molingana cadvisor. Ndi zomveka kuganiza choncho cadvisor imayambitsa syscall pang'onopang'ono, pambuyo pake mapaketi onse omwe amasonkhanitsidwa panthawiyo amakonzedwa:

Kuthetsa vuto la network latency ku Kubernetes

Ichi ndi chiphunzitso, koma momwe mungayesere? Zomwe titha kuchita ndikutsata maziko a CPU munthawi yonseyi, pezani pomwe kuchuluka kwa mapaketi kumapitilira bajeti ndipo ksoftirqd imatchedwa, ndiyeno yang'anani mmbuyo pang'ono kuti muwone chomwe chikuyenda pachimake cha CPU isanafike nthawi imeneyo. . Zili ngati x-ray CPU ma milliseconds angapo aliwonse. Idzawoneka motere:

Kuthetsa vuto la network latency ku Kubernetes

Mosavuta, zonsezi zitha kuchitika ndi zida zomwe zilipo. Mwachitsanzo, perf mbiri imayang'ana pachimake cha CPU chomwe chaperekedwa pafupipafupi ndipo imatha kupanga ndandanda yoyimbira pamakina othamanga, kuphatikiza malo onse ogwiritsa ntchito ndi Linux kernel. Mutha kutenga mbiriyi ndikuyikonza pogwiritsa ntchito foloko yaying'ono ya pulogalamuyi Chithunzi cha Flame kuchokera kwa Brendan Gregg, yomwe imasunga dongosolo lazotsatira. Titha kusunga masitaki a mzere umodzi pa 1 ms iliyonse, ndikuwunikira ndikusunga ma milliseconds 100 mbiri isanakwane. ksoftirqd:

# record 999 times a second, or every 1ms with some offset so not to align exactly with timers
sudo perf record -C 11 -g -F 999
# take that recording and make a simpler stack trace.
sudo perf script 2>/dev/null | ./FlameGraph/stackcollapse-perf-ordered.pl | grep ksoftir -B 100

Nazi zotsatira:

(сотни следов, которые выглядят похожими)

cadvisor;[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];entry_SYSCALL_64_after_swapgs;do_syscall_64;sys_read;vfs_read;seq_read;memcg_stat_show;mem_cgroup_nr_lru_pages;mem_cgroup_node_nr_lru_pages cadvisor;[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];entry_SYSCALL_64_after_swapgs;do_syscall_64;sys_read;vfs_read;seq_read;memcg_stat_show;mem_cgroup_nr_lru_pages;mem_cgroup_node_nr_lru_pages cadvisor;[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];entry_SYSCALL_64_after_swapgs;do_syscall_64;sys_read;vfs_read;seq_read;memcg_stat_show;mem_cgroup_iter cadvisor;[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];entry_SYSCALL_64_after_swapgs;do_syscall_64;sys_read;vfs_read;seq_read;memcg_stat_show;mem_cgroup_nr_lru_pages;mem_cgroup_node_nr_lru_pages cadvisor;[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];[cadvisor];entry_SYSCALL_64_after_swapgs;do_syscall_64;sys_read;vfs_read;seq_read;memcg_stat_show;mem_cgroup_nr_lru_pages;mem_cgroup_node_nr_lru_pages ksoftirqd/11;ret_from_fork;kthread;kthread;smpboot_thread_fn;smpboot_thread_fn;run_ksoftirqd;__do_softirq;net_rx_action;ixgbe_poll;ixgbe_clean_rx_irq;napi_gro_receive;netif_receive_skb_internal;inet_gro_receive;bond_handle_frame;__netif_receive_skb_core;ip_rcv_finish;ip_rcv;ip_forward_finish;ip_forward;ip_finish_output;nf_iterate;ip_output;ip_finish_output2;__dev_queue_xmit;dev_hard_start_xmit;ipip_tunnel_xmit;ip_tunnel_xmit;iptunnel_xmit;ip_local_out;dst_output;__ip_local_out;nf_hook_slow;nf_iterate;nf_conntrack_in;generic_packet;ipt_do_table;set_match_v4;ip_set_test;hash_net4_kadt;ixgbe_xmit_frame_ring;swiotlb_dma_mapping_error;hash_net4_test ksoftirqd/11;ret_from_fork;kthread;kthread;smpboot_thread_fn;smpboot_thread_fn;run_ksoftirqd;__do_softirq;net_rx_action;gro_cell_poll;napi_gro_receive;netif_receive_skb_internal;inet_gro_receive;__netif_receive_skb_core;ip_rcv_finish;ip_rcv;ip_forward_finish;ip_forward;ip_finish_output;nf_iterate;ip_output;ip_finish_output2;__dev_queue_xmit;dev_hard_start_xmit;dev_queue_xmit_nit;packet_rcv;tpacket_rcv;sch_direct_xmit;validate_xmit_skb_list;validate_xmit_skb;netif_skb_features;ixgbe_xmit_frame_ring;swiotlb_dma_mapping_error;__dev_queue_xmit;dev_hard_start_xmit;__bpf_prog_run;__bpf_prog_run

Pali zinthu zambiri pano, koma chachikulu ndikuti timapeza "cadvisor pamaso pa ksoftirqd" yomwe tidawona kale mu ICMP tracer. Zikutanthauza chiyani?

Mzere uliwonse ndi CPU trace pa nthawi inayake. Kuyitanira kulikonse pamzere kumasiyanitsidwa ndi semicolon. Pakati pa mizere tikuwona syscall ikutchedwa: read(): .... ;do_syscall_64;sys_read; .... Chifukwa chake cadvisor amathera nthawi yochuluka pakuyimba foni read()zokhudzana ndi ntchito mem_cgroup_* (pamwamba pazitali zoyimba / kumapeto kwa mzere).

Ndizosasangalatsa kuwona pakuyimba foni zomwe zikuwerengedwa, ndiye tiyeni tithawe strace ndipo tiwone zomwe cadvisor amachita ndikupeza makina amayitanitsa nthawi yayitali kuposa 100ms:

theojulienne@kube-node-bad ~ $ sudo strace -p 10137 -T -ff 2>&1 | egrep '<0.[1-9]'
[pid 10436] <... futex resumed> ) = 0 <0.156784>
[pid 10432] <... futex resumed> ) = 0 <0.258285>
[pid 10137] <... futex resumed> ) = 0 <0.678382>
[pid 10384] <... futex resumed> ) = 0 <0.762328>
[pid 10436] <... read resumed> "cache 154234880nrss 507904nrss_h"..., 4096) = 658 <0.179438>
[pid 10384] <... futex resumed> ) = 0 <0.104614>
[pid 10436] <... futex resumed> ) = 0 <0.175936>
[pid 10436] <... read resumed> "cache 0nrss 0nrss_huge 0nmapped_"..., 4096) = 577 <0.228091>
[pid 10427] <... read resumed> "cache 0nrss 0nrss_huge 0nmapped_"..., 4096) = 577 <0.207334>
[pid 10411] <... epoll_ctl resumed> ) = 0 <0.118113>
[pid 10382] <... pselect6 resumed> ) = 0 (Timeout) <0.117717>
[pid 10436] <... read resumed> "cache 154234880nrss 507904nrss_h"..., 4096) = 660 <0.159891>
[pid 10417] <... futex resumed> ) = 0 <0.917495>
[pid 10436] <... futex resumed> ) = 0 <0.208172>
[pid 10417] <... futex resumed> ) = 0 <0.190763>
[pid 10417] <... read resumed> "cache 0nrss 0nrss_huge 0nmapped_"..., 4096) = 576 <0.154442>

Monga momwe mungayembekezere, tikuwona mafoni apang'onopang'ono pano read(). Kuchokera ku zomwe zili mu ntchito yowerenga ndi nkhani mem_cgroup zikuwonekeratu kuti zovuta izi read() onetsani ku fayilo memory.stat, yomwe ikuwonetsa kugwiritsa ntchito kukumbukira ndi malire amagulu (ukadaulo wa Docker's resource isolation). Chida cha cadvisor chimafunsa fayiloyi kuti ipeze zambiri zogwiritsira ntchito zotengera. Tiyeni tiwone ngati ndi kernel kapena cadvisor akuchita zosayembekezereka:

theojulienne@kube-node-bad ~ $ time cat /sys/fs/cgroup/memory/memory.stat >/dev/null

real 0m0.153s
user 0m0.000s
sys 0m0.152s
theojulienne@kube-node-bad ~ $

Tsopano titha kupanganso cholakwikacho ndikumvetsetsa kuti kernel ya Linux ikukumana ndi matenda.

Chifukwa chiyani ntchito yowerengera ikuchedwa?

Panthawi imeneyi, zimakhala zosavuta kupeza mauthenga ochokera kwa ena okhudzana ndi mavuto ofanana. Monga momwe zinakhalira, mu cadvisor tracker cholakwika ichi chinanenedwa ngati vuto lakugwiritsa ntchito kwambiri CPU, kungoti palibe amene adawona kuti latency imawonekeranso mwachisawawa mu stack network. Zinadziwikadi kuti cadvisor akudya nthawi yambiri ya CPU kuposa momwe amayembekezera, koma izi sizinapatsidwe kufunikira kwakukulu, popeza ma seva athu ali ndi zinthu zambiri za CPU, choncho vutoli silinaphunzire mosamala.

Vuto ndiloti magulu amaganizira kugwiritsa ntchito kukumbukira mkati mwa namespace (chotengera). Zonse zikatuluka mgululi, Docker amamasula gulu lokumbukira. Komabe, "memory" sikuti imangokhala kukumbukira. Ngakhale kukumbukira kwa ndondomekoyi sikukugwiritsidwanso ntchito, zikuwoneka kuti kernel ikuperekabe zomwe zili mkati, monga dentries ndi inodes (directory ndi file metadata), zomwe zimasungidwa mumagulu a kukumbukira. Kuchokera pamafotokozedwe avuto:

magulu a zombie: magulu omwe alibe njira ndipo achotsedwa, koma akadali ndi kukumbukira (kwa ine, kuchokera ku dentry cache, komanso akhoza kugawidwa kuchokera pa cache kapena tmpfs).

Kufufuza kwa kernel kwa masamba onse omwe ali mu cache pamene kumasula gulu kungakhale kochedwa kwambiri, kotero njira yaulesi imasankhidwa: dikirani mpaka masambawa apemphedwe kachiwiri, ndiyeno potsiriza yeretsani gululo pamene kukumbukira kuli kofunikira. Mpaka pano, cgroup imaganiziridwabe posonkhanitsa ziwerengero.

Kuchokera pamawonekedwe a magwiridwe antchito, adasiya kukumbukira kuti agwire ntchito: kufulumizitsa kuyeretsa koyambirira posiya kukumbukira kosungidwa. Izi nzabwino. Pamene kernel imagwiritsa ntchito kukumbukira komaliza, gululo limachotsedwa, kotero silingatchulidwe kuti "kutulutsa". Tsoka ilo, kukhazikitsidwa kwachindunji kwa njira yofufuzira memory.stat mu mtundu wa kernel (4.9), wophatikizidwa ndi kuchuluka kwa kukumbukira pamaseva athu, kumatanthauza kuti zimatenga nthawi yayitali kubwezeretsa zosungidwa zaposachedwa ndikuchotsa Zombies zamagulu.

Zinapezeka kuti ma node athu ena anali ndi Zombies zambiri zamagulu kotero kuti kuwerenga ndi latency kudaposa sekondi imodzi.

The workaround for the cadvisor issue is to free dentries/inodes caches in the system, that soon amachotsa kuwerenga latency komanso network latency pa host host, popeza kuchotsa cache kumatembenukira pa cached cgroup zombie masamba nawonso amamasulidwa. Iyi si njira yothetsera vutoli, koma imatsimikizira chomwe chimayambitsa vutoli.

Zinapezeka kuti m'mitundu yatsopano ya kernel (4.19+) kuyimba foni kudasinthidwa memory.stat, kotero kusinthira ku kernel iyi kunakonza vuto. Nthawi yomweyo, tinali ndi zida zowunikira zovuta m'magulu a Kubernetes, kuwatsitsa mwaulemu ndikuyambiranso. Tidaphatikiza magulu onse, tidapeza ma node okhala ndi latency yayitali ndikuyambiranso. Izi zidatipatsa nthawi yosinthira OS pa maseva otsalawo.

Kufotokozera mwachidule

Chifukwa cholakwikachi chinayimitsa kukonza pamzere wa RX NIC kwa mazana a ma milliseconds, nthawi yomweyo zidapangitsa kuti pakhale kuchedwa kwambiri pamalumikizidwe achidule komanso kulumikizidwa kwapakatikati, monga pakati pa zopempha za MySQL ndi mapaketi oyankha.

Kumvetsetsa ndi kusunga machitidwe a machitidwe ofunikira kwambiri, monga Kubernetes, ndizofunikira kwambiri pa kudalirika ndi kuthamanga kwa mautumiki onse okhudzana ndi iwo. Dongosolo lililonse lomwe mumayendetsa limapindula ndikusintha kwa magwiridwe antchito a Kubernetes.

Source: www.habr.com

Kuwonjezera ndemanga