Qaphela. transl.: Lesi sihloko, esibhalwe u-Galo Navarro, ophethe isikhundla sikaNjiniyela Wesofthiwe Oyinhloko enkampanini yaseYurophu i-Adevinta, "uphenyo" oluthakazelisayo nolufundisayo emkhakheni wemisebenzi yengqalasizinda. Isihloko sayo sokuqala sanwetshwa kancane ekuhumusheni ngesizathu umbhali asichaza ekuqaleni.
Inothi elivela kumbhali: Kubukeka njengalokhu okuthunyelwe
Emasontweni ambalwa edlule, ithimba lami belithutha i-microservice eyodwa liyisa endaweni eyinhloko ehlanganisa i-CI/CD, isikhathi sokusebenza esisekelwe ku-Kubernetes, amamethrikhi, nezinye izinto ezinhle. Lesi sinyathelo besiwukulinga: sihlele ukukuthatha njengesisekelo futhi sidlulisele cishe izinsiza ezingu-150 ngaphezulu ezinyangeni ezizayo. Bonke banesibopho sokusebenza kwamanye amapulatifomu amakhulu e-inthanethi eSpain (Infojobs, Fotocasa, njll.).
Ngemuva kokuthi sithumele isicelo ku-Kubernetes futhi siqondise kabusha ithrafikhi ethile kuyo, sasilindelwe isimanga esishaqisayo. Libazisa (ukubambezeleka) izicelo ku-Kubernetes beziphakeme ngokuphindwe ka-10 kune-EC2. Ngokuvamile, kwakudingeka ukuthola isisombululo sale nkinga, noma ukuyeka ukufuduka kwe-microservice (futhi, mhlawumbe, yonke iphrojekthi).
Kungani ukubambezeleka kuphakeme kangaka e-Kubernetes kunaku-EC2?
Ukuze sithole ibhodlela, siqoqe amamethrikhi kuyo yonke indlela yokucela. I-architecture yethu ilula: isango le-API (i-Zuul) licela i-microservice izimo ku-EC2 noma ku-Kubernetes. Ku-Kubernetes sisebenzisa i-NGINX Ingress Controller, futhi i-backends yizinto ezijwayelekile ezifana
EC2
+---------------+
| +---------+ |
| | | |
+-------> BACKEND | |
| | | | |
| | +---------+ |
| +---------------+
+------+ |
Public | | |
-------> ZUUL +--+
traffic | | | Kubernetes
+------+ | +-----------------------------+
| | +-------+ +---------+ |
| | | | xx | | |
+-------> NGINX +------> BACKEND | |
| | | xx | | |
| +-------+ +---------+ |
+-----------------------------+
Inkinga ibonakala ihlobene nokubambezeleka kokuqala kungemuva (ngimake indawo yenkinga kugrafu ngokuthi "xx"). Ku-EC2, impendulo yesicelo ithathe cishe u-20ms. E-Kubernetes, ukubambezeleka kukhuphuke kwaba ngu-100-200 ms.
Sixoshe ngokushesha abasolwa okungenzeka bahlobene noshintsho lwesikhathi sokusebenza. Inguqulo ye-JVM ihlala injalo. Izinkinga zokufakwa kwamabhokisi nazo azihlangene nakancane nakho: uhlelo lokusebenza beseluvele lusebenza ngempumelelo ezitsheni ku-EC2. Iyalayisha? Kodwa siqaphele ukubambezeleka okuphezulu ngisho nangesicelo esingu-1 ngomzuzwana. Ukumiswa okwesikhashana kokuqoqwa kukadoti nakho kungase kunganakwa.
Omunye wabaphathi bethu be-Kubernetes uzibuze ukuthi ingabe uhlelo lokusebenza lunokuncika kwangaphandle ngoba imibuzo ye-DNS ibangele izinkinga ezifanayo esikhathini esidlule.
I-hypothesis 1: ukulungiswa kwegama le-DNS
Ngesicelo ngasinye, uhlelo lwethu lokusebenza lufinyelela isenzakalo se-AWS Elasticsearch kanye noma kathathu esizindeni esifana naso elastic.spain.adevinta.com
. Ngaphakathi kweziqukathi zethu
Imibuzo ye-DNS evela esitsheni:
[root@be-851c76f696-alf8z /]# while true; do dig "elastic.spain.adevinta.com" | grep time; sleep 2; done
;; Query time: 22 msec
;; Query time: 22 msec
;; Query time: 29 msec
;; Query time: 21 msec
;; Query time: 28 msec
;; Query time: 43 msec
;; Query time: 39 msec
Izicelo ezifanayo ezisuka kwesinye sezimo ze-EC2 lapho isicelo sisebenza khona:
bash-4.4# while true; do dig "elastic.spain.adevinta.com" | grep time; sleep 2; done
;; Query time: 77 msec
;; Query time: 0 msec
;; Query time: 0 msec
;; Query time: 0 msec
;; Query time: 0 msec
Uma kucatshangelwa ukuthi ukubheka kuthathe cishe ama-30ms, kwacaca ukuthi ukulungiswa kwe-DNS lapho ufinyelela i-Elasticsearch empeleni bekunomthelela ekwandeni kokubambezeleka.
Nokho, lokhu kwakuxakile ngenxa yezizathu ezimbili:
- Sesivele sinenqwaba yezinhlelo zokusebenza ze-Kubernetes ezisebenzisana nezinsiza ze-AWS ngaphandle kokuhlupheka ukubambezeleka okuphezulu. Kungakhathaliseki ukuthi siyini isizathu, ihlobene ngokuqondile naleli cala.
- Siyazi ukuthi i-JVM yenza inqolobane ye-DNS yenkumbulo. Ezithombeni zethu, inani le-TTL libhalwe ngalo
$JAVA_HOME/jre/lib/security/java.security
bese usetha kumasekhondi ayi-10:networkaddress.cache.ttl = 10
. Ngamanye amazwi, i-JVM kufanele igcine yonke imibuzo ye-DNS imizuzwana eyi-10.
Ukuqinisekisa i-hypothesis yokuqala, sinqume ukuyeka ukushayela i-DNS isikhashana futhi sibone ukuthi inkinga ihambile yini. Okokuqala, sinqume ukulungisa kabusha uhlelo lokusebenza ukuze luxhumane ngokuqondile ne-Elasticsearch ngekheli le-IP, kunokuba sisebenzise igama lesizinda. Lokhu kuzodinga izinguquko zekhodi kanye nokusetshenziswa okusha, ngakho-ke sivele senza imephu isizinda ekhelini laso le-IP /etc/hosts
:
34.55.5.111 elastic.spain.adevinta.com
Manje isiqukathi sithole i-IP cishe ngaso leso sikhathi. Lokhu kuholele ekuthuthukisweni okuthile, kodwa besisondele kancane nje kumazinga alindelwe ukubambezeleka. Nakuba ukulungiswa kwe-DNS kuthathe isikhathi eside, isizathu sangempela sasisengasitholi.
Ukuxilongwa ngenethiwekhi
Sinqume ukuhlaziya ithrafikhi evela esitsheni sisebenzisa tcpdump
ukuze ubone ukuthi yini ngempela eyenzekayo kunethiwekhi:
[root@be-851c76f696-alf8z /]# tcpdump -leni any -w capture.pcap
Sibe sesithumela izicelo ezimbalwa futhi salanda ukuthwebula kwazo (kubectl cp my-service:/capture.pcap capture.pcap
) ukuze uthole ukuhlaziya okwengeziwe
Kwakungekho lutho olusolisayo mayelana nemibuzo ye-DNS (ngaphandle kwento eyodwa encane engizokhuluma ngayo kamuva). Kodwa kwakukhona okungavamile endleleni inkonzo yethu eyayisingatha ngayo isicelo ngasinye. Ngezansi isithombe-skrini sokuthwebula esibonisa isicelo samukelwe ngaphambi kokuthi impendulo iqale:
Izinombolo zephakheji zikhonjiswa kukholamu yokuqala. Ukuze kucace, ngifake amakhodi anemibala emifudlaneni ehlukene ye-TCP.
Umfudlana oluhlaza oqala ngephakethe 328 ubonisa ukuthi iklayenti (172.17.22.150) limise kanjani uxhumano lwe-TCP esitsheni (172.17.36.147). Ngemva kokuxhawulana kokuqala (328-330), iphakheji 331 ilethwa HTTP GET /v1/..
- isicelo esingenayo enkonzweni yethu. Yonke inqubo ithathe 1 ms.
Ukusakaza okumpunga (kusuka kuphakethe 339) kubonisa ukuthi isevisi yethu ithumele isicelo se-HTTP kusenzakalo se-Elasticsearch (akukho ukuxhawula kwe-TCP ngoba isebenzisa uxhumano olukhona). Lokhu kuthathe u-18ms.
Kuze kube manje konke kuhamba kahle, futhi izikhathi zicishe zifane nokubambezeleka okulindelekile (20-30 ms uma kulinganiswa kuklayenti).
Nokho, ingxenye eluhlaza okwesibhakabhaka ithatha ama-86ms. Kwenzekani kuwo? Ngephakethe 333, insizakalo yethu ithumele isicelo se-HTTP GET ku /latest/meta-data/iam/security-credentials
, futhi ngokushesha ngemva kwayo, phezu koxhumo olufanayo lwe-TCP, esinye isicelo se-GET /latest/meta-data/iam/security-credentials/arn:..
.
Sithole ukuthi lokhu kuphindaphindwa kuso sonke isicelo kuwo wonke umkhondo. Isixazululo se-DNS impela siyahamba kancane ezitsheni zethu (incazelo yalesi simo iyathakazelisa impela, kodwa ngizoyigcinela indatshana ehlukile). Kuvele ukuthi imbangela yokubambezeleka okude kube yizingcingo eziya kusevisi ye-AWS Instance Metadata esicelweni ngasinye.
I-hypothesis 2: izingcingo ezingadingekile eziya ku-AWS
Zombili iziphetho zingezakho
/ # curl http://169.254.169.254/latest/meta-data/iam/security-credentials/
arn:aws:iam::<account_id>:role/some_role
Isicelo sesibili sibuza isiphetho sesibili sezimvume zesikhashana zalesi senzakalo:
/ # curl http://169.254.169.254/latest/meta-data/iam/security-credentials/arn:aws:iam::<account_id>:role/some_role`
{
"Code" : "Success",
"LastUpdated" : "2012-04-26T16:39:16Z",
"Type" : "AWS-HMAC",
"AccessKeyId" : "ASIAIOSFODNN7EXAMPLE",
"SecretAccessKey" : "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"Token" : "token",
"Expiration" : "2017-05-17T15:09:54Z"
}
Iklayenti lingazisebenzisa isikhathi esifushane futhi kufanele ngezikhathi ezithile lithole izitifiketi ezintsha (ngaphambi kokuba zibe khona Expiration
). Imodeli ilula: I-AWS izungezisa okhiye besikhashana njalo ngenxa yezizathu zokuphepha, kodwa amakhasimende angakwazi ukuwagcina kunqolobane amaminithi ambalwa ukuze anxephezele inhlawulo yokusebenza ehlobene nokuthola izitifiketi ezintsha.
I-AWS Java SDK kufanele ithathe umthwalo wokuhlela le nqubo, kodwa ngesizathu esithile lokhu akwenzeki.
Ngemva kokusesha izinkinga ku-GitHub, sihlangabezane nenkinga
I-AWS SDK ibuyekeza izitifiketi uma kwenzeka esinye sezimo ezilandelayo:
- Usuku lokuphelelwa isikhathi (
Expiration
) UkuwelaEXPIRATION_THRESHOLD
, ikhodi eqinile ibe yimizuzu eyi-15. - Isikhathi esiningi sesidlulile kusukela emzamweni wokugcina wokuvuselela izitifiketi kuno
REFRESH_THRESHOLD
, ifakwe ikhodi eqinile imizuzu engama-60.
Ukubona usuku lwangempela lokuphelelwa yisikhathi kwezitifiketi esizitholayo, sisebenzise imiyalo engenhla ye-cURL evela kukho kokubili isiqukathi kanye nesenzakalo se-EC2. Isikhathi sokuqinisekisa sesitifiketi esitholwe esitsheni sivele saba sifushane kakhulu: imizuzu eyi-15 ngqo.
Manje konke sekucacile: ngesicelo sokuqala, isevisi yethu yathola izitifiketi zesikhashana. Njengoba bezingasebenzi isikhathi esingaphezu kwamaminithi angu-15, i-AWS SDK inganquma ukuzibuyekeza esicelweni esilandelayo. Futhi lokhu kwenzeka ngazo zonke izicelo.
Kungani isikhathi sokuqinisekisa sezitifiketi sibe sifushane?
I-AWS Instance Metadata yakhelwe ukusebenza nezimo ze-EC2, hhayi i-Kubernetes. Ngakolunye uhlangothi, besingafuni ukushintsha isixhumi esibonakalayo sohlelo lokusebenza. Ukuze lokhu sisebenzise
I-KIAM ihlinzeka ngezitifiketi zesikhashana kuma-pods. Lokhu kunengqondo uma kucatshangelwa ukuthi ubude besikhathi se-pod bufushane kunesibonelo se-EC2. Isikhathi esizenzakalelayo sokuqinisekisa sezitifiketi
Ngenxa yalokho, uma umboza womabili amanani azenzakalelayo phezu kwelinye, kuphakama inkinga. Isitifiketi ngasinye esinikezwe isicelo siphelelwa yisikhathi ngemva kwemizuzu eyi-15. Nokho, i-AWS Java SDK iphoqa ukuvuselelwa kwanoma yisiphi isitifiketi esinemizuzu engaphansi kweyi-15 esele ngaphambi kosuku lwaso lokuphelelwa yisikhathi.
Ngenxa yalokho, isitifiketi sesikhashana siyaphoqeleka ukuthi sivuselelwe ngesicelo ngasinye, okubandakanya izingcingo ezimbalwa ku-AWS API futhi kubangele ukwanda okukhulu kokubambezeleka. Ku-AWS Java SDK sitholile
Isixazululo sibe lula. Simane silungise kabusha i-KIAM ukuze icele izitifiketi ezinesikhathi eside sokuqinisekisa. Uma lokhu sekwenzekile, izicelo zaqala ukugeleza ngaphandle kokubamba iqhaza kwesevisi ye-AWS Metadata, futhi ukubambezeleka kwehle kwaya kumazinga aphansi ngisho nangaphansi kunaku-EC2.
okutholakele
Ngokusekelwe kokuhlangenwe nakho kwethu ngokufuduka, omunye wemithombo evame kakhulu yezinkinga awuzona iziphazamisi ku-Kubernetes noma ezinye izici zeplathifomu. Futhi ayibheki noma yimaphi amaphutha ayisisekelo kuma-microservices esiwathuthayo. Izinkinga zivame ukuvela ngenxa yokuthi sihlanganisa izakhi ezihlukene.
Sihlanganisa ndawonye amasistimu ayinkimbinkimbi angakaze asebenzisane ngaphambili, silindele ukuthi ndawonye azokwakha isistimu eyodwa, enkulu. Maye, izakhi ezengeziwe, indawo eyengeziwe yamaphutha, i-entropy iyanda.
Esimweni sethu, ukubambezeleka okuphezulu akubanga umphumela wezimbungulu noma izinqumo ezimbi ku-Kubernetes, KIAM, AWS Java SDK, noma i-microservice yethu. Bekuwumphumela wokuhlanganisa izilungiselelo ezimbili ezizenzakalelayo ezizimele: eyodwa ku-KIAM, enye ku-AWS Java SDK. Uma zithathwa ngokwehlukana, womabili amapharamitha anengqondo: inqubomgomo esebenzayo yokuvuselela isitifiketi ku-AWS Java SDK, kanye nesikhathi esifushane sokuqinisekisa sezitifiketi ku-KAIM. Kodwa uma uwahlanganisa, imiphumela iba engalindelekile. Izixazululo ezimbili ezizimele nezinengqondo akudingeki zibe nomqondo uma zihlanganiswa.
I-PS evela kumhumushi
Ungafunda kabanzi mayelana nokwakhiwa kwensiza ye-KIAM yokuhlanganisa i-AWS IAM ne-Kubernetes ku-
Funda futhi kubhulogi yethu:
- «
Izindaba ezi-3 zokuhluleka kwe-Kubernetes ekukhiqizweni: anti-affinity, ukuvala shaqa okuhle, i-webhook "; - «
Ukuthi izinto eziza kuqala ku-Kubernetes zibangele kanjani ukungasebenzi e-Grafana Labs "; - «
Izimbungulu zesistimu yokuzijabulisa eyi-6 ekusebenzeni kwe-Kubernetes [nesixazululo sazo] "; - «
Izindaba ezisebenzayo ezi-6 ezivela empilweni yethu yansuku zonke ye-SRE ".
Source: www.habr.com