Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud
Sannu, Ni Sergey Elantsev, na ci gaba ma'aunin nauyi na hanyar sadarwa a cikin Yandex.Cloud. A baya can, na jagoranci haɓaka ma'auni na L7 don tashar Yandex - abokan aiki suna ba'a cewa komai na yi, ya zama mai daidaitawa. Zan gaya wa masu karatun Habr yadda ake sarrafa kaya a cikin dandali na girgije, abin da muke gani a matsayin kayan aikin da ya dace don cimma wannan burin, da kuma yadda muke motsawa zuwa gina wannan kayan aiki.

Da farko, bari mu gabatar da wasu kalmomi:

  • VIP (Virtual IP) - Adireshin IP mai daidaitawa
  • Sabar, baya, misali - injin kama-da-wane da ke gudanar da aikace-aikace
  • RIP (Real IP) - adireshin IP na uwar garke
  • Kiwon lafiya - duba shirye-shiryen uwar garken
  • Wurin samuwa, AZ - keɓaɓɓen kayan aikin a cikin cibiyar bayanai
  • Yankin - ƙungiyar AZs daban-daban

Masu daidaita ma'auni suna warware manyan ayyuka guda uku: suna yin daidaitawa da kansu, suna haɓaka haƙurin kuskuren sabis, kuma suna sauƙaƙe sikelin sa. Ana tabbatar da haƙurin kuskure ta hanyar sarrafa zirga-zirga ta atomatik: ma'auni yana lura da yanayin aikace-aikacen kuma ya keɓe alƙaluman daidaitawa waɗanda ba su wuce gwajin rayuwa ba. Ana tabbatar da ƙima ta hanyar rarraba kaya daidai gwargwado, da kuma sabunta jerin abubuwan da ke kan tashi. Idan daidaitawa bai isa daidai ba, wasu daga cikin al'amuran za su karɓi kaya wanda ya wuce iyakar ƙarfinsu, kuma sabis ɗin zai zama ƙasa da abin dogaro.

Ana rarraba ma'aunin nauyi sau da yawa ta hanyar layin yarjejeniya daga tsarin OSI wanda yake gudanar da shi. Cloud Balancer yana aiki a matakin TCP, wanda yayi daidai da Layer na huɗu, L4.

Bari mu ci gaba zuwa bayyani na gine-ginen ma'auni na Cloud. A hankali za mu ƙara matakin daki-daki. Mun raba ma'aunin daidaitawa zuwa aji uku. Ajin saitin jirgin sama yana da alhakin hulɗar mai amfani kuma yana adana yanayin tsarin. Jirgin mai sarrafawa yana adana yanayin tsarin na yanzu kuma yana sarrafa tsarin daga ajin jirgin sama, waɗanda ke da alhakin kai tsaye don isar da zirga-zirga daga abokan ciniki zuwa yanayin ku.

Jirgin bayanai

Harkokin zirga-zirga yana ƙarewa akan na'urori masu tsada da ake kira masu amfani da iyaka. Don ƙara haƙurin kuskure, irin waɗannan na'urori da yawa suna aiki lokaci guda a cibiyar bayanai ɗaya. Na gaba, zirga-zirga yana zuwa masu daidaitawa, waɗanda ke sanar da duk wani adireshin IP na watsa shirye-shiryen zuwa duk AZ ta hanyar BGP don abokan ciniki. 

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Ana yada zirga-zirga ta hanyar ECMP - wannan dabara ce ta hanyar tuƙi wanda za'a iya samun hanyoyi masu kyau daidai da su zuwa ga manufa (a cikin yanayinmu, manufa shine adireshin IP ɗin da ake nufi) kuma ana iya aika fakiti tare da kowane ɗayansu. Har ila yau, muna tallafawa aiki a yankuna da dama bisa ga tsari mai zuwa: muna tallata adireshi a kowane yanki, zirga-zirga yana zuwa mafi kusa kuma baya wuce iyakarsa. Daga baya a cikin sakon za mu duba dalla-dalla kan abin da ke faruwa ga zirga-zirga.

Sanya jirgin sama

 
Maɓallin maɓalli na saitin jirgin sama shine API, ta hanyar da ake aiwatar da ayyuka na asali tare da ma'auni: ƙirƙira, sharewa, canza abubuwan abubuwan da suka faru, samun sakamakon binciken lafiya, da sauransu. A gefe guda, wannan API REST ne, kuma akan wani, mu a cikin Cloud sau da yawa muna amfani da tsarin gRPC, don haka muna “fassara” REST zuwa gRPC sannan mu yi amfani da gRPC kawai. Duk wani buƙatu yana haifar da ƙirƙirar jerin ayyukan asynchronous idem masu ƙarfi waɗanda aka aiwatar akan tafkin gama gari na ma'aikatan Yandex.Cloud. Ana rubuta ayyuka ta hanyar da za a iya dakatar da su a kowane lokaci sannan a sake farawa. Wannan yana tabbatar da scalability, maimaitawa da shiga ayyukan.

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

A sakamakon haka, aikin daga API zai yi buƙatu zuwa mai kula da sabis na daidaitawa, wanda aka rubuta a cikin Go. Yana iya ƙarawa da cire ma'auni, canza abun da ke ciki na baya da saituna. 

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Sabis ɗin yana adana yanayinsa a cikin Yandex Database, rumbun adana bayanai da aka rarraba wanda ba da daɗewa ba za ku iya amfani da shi. A cikin Yandex.Cloud, kamar yadda muka riga muka gani gaya, Manufar abincin kare ya shafi: idan mu kanmu muna amfani da ayyukanmu, to abokan cinikinmu ma za su yi farin cikin amfani da su. Yandex Database misali ne na aiwatar da irin wannan ra'ayi. Muna adana duk bayananmu a cikin YDB, kuma ba dole ba ne mu yi tunani game da kiyayewa da daidaita ma'ajin bayanai: an magance mana waɗannan matsalolin, muna amfani da bayanan a matsayin sabis.

Bari mu koma ga mai sarrafa ma'auni. Ayyukansa shine adana bayanai game da ma'auni kuma aika aiki don bincika shirye-shiryen na'urar zuwa ga mai kula da lafiyar lafiya.

Mai kula da lafiyar lafiya

Yana karɓar buƙatun canza ƙa'idodin dubawa, adana su a cikin YDB, rarraba ayyuka tsakanin nodes na healtcheck kuma yana tara sakamakon, waɗanda aka ajiye su zuwa bayanan bayanai kuma a aika zuwa mai sarrafa kaya. Shi, bi da bi, ya aika da bukatar canza abun da ke ciki na gungu a cikin jirgin bayanai zuwa loadbalancer-node, wanda zan tattauna a kasa.

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Bari mu yi magana game da kiwon lafiya. Ana iya raba su zuwa azuzuwan da yawa. Audits suna da ma'auni na nasara daban-daban. Takaddun TCP suna buƙatar samun nasarar kafa haɗin gwiwa cikin ƙayyadadden adadin lokaci. Binciken HTTP yana buƙatar duka haɗin kai mai nasara da amsa tare da lambar matsayi 200.

Har ila yau, cak sun bambanta a cikin aji na aikin - suna aiki da kuma m. Binciken wucewa yana saka idanu akan abin da ke faruwa tare da zirga-zirga ba tare da ɗaukar kowane mataki na musamman ba. Wannan baya aiki sosai akan L4 saboda ya dogara da haƙiƙan ƙa'idodin ƙa'idodi mafi girma: akan L4 babu wani bayani game da tsawon lokacin da aikin ya ɗauka ko kuma haɗin haɗin yana da kyau ko mara kyau. Binciken aiki yana buƙatar ma'auni don aika buƙatun zuwa kowane misali na uwar garke.

Yawancin ma'aunin nauyi suna yin gwajin rayuwa. A Cloud, mun yanke shawarar raba waɗannan sassan tsarin don ƙara haɓaka. Wannan hanya za ta ba mu damar ƙara yawan ma'auni yayin kiyaye adadin buƙatun kiwon lafiya zuwa sabis. Ana yin cak ta nodes na kiwon lafiya daban-daban, a cikin su waɗanda aka karkatar da maƙasudin binciken kuma ana maimaita su. Ba za ku iya yin cak daga runduna ɗaya ba, saboda yana iya gazawa. Sa'an nan kuma ba za mu sami yanayin abubuwan da ya bincika ba. Muna yin gwaje-gwaje akan kowane al'amuran daga aƙalla nodes na duba lafiya. Muna karkatar da dalilan bincike tsakanin nodes ta amfani da daidaitattun algorithms hashing.

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Rarraba daidaitawa da duba lafiya na iya haifar da matsaloli. Idan kumburin kiwon lafiya ya yi buƙatun ga misali, ƙetare ma'auni (wanda ba a halin yanzu yana hidimar zirga-zirgar ababen hawa), to, wani yanayi mai ban mamaki ya taso: albarkatun da alama suna da rai, amma zirga-zirgar ba za ta isa ba. Muna magance wannan matsalar ta wannan hanya: an ba mu tabbacin fara zirga-zirgar zirga-zirgar lafiya ta hanyar ma'auni. A wasu kalmomi, makircin motsin fakiti tare da zirga-zirga daga abokan ciniki da kuma daga kiwon lafiya ya bambanta kadan: a cikin lokuta biyu, fakitin za su kai ga ma'auni, wanda zai sadar da su ga albarkatun da aka yi niyya.

Bambancin shine abokan ciniki suna yin buƙatun zuwa VIP, yayin da kiwo lafiya ke yin buƙatun ga kowane RIP. Matsala mai ban sha'awa ta taso a nan: muna ba masu amfani da mu damar ƙirƙirar albarkatu a cikin cibiyoyin sadarwar IP mai launin toka. Bari mu yi tunanin cewa akwai masu mallakar girgije daban-daban guda biyu waɗanda suka ɓoye ayyukansu a bayan masu daidaitawa. Kowannen su yana da albarkatu a cikin 10.0.0.1/24 subnet, tare da adireshi iri ɗaya. Kuna buƙatar samun damar ko ta yaya bambance su, kuma a nan kuna buƙatar nutsewa cikin tsarin hanyar sadarwar kama-da-wane na Yandex.Cloud. Yana da kyau a sami ƙarin cikakkun bayanai a ciki bidiyo daga game da: taron girgije, Yana da mahimmanci a gare mu a yanzu cewa hanyar sadarwa tana da nau'i-nau'i da yawa kuma tana da ramukan da za a iya bambanta ta hanyar subnet id.

Kiwon lafiya duba nodes suna tuntuɓar ma'auni ta amfani da abin da ake kira adiresoshin quasi-IPv6. Adireshin quasi adireshi ne na IPv6 tare da adireshin IPv4 da kuma id ɗin subnet na mai amfani a ciki. Hanyoyin zirga-zirga sun kai ga ma'auni, wanda ke fitar da adireshin albarkatun IPv4 daga gare ta, ya maye gurbin IPv6 tare da IPv4 kuma ya aika fakitin zuwa cibiyar sadarwar mai amfani.

Hanyar da ta biyo baya tana tafiya kamar haka: ma'auni yana ganin cewa wurin da ake nufi shine cibiyar sadarwa mai launin toka daga masu aikin kiwon lafiya, kuma yana canza IPv4 zuwa IPv6.

VPP - zuciyar jirgin data

Ana aiwatar da ma'auni ta amfani da fasahar Vector Packet Processing (VPP), tsari daga Cisco don sarrafa tsari na zirga-zirgar hanyar sadarwa. A cikin yanayinmu, tsarin yana aiki a saman ɗakin karatu na sarrafa na'ura mai amfani da sararin samaniya - Data Plane Development Kit (DPDK). Wannan yana tabbatar da babban aikin sarrafa fakiti: ƙarancin katsewa yana faruwa a cikin kwaya, kuma babu mahallin mahalli tsakanin sararin kernel da sararin mai amfani. 

VPP ta ci gaba har ma da matsi da ƙarin aiki daga cikin tsarin ta hanyar haɗa fakiti cikin batches. Abubuwan da aka samu sun fito ne daga mummunan amfani da caches akan na'urori na zamani. Ana amfani da cache guda biyu (ana sarrafa fakiti a cikin "vectors", bayanan suna kusa da juna) da kuma caches na koyarwa: a cikin VPP, sarrafa fakiti yana bin jadawali, nodes wanda ke ƙunshe da ayyuka waɗanda ke yin aiki iri ɗaya.

Alal misali, sarrafa fakitin IP a cikin VPP yana faruwa a cikin tsari mai zuwa: na farko, ana yin la'akari da rubutun fakiti a cikin kullin parsing, sa'an nan kuma a aika su zuwa kullin, wanda ke tura fakitin gaba bisa ga tebur masu juyawa.

Dan hardcore. Mawallafa na VPP ba su yarda da yin sulhu ba a cikin amfani da caches na processor, don haka lambar da aka saba da ita don sarrafa vector na fakitoci sun ƙunshi vectorization na hannu: akwai madaidaicin sarrafawa wanda aka sarrafa yanayin kamar "muna da fakiti huɗu a cikin jerin gwano". sai guda biyu, sannan - daya. Ana amfani da umarnin prefetch sau da yawa don loda bayanai cikin caches don hanzarta samun damar zuwa gare su a cikin maimaitawa na gaba.

n_left_from = frame->n_vectors;
while (n_left_from > 0)
{
    vlib_get_next_frame (vm, node, next_index, to_next, n_left_to_next);
    // ...
    while (n_left_from >= 4 && n_left_to_next >= 2)
    {
        // processing multiple packets at once
        u32 next0 = SAMPLE_NEXT_INTERFACE_OUTPUT;
        u32 next1 = SAMPLE_NEXT_INTERFACE_OUTPUT;
        // ...
        /* Prefetch next iteration. */
        {
            vlib_buffer_t *p2, *p3;

            p2 = vlib_get_buffer (vm, from[2]);
            p3 = vlib_get_buffer (vm, from[3]);

            vlib_prefetch_buffer_header (p2, LOAD);
            vlib_prefetch_buffer_header (p3, LOAD);

            CLIB_PREFETCH (p2->data, CLIB_CACHE_LINE_BYTES, STORE);
            CLIB_PREFETCH (p3->data, CLIB_CACHE_LINE_BYTES, STORE);
        }
        // actually process data
        /* verify speculative enqueues, maybe switch current next frame */
        vlib_validate_buffer_enqueue_x2 (vm, node, next_index,
                to_next, n_left_to_next,
                bi0, bi1, next0, next1);
    }

    while (n_left_from > 0 && n_left_to_next > 0)
    {
        // processing packets by one
    }

    // processed batch
    vlib_put_next_frame (vm, node, next_index, n_left_to_next);
}

Don haka, Healthchecks suna magana akan IPv6 zuwa VPP, wanda ke juya su zuwa IPv4. Ana yin wannan ta hanyar kumburi a cikin jadawali, wanda muke kira algorithmic NAT. Don juyawa zirga-zirga (da jujjuyawa daga IPv6 zuwa IPv4) akwai wannan kumburin NAT algorithmic iri ɗaya.

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Hanyoyin zirga-zirgar kai tsaye daga abokan ciniki na ma'auni suna tafiya ta cikin nodes ɗin jadawali, waɗanda ke yin daidaita kanta. 

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Kumburi na farko shine zaman manne. Yana adana zanta na 5-tuful domin kafa zaman. 5-tuple ya haɗa da adireshin da tashar jiragen ruwa na abokin ciniki daga abin da aka watsa bayanai, adireshin da tashar jiragen ruwa na albarkatun da ke samuwa don karɓar zirga-zirga, da kuma tsarin hanyar sadarwa. 

Hash 5-tuple yana taimaka mana yin ƙarancin ƙididdigewa a cikin madaidaicin kumburin hashing na gaba, haka kuma mafi kyawun sarrafa jerin abubuwan canje-canje a bayan mai daidaitawa. Lokacin da fakitin da babu zamansa ya isa wurin ma'auni, ana aika shi zuwa madaidaicin kumburin hashing. Wannan shine inda daidaitawa ke faruwa ta amfani da daidaitaccen hashing: muna zaɓar wata hanya daga jerin albarkatun “rayuwa”. Bayan haka, ana aika fakitin zuwa kumburin NAT, wanda a zahiri ya maye gurbin adireshin da aka nufa kuma yana sake ƙididdige ƙididdiga. Kamar yadda kake gani, muna bin ka'idodin VPP - kamar so, haɗa nau'ikan ƙididdiga iri ɗaya don haɓaka haɓakar caches na sarrafawa.

Daidaitaccen hashing

Me yasa muka zaba shi kuma menene ma? Da farko, bari mu yi la'akari da aikin da ya gabata - zaɓin albarkatu daga lissafin. 

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Tare da hashing mara daidaituwa, ana ƙididdige hash na fakitin mai shigowa, kuma ana zaɓar kayan aiki daga jerin ta ragowar rarraba wannan hash ta adadin albarkatun. Muddin lissafin bai canza ba, wannan makirci yana aiki da kyau: koyaushe muna aika fakiti tare da 5-tuple iri ɗaya zuwa misalin guda. Idan, alal misali, wasu albarkatu sun daina ba da amsa ga binciken lafiya, to ga wani muhimmin sashi na hashes zaɓin zai canza. Haɗin TCP na abokin ciniki za a karye: fakitin da a baya ya kai misali A na iya fara isa ga misalin B, wanda bai saba da zaman wannan fakitin ba.

Daidaitaccen hashing yana magance matsalar da aka bayyana. Hanya mafi sauƙi don bayyana wannan ra'ayi ita ce: yi tunanin cewa kuna da zobe wanda kuke rarraba albarkatun ta hanyar zanta (misali, ta IP: tashar jiragen ruwa). Zaɓin kayan aiki shine jujjuya dabaran ta kusurwa, wanda aka ƙaddara ta hanyar hash na fakitin.

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Wannan yana rage rarrabuwar zirga-zirga lokacin da abun cikin kayan ya canza. Share kayan aiki zai shafi ɓangaren daidaitaccen zoben hashing ɗin da aka samo albarkatun. Ƙara kayan aiki kuma yana canza rarraba, amma muna da kumburin zaman manne, wanda ke ba mu damar canza zaman da aka riga aka kafa zuwa sababbin albarkatu.

Mun kalli abin da ke faruwa don jagorantar zirga-zirga tsakanin ma'auni da albarkatu. Yanzu bari mu kalli zirga-zirgar dawowa. Yana bin tsari iri ɗaya kamar duba zirga-zirga - ta hanyar algorithmic NAT, wato, ta hanyar juyar da NAT 44 don zirga-zirgar abokin ciniki kuma ta hanyar NAT 46 don zirga-zirgar kiwon lafiya. Muna bin tsarin namu: muna haɓaka zirga-zirgar lafiyar lafiya da zirga-zirgar masu amfani na gaske.

Loadbalancer-kumburi da abubuwan da aka haɗa

Abubuwan da ke tattare da ma'auni da albarkatu a cikin VPP sun ruwaito ta hanyar sabis na gida - loadbalancer-node. Yana biyan kuɗi zuwa rafi na abubuwan da suka faru daga mai kula da loadbalancer kuma yana iya tsara bambanci tsakanin yanayin VPP na yanzu da kuma yanayin da aka samu daga mai sarrafawa. Muna samun tsarin rufaffiyar: abubuwan da suka faru daga API sun zo ga mai sarrafa ma'auni, wanda ke ba da ayyuka ga mai kula da lafiyar lafiya don duba "rayuwar" albarkatun. Wannan, bi da bi, yana ba da ayyuka ga kullin duba lafiya kuma yana tara sakamakon, bayan haka yana mayar da su zuwa mai kula da ma'auni. Loadbalancer-node yana biyan kuɗi zuwa abubuwan da suka faru daga mai sarrafawa kuma yana canza yanayin VPP. A cikin irin wannan tsarin, kowane sabis ya san kawai abin da ya wajaba game da ayyukan makwabta. Adadin haɗin kai yana iyakance kuma muna da ikon yin aiki da sikelin sassa daban-daban da kansu.

Gine-ginen ma'auni mai ɗaukar nauyi na hanyar sadarwa a cikin Yandex.Cloud

Wadanne batutuwa aka kaucewa?

Dukkan ayyukanmu a cikin jirgin sarrafawa an rubuta su a cikin Go kuma suna da kyakkyawan sikelin da halayen dogaro. Go yana da ɗakunan karatu masu buɗewa da yawa don gina tsarin rarrabawa. Muna amfani da GRPC da gaske, duk abubuwan da aka gyara sun ƙunshi buɗaɗɗen aiwatar da aikin gano sabis - ayyukanmu suna sa ido kan ayyukan juna, suna iya canza abun da ke ciki a hankali, kuma mun danganta wannan tare da daidaitawa na GRPC. Don ma'auni, muna kuma amfani da mafita mai buɗewa. A cikin jirgin sama na bayanai, mun sami kyakkyawan aiki da babban tanadin albarkatu: ya zama mai wahala sosai don tara tsayawar da za mu iya dogara da aikin VPP, maimakon katin sadarwar ƙarfe.

Matsaloli da Magani

Me bai yi aiki sosai ba? Go yana da sarrafa ƙwaƙwalwar ajiya ta atomatik, amma har yanzu yatsuwar ƙwaƙwalwar ajiya tana faruwa. Hanya mafi sauƙi don magance su ita ce gudanar da goroutines kuma ku tuna da dakatar da su. Takeaway: Kalli yadda ake amfani da ƙwaƙwalwar ajiyar shirye-shiryen Go. Sau da yawa mai kyau mai nuna alama shine adadin goroutines. Akwai ƙari a cikin wannan labarin: a cikin Go yana da sauƙi don samun bayanan lokacin aiki - yawan amfani da ƙwaƙwalwar ajiya, adadin goroutines masu gudana, da sauran sigogi da yawa.

Hakanan, Go bazai zama mafi kyawun zaɓi don gwaje-gwajen aiki ba. Suna da magana sosai, kuma daidaitaccen tsarin "gudanar da komai a cikin CI a cikin tsari" bai dace da su sosai ba. Gaskiyar ita ce gwaje-gwajen aiki sun fi buƙatar albarkatu kuma suna haifar da ɓata lokaci na gaske. Saboda wannan, gwaje-gwaje na iya gazawa saboda CPU yana shagaltu da gwaje-gwajen naúrar. Kammalawa: Idan zai yiwu, yi gwaje-gwaje na "nauyi" daban da gwajin naúrar. 

Tsarin gine-gine na Microservice ya fi rikitarwa fiye da monolith: tattara rajistan ayyukan akan yawancin injuna daban-daban bai dace sosai ba. Kammalawa: idan kun yi microservices, nan da nan kuyi tunani game da ganowa.

Shirye-shiryen mu

Za mu ƙaddamar da ma'auni na ciki, ma'auni na IPv6, ƙara tallafi don rubutun Kubernetes, ci gaba da shard ayyukanmu (a halin yanzu kawai kiwon lafiya-node da healthcheck-ctrl suna sharded), ƙara sababbin duban lafiya, da kuma aiwatar da haɗakarwa mai hankali na cak. Muna la'akari da yuwuwar sa ayyukanmu su zama masu zaman kansu - ta yadda ba za su sadarwa kai tsaye da juna ba, amma ta amfani da layin saƙo. Sabis mai dacewa da SQS kwanan nan ya bayyana a cikin Cloud Yandex Message Queue.

Kwanan nan, sakin jama'a na Yandex Load Balancer ya faru. Bincika takardun shaida zuwa sabis ɗin, sarrafa ma'auni ta hanyar da ta dace da ku kuma ƙara rashin haƙuri na ayyukanku!

source: www.habr.com

Add a comment