I-Delta: Ungqamaniso lweDatha kunye nePlatifomu yokuNtyebisa

Ngokulindela ukuqaliswa kokuhamba okutsha kwinqanaba Injineli yedatha Siye salungiselela inguqulelo yombandela obangel’ umdla.

I-Delta: Ungqamaniso lweDatha kunye nePlatifomu yokuNtyebisa

isishwankathelo

Siza kuthetha ngepateni eyaziwayo ngokufanelekileyo apho izicelo zisebenzisa iivenkile ezininzi zedatha, apho ivenkile nganye isetyenziselwa iinjongo zayo, umzekelo, ukugcina ifom ye-canonical yedatha (MySQL, njl.), ukubonelela ngezakhono zokukhangela eziphambili (ElasticSearch, njl.))., i-caching (Memcached, njl.) kunye nabanye. Ngokuqhelekileyo, xa usebenzisa iivenkile ezininzi zedatha, enye yazo isebenza njengevenkile yokuqala kunye nezinye njengezitolo eziphuma kuyo. Ingxaki kuphela yindlela yokulungelelanisa ezi venkile zedatha.

Sijonge inani leepatheni ezahlukeneyo ezazama ukucombulula ingxaki yokuvumelanisa izitolo ezininzi, ezifana nokubhala kabini, ukuthengiselana okusasazwayo, njl. Nangona kunjalo, ezi ndlela zinezithintelo ezibalulekileyo malunga nokusetyenziswa kobomi bokwenyani, ukuthembeka kunye nokugcinwa. Ukongeza kungqamaniso lwedatha, ezinye izicelo zifuna ukutyebisa idatha ngokubiza iinkonzo zangaphandle.

IDelta yaphuhliswa ukusombulula ezi ngxaki. I-Delta ekugqibeleni ibonelela ngeqonga elingaguquguqukiyo, eliqhutywa yisiganeko sokuvumelanisa idatha kunye nokutyebisa.

Izisombululo ezikhoyo

Ukungena kabini

Ukugcina iivenkile ezimbini zedatha zihambelana, ungasebenzisa ukubhala kabini, okubhalela kwivenkile enye kwaye ubhalele kwenye ngokukhawuleza emva koko. Ukurekhoda kokuqala kunokuzanywa kwakhona kwaye okwesibini kunokuchithwa ukuba eyokuqala ayiphumelelanga emva kokuba inani leenzame liphelile. Nangona kunjalo, iivenkile ezimbini zedatha zinokungahambelani ukuba ukubhala kwivenkile yesibini kusilele. Le ngxaki idla ngokusonjululwa ngokudala inkqubo yokubuyisela enokuthi ngamaxesha athile iphinde idlulise idatha ukusuka kwindawo yokuqala yokugcina ukuya kweyesibini, okanye yenza njalo kuphela ukuba iyantlukwano ifunyenwe kwidatha.

Iingxaki:

Ukwenza inkqubo yokubuyisela ngumsebenzi othile ongenakusetyenziswa kwakhona. Ukongezelela, idatha phakathi kweendawo zokugcina zihlala zingavumelani kuze kube yinkqubo yokubuyisela. Isisombululo siba nzima ngakumbi ukuba kusetyenziswe iivenkile ezingaphezu kwesibini. Ekugqibeleni, inkqubo yokubuyisela inokongeza umthwalo kumthombo wedatha yangaphambili.

Guqula itafile yelog

Xa utshintsho lwenzeka kwiseti yeetafile (njengokufaka, ukuhlaziya, kunye nokucima irekhodi), iirekhodi zenguqu zongezwa kwitafile yelog njengenxalenye yentengiselwano efanayo. Enye intambo okanye inkqubo isoloko icela iziganeko kwitafile yelog kwaye ibhale kwiivenkile zedatha enye okanye ngaphezulu, ukuba kuyimfuneko, ukususa iziganeko kwitafile yelog emva kokuba irekhodi liqinisekisiwe kuzo zonke izitolo.

Iingxaki:

Le pateni kufuneka iphunyezwe njengethala leencwadi, kwaye ngokufanelekileyo ngaphandle kokutshintsha ikhowudi yesicelo esiyisebenzisayo. Kwimeko-bume ye-polyglot, ukuphunyezwa kwethala leencwadi kufuneka kubekho kulo naluphi na ulwimi oluyimfuneko, kodwa ukuqinisekisa ukuhambelana kokusebenza kunye nokuziphatha kwiilwimi zonke kunzima kakhulu.

Enye ingxaki isekufumaneni utshintsho lweschema kwiinkqubo ezingaxhasi utshintsho lwe-schema yetransactional [1][2], njengeMySQL. Ngoko ke, iphethini yokwenza utshintsho (umzekelo, utshintsho lwe-schema) kunye nokurekhoda ngokuthengiselana kwitheyibhile yelog yenguqu ayiyi kuhlala isebenza.

IiNtengiselwano ezisasaziweyo

Iintengiselwano ezisasaziweyo zingasetyenziselwa ukwahlula intengiselwano kuzo zonke iivenkile zedatha ezininzi ezahlukeneyo ukuze umsebenzi uzibophelele kuzo zonke iivenkile zedatha ezisetyenzisiweyo, okanye ungazibophelelanga nakweyiphi na kuzo.

Iingxaki:

Iintengiselwano ezisasaziweyo yingxaki enkulu kakhulu kwiivenkile zedatha ezingafaniyo. Ngokwendalo yabo, banokuthembela kuphela kwidinomineyitha ephantsi yeenkqubo ezibandakanyekayo. Ngokomzekelo, i-XA transactions block execution ukuba inkqubo yesicelo iyasilela ngexesha lesigaba sokulungiselela. Ukongeza, i-XA ayiboneleli ngokufunyaniswa kwedeadlock okanye ixhase izicwangciso zolawulo lwemali encomekayo. Ukongeza, ezinye iinkqubo ezifana ne-ElasticSearch azixhasi i-XA okanye nayiphi na enye imodeli yentengiselwano eyahluka-hlukeneyo. Ke, ukuqinisekisa ukubhala iatomicity kwiitekhnoloji ezahlukeneyo zokugcina idatha kuhlala kungumsebenzi onzima kakhulu kwizicelo [3].

Delta

I-Delta yenzelwe ukujongana nemida yezisombululo zolungelelwaniso lwedatha esele zikho kwaye inika amandla ukutyetyiswa kwedatha kwi-fly. Injongo yethu yayikukukhupha zonke ezi zintsonkothe ​​kude kubaphuhlisi bezicelo ukuze bakwazi ukugxila ngokupheleleyo ekuphumezeni ukusebenza kweshishini. Okulandelayo siza kuchaza "uKhangelo lweMovie", eyona meko yokusetyenziswa kweDelta yeNetflix.

I-Netflix isebenzisa ngokubanzi i-microservice architecture, kwaye i-microservice nganye isebenzisa uhlobo olunye lwedatha. Ulwazi olusisiseko malunga nefilimu luqulethwe kwi-microservice ebizwa ngokuba yiNkonzo yeMovie, kunye nedatha ehambelanayo efana nolwazi malunga nabavelisi, abadlali, abathengisi, njalo njalo ilawulwa ngamanye ama-microservices amaninzi (okungukuthi iNkonzo yoDili, iNkonzo yeeTalente kunye neNkonzo yoMthengisi).
Abasebenzisi boShishino kwiNetflix Studios bahlala befuna ukukhangela kwiikhrayitheriya ezahlukeneyo zemuvi, yiyo loo nto kubaluleke kakhulu kubo ukuba bakwazi ukukhangela kuyo yonke idatha enxulumene nemovie.

Phambi kwe-Delta, iqela lokukhangela imuvi lalifuna ukutsala idatha kwii-microservices ezininzi ngaphambi kokubonisa idatha yemuvi. Ukongezelela, iqela kwafuneka liphuhlise inkqubo eya kuhlaziya rhoqo isalathiso sokukhangela ngokucela utshintsho kwezinye ii-microservices, nokuba akukho tshintsho kwaphela. Le nkqubo yakhawuleza yantsonkotha kwaye kwanzima ukuyigcina.

I-Delta: Ungqamaniso lweDatha kunye nePlatifomu yokuNtyebisa
Umzobo 1. Inkqubo yokuvota kwiDelta
Emva kokusebenzisa i-Delta, inkqubo yenziwe lula kwinkqubo eqhutywa ngumsitho njengoko kubonisiwe kulo mfanekiso ulandelayo. CDC (Change-Data-Capture) iziganeko zithunyelwa Keystone Kafka izihloko usebenzisa Delta-Connector. Isicelo se-Delta esakhiwe ngokusebenzisa i-Delta Stream Processing Framework (esekelwe kwi-Flink) ifumana iziganeko ze-CDC kwisihloko, ziphucule ngokubiza ezinye ii-microservices, kwaye ekugqibeleni zigqithise idatha ephuculweyo kwisalathisi sokukhangela kwi-Elasticsearch. Yonke inkqubo yenzeka phantse ngexesha langempela, oko kukuthi, ngokukhawuleza ukuba utshintsho luzibophelele kwindawo yokugcina idatha, izalathisi zokukhangela zihlaziywa.

I-Delta: Ungqamaniso lweDatha kunye nePlatifomu yokuNtyebisa
Umzobo 2. Umbhobho wedatha usebenzisa iDelta
Kula macandelo alandelayo, siya kuchaza ukusebenza kwe-Delta-Connector, edibanisa kwisitoreji kwaye ipapashe iziganeko ze-CDC kuluhlu lwezothutho, olusisiseko sokuhanjiswa kwedatha yexesha langempela elihambisa iziganeko zeCDC kwizihloko ze-Kafka. Kwaye ekugqibeleni, siza kuthetha malunga nesakhelo sokusetyenzwa komjelo weDelta, abaphuhlisi besicelo abanokusebenzisa ukusetyenzwa kwedatha kunye nengqiqo yokutyebisa.

I-CDC (Tshintsho-iDatha-Thatha)

Siye saphuhlisa inkonzo ye-CDC ebizwa ngokuba yi-Delta-Connector, enokuthi ibambe utshintsho oluzinikeleyo kwivenkile yedatha ngexesha lokwenyani kwaye ibhale kumsinga. Utshintsho lwexesha langempela luthathwa kwilog yentengiselwano kunye neendawo zokulahla zokugcina. Ukulahla kusetyenziswa ngenxa yokuba iinkuni zentengiselwano zihlala zingagcini yonke imbali yotshintsho. Utshintsho ludla ngokuhlelwa njengemisitho yeDelta, ngoko ke umamkeli akanyanzelekanga ukuba abe nexhala malunga nokuba utshintsho luvela phi na.

I-Delta-Connector ixhasa izinto ezininzi ezongezelelweyo ezinje:

  • Ukukwazi ukubhala idatha yemveliso yesiko edlulileyo eKafka.
  • Ukukwazi ukwenza ukuba kusebenze ukulahla ngesandla nangaliphi na ixesha kuzo zonke iitafile, itafile ethile, okanye izitshixo ezithile eziphambili.
  • Iinkunkuma zinokufunyanwa zibe ziziqwenga, ngoko ke akukho mfuneko yokuba kuqalwe kwakhona xa kukho ukusilela.
  • Akukho mfuneko yokubeka izitshixo kwiitafile, okubaluleke kakhulu ukuqinisekisa ukuba i-database yokubhala i-traffic ayize ivalwe yinkonzo yethu.
  • Ukufumaneka okuphezulu ngenxa yeemeko ezingafunekiyo kwiiNdawo zokuFumaneka ze-AWS.

Ngoku sixhasa i-MySQL kunye ne-Postgres, kubandakanywa ukuthunyelwa kwi-AWS RDS kunye ne-Aurora. Sikwaxhasa iCassandra (i-multi-master). Ungafumana iinkcukacha ezithe vetshe malunga neDelta-Connector apha iposti yebhlog.

I-Kafka kunye nomgangatho wezothutho

Umaleko wothutho lomcimbi weDelta wakhiwe kwinkonzo yemiyalezo yeqonga Ilitye elingundoqo.

Ngokwembali, ukuthumela kwiNetflix kuye kwalungiselelwa ukufikeleleka kunokuphila ixesha elide (jonga ngezantsi). inqaku elidlulileyo). Urhwebo lwalunokubakho ukungangqinelani kwedatha yee-broker kwiimeko ezahlukeneyo zomda. Umzekelo, unyulo lwenkokeli olungcolileyo unoxanduva lokuba umamkeli abe neziganeko eziphindaphindwayo okanye ezilahlekileyo.

NgeDelta, sifuna iziqinisekiso zokuqina ezomeleleyo zokuqinisekisa ukuhanjiswa kwemisitho yeCDC kwiivenkile ezifunyenweyo. Ngenxa yale njongo, sicebise iqela le-Kafka eliyilwe ngokukodwa njengento yodidi lokuqala. Ungajonga ezinye iisetingi zomthengisi kwitheyibhile engezantsi:

I-Delta: Ungqamaniso lweDatha kunye nePlatifomu yokuNtyebisa

Kumaqela e-Keystone Kafka, unyulo lwenkokeli olungcolileyo ngokuqhelekileyo ibandakanyiwe ukuqinisekisa ukufikeleleka kompapashi. Oku kunokubangela ukulahleka komyalezo ukuba i-replica engahambelaniyo yonyulwa njengenkokeli. Ngokufumaneka okuphezulu kweqela leKafka, ukhetho unyulo lwenkokeli olungcolileyo icinyiwe ukunqanda ukulahleka komyalezo.

Nathi sonyuka into yokuphindaphinda ukusuka ku-2 ukuya ku-3 kwaye ubuncinci bokuphindaphinda okungahambelaniyo 1 ukuya ku-2. Abapapashi ababhalela eli qela bafuna ii-acks ezivela kuzo zonke ezinye, ziqinisekisa ukuba i-2 kwi-3 replicas ineyona miyalezo yangoku ethunyelwa ngumpapashi.

Xa umzekelo womthengisi uyekile, imo entsha ithatha indawo yendala. Nangona kunjalo, umthengisi omtsha uya kufuna ukufumana iireplicas ezingahambelaniyo, ezinokuthatha iiyure ezininzi. Ukunciphisa ixesha lokubuyisela kule meko, saqala ukusebenzisa ukugcinwa kwedatha yebhloko (i-Amazon Elastic Block Store) endaweni yeediski ze-broker zendawo. Xa isiganeko esitsha sithatha indawo yomzekelo oyekiweyo we-broker, incamathela umthamo we-EBS owawunayo isiganeko esiphelisiwe kwaye iqalisa ukufumana imiyalezo emitsha. Le nkqubo inciphisa ixesha lokususwa komsebenzi ongasemva ukusuka kwiiyure ukuya kwimizuzu kuba umzekelo omtsha awusafuni kuphinda uphindaphindwe ukusuka kwimeko engenanto. Lilonke, ukugcinwa okwahlukileyo kunye nemijikelo yobomi bomrhwebi kunciphisa kakhulu impembelelo yokutshintsha i-broker.

Ukwandisa ngakumbi isiqinisekiso sokuhanjiswa kwedatha, sasebenzisa inkqubo yokulandela umyalezo ukukhangela nayiphi na ilahleko yomyalezo phantsi kweemeko ezigqithisileyo (umzekelo, ukuchithwa kwewotshi kwinkokeli yolwahlulo).

Isakhelo sokuLungiselela umsinga

I-Delta's processing layer yakhiwe phezu kweqonga leNetflix SPaaS, elibonelela ngokudityaniswa kweApache Flink kunye neNetflix ecosystem. Iqonga libonelela ngojongano lomsebenzisi olawula ukuthunyelwa kwemisebenzi yeFlink kunye ne-orchestration yamaqela e-Flink phezulu kweqonga lethu lolawulo lwezikhongozeli zeTitus. Ujongano lukwalawula uqwalaselo lomsebenzi kwaye luvumela abasebenzisi ukuba benze utshintsho loqwalaselo ngokuguquguqukayo ngaphandle kokubuyisela imisebenzi yeFlink.

I-Delta ibonelela ngesakhelo sokusetyenzwa komsinga esekelwe kwi-Flink kunye ne-SPaaS esebenzisa i-annotation-based I-DSL (i-Domain Specific Language) ukukhupha iinkcukacha zobugcisa. Ngokomzekelo, ukucacisa inyathelo apho iziganeko ziya kutyetyiswa ngokubiza iinkonzo zangaphandle, abasebenzisi kufuneka babhale i-DSL elandelayo, kwaye isakhelo siya kudala imodeli esekelwe kuyo, eya kwenziwa nguFlink.

I-Delta: Ungqamaniso lweDatha kunye nePlatifomu yokuNtyebisa
Umzobo 3. Umzekelo wokutyebisa kwi-DSL eDelta

Isakhelo sokusebenza asinciphisi kuphela ijika lokufunda, kodwa sikwabonelela ngeempawu zokusetyenzwa komjelo eziqhelekileyo ezifana nokudityaniswa, ukucwangciswa, kunye nokuguquguquka kunye nokomelela ukusombulula iingxaki eziqhelekileyo zokusebenza.

I-Delta Stream Processing Framework ineemodyuli ezimbini eziphambili, imodyuli ye-DSL kunye ne-API kunye nemodyuli yeXesha lokuSebenza. Imodyuli ye-DSL kunye ne-API ibonelela nge-DSL kunye ne-UDF (User-Defined-Function) APIs ukwenzela ukuba abasebenzisi babhale ingqiqo yabo yokucubungula (njengokucoca okanye ukuguqulwa). Imodyuli yeXesha loKusebenza ibonelela ngomiliselo loluhlu lweDSL olwakha ukumelwa kwangaphakathi kwamanyathelo okuqhuba kwiimodeli zeDAG. Icandelo lokuSebenza litolika imodeli ye-DAG ukuqalisa iinkcazo ze-Flink zokwenyani kwaye ekugqibeleni iqhube isicelo seFlink. Uyilo lwesakhelo luboniswe kulo mfanekiso ulandelayo.

I-Delta: Ungqamaniso lweDatha kunye nePlatifomu yokuNtyebisa
Umzobo 4. Delta Stream Processing Framework uyilo

Le ndlela ineenzuzo ezininzi:

  • Abasebenzisi banokugxila kwingqiqo yeshishini labo ngaphandle kokungena kwiinkcukacha zeFlink okanye isakhiwo se-SPaaS.
  • Ukulungiswa kunokwenziwa ngendlela ecacileyo kubasebenzisi, kwaye iimpazamo zinokulungiswa ngaphandle kokufuna naluphi na utshintsho kwikhowudi yomsebenzisi (UDF).
  • Amava esicelo seDelta enziwe lula kubasebenzisi kuba iqonga libonelela ngokuguquguquka kunye nokuqina ngaphandle kwebhokisi kwaye liqokelela iintlobo ngeentlobo zeemethrikhi ezineenkcukacha ezingasetyenziselwa izilumkiso.

Ukusetyenziswa kwemveliso

I-Delta sele ikwimveliso ngaphezu konyaka kwaye idlala indima ephambili kwizicelo ezininzi ze-Netflix Studio. Uncede amaqela aphumeza iimeko zokusetyenziswa ezifana nokukhangela isalathiso, ukugcinwa kwedatha, kunye nokuhamba komsebenzi okuqhutywa yisiganeko. Ngezantsi umboniso wezinga eliphezulu loyilo lweqonga leDelta.

I-Delta: Ungqamaniso lweDatha kunye nePlatifomu yokuNtyebisa
Umzobo 5. Uyilo oluphezulu lweDelta.

Imibulelo

Sithanda ukubulela aba bantu balandelayo ababandakanyekayo ekudalweni nasekuphuhlisweni kweDelta kwiNetflix: Allen Wang, Charles Zhao, Jaebin Yoon, Josh Snyder, Kasturi Chatterjee, Mark Cho, Olof Johansson, Piyush Goyal, Prashanth Ramdas, Raghuram Onti Srinivasan, Sandeep Gupta , Steven Wu, Tharanga Gamaethige, Yun Wang kunye noZhenzhong Xu.

Imithombo

  1. dev.mysql.com/doc/refman/5.7/en/implicit-commit.html
  2. dev.mysql.com/doc/refman/5.7/en/cannot-roll-back.html
  3. UMartin Kleppmann, Alastair R. Beresford, Boerge Svingen: Ukulungiswa kwesiganeko kwi-Intanethi. Uluntu. I-ACM 62(5): 43–49 (2019). DOI: doi.org/10.1145/3312527

Bhalisela i-webinar yasimahla: "Isixhobo soKwakha iDatha ye-Amazon Redshift Storage."

umthombo: www.habr.com

Yongeza izimvo