Ungabheka kanjani emehlweni kaCassandra ngaphandle kokulahlekelwa idatha, ukuzinza nokholo ku-NoSQL

Ungabheka kanjani emehlweni kaCassandra ngaphandle kokulahlekelwa idatha, ukuzinza nokholo ku-NoSQL

Bathi konke empilweni kuwufanele ukuzama okungenani kanye. Futhi uma ujwayele ukusebenza nama-DBMS ahlobene, khona-ke kufanelekile ukujwayelana ne-NoSQL ngokusebenza, okokuqala, okungenani ekuthuthukisweni okujwayelekile. Manje, ngenxa yentuthuko esheshayo yalobu buchwepheshe, kunemibono eminingi engqubuzanayo kanye nezinkulumo-mpikiswano ezishisayo ngalesi sihloko, okubangela ikakhulukazi isithakazelo.
Uma uhlolisisa ingqikithi yazo zonke lezi zingxabano, uyabona ukuthi zivela ngenxa yendlela engalungile. Labo abasebenzisa imininingwane ye-NoSQL lapho bedingeka khona banelisekile futhi bathola zonke izinzuzo ezivela kulesi sixazululo. Futhi abahloli abathembele kulobu buchwepheshe njenge-panacea lapho bungasebenzi khona nhlobo baphoxekile, njengoba belahlekelwe amandla esizindalwazi esihlobene ngaphandle kokuthola izinzuzo ezibalulekile.

Ngizokutshela ngokuhlangenwe nakho kwethu ekusebenziseni isixazululo esisekelwe ku-Cassandra DBMS: lokho okwakufanele sibhekane nakho, ukuthi saphuma kanjani ezimweni ezinzima, noma ngabe sikwazile ukuzuza ngokusebenzisa i-NoSQL nalapho kufanele sitshale khona imizamo/izimali ezengeziwe. .
Umsebenzi wokuqala uwukwakha isistimu erekhoda izingcingo ngohlobo oluthile lwesitoreji.

Umgomo wokusebenza wesistimu umi kanje. Okokufaka kufaka phakathi amafayela anesakhiwo esithile esichaza isakhiwo socingo. Uhlelo lokusebenza bese luqinisekisa ukuthi lesi sakhiwo sigcinwa kumakholomu afanelekile. Ngokuzayo, izingcingo ezigciniwe zisetshenziselwa ukubonisa ulwazi mayelana nokusetshenziswa kwethrafikhi kwababhalisile (izinkokhelo, izingcingo, umlando webhalansi).

Ungabheka kanjani emehlweni kaCassandra ngaphandle kokulahlekelwa idatha, ukuzinza nokholo ku-NoSQL

Kucace bha ukuthi kungani bekhethe uCassandra - ubhala njengesibhamu somshini, uhlaseleka kalula, futhi uyawabekezelela amaphutha.

Ngakho, yilokhu okuhlangenwe nakho okusinike kona

Yebo, i-node ehlulekile akuyona inhlekelele. Lona umnyombo wokubekezelela amaphutha kukaCassandra. Kodwa i-node ingaphila futhi ngesikhathi esifanayo iqale ukuhlupheka ekusebenzeni. Njengoba kwenzeka, lokhu kuthinta ngokushesha ukusebenza kweqoqo lonke.

I-Cassandra ngeke ikuvikele lapho i-Oracle ikusindise khona ngezingqinamba zayo. Futhi uma umlobi wohlelo lokusebenza engakuqondi lokhu kusengaphambili, khona-ke okuphindwe kabili okufike kuCassandra akubi kakhulu kunokuqala. Uma isifikile, sizoyifaka.

U-IB akathandanga kakhulu i-Cassandra yamahhala ngaphandle kwebhokisi: Akukho ukuloga kwezenzo zabasebenzisi, akukho ukuhlukaniswa kwamalungelo. Ulwazi olumayelana nezingcingo lubhekwa njengedatha yomuntu siqu, okusho ukuthi yonke imizamo yokuyicela/ukuyishintsha nganoma iyiphi indlela kufanele ifakwe kunethuba lokucwaninga kwamabhuku okulandelayo. Futhi, udinga ukuqaphela isidingo sokuhlukanisa amalungelo kumazinga ahlukene kubasebenzisi abahlukene. Unjiniyela wokusebenza olula kanye nomphathi omkhulu ongasusa ngokukhululekile sonke isikhala sokhiye izindima ezihlukene, izibopho ezihlukene, nekhono. Ngaphandle kokwehlukaniswa okunjalo kwamalungelo okufinyelela, inani nobuqotho bedatha izovela embuzweni ngokushesha kunanoma IYIPHI ileveli yokuvumelana.

Asizange sinake ukuthi amakholi adinga kokubili ukuhlaziya okubucayi kanye namasampuli ezikhathi ezithile ezimweni ezihlukahlukene. Njengoba amarekhodi akhethiwe kufanele asuswe futhi abhalwe kabusha (njengengxenye yomsebenzi, kufanele sisekele inqubo yokubuyekeza idatha lapho idatha ingena ngokungalungile ekuqaleni), uCassandra akayena umngane wethu lapha. I-Cassandra ifana ne-piggy bank - kulula ukufaka izinto, kodwa awukwazi ukubala kuyo.

Sihlangabezane nenkinga yokudlulisa idatha ezindaweni zokuhlola (Amanodi ama-5 esivivinyweni aqhathaniswa nama-20 ku-prom). Kulokhu, ukulahla akukwazi ukusetshenziswa.

Inkinga yokubuyekeza i-schema sedatha yohlelo lokusebenza olubhalela i-Cassandra. Ukubuyisela emuva kuzokhiqiza amatshe amathuna amaningi, okungaholela ekulahlekelweni kokukhiqiza ngezindlela ezingalindelekile.. I-Cassandra yenzelwe ukuqoshwa, futhi ayicabangi kakhulu ngaphambi kokubhala. Noma yikuphi ukusebenza okunedatha ekhona kuyo futhi kuwukurekhoda. Okusho ukuthi, ngokususa okungadingekile, sizomane sikhiqize amarekhodi engeziwe, futhi amanye awo azophawulwa ngamatshe amathuna.

Ukuphelelwa yisikhathi uma ufaka. UCassandra muhle ekurekhodeni, kodwa ngezinye izikhathi ukugeleza okungenayo kungamxaka kakhulu. Lokhu kwenzeka uma uhlelo lokusebenza luqala ukuzungeza amarekhodi amaningana angakwazi ukufakwa ngesizathu esithile. Futhi sizodinga i-DBA yangempela ezoqapha i-gc.log, isistimu kanye namalogi okususa iphutha ngemibuzo ehamba kancane, amamethrikhi ekubambeni okulindile.

Izikhungo zedatha ezimbalwa kuqoqo. Ungafunda kuphi futhi ubhale kuphi?
Mhlawumbe ihlukaniswe yaba ukufunda nokubhala? Futhi uma kunjalo, kufanele kube ne-DC eduze nesicelo sokubhala noma sokufunda? Futhi ngeke yini sigcine sinobuchopho bangempela bokuhlukana uma sikhetha izinga lokungaguquguquki elingalungile? Kunemibuzo eminingi, izilungiselelo eziningi ezingaziwa, amathuba ofuna ukuwahlola ngempela.

Indlela esinqume ngayo

Ukuze uvimbele inodi ukuthi ingacwili, i-SWAP iye yakhutshazwa. Futhi manje, uma kukhona ukuntuleka kwenkumbulo, i-node kufanele yehle futhi ingadali ikhefu elikhulu le-gc.

Ngakho-ke, asisancikile ku-logic kusizindalwazi. Abathuthukisi bohlelo lokusebenza bayaziqeqesha kabusha futhi sebeqala ukuthatha izinyathelo zokuphepha ngekhodi yabo. Ukuhlukaniswa okufanelekile kokugcinwa nokucubungula idatha.

Sithenge ukwesekwa kwaDathaStax. Ukuthuthukiswa kwebhokisi le-Cassandra sekuphelile (isibopho sokugcina sasingoFebhuwari 2018). Ngesikhathi esifanayo, i-Datastax inikeza isevisi enhle kakhulu kanye nenani elikhulu lezixazululo eziguquliwe neziguquliwe zezixazululo ezikhona ze-IP.

Ngifuna futhi ukuqaphela ukuthi i-Cassandra ayilungele kakhulu imibuzo yokukhetha. Yebo, i-CQL iyisinyathelo esikhulu esiya phambili kubasebenzisi (uma kuqhathaniswa ne-Trift). Kodwa uma uneminyango yonke ejwayele ukujoyina okunjalo okulula, ukuhlunga mahhala nganoma iyiphi inkundla namandla okuthuthukisa imibuzo, futhi le minyango isebenzela ukuxazulula izikhalazo nezingozi, khona-ke isisombululo ku-Cassandra sibonakala sinobutha futhi siyiziphukuphuku kubo. Futhi saqala ukunquma ukuthi ozakwethu kufanele benze kanjani amasampula.

Sicabangele izinketho ezimbili: Kokukhetha kokuqala, asibhali izingcingo kuphela ku-C*, kodwa naku-database ye-Oracle egciniwe. Kuphela, ngokungafani ne-C*, le datha egciniwe ibiza kuphela inyanga yamanje (ukujula okwanele kwesitoreji sekholi kumacala okushajwa). Lapha sisheshe sabona inkinga elandelayo: uma sibhala ngokuhambisanayo, khona-ke silahlekelwa yizo zonke izinzuzo ze-C* ezihambisana nokufakwa ngokushesha; uma sibhala ngokuvumelanayo, asikho isiqinisekiso sokuthi zonke izingcingo ezidingekayo zingene ku-Oracle nhlobo. Kube khona ukuhlanganisa okukodwa, kodwa okukhulu: ekusebenzeni kusasele unjiniyela ofanayo we-PL/SQL, okungukuthi sisebenzisa iphethini ethi “Facade”. Enye inketho. Senza indlela ethula izingcingo ezivela ku-C*, idonse idatha ethile ukuze ithuthukiswe kusuka kumathebula ahambisanayo ku-Oracle, ihlanganisa amasampula avelayo futhi isinike umphumela, esibe sesiwusebenzisa ngandlela thile (hlehlisa, phinda, hlaziya, sibabaze). Ububi: inqubo inezinyathelo eziningi, futhi ngaphezu kwalokho, asikho isikhombimsebenzisi sabasebenzi abasebenza.

Ekugcineni, sazinza ngenketho yesibili. I-Apache Spark isetshenziselwe ukwenza isampula yezimbiza ezihlukene. Ingqikithi yendlela yehliselwe kukhodi ye-Java, okuthi, kusetshenziswa okhiye abashiwo (obhalisile, isikhathi socingo - okhiye besigaba), ikhiphe idatha ku-C *, kanye nedatha edingekayo yokucebisa kunoma iyiphi enye i-database. Ngemva kwalokho iwahlanganisa enkumbulweni yayo futhi ibonise umphumela kuthebula eliwumphumela. Sidwebe ubuso bewebhu phezu kwenhlansi futhi yabonakala isebenziseka.

Ungabheka kanjani emehlweni kaCassandra ngaphandle kokulahlekelwa idatha, ukuzinza nokholo ku-NoSQL

Lapho sixazulula inkinga yokubuyekeza idatha yokuhlola yezimboni, siphinde sacabangela izixazululo ezimbalwa. Kokubili ukudlulisa nge-Sssloader kanye nenketho yokuhlukanisa iqoqo elisendaweni yokuhlola libe izingxenye ezimbili, ngayinye eyingxenye yeqoqo elifanayo nelokukhangisa, ngaleyo ndlela inikwe amandla yilo. Lapho kubuyekezwa ukuhlolwa, kwakuhlelwe ukuthi kushintshwe: ingxenye esebenze esivivinyweni iyasulwa futhi ifakwe ekukhiqizeni, kanti enye iqala ukusebenza nedatha ngokuhlukana. Nokho, ngemva kokucabanga futhi, sahlola ngokuhluzekile idatha obekufanele idluliselwe, futhi sabona ukuthi izingcingo ngokwazo ziyibhizinisi elingahambisani nokuhlolwa, elikhiqizwa ngokushesha uma kudingekile, futhi kuyisethi yedatha yokuphromotha engenalo inani lokudluliselwa ku test. Kunezinto ezimbalwa zokugcina ezifanele ukunyakaza, kodwa lawa angamatafula ambalwa, futhi awasindi kakhulu. Ngakho-ke thina njengesixazululo, u-Spark uphinde wasiza, ngosizo esabhala ngalo futhi saqala ukusebenzisa ngenkuthalo iskripthi sokudlulisa idatha phakathi kwamatafula, ukuhlolwa kwe-prom.

Inqubomgomo yethu yamanje yokusebenzisa isivumela ukuthi sisebenze ngaphandle kokuhlehliswa. Ngaphambi kwephromo, kukhona ukuhlolwa okuphoqelekile, lapho iphutha lingabizi kakhulu. Uma kwenzeka ukwehluleka, ungakwazi njalo ukulahla i-casespace bese ugingqa lonke uhlelo kusukela ekuqaleni.

Ukuqinisekisa ukutholakala okuqhubekayo kwe-Cassandra, udinga i-dba hhayi yena kuphela. Wonke umuntu osebenza nesicelo kumele aqonde ukuthi kufanele abheke kuphi futhi kanjani isimo samanje nendlela yokuxilonga izinkinga ngesikhathi esifanele. Ukuze senze lokhu, sisebenzisa i-DataStax OpsCenter ngenkuthalo (Ukuphatha nokuqapha umthwalo womsebenzi), amamethrikhi esistimu ye-Cassandra Driver (inombolo yokuphelelwa yisikhathi yokubhalela ku-C*, inani lezikhathi zokuphela zokufunda kusuka ku-C*, ukubambezeleka okuphezulu, njll.), qapha ukusebenza. yohlelo ngokwalo, esebenzisana noCassandra.

Lapho sicabanga ngombuzo odlule, sabona ukuthi ingozi yethu enkulu ingase ilale kuphi. Lawa amafomu okubonisa idatha abonisa idatha kusuka kumibuzo embalwa ezimele kuya endaweni yokubeka. Ngale ndlela singathola ulwazi olungaguquki. Kodwa le nkinga ingasebenza ngendlela efanayo uma sisebenza nesikhungo sedatha esisodwa kuphela. Ngakho-ke into enengqondo kakhulu lapha, yiqiniso, ukudala umsebenzi we-batch wokufunda idatha kuhlelo lokusebenza lomuntu wesithathu, oluzoqinisekisa ukuthi idatha itholwa ngesikhathi esisodwa. Ngokuqondene nokuhlukaniswa kokufunda nokubhala ngokuphathelene nokusebenza, lapha samiswa engozini yokuthi ngokulahlekelwa okuthile kokuxhumana phakathi kwe-DCs, singaphetha ngamaqoqo amabili angahambisani ngokuphelele nomunye nomunye.

Ngenxa yalokho, okwamanje kumiswe ileveli yokuvumelana ukuze kubhalwe okuthi EACH_QUORUM, ukuze kufundwe - LOCAL_QUORUM

Imibono emifushane neziphetho

Ukuze sihlole isixazululo esiwumphumela ngokombono wokwesekwa kokusebenza kanye namathemba entuthuko eyengeziwe, sinqume ukucabanga ngokuthi intuthuko enjalo ingasetshenziswa kuphi.

Zisuka nje kubhethi, bese kuba idatha yezinhlelo ezinjengokuthi “Khokha uma kufaneleka” (silayisha ulwazi ku-C*, ukubala kusetshenziswa imibhalo ye-Spark), ukubala izimangalo ngokuhlanganisa ngendawo, ukugcina izindima nokubala amalungelo okufinyelela komsebenzisi ngokusekelwe endimeni. i-matrix.

Njengoba ubona, i-repertoire ibanzi futhi ihlukahlukene. Futhi uma sikhetha ikamu labasekeli/abaphikisi be-NoSQL, sizobe sesijoyina abasekeli, njengoba sithole izinzuzo zethu, kanye nalapho besilindele khona.

Ngisho nenketho ye-Cassandra ephuma ebhokisini ivumela ukukala okuvundlile ngesikhathi sangempela, ukuxazulula ngokungenabuhlungu indaba yokwandisa idatha ohlelweni. Sikwazile ukuhambisa indlela enomthwalo ophezulu kakhulu wokubala izilinganiso zezingcingo kwisekethe ehlukile, futhi siphinde sihlukanise i-schema sohlelo lokusebenza nengqondo, sisusa umkhuba omubi wokubhala imisebenzi yangokwezifiso nezinto kusizindalwazi ngokwaso. Sithole ithuba lokukhetha nokulungisa, ukusheshisa, ukuthi yimaphi ama-DC esizokwenza izibalo kuwo nokuthi yimaphi esizoqopha idatha kuwo, sazibophezela thina ngokwethu ngokumelene nokuphahlazeka kwawo womabili ama-node kanye ne-DC yonkana.

Ukusebenzisa izakhiwo zethu kumaphrojekthi amasha, futhi sengivele nginolwazi oluthile, ngingathanda ukucabangela ngokushesha ama-nuances achazwe ngenhla, futhi ngivimbele amaphutha athile, bushelelezi amanye amakhona abukhali ayengenakugwenywa ekuqaleni.

Isibonelo, gcina umkhondo wezibuyekezo zikaCassandra ngesikhathingoba izinkinga ezimbalwa esizitholile besezaziwa futhi sezilungisiwe.

Ungabeki kokubili i-database ngokwayo kanye ne-Spark ezindaweni ezifanayo (noma uhlukanise ngokuqinile ngenani lokusetshenziswa kwensiza okuvunyelwe), njengoba i-Spark ingadla i-OP eyengeziwe kunokulindelekile, futhi sizothola ngokushesha inombolo yenkinga 1 ohlwini lwethu.

Ukuthuthukisa ukuqapha kanye nekhono lokusebenza esigabeni sokuhlolwa kwephrojekthi. Ekuqaleni, cabangela ngangokunokwenzeka bonke abangaba abathengi besixazululo sethu, ngoba lokhu yilokho ukwakheka kwesizindalwazi ekugcineni kuzoncika kukho.

Zungezisa isekethe ewumphumela izikhathi ezimbalwa ukuze kusetshenziswe ngokugcwele. Khetha ukuthi iziphi izinkambu ezingafakwa ku-serialized. Qonda ukuthi yimaphi amathebula engeziwe okufanele siwenze ukuze siwacabangele kahle nangendlela efanele, bese sinikeza ulwazi oludingekayo lapho sicelwa (ngokwesibonelo, ngokuthatha ngokuthi singagcina idatha efanayo kumathebula ahlukene, sicabangela ukuhlukaniswa okuhlukene ngokuvumelana izindlela ezahlukahlukene, singonga kakhulu isikhathi se-CPU sokufunda izicelo).

Akukubi Ngokushesha hlinzeka ngokunamathisela i-TTL kanye nokuhlanza idatha ephelelwe yisikhathi.

Lapho ulanda idatha ku-Cassandra I-logic yohlelo lokusebenza kufanele isebenze kumgomo we-FETCH, ukuze kungabi yonke imigqa elayishwa kumemori ngesikhathi esisodwa, kodwa ikhethwe ngamaqoqo.

Kutuswa ngaphambi kokudlulisela iphrojekthi esixazululweni esichaziwe hlola ukubekezelela iphutha kwesistimu ngokuqhuba uchungechunge lokuhlolwa kokuphahlazeka, njengokulahleka kwedatha esikhungweni sedatha esisodwa, ukubuyiselwa kwedatha eyonakele esikhathini esithile, ukuphuma kwenethiwekhi phakathi kwezikhungo zedatha. Ukuhlola okunjalo ngeke nje kuvumele umuntu ukuba ahlole izinzuzo nezingozi zesakhiwo esihlongozwayo, kodwa futhi kuzohlinzeka ngomkhuba omuhle wokufudumala konjiniyela abaqhubayo, futhi ikhono elizuziwe liyoba kude kakhulu nokungaphezu kwamandla uma ukwehluleka kwesistimu kukhiqizwa kabusha ekukhiqizeni.

Uma sisebenza ngolwazi olubucayi (olufana nedatha yokukhokha, ukubalwa kwesikweletu sobhalisile), khona-ke kufanelekile ukunaka amathuluzi azonciphisa ubungozi obuvela ngenxa yezici ze-DBMS. Isibonelo, sebenzisa insiza ye-nodesync (Datastax), usungule isu elilungile lokuyisebenzisa ukuze ngenxa yokungaguquguquki, ungadali umthwalo oweqile kuCassandra futhi uyisebenzisele amatafula athile kuphela esikhathini esithile.

Kwenzekani kuCassandra ngemva kwezinyanga eziyisithupha zokuphila? Ngokuvamile, azikho izinkinga ezingaxazululiwe. Asizange futhi sivumele noma yiziphi izingozi ezimbi noma ukulahleka kwedatha. Yebo, kwakudingeka sicabange ngokunxephezela ezinye izinkinga ezazingakaze zivele ngaphambili, kodwa ekugcineni lokhu akuzange kusifiphaze kakhulu isisombululo sethu sezakhiwo. Uma ufuna futhi ungesabi ukuzama into entsha, futhi ngesikhathi esifanayo ungafuni ukudumazeka kakhulu, bese ulungele iqiniso lokuthi akukho lutho olukhululekile. Kuzodingeka uqonde, uhlole imibhalo futhi uhlanganise ireki yakho yomuntu siqu ngaphezu kwesixazululo sefa elidala, futhi ayikho ithiyori ezokutshela kusenesikhathi ukuthi iyiphi ireki ekulindele.

Source: www.habr.com

Engeza amazwana