Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Ekubeni i-ClickHouse yinkqubo ekhethekileyo, xa uyisebenzisa kubalulekile ukuqwalasela iimpawu zezakhiwo zayo. Kule ngxelo, u-Alexey uya kuthetha ngemizekelo yeempazamo eziqhelekileyo xa usebenzisa i-ClickHouse, engakhokelela kumsebenzi ongasebenziyo. Imizekelo esebenzayo iya kubonisa indlela ukukhetha esinye okanye esinye isikimu sokucwangcisa idatha kunokutshintsha ukusebenza ngemiyalelo yobukhulu.

Molweni nonke! Igama lam nguAlexey, ndenza iClickHouse.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Okokuqala, ndikhawuleza ukukukholisa ngoko nangoko, namhlanje andizukukuxelela ukuba yintoni na iClickHouse. Xa ndithetha inyani, ndidiniwe. Ngalo lonke ixesha ndikuxelela ukuba yintoni. Kwaye mhlawumbi wonke umntu sele eyazi.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Endaweni yoko, ndiza kukuxelela ukuba zeziphi iimpazamo ezikhoyo, oko kukuthi, ungasebenzisa njani iClickHouse ngokungalunganga. Enyanisweni, akukho mfuneko yokoyika, kuba siphuhlisa i-ClickHouse njengenkqubo elula, efanelekileyo, kwaye isebenza ngaphandle kwebhokisi. Ndiyifakile, akukho ngxaki.

Kodwa kusafuneka uthathele ingqalelo ukuba le nkqubo ikhethekileyo kwaye unokuzifumana ngokulula kwimeko yokusetyenziswa engaqhelekanga eya kuthatha le nkqubo ngaphandle kwendawo yayo yokuthuthuzela.

Ke, luhlobo luni lweraki olukhoyo? Ikakhulu ndiza kuthetha ngezinto ezicacileyo. Yonke into ibonakala kuwo wonke umntu, wonke umntu uyayiqonda yonke into kwaye unokuvuyela ukuba bahlakaniphile, kwaye abo bangayiqondiyo baya kufunda into entsha.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Umzekelo wokuqala kunye nolula kakhulu, oko, ngelishwa, ukwenzeka rhoqo, inani elikhulu lokufakwa kunye neebhetshi ezincinci, oko kukuthi inani elikhulu lezinto ezincinci.

Ukuba siqwalasela indlela iClickHouse eyenza ngayo ukufaka, ngoko ungathumela ubuncinane i-terabyte yedatha kwisicelo esinye. Akuyongxaki.

Kwaye makhe sibone ukuba inokuba yintoni intsebenzo eqhelekileyo. Ngokomzekelo, sinetafile evela kwidatha yeYandex.Metrica. Iibetha. 105 ezinye iikholamu. Iibytes ezingama-700 zingaxinzelelwanga. Kwaye siya kufaka ngendlela efanelekileyo kwiibhetshi zemiqolo yesigidi.

Sifaka i-MergeTree etafileni, ijika isiqingatha sesigidi semiqolo ngomzuzwana. Kakhulu. Kwitheyibhile ephindiweyo iya kuba ncinane kancinci, malunga nama-400 imiqolo ngesekhondi.

Kwaye ukuba uvumela ufakelo lwekhoram, ufumana okuncinci, kodwa ukusebenza okunesidima, 250 yemigaqo ngesekhondi. Ufakelo lwekhoram luphawu olungabhalwanga kwiClickHouse*.

*ukusukela ngo-2020, sele ibhaliwe.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwenzeka ntoni xa wenze into embi? Sifaka umqolo omnye kwitheyibhile ye-MergeTree kwaye sifumane imiqolo engama-59 ngesibini. Li-10 lamaxesha acothayo. KwiReplicatedMergeTree – imiqolo emi-000 ngesekhondi. Kwaye ukuba ikhoram ivuliwe, ngoko kuvela imigca emi-6 ngesekhondi. Ngokombono wam, olu luhlobo oluthile lwe-crap ngokupheleleyo. Ungacotha njani ngolo hlobo? Ndide ndiyibhale kwisikipa sam ukuba iClickHouse mayingacothi. Kodwa nangona kunjalo kuyenzeka ngamanye amaxesha.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Enyanisweni, oku kukusilela kwethu. Besinokwenza ukuba yonke into isebenze kakuhle, kodwa asizange senze njalo. Kwaye asizange siyenze ngenxa yokuba iskripthi sethu asiyifuni. Besele sineebhutshi. Sisanda kufumana iibhetshi ekungeneni kwethu, kwaye akukho ngxaki. Sifaka kwaye yonke into isebenza kakuhle. Kodwa, ngokuqinisekileyo, zonke iintlobo zeemeko zinokwenzeka. Umzekelo, xa uneqela leeseva apho idatha yenziwe khona. Kwaye abafaki idatha rhoqo, kodwa bagcina ngokufaka rhoqo. Kwaye kufuneka ngandlela ithile sikuphephe oku.

Ukususela kumbono wezobugcisa, ingongoma kukuba xa ufaka i-ClickHouse, idatha ayipheli kuyo nayiphi na into ekhunjulwayo. Asinaso nesakhiwo selogi yokwenyani iMergeTree, kodwa iMergeTree nje, kuba akukho log okanye memTable. Sivele sibhale ngokukhawuleza idatha kwisixokelelwano sefayile, esele icwangciswe kwimiqolo. Kwaye ukuba uneekholamu ezili-100, ngaphezulu kweefayile ezingama-200 kuya kufuneka zibhalwe kuluhlu olwahlukileyo. Konke oku kunzima kakhulu.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwaye umbuzo uvela: "Uyenza njani ngokufanelekileyo?" Ukuba imeko injalo ukuba usadinga ngandlela-thile ukurekhoda idatha kwiClickHouse.

Indlela yoku-1. Le yeyona ndlela ilula. Sebenzisa uhlobo oluthile lomgca osasaziweyo. Ngokomzekelo, iKafka. Ukhupha nje idatha kwi-Kafka kwaye uyibhexe kube kanye ngomzuzwana. Kwaye yonke into iya kulunga, uyarekhoda, yonke into isebenza kakuhle.

Ukungalungi kukuba iKafka yenye inkqubo esasazwa kakhulu. Ndiyaqonda nokuba sele unayo iKafka kwinkampani yakho. Ilungile, ifanelekile. Kodwa ukuba ayikho, kuya kufuneka ucinge kathathu ngaphambi kokutsala enye inkqubo esasaziweyo kwiprojekthi yakho. Kwaye ngoko ke kufanelekile ukuqwalasela ezinye iindlela.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Indlela yesi-2. Le yenye indlela yesikolo esidala kwaye kwangaxeshanye ilula kakhulu. Ingaba unalo uhlobo lomncedisi owenza iilog zakho. Kwaye ibhala nje iilog zakho kwifayile. Kwaye kanye okwesibini, umzekelo, siyiqamba ngokutsha le fayile kwaye sikrazula entsha. Kwaye iskripthi esahlukileyo, nokuba kunge-cron okanye enye i-daemon, sithatha eyona fayile indala kwaye siyibhale kwiClickHouse. Ukuba urekhoda iilogi kanye ngomzuzwana, ngoko yonke into iya kulunga.

Kodwa into engalunganga yale ndlela kukuba ukuba umncedisi wakho apho iilogi zenziwa inyamalala kwenye indawo, ngoko ke idatha iya kunyamalala.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Indlela yesi-3. Kukho enye indlela enomdla, engadingi iifayile zesikhashana kuzo zonke. Umzekelo, unohlobo oluthile lwespinner yentengiso okanye enye i-daemon enomdla eyenza idatha. Kwaye ungaqokelela iqela ledatha ngokuthe ngqo kwi-RAM, kwi-buffer. Kwaye xa ixesha elaneleyo lidlulile, ubeke le buffer ecaleni, yenza entsha, kwaye kumsonto owahlukileyo, faka into esele iqokelelwe kwiClickHouse.

Kwelinye icala, idata iyanyamalala ngokubulala -9. Ukuba iseva yakho iyawohloka, uya kuphulukana nale datha. Kwaye enye ingxaki kukuba ukuba awukwazi ukubhala kwi-database, ngoko idatha yakho iya kuqokelela kwi-RAM. Kwaye mhlawumbi i-RAM iya kuphelelwa, okanye uya kuphulukana data.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Indlela 4. Enye indlela enomdla. Ngaba unolunye uhlobo lwenkqubo yomncedisi. Kwaye inokuthumela idatha kwiClickHouse kwangoko, kodwa yenze kuqhagamshelo olunye. Umzekelo, ndithumele isicelo se-http nge-transfer-encoding: chunked with insert. Kwaye ivelisa iziqwenga hayi kunqabile kakhulu, ungathumela umgca ngamnye, nangona kuyakubakho umphezulu wokuqulunqa le datha.

Nangona kunjalo, kule meko idatha iya kuthunyelwa kwi-ClickHouse ngokukhawuleza. Kwaye iClickHouse iya kuzikhusela ngokwazo.

Kodwa kukwavela iingxaki. Ngoku uya kulahlekelwa idatha, kubandakanywa xa inkqubo yakho ibulawa kwaye ukuba inkqubo yeClickHouse ibulawa, kuba iya kuba yifake engaphelelanga. Kwaye kwiClickHouse ufakelo luyi-atomic ukuya kumda othile ochaziweyo kubungakanani bemiqolo. Ngokomgaqo, le yindlela enomdla. Inokusetyenziswa kwakhona.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Indlela 5. Nantsi enye indlela enomdla. Olu luhlobo oluthile lomncedisi ophuhliswe ekuhlaleni ukwenzela ukudibanisa idatha. Khange ndiyijonge ngokwam, so andinakuqinisekisa nto. Nangona kunjalo, akukho ziqinisekiso ezibonelelweyo kwiClickHouse ngokwayo. Lo ikwangumthombo ovulekileyo, kodwa kwelinye icala, usenokusetyenziselwa umgangatho othile womgangatho esizama ukuwubonelela. Kodwa ngale nto-andazi, yiya kwi-GitHub, jonga ikhowudi. Mhlawumbi babhale into eqhelekileyo.

* ukusukela ngo-2020, kufuneka yongezwe kuqwalaselo KittenHouse.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Indlela yesi-6. Enye indlela kukusebenzisa iBuffer tables. Inzuzo yale ndlela kukuba kulula kakhulu ukuqalisa ukusebenzisa. Yenza iBuffer table kwaye uyifake kuyo.

Ububi kukuba ingxaki ayisonjululwanga ngokupheleleyo. Ukuba, kwireyithi efana neMergeTree, kufuneka uqokelele idatha ngebhetshi enye ngesekondi, ngoko ke kwireyithi yebuffer table, kufuneka uhlanganise ubuncinane ukuya kumawaka aliqela ngesekhondi. Ukuba ingaphezulu kwe-10 ngomzuzwana, iya kuba imbi. Kwaye ukuba uyifaka kwiibhetshi, ngoko ubonile ukuba ijika ibe yimigca yekhulu lamawaka ngomzuzwana. Kwaye oku sele kwidata enzima ngokwaneleyo.

Kwaye iitafile zesithinteli azinalogi. Kwaye ukuba kukho into engalunganga ngomncedisi wakho, ke idatha iya kulahleka.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwaye njengebhonasi, kutshanje sifumene ithuba kwi-ClickHouse ukubuyisela idatha kwi-Kafka. Kukho injini yetafile - iKafka. Uyakha nje. Kwaye unokuxhoma imiboniso ebonakalayo kuyo. Kule meko, iya kukhupha idatha kwi-Kafka kwaye ifake kwiitafile ozifunayo.

Kwaye eyona nto imnandi ngeli thuba kukuba ayisithi thina ebenzileyo. Olu luphawu loluntu. Kwaye xa ndisithi "inqaku loluntu," ndithetha ngaphandle kokudelela. Siyifundile ikhowudi, senze uphononongo, kufuneka isebenze kakuhle.

* ukusukela ngo-2020, inkxaso efanayo ibonakala UmvundlaMQ.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Yintoni enye enokuba yingozi okanye ingalindelekanga xa ufaka idatha? Ukuba wenza isicelo sokufaka amaxabiso kwaye ubhale amabinzana abaliweyo kumaxabiso. Umzekelo, ngoku() ikwalibinzana elibaliweyo. Kwaye kule meko, i-ClickHouse inyanzelekile ukuba iqalise itoliki yala mazwi kumgca ngamnye, kwaye ukusebenza kuya kuhla ngemiyalelo yobukhulu. Kungcono ukuyiphepha le nto.

* okwangoku, ingxaki isonjululwe ngokupheleleyo, akusekho nakuphi na ukucutha ukusebenza xa usebenzisa amabinzana kwii-VALUES.

Omnye umzekelo kuxa kunokubakho iingxaki xa unedatha kwibhetshi enye eyeqela lezahlulo. Ngokungagqibekanga, izahlulo zeClickHouse zenyanga. Kwaye ukuba ufaka ibhetshi yesigidi semigca, kwaye kukho idatha yeminyaka eliqela, ngoko uya kuba nezahlulo ezininzi apho. Kwaye oku kufana nenyaniso yokuba kuya kubakho iibhetshi eziliqela eziphindwe kashumi ezincinci ngobukhulu, kuba ngaphakathi zihlala zihlulwe kuqala zibe yizahlulo.

* Kutshanje, kwimodi yovavanyo, i-ClickHouse yongeze inkxaso yefomathi edibeneyo yee-chunks kunye ne-chunks kwi-RAM kunye ne-log-ahead log, ephantse isombulule ingxaki ngokupheleleyo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Ngoku makhe sijonge uhlobo lwesibini lwengxaki - ukuchwetheza idatha.

Ukuchwetheza idatha kunokuba ngqongqo okanye umtya. Umtya kuxa uthe wayithatha kwaye wabhengeza ukuba onke amasimi akho aluhlobo lomtya. Oku kumosha. Akukho mfuneko yakwenza oku.

Makhe sibone indlela yokwenza ngokuchanekileyo kwezo meko xa ufuna ukuthi sinentsimi ethile, umtya, kwaye uvumele i-ClickHouse iqikelele ngokwayo, kwaye andiyi kuzikhathaza. Kodwa kusafanele ukwenza umgudu othile.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Umzekelo, sinedilesi ye-IP. Kwimeko enye, siyigcine njengentambo. Umzekelo, 192.168.1.1. Kwaye kwenye imeko, iya kuba inani lohlobo lwe-UInt32*. Amasuntswana angama-32 anele idilesi ye-IPv4.

Okokuqala, ngokungaqhelekanga ngokwaneleyo, idatha iya kucinezelwa malunga ngokulinganayo. Kuya kubakho umahluko, kunjalo, kodwa hayi ukuba mkhulu. Ngoko ke akukho ngxaki zikhethekileyo nge-disk I/O.

Kodwa kukho umahluko omkhulu kwixesha leprosesa kunye nexesha lokubuza imibuzo.

Masibale inani leedilesi ze-IP ezizodwa ukuba zigcinwe njengamanani. Oko kusebenza kwimigca ye-137 yezigidi ngesekhondi. Ukuba okufanayo kukwimo yeentambo, ke imigca ye-37 yezigidi ngesekhondi. Andazi ukuba kutheni le coincidence yenzeke. Ndazenza ngokwam ezi zicelo. Kodwa kunjalo malunga namaxesha ama-4 acothayo.

Kwaye ukuba ubala umehluko kwindawo yediski, ngoko kukho umehluko. Kwaye umahluko umalunga nekota enye, kuba zininzi iidilesi ze-IP ezikhethekileyo. Yaye ukuba bekukho imigca enenani elincinane leentsingiselo ezahlukeneyo, bekuya kuba lula ukucinezelwa ngokwesichazi-magama ukuba ibe ngumqulu ofanayo.

Kwaye umahluko wamaxesha kane awulali endleleni. Mhlawumbi awundihoyi, kodwa xa ndibona lo mahluko undenza buhlungu.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Makhe sijonge iimeko ezahlukeneyo.

1. Imeko enye xa unamaxabiso awohlukileyo ambalwa. Kule meko, sisebenzisa inkqubo elula onokuthi uyayazi kwaye unokuyisebenzisa kuyo nayiphi na i-DBMS. Konke oku kuyavakala kungekuphela kwiClickHouse. Bhala nje izichazi zamanani kuvimba weenkcukacha. Kwaye ungaguqulela kwimitya kwaye ubuyele kwicala lesicelo sakho.

Umzekelo, unommandla. Kwaye uzama ukuyigcina njengentambo. Kwaye kuya kubhalwa apho: Ummandla waseMoscow noMoscow. Kwaye xa ndibona ukuba ithi "eMoscow", akukho nto, kodwa xa iMoscow, ngandlela-thile iba buhlungu ngokupheleleyo. Le yindlela ezininzi ngayo ii-bytes.

Endaweni yoko, simane sibhala phantsi inani le-Ulnt32 kunye ne-250. Sine-250 kwi-Yandex, kodwa eyakho ingahluka. Ukuba kunokwenzeka, ndiza kuthi iClickHouse inamandla owakhelwe ngaphakathi okusebenza kunye ne-geobase. Ubhala ngokulula phantsi ulawulo olunemimandla, kubandakanywa noluhlu lwemigangatho, oko kukuthi kuya kubakho iMoscow, iNgingqi yaseMoscow, nayo yonke into oyifunayo. Kwaye unokuguqula kwinqanaba lesicelo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Inketho yesibini iphantse yafana, kodwa ngenkxaso ngaphakathi kweClickHouse. Olu luhlobo lwedatha ye-Enum. Ubhala ngokulula onke amaxabiso owadingayo ngaphakathi kwe-Enum. Umzekelo, uhlobo lwesixhobo kwaye ubhale apho: idesktop, iselula, ithebhulethi, iTV. Kukho iinketho ezi-4 zizonke.

Ukungalungi kukuba kufuneka uyitshintshe ngamaxesha athile. Inye kuphela inketho eyongeziweyo. Masenze itafile yokutshintsha. Ngapha koko, itafile yokuguqula kwiClickHouse isimahla. Ngokukodwa simahla kwi-Enum kuba idatha ekwidiski ayitshintshi. Kodwa nangona kunjalo, i-alter ifumana isitshixo* etafileni kwaye kufuneka ilinde de zonke ezikhethiweyo zenziwe. Kwaye kuphela emva kokuba oku kuya kwenziwa utshintsho, o.t. kusekho izinto eziphazamisayo.

* kwiinguqulelo zamva nje zeClickHouse, ALTER yenziwa ingavali ngokupheleleyo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Olunye ukhetho olukhethekileyo lweClickHouse ludibanisa izichazi-magama zangaphandle. Ungabhala amanani kwiClickHouse, kwaye ugcine abalawuli bakho kuyo nayiphi na inkqubo ekulungeleyo. Umzekelo, ungasebenzisa: MySQL, Mongo, Postgres. Unokwenza eyakho i-microservice eya kuthumela le datha nge-http. Kwaye kwinqanaba leClickHouse, ubhala umsebenzi oza kuguqula le datha ukusuka kumanani ukuya kwiintambo.

Le yindlela ekhethekileyo kodwa esebenzayo kakhulu yokwenza ukudibanisa kwitafile yangaphandle. Kwaye kukho iinketho ezimbini. Kwimbonakaliso enye, le datha iya kugcinwa ngokupheleleyo, ikhoyo ngokupheleleyo kwi-RAM kwaye ihlaziywe ngamaxesha athile. Kwaye kolunye ukhetho, ukuba le datha ayingeni kwi-RAM, ngoko ungayigcina ngokuyinxenye.

Nanku umzekelo. Kukho iYandex.Direct. Kwaye kukho inkampani yentengiso kunye neebhanile. Kukho mhlawumbi malunga namashumi ezigidi zeenkampani zentengiso. Kwaye bangena ngokufanelekileyo kwi-RAM. Kwaye kukho iibhiliyoni zeebhanile, azifanelanga. Kwaye sisebenzisa isichazi-magama esigcinwe kwi-MySQL.

Ingxaki kuphela kukuba isichazi-magama esigciniweyo siya kusebenza kakuhle ukuba izinga lokubetha lisondele kwi-100%. Ukuba incinci, ngoko xa uphendula imibuzo kwibhetshi nganye yedatha, kuya kufuneka uthathe izitshixo ezingekhoyo kwaye uhambe ufumane idatha kwi-MySQL. Malunga neClickHouse, ndisenokuqinisekisa ukuba - ewe, ayicothi, andiyi kuthetha ngezinye iinkqubo.

Kwaye njengebhonasi, izichazi-magama ziyindlela elula kakhulu yokuhlaziya idatha kwi-ClickHouse. Oko kukuthi, ube nengxelo kwiinkampani zentengiso, umsebenzisi watshintsha nje inkampani yentengiso kwaye kuyo yonke idatha endala, kuzo zonke iingxelo, le datha itshintshile. Ukuba ubhala imiqolo ngokuthe ngqo kwitheyibhile, akunakukwazi ukuyihlaziya.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Enye indlela xa ungazi ukuba ungazifumana phi izichazi zemitya yakho. ungayibambisa nje. Ngaphezu koko, eyona ndlela ilula kukuthatha i-64-bit hash.

Ingxaki kuphela kukuba, ukuba i-hash yi-64-bit, ngokuqinisekileyo uya kuba nokungqubana. Kuba ukuba kukho imigca yebhiliyoni apho, ngoko ke amathuba sele ebonakala.

Kwaye akuyi kuba kuhle kakhulu ukuba hash amagama iinkampani zentengiso ngale ndlela. Ukuba iikhampasi zentengiso yeenkampani ezahlukeneyo zixutywe, ngoko kuya kubakho into engaqondakaliyo.

Kwaye kukho iqhinga elilula. Yinyani, ayifanelekanga kakhulu kwidatha enzulu, kodwa ukuba kukho into engalunganga kakhulu, yongeza nje isichongi somthengi kwisitshixo sesichazi-magama. Kwaye ke uya kuba nokungqubana, kodwa ngaphakathi komxhasi omnye kuphela. Kwaye sisebenzisa le ndlela kwiimephu zekhonkco kwiYandex.Metrica. Sine-URLs apho, sigcina i-hashes. Kwaye siyazi ukuba, kunjalo, kukho ukungqubana. Kodwa xa iphepha liboniswa, amathuba okuba kwiphepha elinye lomsebenzisi ezinye ii-URL zibambene kwaye oku kuya kuqatshelwa kunokungahoywa.

Njengebhonasi, kwimisebenzi emininzi i-hashes yodwa yanele kwaye imitya ngokwayo ayifuni kugcinwa naphi na.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Omnye umzekelo ukuba iintambo zifutshane, umzekelo, imimandla yewebhusayithi. Ziyakwazi ukugcinwa njengoko zinjalo. Okanye, umzekelo, ulwimi lwesikhangeli ru yi-2 bytes. Ngokuqinisekileyo, ngokwenene ndivakalelwa kukuba ii-bytes, kodwa ungakhathazeki, ii-bytes ezi-2 azikho lusizi. Nceda uyigcine njengoko injalo, ungakhathazeki.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Enye imeko xa, ngokuchaseneyo, kukho imigca emininzi kwaye kukho ezininzi ezizodwa kuzo, kwaye isethi inokuthi ingenamkhawulo. Umzekelo oqhelekileyo ngamabinzana okukhangela okanye ii-URL. Phendla amabinzana, kuquka typos. Makhe sibone ukuba mangaphi amabinzana okukhangela awodwa akhoyo ngosuku. Kwaye kuvela ukuba phantse isiqingatha sazo zonke iziganeko. Kwaye kule meko, unokucinga ukuba kufuneka ulungelelanise idatha, ubale izichazi, kwaye uyibeke kwitafile eyahlukileyo. Kodwa akuyomfuneko ukuba wenze oko. Gcina nje le migca njengoko injalo.

Kungcono ukuba ungaqambi nantoni na, kuba ukuba uyigcina ngokwahlukileyo, kuya kufuneka udibanise. Kwaye olu dibaniso, kokona kulungileyo, lufikelelo olungakhethiyo kwinkumbulo, ukuba isangena kwinkumbulo. Ukuba ayingeni, kuya kubakho iingxaki.

Kwaye ukuba idatha igcinwe kwindawo, ngoko ifundwa ngokulula kwi-odolo efunekayo kwinkqubo yefayile kwaye yonke into ilungile.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Ukuba unama-URL okanye enye intambo ende enzima, ngoko kuyafaneleka ukuba ucinge ukuba unokubala uhlobo oluthile lwesicatshulwa kwangaphambili kwaye ubhale kwikholamu eyahlukileyo.

Kwii-URLs, umzekelo, ungagcina i-domain ngokwahlukileyo. Kwaye ukuba ufuna ngokwenene isizinda, sebenzisa nje le kholamu, kwaye ii-URL ziya kulala apho, kwaye awuyi kuzichukumisa.

Makhe sibone ukuba yintoni umahluko. I-ClickHouse inomsebenzi okhethekileyo obala i-domain. Ikhawuleza kakhulu, siyilungisile. Kwaye, ukunyaniseka, ayihambisani ne-RFC, kodwa nangona kunjalo iqwalasela yonke into esiyidingayo.

Kwaye kwimeko enye siya kufumana ngokulula ii-URL kwaye sibale isizinda. Oko kusebenza kwi-166 milliseconds. Kwaye ukuba uthatha i-domain esele yenziwe, ke ijika ibe yi-67 milliseconds kuphela, oko kukuthi phantse kathathu ngokukhawuleza. Kwaye ikhawuleza hayi ngenxa yokuba kufuneka senze ezinye izibalo, kodwa ngenxa yokuba sifunda idatha encinci.

Yingakho isicelo esinye, esicothayo, sinesantya esiphezulu segigabytes ngomzuzwana. Kuba ifunda ngakumbi iigigabhayithi. Le yidatha engeyomfuneko ngokupheleleyo. Isicelo sibonakala sibaleka ngokukhawuleza, kodwa sithatha ixesha elide ukugqiba.

Kwaye ukuba ujonga inani ledatha kwidiski, kuvela ukuba i-URL yi-megabytes eyi-126, kwaye i-domain yi-5 megabytes kuphela. Kuvela amaxesha angama-25 ngaphantsi. Kodwa nangona kunjalo, isicelo senziwa kuphela amaxesha ama-4 ngokukhawuleza. Kodwa kungenxa yokuba idatha ishushu. Kwaye ukuba kuyabanda, inokuba 25 amaxesha ngokukhawuleza ngenxa disk I/O.

Ngendlela, ukuba uqikelela ukuba i-domain encinci kangakanani kune-URL, ijika ibe malunga namaxesha ama-4. Kodwa ngenxa yesizathu esithile, idatha ithatha amaxesha angama-25 ngaphantsi kwidiski. Ngoba? Ngenxa yoxinzelelo. Kwaye i-URL icinezelwe, kwaye i-domain icinezelwe. Kodwa rhoqo i-URL iqulethe inkunkuma eninzi.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwaye, ewe, kuyahlawula ukusebenzisa iintlobo zedatha ezifanelekileyo eziyilelwe ngokuthe ngqo amaxabiso afunekayo okanye afanelekileyo. Ukuba uku-IPv4, ke gcina i-UInt32*. Ukuba IPv6, ngoko FixedString(16), kuba idilesi ye IPv6 ngamasuntswana angama-128, oko kukuthi igcinwe ngqo kwifomati yokubini.

Kodwa kuthekani ukuba ngamanye amaxesha uneedilesi ze-IPv4 kwaye ngamanye amaxesha i-IPv6? Ewe, ungazigcina zombini. Enye ikholamu ye-IPv4, enye ye-IPv6. Ewe kunjalo, kukho ukhetho lokubonisa i-IPv4 kwi-IPv6. Oku kuya kusebenza, kodwa ukuba ufuna rhoqo idilesi ye-IPv4 kwizicelo, ngoko kuya kuba kuhle ukuyibeka kumhlathi owahlukileyo.

* I-ClickHouse ngoku ine-IPv4 eyahlukileyo, iindidi zedatha ye-IPv6 ezigcina idatha ngokufanelekileyo njengamanani, kodwa zimele ngokufanelekileyo njengeentambo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwakhona kubalulekile ukuqaphela ukuba kuyafaneleka ukucubungula idatha kwangaphambili. Umzekelo, ufumana iinkuni ezikrwada. Kwaye mhlawumbi akufanele uzibeke nje kwiClickHouse ngoko nangoko, nangona kuhenda kakhulu ukuba ungenzi nto kwaye yonke into iya kusebenza. Kodwa kusafanelekile ukwenza izibalo ezinokwenzeka.

Umzekelo, uguqulelo lwebhrawuza. Kwisebe elithile elikufutshane, endingafuni ukukhomba ngomunwe, inguqulo yesikhangeli igcinwe ngolu hlobo, oko kukuthi, njengomtya: 12.3. Kwaye emva koko, ukwenza ingxelo, bathatha lo mtya kwaye bawahlule kwi-array, kwaye ke ibe yinto yokuqala yoluhlu. Ngokwemvelo, yonke into iyacotha. Ndabuza ukuba kutheni besenza oku. Bandixelele ukuba abakuthandi ukwenziwa kwangaphambi kwexesha. Kwaye andikuthandi ukuphelelwa lithemba kwangaphambi kwexesha.

Ngoko ke kule meko kuya kuba ichanekile ngakumbi ukwahlula-hlula kwiikholamu ezi-4. Sukoyika apha, kuba le yiClickHouse. I-ClickHouse yi-database ye-columnar. Kwaye okukhona iikholamu ezincinci zicocekile, kokukhona kungcono. Kuya kubakho i-5 BrowserVersions, yenza iikholamu ezi-5. Oku kulungile.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Ngoku makhe sijonge into omawuyenze ukuba uneentambo ezininzi ezinde kakhulu, uluhlu olude kakhulu. Akukho mfuneko yokuba zigcinwe kwiClickHouse konke konke. Endaweni yoko, ungagcina kuphela isazisi kwiClickHouse. Kwaye ubeke le migca mide kwenye inkqubo.

Umzekelo, enye yeenkonzo zethu zohlalutyo ineeparamitha zesiganeko. Kwaye ukuba zininzi iiparameters zeziganeko, sigcina nje i512 yokuqala eza ngapha.kuba u512 akalosizi.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwaye ukuba awukwazi ukugqiba kwiintlobo zedatha yakho, ngoko unokurekhoda idatha kwi-ClickHouse, kodwa kwitafile yesikhashana yohlobo lweLog, ekhethekileyo kwidatha yesikhashana. Emva koku, unokuhlalutya ukuba yeyiphi ukuhanjiswa kwamaxabiso onakho apho, yintoni ekhoyo ngokubanzi, kwaye wenze iintlobo ezichanekileyo.

*ClickHouse ngoku inodidi lwedatha Ikhadinali ephantsi ekuvumela ukuba ugcine imitya ngokufanelekileyo ngomzamo omncinci.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Ngoku makhe sijonge enye imeko enomdla. Ngamanye amaxesha izinto zisebenza ngendlela engaqhelekanga ebantwini. Ndingene ndibone oku. Kwaye ngokukhawuleza kubonakala ngathi oku kwenziwa ngomnye onamava kakhulu, umlawuli okrelekrele onamava abanzi ekusekweni kwenguqulo ye-MySQL 3.23.

Apha sibona iitafile eziliwaka, nganye kuzo irekhoda intsalela yokwahlula owaziyo ukuba yintoni ngewaka.

Ngokomgaqo, ndiyawahlonipha amava abanye abantu, kuquka nokuqonda ukubandezeleka okunokuzuzwa ngala mava.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwaye izizathu zicacile ngakumbi okanye zingaphantsi. Ezi stereotypes ezindala ezinokuthi zaqokelelwa ngelixa usebenza nezinye iinkqubo. Umzekelo, iitafile zeMyISAM azinasitshixo esidityanisiweyo sokuqala. Kwaye le ndlela yokwahlula-hlula idatha inokuba ngumzamo onzima wokufumana ukusebenza okufanayo.

Esinye isizathu kukuba kunzima ukwenza nayiphi na imisebenzi yokuguqula kwiitafile ezinkulu. Yonke into iya kuvalwa. Nangona kwiinguqulelo zanamhlanje ze-MySQL le ngxaki ayiseyona nto imbi kakhulu.

Okanye, umzekelo, i-microsharding, kodwa ngaphezulu koko kamva.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Akukho mfuneko yokwenza oku kwi-ClickHouse, kuba, okokuqala, isitshixo esiphambili sihlanganiswe, idatha iyalelwa ngukhiye oyintloko.

Kwaye ngamanye amaxesha abantu bayandibuza: "Ngaba ukusebenza koluhlu lwemibuzo kwiClickHouse kwahluka ngokuxhomekeke kubungakanani betafile?" Ndithi ayitshintshi tu. Umzekelo, unetafile enemiqolo yebhiliyoni kwaye ufunda uluhlu lwemiqolo eyisigidi. Yonke into iilungile. Ukuba kukho imiqolo yetriliyoni kwitheyibhile kwaye ufunda imiqolo yesigidi, iya kuba phantse ifana.

Kwaye, okwesibini, zonke iintlobo zezinto ezifana nezahlulo zezandla azifunwa. Ukuba uyangena kwaye ujonge into ekwinkqubo yefayile, uya kubona ukuba itafile yinto enkulu kakhulu. Kwaye kukho into efana nezahlulo ngaphakathi. Oko kukuthi, iClickHouse ikwenzela yonke into kwaye akufuneki ubandezeleke.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Ukutshintsha kwiClickHouse kusimahla ukuba tshintsha ukongeza/kwehla ikholamu.

Kwaye akufanele wenze iitafile ezincinci, kuba ukuba unemiqolo eyi-10 okanye i-10 kwitheyibhile, akunandaba konke konke. I-ClickHouse yinkqubo eyenza i-throughput, kungekhona i-latency, ngoko akukho ngqiqo ukucubungula imigca ye-000.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kuchanekile ukusebenzisa itafile enye enkulu. Yahlukana neengcinga zakudala, yonke into izakulunga.

Kwaye njengebhonasi, kuguqulelo lwamva nje sinamandla okwenza isitshixo sokwahlula ngokungenasizathu ukuze senze zonke iintlobo zemisebenzi yogcino kwizahlulo zomntu ngamnye.

Umzekelo, udinga iitafile ezininzi ezincinci, umzekelo, xa kukho imfuneko yokucubungula idatha ephakathi, ufumana iinqununu kwaye kufuneka wenze utshintsho kuzo ngaphambi kokubhala kwitafile yokugqibela. Kule meko, kukho injini yetafile emangalisayo-iStripeLog. Ifana ne-TinyLog, ingcono kuphela.

* ngoku ClickHouse nayo unayo igalelo lomsebenzi wetafile.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Enye i-antipattern yi-microsharding. Ngokomzekelo, kufuneka udibanise idatha kwaye unamaseva ama-5, kwaye ngomso kuya kuba neeseva ezi-6. Kwaye ucinga malunga nendlela yokulinganisa kwakhona le datha. Kwaye endaweni yoko awuqhekezi ube yi-5 shards, kodwa ube yi-1 shards. Kwaye ke wenze imephu nganye kwezi microshards kwiseva eyahlukileyo. Kwaye uya kufumana, umzekelo, 000 ClickHouses kwiseva enye, umzekelo. Imizekelo eyahlukileyo kumazibuko ahlukeneyo okanye oovimba beenkcukacha ezahlukeneyo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kodwa oku akulunganga kakhulu kwiClickHouse. Kuba nomzekelo omnye weClickHouse uzama ukusebenzisa zonke izixhobo ezifumanekayo zeseva ukwenza isicelo esinye. Oko kukuthi, unohlobo oluthile lweseva kwaye, umzekelo, i-56 processor cores. Usebenzisa umbuzo othatha isekhondi enye kwaye izakusebenzisa iicores ezingama-56. Kwaye ukuba ubeke 200 ClickHouses phaya kwiseva enye, ngoko kuvela ukuba 10 imisonto izakuqala. Ngokubanzi, yonke into iya kuba yimbi kakhulu.

Esinye isizathu kukuba ukuhanjiswa komsebenzi kuzo zonke ezi meko kuya kungalingani. Abanye baya kugqiba kwangaphambili, abanye baya kugqiba kamva. Ukuba konke oku kwenzeke kwimeko enye, ngoko ke iClickHouse ngokwayo iya kubona indlela yokusasaza ngokuchanekileyo idatha phakathi kwemisonto.

Kwaye esinye isizathu kukuba uya kuba nonxibelelwano lwe-interprocessor nge-TCP. Idatha kuya kufuneka yenziwe i-serialized, ikhutshwe, kwaye eli linani elikhulu le-microshards. Ayizukusebenza ngokufanelekileyo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Enye i-antipattern, nangona ingenakubizwa ngokuba yi-antipattern. Esi sisixa esikhulu sokudityaniswa kwangaphambili.

Ngokubanzi, ukuhlanganiswa kwangaphambili kulungile. Unemiqolo eyibhiliyoni, uyidibanise kwaye yaba yimigca eyi-1, kwaye ngoku umbuzo uyenziwa ngoko nangoko. Yonke into inkulu. Ungakwenza oku. Kwaye oku, nkqu neClickHouse inohlobo lwetafile ekhethekileyo, i-AggregatingMergeTree, eyenza ukudibanisa okunyukayo njengoko idatha ifakiwe.

Kodwa kukho amaxesha apho ucinga ukuba siza kuhlanganisa idatha efana nale kwaye sidibanise idatha ngolu hlobo. Kwaye kwelinye isebe elingummelwane, nam andifuni kuthetha ukuba yeyiphi na, basebenzisa iitafile ze-SummingMergeTree ukushwankathela ngesitshixo sokuqala, kwaye malunga nama-20 amakholomu asetyenziswa njengesitshixo sokuqala. Nje ukuba kunokwenzeka, nditshintshe amagama ezinye iikholamu ukuze zibe yimfihlo, kodwa intle kakhulu.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Yaye iingxaki ezinjalo ziyavela. Okokuqala, umthamo wakho wedatha awunciphi kakhulu. Umzekelo, iyancipha kathathu. Izihlandlo ezithathu ziya kuba lixabiso elihle lokufumana amandla okuhlalutya okungenamkhawulo okuvela ukuba idatha yakho ayidityaniswanga. Ukuba idatha ihlanganisiwe, ngoko endaweni yohlalutyo ufumana izibalo ezibuhlungu kuphela.

Kwaye yintoni ekhethekileyo ngayo? Inyani yeyokuba aba bantu bakwisebe elingummelwane ngamanye amaxesha baye bacele ukongeza enye ikholamu kwisitshixo sokuqala. Oko kukuthi, sidibanise idatha ngolu hlobo, kodwa ngoku sifuna okungakumbi. Kodwa iClickHouse ayinalo isitshixo sokuqala. Ngoko ke, kufuneka sibhale ezinye izikripthi kwi-C ++. Kwaye andizithandi izikripthi, nokuba zikwi-C ++.

Kwaye ukuba ujonga ukuba iClickHouse yenzelwe ntoni, ke idatha engadityaniswanga yimeko kanye eyazalelwa kuyo. Ukuba usebenzisa iClickHouse yedatha engadityaniswanga, ngoko uyenza ngokufanelekileyo. Ukuba udibanisa, oku ngamanye amaxesha kuyaxoleleka.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Enye imeko enomdla yimibuzo ekwiluphu engapheliyo. Ngamanye amaxesha ndiye kwiseva ethile yemveliso kwaye ndijonge uluhlu lwenkqubo yokubonisa apho. Kwaye ngalo lonke ixesha ndifumanisa ukuba kukho into embi eyenzekayo.

Umzekelo, njengale. Kucacile kwangoko ukuba yonke into inokwenziwa ngesicelo esinye. Bhala nje i-url kunye noluhlu olulapho.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kutheni le nto imibuzo emininzi enjalo ikwi-loop engapheliyo embi? Ukuba isalathisi asisetyenziswanga, ngoko uya kuba nokudlula okuninzi kwidatha efanayo. Kodwa ukuba isalathisi sisetyenzisiweyo, umzekelo, uneqhosha eliphambili le-ru kwaye ubhala url = into apho. Kwaye ucinga ukuba i-URL enye kuphela efundwa kwitafile, yonke into iya kulunga. Kodwa eneneni hayi. Kuba iClickHouse yenza yonke into kwiibhetshi.

Xa efuna ukufunda uluhlu oluthile lwedatha, ufunda kancinci, kuba isalathisi kwi-ClickHouse sinqabile. Esi salathisi asikuvumeli ukuba ufumane umqolo omnye kwitafile, kuphela uluhlu oluthile. Kwaye idatha icinezelwe kwiibhloko. Ukuze ufunde umgca omnye, kufuneka uthathe ibhloko yonke kwaye uyinqande. Kwaye ukuba wenza intaphane yemibuzo, uya kuba nokugqithelana okuninzi, kwaye uya kuba nomsebenzi omninzi ekufuneka uwenze ngokuphindaphindiweyo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwaye njengebhonasi, unokuqaphela ukuba kwiClickHouse akufuneki woyike ukuhambisa iimegabytes kunye namakhulu eemegabytes kwicandelo le-IN. Ndiyakhumbula kwindlela yethu yokwenza ukuba kwi-MySQL sidlulisela iqela lamaxabiso kwi-IN candelo, umzekelo, sidlulisela i-megabytes ezili-100 zamanani athile apho, emva koko i-MySQL idla i-10 gigabytes yememori kwaye akukho nto enye eyenzekayo kuyo, yonke into. isebenza kakubi.

Kwaye okwesibini kukuba kwiClickHouse, ukuba imibuzo yakho isebenzisa isalathiso, ngoko ayisoloko icotha kuneskena esipheleleyo, oko kukuthi, ukuba ufuna ukufunda phantse yonke itafile, iya kuhamba ngokulandelelana kwaye ifunde itafile yonke. Ngokubanzi, uya kuziqonda ngokwakhe.

Kodwa nangona kunjalo kukho ubunzima. Umzekelo, into yokuba IN nge-subquery ayisebenzisi isalathisi. Kodwa le yingxaki yethu kwaye kufuneka siyilungise. Akukho nto isisiseko apha. Siza kuyilungisa*.

Kwaye enye into enomdla kukuba ukuba unesicelo eside kakhulu kwaye ukuhanjiswa kwesicelo kuyaqhubeka, ke esi sicelo side kakhulu siya kuthunyelwa kumncedisi ngamnye ngaphandle koxinzelelo. Umzekelo, iimegabytes eziyi-100 kunye neeseva ezingama-500. Kwaye, ngokufanelekileyo, uya kuba neegigabytes ezingama-50 ezidluliselwe kwinethiwekhi. Iza kuhanjiswa kwaye emva koko yonke into iya kugqitywa ngempumelelo.

* sele usebenzisa; Yonke into yalungiswa njengoko kwakuthenjisiwe.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwaye imeko eqhelekileyo kuxa izicelo zivela kwi-API. Umzekelo, wenze uhlobo oluthile lwenkonzo yakho. Kwaye ukuba umntu ufuna inkonzo yakho, ngoko uvula i-API kwaye ngokoqobo iintsuku ezimbini kamva ubona ukuba kukho into engaqondakaliyo eyenzekayo. Yonke into igcwele kakhulu kwaye ezinye izicelo ezoyikisayo ziyeza ebekungafanelanga ukuba zenzeke.

Kwaye kukho isisombululo esinye kuphela. Ukuba uvule i-API, kuya kufuneka uyinqumle. Umzekelo, yazisa uhlobo oluthile lwezabelo. Azikho ezinye iinketho eziqhelekileyo. Ngaphandle koko, baya kubhala ngokukhawuleza iskripthi kwaye kuya kubakho iingxaki.

Kwaye iClickHouse inomsebenzi okhethekileyo-ukubalwa kwequota. Ngaphezu koko, ungadlulisela isitshixo sakho sekota. Oku, umzekelo, i-ID yomsebenzisi wangaphakathi. Kwaye iiquotas ziya kubalwa ngokuzimeleyo nganye kuzo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Ngoku enye into enomdla. Oku kuphindaphinda ngesandla.

Ndiyazazi iimeko ezininzi apho, nangona iClickHouse inenkxaso eyakhelwe-ngaphakathi yokuphindaphinda, abantu baphindaphinda iClickHouse ngesandla.

Uthini umgaqo? Unombhobho wokucubungula idatha. Kwaye isebenza ngokuzimeleyo, umzekelo, kumaziko ahlukeneyo edatha. Ubhala idatha efanayo ngendlela efanayo kwiClickHouse. Enyanisweni, ukuziqhelanisa kubonisa ukuba idatha isaya kuhlukana ngenxa yezinye iimpawu kwikhowudi yakho. Ndiyathemba ukuba ikuwe.

Kwaye amaxesha ngamaxesha kuya kufuneka uvumelanise ngesandla. Umzekelo, kanye ngenyanga ii-admins zenza i-rsync.

Ngapha koko, kulula kakhulu ukusebenzisa uphindaphindo olwakhelwe kwiClickHouse. Kodwa kusenokubakho ezinye contraindications, kuba oku kufuneka usebenzise ZooKeeper. Andiyi kuthetha into embi malunga neZooKeeper, ngokomgaqo, inkqubo isebenza, kodwa kwenzeka ukuba abantu bangayisebenzisi ngenxa ye-java-phobia, kuba i-ClickHouse yinkqubo enhle, ebhalwe kwi-C ++, ongayisebenzisa kwaye yonke into iza kulunga . Kwaye iZooKeeper ikwijava. Kwaye ngandlela thile awufuni nokujonga, kodwa ke ungasebenzisa ukuphindaphinda ngesandla.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

I-ClickHouse yinkqubo esebenzayo. Uzithathela ingqalelo iimfuno zakho. Ukuba uphindaphinda ngesandla, ngoko ungenza itafile eSasaziweyo ejonge kwiikopi zakho zesandla kwaye yenza ukusilela phakathi kwazo. Kwaye kukho ukhetho olukhethekileyo olukuvumela ukuba uphephe iiflops, nokuba iilayini zakho ziyahlukana ngokucwangcisiweyo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Ezinye iingxaki zinokuvela ukuba usebenzisa iinjini zetafile zamandulo. ClickHouse ngumakhi oneqela leenjini zetafile ezahlukeneyo. Kuzo zonke iimeko ezinzulu, njengoko kubhaliwe kumaxwebhu, sebenzisa iitafile ezivela kwintsapho ye-MergeTree. Kwaye bonke abanye - oku kunjalo, kwiimeko zomntu okanye iimvavanyo.

Kwitheyibhile yeMergeTree, akukho mfuneko yokuba ubenamhla kunye nexesha. Ungayisebenzisa nangoku. Ukuba akukho mhla kunye nexesha, bhala ukuba ukusilela ngu-2000. Oku kuya kusebenza kwaye akusayi kufuna izixhobo.

Kwaye kuguqulelo olutsha lomncedisi, ungakhankanya nokuba unokwahlulahlula ngokwesiko ngaphandle kwesitshixo sokwahlula. Kuya kuba njalo.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kwelinye icala, ungasebenzisa iinjini zetafile zamandulo. Umzekelo, gcwalisa idatha kube kanye kwaye ujonge, ujije kwaye ucime. Ungasebenzisa i-Log.

Okanye ukugcina imiqulu emincinci yokulungiswa okuphakathi yiStripeLog okanye iTinyLog.

Imemori ingasetyenziswa ukuba inani ledatha lincinci kwaye unokumane ujija into ethile kwi-RAM.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

I-ClickHouse ayiyithandi ncam idatha ehlaziyiweyo.

Nanku umzekelo oqhelekileyo. Eli linani elikhulu lee-URL. Uzibeke kwitafile elandelayo. Kwaye emva koko bagqiba ekubeni bahlanganyele nabo, kodwa oku akuyi kusebenza, njengomthetho, kuba i-ClickHouse ixhasa kuphela i-Hash JOIN. Ukuba akukho RAM yaneleyo yedatha eninzi ekufuneka iqhagamshelwe, ke u-JOIN ayizukusebenza*.

Ukuba idatha yekhadinali ephezulu, ke ungakhathazeki, yigcine kwifom ye-denormalized, ii-URL zibekwe ngokuthe ngqo kwitafile ephambili.

* kwaye ngoku i-ClickHouse nayo inokudibanisa ukudibanisa, kwaye isebenza kwiimeko apho idatha ephakathi ingangeni kwi-RAM. Kodwa oku akusebenzi kwaye iingcebiso zihlala zisebenza.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Eminye imizekelo embalwa, kodwa sele ndithandabuza ukuba ngaba bachasene nepateni okanye hayi.

I-ClickHouse inesiphako esinye esaziwayo. Andazi ukuba ihlaziywa njani*. Ngandlel’ ithile, oku kulungile. Ukuba unedatha ebalulekileyo, umzekelo, i-accounting, ngoko akukho mntu uya kukwazi ukuyithumela, kuba akukho zihlaziyo.

* Inkxaso yohlaziyo kunye nokucima kwimodi ye-batch yongezwe kwakudala.

Kodwa kukho iindlela ezikhethekileyo ezivumela uhlaziyo ngokungathi lusemva. Umzekelo, iitafile ezifana ReplaceMergeTree. Benza uhlaziyo ngexesha lodibaniso lwangasemva. Ungakunyanzela oku usebenzisa itafile yokwandisa. Kodwa musa ukwenza oku rhoqo, kuba iya kubhala ngaphezulu kwesahlulelo.

Ukusasazwa kwe-JOIN kwiClickHouse nayo ayiphathwa kakuhle sisicwangcisi sombuzo.

Kubi, kodwa ngamanye amaxesha kulungile.

Ukusebenzisa iClickHouse kuphela ukufunda idatha emva usebenzisa khetha*.

Andizukucebisa ukusebenzisa iClickHouse ukubala nzima. Kodwa oku akuyonyani ngokupheleleyo, kuba sele sisuka kude kule ngcebiso. Kwaye kutshanje songeze amandla okusebenzisa iimodeli zokufunda ngomatshini kwiClickHouse - Catboost. Kwaye iyandikhathaza kuba ndicinga, “Okunjani ukoyikeka. Le yindlela ephuma ngayo imijikelo ngebhayithi nganye! Ndikuthiyile kakhulu ukuchitha iiwotshi kwiibhayithi.

Ukusetyenziswa okusebenzayo kweClickHouse. UAlexey Milovidov (Yandex)

Kodwa ungoyiki, faka iClickHouse, yonke into iya kulunga. Ukuba kukho nantoni na, sinoluntu. Ngendlela, uluntu nguwe. Kwaye ukuba unayo nayiphi na ingxaki, ungaya kwingxoxo yethu, kwaye ngethemba baya kukunceda.

Imibuzo yakho

Enkosi ngengxelo! Ndingakhalazela phi malunga nokuwa kweClickHouse?

Ungakhalaza kum ngokwam ngoku.

Kutshanje ndiqalise ukusebenzisa iClickHouse. Ngoko nangoko ndawisa ujongano cli.

Yintoni amanqaku.

Kancinane emva koko ndawisa iseva ngokukhetha okuncinci.

Unetalente.

Ndavula i-GitHub bug, kodwa ayizange ihoywe.

Sobona.

U-Alexey wandikhohlisa ukuba ndiye kwingxelo, ethembisa ukundixelela ukuba ufikelela njani kwidatha ngaphakathi.

Ilula kakhulu.

Ndiyiqaphele izolo lento. Iinkcukacha ezingakumbi.

Akukho maqhinga amabi apho. Kukho nje ucinezelo lwebhloko ngebhloko. Ukungagqibeki yi-LZ4, ungayenza i-ZSTD*. Iibhloko ukusuka kwi-64 kilobytes ukuya kwi-megabyte eyi-1.

* kukwakho nenkxaso yeekhowudi zocinezelo ezikhethekileyo ezinokuthi zisetyenziswe kwitsheyini kunye nezinye ii-algorithms.

Ngaba iibhloko idatha ekrwada nje?

Hayi ekrwada ngokupheleleyo. Kukho uluhlu. Ukuba unomhlathi wamanani, ngoko ke amanani alandelelanayo abekwe kuluhlu.

Icacile.

U-Alexey, umzekelo owawune uniqExact over IPs, o.k.t into yokuba i-uniqExact ithatha ixesha elide ukubala ngemigca kunamanani, njalo njalo. Kuthekani ukuba sisebenzisa i-fiint ngeendlebe zethu kwaye siphosa ngexesha lokuvavanya? Oko kukuthi, kubonakala ngathi uthe kwidiski yethu ayifani kakhulu. Ukuba sifunda imigca esuka kwidiski kunye nokuphosa, ngaba ii-aggregates zethu ziya kukhawuleza okanye akunjalo? Okanye sizakuzuza kancinci apha? Kum kubonakala ngathi uyivavanyile le nto, kodwa ngesizathu esithile ungayibonakalisi kwibenchmark.

Ndicinga ukuba izakucotha kunokuba ngaphandle kokuphosa. Kule meko, idilesi ye-IP kufuneka ihlulwe ukusuka kumtya. Ewe kunjalo, eClickHouse, idilesi yethu ye-IP yokwahlulahlula nayo yenziwe. Sizame kakhulu, kodwa apho unamanani abhalwe kwifomu yewaka leshumi. Ayikhululekanga kakhulu. Ngakolunye uhlangothi, umsebenzi we-uniqExact uya kusebenza ngokukhawuleza kwiintambo, kungekhona nje ngenxa yokuba le mitya, kodwa nangenxa yokuba i-specialization eyahlukileyo ye-algorithm ikhethiwe. Iintambo zisetyenzwa ngokwahlukileyo.

Kwenzeka ntoni ukuba sithatha uhlobo lwedatha oludala? Umzekelo, sibhale phantsi i-id yomsebenzisi, esinayo, siyibhale phantsi njengomgca, saza sayikrazula, ngaba kuya kuba mnandi ngakumbi okanye akunjalo?

Andiqondi. Ndicinga ukuba kuya kuba lusizi ngakumbi, kuba emva kwayo yonke loo nto, ukucazulula amanani kuyingxaki enkulu. Kum kubonakala ngathi lo mlingane ude wanikela ingxelo malunga nokuba kunzima kangakanani ukwahlula amanani kwifomu yewaka lamawaka, kodwa mhlawumbi akunjalo.

U-Alexey, enkosi kakhulu ngengxelo! Kwaye enkosi kakhulu ngeClickHouse! Ndinombuzo malunga nezicwangciso. Ngaba kukho naziphi na izicwangciso zenqaku lokuhlaziya izichazi-magama ngokungagqibekanga?

Oko kukuthi, ukuqalisa kwakhona ngokuyinxenye?

Ewe Ewe. Njengokukwazi ukuseta intsimi ye-MySQL apho, oko kukuthi ukuhlaziya emva koko kuphela le datha ilayishwe ukuba isichazi-magama sikhulu kakhulu.

Uphawu olunomdla kakhulu. Kwaye ndicinga ukuba kukho umntu oyicebisile kwincoko yethu. Mhlawumbi ibinguwe.

Andicingi njalo.

Kuhle, ngoku kuvela ukuba kukho izicelo ezimbini. Kwaye ungaqalisa kancinci ukuyenza. Kodwa ndifuna ukukulumkisa ngoko nangoko ukuba olu phawu lulula ukulusebenzisa. Oko kukuthi, kwithiyori, kufuneka nje ubhale inombolo yoguqulelo kwitafile kwaye emva koko ubhale: inguqulelo engaphantsi kwaleyo naleyo. Oku kuthetha ukuba, ngokuqinisekileyo, siya kunikela oku kwabo banomdla. Ngaba ungumntu onomdla?

Ewe, kodwa, ngelishwa, hayi kwi-C ++.

Ngaba oogxa bakho bayayazi indlela yokubhala kwi-C++?

Ndizomfumana umntu.

Kakhulu*.

* inqaku longezwa kwiinyanga ezimbini emva kwengxelo - umbhali wombuzo uwuphuhlise kwaye wathumela owakhe tsala isicelo.

Ndiyabonga!

Mholo! Enkosi ngengxelo! Ukhankanye ukuba i-ClickHouse ilungile kakhulu ekusebenziseni zonke izixhobo ezikhoyo kuyo. Kwaye isithethi esisecaleni kweLuxoft sathetha ngesisombululo sakhe kwiPosi yaseRussia. Uthe bayithanda kakhulu iClickHouse, kodwa abazange bayisebenzise endaweni yokhuphiswano lwabo oluphambili ngokuchanekileyo kuba yayidla yonke i-CPU. Kwaye abakwazanga ukuyiplaga kuyilo lwabo, kwiZooKeeper yabo eneedokhi. Ngaba kuyenzeka ukuba ngandlela ithile unciphise iClickHouse ukuze ingatyi yonke into efumanekayo kuyo?

Ewe, kunokwenzeka kwaye kulula kakhulu. Ukuba ufuna ukusebenzisa ii-cores ezimbalwa, bhala nje set max_threads = 1. Kwaye yiloo nto, iya kwenza isicelo kwindawo enye. Ngaphezu koko, unokucacisa izicwangciso ezahlukeneyo zabasebenzisi abahlukeneyo. Ngoko akukho ngxaki. Kwaye uxelele oogxa bakho baseLuxoft ukuba akulunganga ukuba abalufumananga olu seto kumaxwebhu.

U-Alexey, molo! Ndingathanda ukubuza ngalo mbuzo. Eli ayisosihlandlo sokuqala ndisiva ukuba abantu abaninzi baqala ukusebenzisa i-ClickHouse njengendawo yokugcina izigodo. Kwingxelo uthe ungenzi oku, oko kukuthi awudingi ukugcina iintambo ezinde. Ucinga ntoni ngayo?

Okokuqala, izigodo, njengomthetho, azikho iintambo ezide. Kukho, ngokuqinisekileyo, ngaphandle. Umzekelo, enye inkonzo ebhalwe kwi-java iphosa ngaphandle, ifakwe. Kwaye njalo kwi-loop engapheliyo, kwaye isithuba kwi-hard drive siyaphuma. Isisombululo silula kakhulu. Ukuba imigca inde kakhulu, yinqumle. Lithetha ukuthini ixesha elide? Amashumi eekhilobhayithi alungile*.

* kwiinguqulelo zamva nje zeClickHouse, "i-adaptive index granularity" yenziwe, ephelisa ingxaki yokugcina imiqolo emide inxalenye enkulu.

Ingaba ikhilobhayithi iqhelekile?

Kuqhelekile.

Mholo! Enkosi ngengxelo! Sele ndibuzile ngale nto kwincoko, kodwa andikhumbuli ukuba ndifumene impendulo. Ngaba kukho izicwangciso zokwandisa ngandlela-thile icandelo le-WITH ngendlela ye-CTE?

Ayikenzeki. Icandelo lethu NGENXA liyinto engenamsebenzi. Kufana nenqaku elincinci kuthi.

Ndiyaqonda. Enkosi!

Enkosi ngengxelo! Inika umdla kakhulu! Umbuzo wehlabathi. Ngaba kukho naziphi na izicwangciso zokuguqula ukucinywa kwedatha, mhlawumbi ngohlobo oluthile lwe-stubs?

Ngokuyimfuneko. Lo ngumsebenzi wethu wokuqala kumgca wethu. Ngoku sicinga ngenkuthalo malunga nendlela yokwenza yonke into ngokuchanekileyo. Kwaye kufuneka uqale ukucofa ibhodi yezitshixo*.

*cofa amaqhosha ekhibhodi kwaye wenza yonke into.

Ngaba oku ngandlela thile kuya kuyichaphazela inkqubo yokusebenza okanye hayi? Ngaba ukufakela kuya kukhawuleza njengangoku?

Mhlawumbi ukucima ngokwabo kunye nokuhlaziywa ngokwabo kuya kuba nzima kakhulu, kodwa oku akuyi kuchaphazela ukusebenza kokukhetha okanye ukusebenza kokufakela.

Kwaye omnye umbuzo omncinci. Kumboniso uthethe ngesitshixo sokuqala. Ngokufanelekileyo, sinokwahlulahlula, okuyinyanga ngokungagqibekanga, akunjalo? Kwaye xa siseta uluhlu lomhla olungena kwinyanga, ke kuphela esi sahlulelo esifundwayo, akunjalo?

Ewe.

Umbuzo. Ukuba asikwazi ukukhetha nasiphi na isitshixo esiphambili, ngaba kulungile ukuba sikwenze ngokuthe ngqo kwibala elithi "Umhla" ukwenzela ukuba ngasemva kukho ukulungiswa okuncinci kwale datha ukwenzela ukuba ifaneleke ngendlela enocwangco? Ukuba awunayo imibuzo yoluhlu kwaye awukwazi nokukhetha nasiphi na isitshixo sokuqala, ngaba kufanelekile ukubeka umhla kwisitshixo sokuqala?

Ewe.

Mhlawumbi kunengqiqo ukubeka intsimi kwiqhosha eliphambili eliya kucinezela idatha ngcono ukuba ihlelwe yile ntsimi. Umzekelo, i-ID yomsebenzisi. Umsebenzisi, umzekelo, uya kwindawo enye. Kule meko, beka id yomsebenzisi kunye nexesha. Kwaye ke idatha yakho iya kuba ngcono icinezelwe. Ngokubhekiselele kumhla, ukuba awunayo kwaye awusoze ube nemibuzo yoluhlu ngemihla, akufuneki ubeke umhla kwisitshixo sokuqala.

Kulungile enkosi kakhulu!

umthombo: www.habr.com

Yongeza izimvo