Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Njengoba i-ClickHouse iwuhlelo olukhethekile, lapho uyisebenzisa kubalulekile ukucabangela izici zezakhiwo zayo. Kulo mbiko, u-Alexey uzokhuluma ngezibonelo zamaphutha avamile lapho usebenzisa i-ClickHouse, engaholela emsebenzini ongasebenzi. Izibonelo ezingokoqobo zizobonisa ukuthi ukukhetha isikimu esisodwa noma esinye sokucubungula idatha kungashintsha kanjani ukusebenza ngama-oda wobukhulu.

Sanibonani nonke! Igama lami ngingu-Alexey, ngenza i-ClickHouse.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Okokuqala, ngishesha ukukujabulisa ngaso leso sikhathi, namuhla ngeke ngikutshele ukuthi iyini i-ClickHouse. Uma ngikhuluma iqiniso, ngikhathele yilo. Njalo uma ngikutshela ukuthi kuyini. Futhi cishe wonke umuntu useyazi.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Esikhundleni salokho, ngizokutshela ukuthi yimaphi amaphutha okungenzeka akhona, okungukuthi, ungasebenzisa kanjani i-ClickHouse ngokungalungile. Eqinisweni, asikho isidingo sokwesaba, ngoba sithuthukisa i-ClickHouse njengesistimu elula, elula, futhi esebenza ngaphandle kwebhokisi. Ngiyifakile, azikho izinkinga.

Kodwa usadinga ukucabangela ukuthi lesi simiso sikhethekile futhi ungakwazi kalula ukuhlangabezana necala elingavamile lokusebenzisa elizokhipha lesi simiso endaweni yalo yokunethezeka.

Ngakho-ke, hlobo luni lwe-rake olukhona? Ikakhulukazi ngizokhuluma ngezinto ezisobala. Konke kusobala kuwo wonke umuntu, wonke umuntu uqonda yonke into futhi angajabula ngokuthi bahlakaniphe kakhulu, futhi labo abangaqondi bazofunda okuthile okusha.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Isibonelo sokuqala futhi esilula, okuyinto, ngeshwa, evame ukuvela, inombolo enkulu yokufakwa ngamaqoqo amancane, okungukuthi inani elikhulu lokufakwa okuncane.

Uma sicabanga ukuthi i-ClickHouse yenza kanjani ukufaka, ungathumela okungenani i-terabyte yedatha esicelweni esisodwa. Akuyona inkinga.

Ake sibone ukuthi ukusebenza okujwayelekile kungaba yini. Isibonelo, sinetafula elivela kudatha ye-Yandex.Metrica. Amahithi. 105 amanye amakholomu. 700 byte uncompressed. Futhi sizofaka ngendlela enhle ngamaqoqo emigqa eyisigidi.

Sifaka i-MergeTree etafuleni, kuvela imigqa eyingxenye yesigidi ngomzuzwana. Kuhle. Etafuleni eliphindiwe lizoba lincane kancane, cishe imigqa engu-400 ngomzuzwana.

Futhi uma uvumela ukufakwa kwekhoramu, uthola kancane kancane, kodwa ukusebenza okuhloniphekile, amagama angu-250 ngomzuzwana. Ukufakwa kwekhoramu isici esingabhaliwe ku-ClickHouse*.

* kusukela ngo-2020, osekubhaliwe.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Kwenzekani uma wenza into embi? Sifaka umugqa owodwa etafuleni le-MergeTree futhi sithola imigqa engama-59 ngomzuzwana. Lokho kuhamba kancane ka-10. Ku-ReplicatedMergeTree - imigqa engu-000 ngomzuzwana. Futhi uma ikhoramu ivuliwe, bese kuba imigqa emi-6 ngomzuzwana. Ngokubona kwami, lolu uhlobo oluthile lwe-crap ngokuphelele. Unganciphisa kanjani ijubane kanjalo? Ngize ngibhale kusikibha sami ukuthi i-ClickHouse akufanele yehlise ijubane. Kodwa noma kunjalo kuyenzeka ngezinye izikhathi.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Eqinisweni, lokhu ukushiyeka kwethu. Besingenza kalula yonke into isebenze kahle, kodwa asizange. Futhi asikwenzanga ngoba umbhalo wethu awukudingi. Besivele sinamabhusha. Sisanda kuthola amaqoqo emnyango wethu, futhi azikho izinkinga. Siyayifaka futhi konke kusebenza kahle. Kodwa-ke, zonke izinhlobo zezimo zingenzeka. Isibonelo, uma unenqwaba yamaseva lapho idatha ikhiqizwa khona. Futhi abafaki idatha kaningi, kodwa basagcina ngokufaka njalo. Futhi sidinga ukuthi ngandlela thize sikugweme lokhu.

Ngokombono wezobuchwepheshe, iphuzu liwukuthi uma ufaka ku-ClickHouse, idatha ayigcini kunoma yikuphi okukhumbulekayo. Asinaso ngisho nesakhiwo selogi sangempela i-MergeTree, kodwa i-MergeTree nje, ngoba ayikho ilogi noma i-memTable. Simane sibhale ngokushesha idatha ohlelweni lwefayela, eseluhlelwe ngamakholomu. Futhi uma unamakholomu ayi-100, amafayela angaphezu kuka-200 azodinga ukuthi abhalwe ohlwini lwemibhalo oluhlukile. Konke lokhu kunzima kakhulu.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Futhi kuphakama umbuzo: "Ungayenza kanjani kahle?" Uma isimo sinjalo ukuthi usadinga ngandlela-thile ukurekhoda idatha ku-ClickHouse.

Indlela 1. Lena indlela elula. Sebenzisa uhlobo oluthile lomugqa osabalalisiwe. Ngokwesibonelo, Kafka. Umane ukhiphe idatha ku-Kafka bese uyihlanganisa kanye ngomzuzwana. Futhi konke kuzolunga, uyarekhoda, konke kusebenza kahle.

Okubi ukuthi i-Kafka ingenye uhlelo olusatshalaliswa ngobuningi. Ngiyaqonda futhi uma usunayo i-Kafka enkampanini yakho. Kuhle, kuyafaneleka. Kodwa uma ingekho, kufanele ucabange izikhathi ezintathu ngaphambi kokudonsa enye isistimu esabalalisiwe kuphrojekthi yakho. Futhi ngakho-ke kufanelekile ukucabangela ezinye izindlela.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Indlela 2. Lena enye indlela yesikole esidala futhi ngesikhathi esifanayo ilula kakhulu. Ingabe unalo uhlobo oluthile lweseva olukhiqiza izingodo zakho. Futhi ivele ibhale izingodo zakho efayeleni. Futhi kanye ngomzuzwana, ngokwesibonelo, siqamba kabusha leli fayela futhi sidabule elisha. Futhi umbhalo ohlukile, kungaba nge-cron noma nge-daemon ethile, uthatha ifayela elidala futhi ulibhale ku-ClickHouse. Uma urekhoda izingodo kanye ngomzuzwana, khona-ke konke kuzolunga.

Kodwa okubi kwale ndlela ukuthi uma iseva yakho lapho amalogi akhiqizwa khona inyamalala endaweni ethile, khona-ke idatha nayo izonyamalala.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Indlela 3. Kukhona enye indlela ethakazelisayo, engadingi amafayela esikhashana nhlobo. Isibonelo, unohlobo oluthile lwesipina sokukhangisa noma enye i-daemon ethokozisayo ekhiqiza idatha. Futhi ungakwazi ukuqongelela inqwaba yedatha ngqo ku-RAM, kubhafa. Futhi lapho isikhathi esanele sesidlulile, ubeka lesi sigcinalwazi eceleni, udale entsha, bese ngomucu ohlukile, ufake osekunqwabelene kakade ku-ClickHouse.

Ngakolunye uhlangothi, idatha nayo iyanyamalala ngokubulala -9. Uma iseva yakho iphahlazeka, uzolahlekelwa yile datha. Futhi enye inkinga ukuthi uma ungakwazanga ukubhala ku-database, idatha yakho izoqoqwa ku-RAM. Futhi i-RAM izophela, noma uzomane ulahlekelwe idatha.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Indlela 4. Enye indlela ethokozisayo. Ingabe unalo uhlobo oluthile lwenqubo yeseva. Futhi ingathumela idatha ku-ClickHouse ngokushesha, kodwa ikwenze ngoxhumano olulodwa. Isibonelo, ngithumele isicelo se-http esinombhalo wokudlulisa: ohlanganiswe nokufaka. Futhi ikhiqiza izingcezu akuvamile, ungathumela ulayini ngamunye, nakuba kuzoba khona phezulu ukuze kwenziwe uzimele le datha.

Kodwa-ke, kulokhu idatha izothunyelwa ku-ClickHouse ngokushesha. Futhi i-ClickHouse izozibekela yona uqobo.

Kodwa izinkinga nazo ziyavela. Manje uzolahlekelwa idatha, okuhlanganisa ukuthi inqubo yakho ibulawa nini futhi uma inqubo ye-ClickHouse ibulawa, ngoba kuyoba ukufakwa okungaphelele. Futhi kokufakwayo kwe-ClickHouse kuyi-athomu kufika kumkhawulo othile oshiwo ngosayizi wemigqa. Empeleni, lena indlela ezithakazelisayo. Ingasetshenziswa futhi.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Indlela 5. Nansi enye indlela ethokozisayo. Lolu wuhlobo oluthile lweseva ethuthukiswe umphakathi yokuhlanganiswa kwedatha. Angikakubheki mina, ngakho angikwazi ukuqinisekisa lutho. Nokho, azikho iziqinisekiso ezinikeziwe ze-ClickHouse ngokwayo. Lona futhi umthombo ovulekile, kodwa ngakolunye uhlangothi, ungase usetshenziswe ezingeni elithile lekhwalithi esizama ukukunikeza. Kepha ngale nto - angazi, hamba ku-GitHub, ubheke ikhodi. Mhlawumbe babhale okuthile okujwayelekile.

* kusukela ngo-2020, kufanele futhi kwengezwe ekucatshangelweni I-KittenHouse.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Indlela 6. Enye indlela ukusebenzisa amathebula e-Buffer. Inzuzo yale ndlela ukuthi kulula kakhulu ukuqala ukuyisebenzisa. Dala ithebula le-Buffer bese ulifaka kulo.

Ububi ukuthi inkinga ayixazululeki ngokuphelele. Uma, ngesilinganiso esifana ne-MergeTree, kufanele uqoqe idatha ngebheshi eyodwa ngomzuzwana, ngakho-ke ngenani kuthebula le-buffer, udinga ukuqoqa okungenani kufika ezinkulungwaneni ezimbalwa ngomzuzwana. Uma ingaphezu kuka-10 ngomzuzwana, izobe isazoba yimbi. Futhi uma uyifaka ngamaqoqo, bese ubona ukuthi iphenduka imigqa eyizinkulungwane eziyikhulu ngomzuzwana. Futhi lokhu sekuvele kudatha esindayo.

Futhi amathebula e-buffer awanayo ilogi. Futhi uma kukhona okungalungile ngeseva yakho, idatha izolahleka.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Futhi njengebhonasi, sisanda kuthola ithuba e-ClickHouse lokubuyisa idatha ku-Kafka. Kukhona injini yetafula - i-Kafka. Uvele udale. Futhi ungalengisa izethulo ezenziwe ngezinto ezibonakalayo kuso. Kulokhu, izokhipha idatha ku-Kafka futhi ifake ematafuleni owadingayo.

Futhi okujabulisa kakhulu ngaleli thuba wukuthi akusithina elenzile. Lesi isici somphakathi. Futhi uma ngithi β€œisici somphakathi,” ngiqonde ngaphandle kokudelela. Siyifundile ikhodi, senze isibuyekezo, kufanele isebenze kahle.

* kusukela ngo-2020, ukwesekwa okufanayo kuye kwavela RabbitMQ.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Yini enye engase iphazamise noma ingalindelekile lapho ufaka idatha? Uma wenza isicelo sokufaka amanani bese ubhala izinkulumo ezibaliwe ngamanani. Isibonelo, i-now() nayo iyinkulumo ebaliwe. Futhi kulokhu, i-ClickHouse iphoqeleka ukuthi iqalise umhumushi walezi zinkulumo kulayini ngamunye, futhi ukusebenza kuzokwehla ngama-oda wobukhulu. Kungcono ukukugwema lokhu.

* okwamanje, inkinga isixazululwe ngokuphelele, akusekho ukuhlehla kokusebenza uma kusetshenziswa izinkulumo kokuthi VALUES.

Esinye isibonelo yilapho kungase kube khona izinkinga ezithile uma unedatha kuqoqo elilodwa okungeyeqembu lama-partitions. Ngokuzenzakalelayo, izingxenye ze-ClickHouse zenyanga. Futhi uma ufaka iqoqo lemigqa eyisigidi, futhi kukhona idatha yeminyaka eminingana, khona-ke uzoba nezingxenye ezimbalwa lapho. Futhi lokhu kulingana neqiniso lokuthi kuzoba nama-batches amashumi ambalwa izikhathi ezincane ngosayizi, ngoba ngaphakathi ahlala ehlukaniswa kuqala ngama-partitions.

* Muva nje, kumodi yokuhlola, i-ClickHouse yengeze ukusekelwa kwefomethi ehlangene yama-chunks nama-chunks ku-RAM enologi yokubhala phambili, ecishe iyixazulule ngokuphelele inkinga.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Manje ake sibheke uhlobo lwesibili lwenkinga - ukuthayipha kwedatha.

Ukuthayipha kwedatha kungaba okuqinile noma iyunithi yezinhlamvu. I-String yilapho usanda kuyithatha futhi umemezele ukuthi zonke izinkambu zakho ziwuchungechunge lohlobo. Kuyanya lokhu. Asikho isidingo sokwenza lokhu.

Ake sithole ukuthi sikwenza kanjani ngendlela efanele kulezo zimo lapho ufuna ukusho ukuthi sinenkambu ethile, intambo, futhi ake i-ClickHouse izitholele yona ngokwayo, futhi ngeke ngizihluphe. Kodwa kusafanele ukwenza umzamo othile.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Isibonelo, sinekheli le-IP. Kwesinye isimo, siyilondoloze njengeyunithi yezinhlamvu. Isibonelo, 192.168.1.1. Futhi kwesinye isimo, kuzoba inombolo yohlobo lwe-UInt32*. Amabhithi angu-32 anele ikheli le-IPv4.

Okokuqala, okuxakile, idatha izocindezelwa cishe ngokulinganayo. Kuyoba khona umehluko, kunjalo, kodwa hhayi ukuthi mkhulu. Ngakho-ke azikho izinkinga ezikhethekile nge-disk I/O.

Kepha kunomehluko omkhulu ngesikhathi sokusebenza kanye nesikhathi sokwenza imibuzo.

Ake sibale inombolo yamakheli e-IP ahlukile uma egcinwe njengezinombolo. Lokho kusebenza kolayini abayizigidi eziyi-137 ngomzuzwana. Uma okufanayo kusesimweni sezintambo, bese kuba nemigqa eyizigidi ezingu-37 ngomzuzwana. Angazi ukuthi kungani kwenzeke lokhu kuziqondana. Ngizenze ngokwami ​​lezi zicelo. Kodwa kusahamba kancane izikhathi ezi-4.

Futhi uma ubala umehluko endaweni yediski, khona-ke kukhona umehluko. Futhi umehluko cishe ikota eyodwa, ngoba maningi impela amakheli e-IP ahlukile. Futhi ukube bekunemigqa enenani elincane lezincazelo ezihlukene, ibingacindezelwa kalula ngokwesichazamazwi ibe cishe umthamo ofanayo.

Futhi umehluko wesikhathi ophindwe kane awulele emgwaqeni. Mhlampe awungiphoxi, kodwa uma ngibona umehluko onje, kuyangidabukisa.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Ake sibheke amacala ahlukene.

1. Isimo esisodwa uma unamanani ambalwa ahlukile ahlukile. Kulokhu, sisebenzisa umkhuba olula okungenzeka uyazi futhi ongawusebenzisa kunoma iyiphi i-DBMS. Konke lokhu kunengqondo hhayi nge-ClickHouse kuphela. Vele ubhale izihlonzi zezinombolo kusizindalwazi. Futhi ungaguqulela kuyunithi yezinhlamvu bese ubuyela ohlangothini lohlelo lwakho lokusebenza.

Isibonelo, unesifunda. Futhi uzama ukuyigcina njengeyunithi yezinhlamvu. Futhi kuzobhalwa lapho: Isifunda saseMoscow neMoscow. Futhi lapho ngibona ukuthi ithi "Moscow", akuyona into, kodwa uma iMoscow, ngandlela-thile iba buhlungu ngokuphelele. Nakhu ukuthi mangaki amabhayithi.

Kunalokho, simane sibhala phansi inombolo ethi Ulnt32 no-250. Sine-250 ku-Yandex, kodwa eyakho ingase yehluke. Uma kwenzeka, ngizosho ukuthi i-ClickHouse inekhono elakhelwe ngaphakathi lokusebenza nge-geobase. Umane ubhale phansi uhla lwemibhalo olunezifunda, okuhlanganisa nelokubusa, okungukuthi kuzoba neMoscow, Isifunda saseMoscow, nakho konke okudingayo. Futhi ungakwazi ukuguqula ezingeni lesicelo.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Inketho yesibili icishe ifane, kodwa ngokusekelwa ngaphakathi kwe-ClickHouse. Lolu uhlobo lwedatha ye-Enum. Uvele ubhale wonke amanani owadingayo ngaphakathi kwe-Enum. Isibonelo, uhlobo lwedivayisi bese ubhala lapho: ideskithophu, iselula, ithebhulethi, i-TV. Kunezinketho ezi-4 sezizonke.

Okubi ukuthi udinga ukuyishintsha ngezikhathi ezithile. Inketho eyodwa nje yengeziwe. Masenze ukushintsha ithebula. Eqinisweni, itafula lokushintsha ku-ClickHouse limahhala. Ikakhulukazi mahhala ku-Enum ngoba idatha ekudiski ayishintshi. Kodwa nokho, i-alter ithola ukukhiya* etafuleni futhi kufanele ilinde kuze kukhishwe konke okukhethiwe. Futhi kuphela ngemva kokwenziwa kwalolu shintsho, okungukuthi, kusenezinkinga ezithile.

* ezinguqulweni zakamuva ze-ClickHouse, i-ALTER yenziwe ingavimbeli ngokuphelele.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Enye inketho ehluke kakhulu kwe-ClickHouse ukuxhuma izichazamazwi zangaphandle. Ungabhala izinombolo ku-ClickHouse, futhi ugcine izinkomba zakho kunoma yiluphi uhlelo olulungele wena. Isibonelo, ungasebenzisa: MySQL, Mongo, Postgres. Ungakwazi ngisho nokudala i-microservice yakho ezothumela le datha nge-http. Futhi ezingeni le-ClickHouse, ubhala umsebenzi ozoguqula le datha isuka ezinombolweni iye kuyunithi yezinhlamvu.

Lena indlela ekhethekile kodwa esebenza kahle kakhulu yokwenza ukujoyina kuthebula langaphandle. Futhi kukhona ezimbili ongakhetha. Ngomfanekiso owodwa, le datha izogcinwa ngokuphelele, ibe khona ngokugcwele ku-RAM futhi ibuyekezwe ngefrikhwensi ethile. Futhi kwenye inketho, uma le datha ingangeni ku-RAM, ungakwazi ukuyigcina ngokwengxenye.

Nasi isibonelo. Kukhona i-Yandex.Direct. Futhi kukhona inkampani yokukhangisa namabhanela. Kukhona cishe amashumi ezigidi zezinkampani zokukhangisa. Futhi zilingana kahle ne-RAM. Futhi kunezigidigidi zamabhanela, azilingani. Futhi sisebenzisa isichazamazwi esigcinwe kunqolobane esivela ku-MySQL.

Inkinga nje ukuthi isichazamazwi esifakwe kunqolobane sizosebenza kahle uma izinga lokushaya lisondele ku-100%. Uma incane, khona-ke lapho ucubungula imibuzo yeqoqo ngalinye ledatha, empeleni kuzodingeka uthathe okhiye abalahlekile futhi uyolanda idatha ku-MySQL. Mayelana ne-ClickHouse, ngisengaqinisekisa ukuthi - yebo, ayinciphisi, ngeke ngikhulume ngezinye izinhlelo.

Futhi njengebhonasi, izichazamazwi ziyindlela elula kakhulu yokuvuselela kabusha idatha ku-ClickHouse. Okusho ukuthi, ube nombiko wezinkampani zokukhangisa, umsebenzisi uvele washintsha inkampani yokukhangisa futhi kuyo yonke idatha yakudala, kuyo yonke imibiko, le datha nayo yashintsha. Uma ubhala imigqa ngokuqondile etafuleni, ngeke kwenzeke ukuyibuyekeza.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Enye indlela uma ungazi ukuthi uzitholaphi izihlonzi zeyunithi yezinhlamvu zakho. ungamane uyisheshise. Ngaphezu kwalokho, inketho elula ukuthatha i-hash engu-64-bit.

Inkinga nje ukuthi uma i-hashi ingu-64-bit, cishe uzoba nokushayisana. Ngoba uma kunemigqa eyizigidi eziyizinkulungwane lapho, khona-ke amathuba asevele abonakala.

Futhi ngeke kube kuhle kakhulu ukubiza amagama ezinkampani zokukhangisa ngale ndlela. Uma imikhankaso yokukhangisa yezinkampani ezahlukene ixubene, khona-ke kuyoba khona into engaqondakali.

Futhi kukhona iqhinga elilula. Yiqiniso, futhi ayifaneleki kakhulu kudatha ebucayi, kodwa uma kukhona okungathΓ­ sina kakhulu, vele wengeze isihlonzi seklayenti kukhiye wesichazamazwi. Futhi-ke uzoba nokushayisana, kodwa ngaphakathi kweklayenti elilodwa kuphela. Futhi sisebenzisa le ndlela kumamephu esixhumanisi ku-Yandex.Metrica. Sinama-URL lapho, sigcina ama-hash. Futhi siyazi ukuthi, kunjalo, kukhona ukushayisana. Kodwa uma ikhasi liboniswa, amathuba okuthi ekhasini elilodwa lomsebenzisi oyedwa amanye ama-URL anamathelene futhi lokhu kuzoqashelwa anganakwa.

Njengebhonasi, emisebenzini eminingi ama-hashes ewodwa anele futhi izintambo ngokwazo azidingi ukugcinwa noma kuphi.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Esinye isibonelo ukuthi uma izintambo zifushane, isibonelo, izizinda zewebhusayithi. Angagcinwa njengoba enjalo. Noma, isibonelo, ulimi lwesiphequluli ru yi-2 bytes. Yiqiniso, ngidabukela ngempela ama-byte, kodwa ungakhathazeki, ama-byte angu-2 awasona isihawu. Sicela uyigcine njengoba injalo, ungakhathazeki.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Esinye isimo lapho, ngokuphambene nalokho, kunemigqa eminingi futhi kunezimo eziningi eziyingqayizivele kuzo, futhi ngisho nesethi kungenzeka ingenamkhawulo. Isibonelo esijwayelekile imishwana yosesho noma ama-URL. Sesha imishwana, okuhlanganisa nokuthayipha. Ake sibone ukuthi mingaki imishwana yosesho ehlukile ekhona ngosuku. Futhi kuvela ukuthi cishe ziyingxenye yazo zonke izenzakalo. Futhi kulesi simo, ungase ucabange ukuthi udinga ukwenza idatha ibe evamile, ubale izihlonzi, futhi uyibeke etafuleni elihlukile. Kodwa awudingi ukwenza lokho. Gcina le migqa njengoba injalo.

Kungcono ukungaqambi lutho, ngoba uma uyigcina ngokwehlukana, uzodinga ukujoyina. Futhi lokhu kujoyina, okungenani, ukufinyelela okungahleliwe kumemori, uma kusangena kumemori. Uma ingangeni, kuzoba nezinkinga.

Futhi uma idatha igcinwe endaweni, khona-ke imane ifundwe ngendlela edingekayo ohlelweni lwefayela futhi konke kuhamba kahle.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Uma unama-URL noma olunye uchungechunge olude oluyinkimbinkimbi, kufanele ucabangele ukuthi ungakwazi ukubala uhlobo oluthile lokukhishwa kusengaphambili bese ulubhala kukholomu ehlukile.

Kuma-URL, isibonelo, ungagcina isizinda ngokwehlukana. Futhi uma udinga ngempela isizinda, vele usebenzise le kholomu, futhi ama-URL azobe elele lapho, futhi ngeke uze uwathinte.

Ake sibone ukuthi uyini umehluko. I-ClickHouse inomsebenzi okhethekile obala isizinda. Kuyashesha kakhulu, sikulungiselele. Futhi, uma ngikhuluma iqiniso, ayihambisani ngisho ne-RFC, kodwa noma kunjalo icubungula konke esikudingayo.

Futhi esimweni esisodwa sizovele sithole ama-URL futhi sibale isizinda. Lokho kusebenza kuma-millisecond angu-166. Futhi uma uthatha isizinda esenziwe ngomumo, khona-ke kuvela ama-millisecond angama-67 kuphela, okungukuthi ngokushesha okuphindwe kathathu. Futhi kuyashesha hhayi ngoba sidinga ukwenza izibalo ezithile, kodwa ngoba sifunda idatha encane.

Yingakho isicelo esisodwa, esihamba kancane, sinesivinini esikhulu samagigabhayithi ngomzuzwana. Ngoba ifunda amagigabhayithi amaningi. Lena idatha engadingekile ngokuphelele. Isicelo sibonakala sisebenza ngokushesha, kodwa kuthatha isikhathi eside ukuqedwa.

Futhi uma ubheka inani ledatha kudiski, kuvela ukuthi i-URL ingama-megabytes angu-126, futhi isizinda singamamegabhayithi angu-5 kuphela. Kuvela izikhathi ezingu-25 ngaphansi. Kodwa noma kunjalo, isicelo senziwa ngokushesha izikhathi ezi-4. Kodwa lokho kungenxa yokuthi idatha iyashisa. Futhi ukube bekubanda, bekuzoshesha izikhathi ezingu-25 ngenxa yediski I/O.

Kodwa-ke, uma ulinganisela ukuthi isizinda sincane kangakanani kune-URL, kuvela ukuthi sincane ngokuphindwe ka-4. Kodwa ngesizathu esithile, idatha ithatha izikhathi eziphindwe ka-25 kudiski. Kungani? Ngenxa yokucindezelwa. Futhi i-URL iyacindezelwa, futhi isizinda siyacindezelwa. Kodwa ngokuvamile i-URL iqukethe inqwaba kadoti.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Futhi, kunjalo, kuyakhokha ukusebenzisa izinhlobo zedatha ezifanele eziklanyelwe amanani afiselekayo noma ezifanele. Uma uku-IPv4, gcina i-UInt32*. Uma i-IPv6, bese i-FixedString(16), ngoba ikheli le-IPv6 lingamabhithi angu-128, okungukuthi agcinwe ngokuqondile kufomethi kanambambili.

Kodwa kuthiwani uma ngezinye izikhathi uba namakheli e-IPv4 futhi ngezinye izikhathi IPv6? Yebo, ungakugcina kokubili. Ikholomu eyodwa ye-IPv4, enye eye-IPv6. Yebo, kukhona inketho yokubonisa i-IPv4 ku-IPv6. Lokhu kuzosebenza futhi, kodwa uma uvamise ukudinga ikheli le-IPv4 ezicelweni, kungaba kuhle ukulibeka kukholamu ehlukile.

* I-ClickHouse manje ine-IPv4 ehlukene, izinhlobo zedatha ye-IPv6 ezigcina idatha ngokuphumelelayo njengezinombolo, kodwa ezimele kalula njengeyunithi yezinhlamvu.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Kubalulekile futhi ukuqaphela ukuthi kufanelekile ukucubungula idatha kusengaphambili. Isibonelo, uthola izingodo ezingavuthiwe. Futhi mhlawumbe akufanele nje uwafake ku-ClickHouse ngokushesha, nakuba kulinga kakhulu ukungenzi lutho futhi konke kuzosebenza. Kodwa kusafanele ukwenza izibalo okungenzeka.

Isibonelo, inguqulo yesiphequluli. Komunye umnyango oseduze, engingafuni ukuwukhomba ngomunwe, inguqulo yesiphequluli igcinwa kanje, okungukuthi, njengentambo: 12.3. Bese-ke, ukwenza umbiko, bathathe lolu chungechunge bese beluhlukanisa lube ngohlelo, bese lube yisici sokuqala samalungu afanayo. Ngokwemvelo, konke kuhamba kancane. Ngabuza ukuthi kungani benza lokhu. Bangitshele ukuthi abakuthandi ukwenza kahle ngaphambi kwesikhathi. Futhi angikuthandi ukungathembeki ngaphambi kwesikhathi.

Ngakho-ke kulesi simo kuyoba okulungile kakhulu ukuhlukanisa amakholomu angu-4. Ungesabi lapha, ngoba le yi-ClickHouse. I-ClickHouse iyisizindalwazi sekholamu. Futhi amakholomu amancane acoceke kakhulu, angcono. Kuzoba khona 5 BrowserVersions, yenza amakholomu angu-5. Lokhu kuhle.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Manje ake sibheke ukuthi yini okufanele uyenze uma unezintambo eziningi ezinde kakhulu, izinhlu ezinde kakhulu. Azidingi ukugcinwa ku-ClickHouse nhlobo. Kunalokho, ungagcina kuphela isihlonzi ku-ClickHouse. Futhi ubeke le migqa emide kolunye uhlelo.

Isibonelo, enye yezinsizakalo zethu zokuhlaziya inezinhlaka ezithile zomcimbi. Futhi uma kunamapharamitha amaningi emicimbi, sivele sigcine eyokuqala engu-512 etholakalayo.Ngoba u-512 akadabuki.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Futhi uma ungakwazi ukunquma ngezinhlobo zedatha yakho, ungaphinda urekhode idatha ku-ClickHouse, kodwa kuthebula lesikhashana lohlobo lwe-Log, olukhethekile lwedatha yesikhashana. Ngemuva kwalokhu, ungahlaziya ukuthi yikuphi ukusatshalaliswa kwamanani onakho lapho, yini ekhona ngokujwayelekile, futhi udale izinhlobo ezifanele.

*I-ClickHouse manje inohlobo lwedatha I-Cardinality ephansi okukuvumela ukuthi ugcine izintambo kahle ngomzamo omncane.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Manje ake sibheke elinye icala elithakazelisayo. Kwesinye isikhathi izinto zisebenza ngendlela exakile kubantu. Ngingene ngibone lokhu. Futhi kubukeka sengathi lokhu kwenziwa umlawuli othile onolwazi kakhulu, ohlakaniphile onolwazi olunzulu ekusetheni inguqulo ye-MySQL engu-3.23.

Lapha sibona amatafula ayinkulungwane, ngalinye lirekhoda ingxenye esele yokuhlukanisa ubani owaziyo ukuthi yini ngenkulungwane.

Empeleni, ngiyakuhlonipha okuhlangenwe nakho kwabanye abantu, okuhlanganisa nokuqonda ukuhlupheka okungazuzwa ngalokhu okuhlangenwe nakho.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Futhi izizathu zisobala kakhulu. Lawa ama-stereotypes amadala okungenzeka ukuthi anqwabelana ngenkathi esebenza namanye amasistimu. Isibonelo, amathebula e-MyISAM awanawo ukhiye oyinhloko ohlanganisiwe. Futhi le ndlela yokuhlukanisa idatha ingase ibe umzamo onzima wokuthola ukusebenza okufanayo.

Esinye isizathu ukuthi kunzima ukwenza noma yimiphi imisebenzi yokushintsha amatafula amakhulu. Konke kuzovinjwa. Nakuba ezinguqulweni zanamuhla ze-MySQL le nkinga ayiseyona imbi kangako.

Noma, isibonelo, i-microsharding, kodwa okuningi ngalokho kamuva.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Asikho isidingo sokwenza lokhu ku-ClickHouse, ngoba, okokuqala, ukhiye oyinhloko uhlanganisiwe, idatha ihlelwe ukhiye oyinhloko.

Futhi kwesinye isikhathi abantu bayangibuza: "Ingabe ukusebenza kwemibuzo yobubanzi ku-ClickHouse kuyehluka kanjani kuye ngosayizi wetafula?" Ngithi akushintshi nakancane. Isibonelo, unetafula elinemigqa eyibhiliyoni futhi ufunda ububanzi bemigqa eyisigidi. Konke kuhamba kahle. Uma kunemigqa yethriliyoni etafuleni futhi ufunda imigqa eyisigidi, izocishe ifane.

Futhi, okwesibili, zonke izinhlobo zezinto ezinjengama-partitions okwenziwa ngesandla azidingeki. Uma ungena futhi ubheka ukuthi yini ohlelweni lwamafayela, uzobona ukuthi ithebula liyinto enkulu kakhulu. Futhi kukhona into efana nama-partitions ngaphakathi. Okusho ukuthi, i-ClickHouse ikwenzela yonke into futhi akufanele uhlupheke.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Ukushintsha ku-ClickHouse kumahhala uma shintsha ikholomu yengeza/yehlisa.

Futhi akufanele wenze amatafula amancane, ngoba uma unemigqa eyi-10 noma imigqa engu-10 etafuleni, ngakho-ke akunandaba nhlobo. I-ClickHouse iwuhlelo oluthuthukisa ukuphuma, hhayi ukubambezeleka, ngakho-ke akwenzi mqondo ukucubungula imigqa eyi-000.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Kulungile ukusebenzisa ithebula elikhulu elilodwa. Hlukana nemibono yakudala, konke kuzolunga.

Futhi njengebhonasi, enguqulweni yakamuva manje sinamandla okudala ukhiye wokuhlukanisa ngokungafanele ukuze senze zonke izinhlobo zemisebenzi yokulungisa kuma-partitions angawodwana.

Isibonelo, udinga amatafula amaningi amancane, isibonelo, uma kunesidingo sokucubungula idatha ethile ephakathi, uthola izingcezu futhi udinga ukwenza uguquko kuwo ngaphambi kokubhalela ithebula lokugcina. Kulokhu, kukhona injini yetafula emangalisayo - StripeLog. Kufana ne-TinyLog, engcono kuphela.

* manje i-ClickHouse nayo inakho okokufaka komsebenzi wethebula.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Enye i-antipattern i-microsharding. Isibonelo, udinga ukuhlukanisa idatha futhi unamaseva angu-5, futhi kusasa kuzoba namaseva angu-6. Futhi ucabanga ukuthi ungalinganisa kanjani kabusha le datha. Futhi esikhundleni salokho awuqhekeki ube ama-shards angu-5, kodwa ube yi-1 shards. Bese ubeka imephu ngayinye yalawa ma-microshards kuseva ehlukile. Futhi uzothola, ngokwesibonelo, ama-ClickHouses angama-000 kuseva eyodwa, ngokwesibonelo. Izehlakalo ezihlukene kumachweba ahlukene noma kusizindalwazi esihlukene.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Kepha lokhu akukuhle kakhulu ku-ClickHouse. Ngoba ngisho nesibonelo esisodwa se-ClickHouse sizama ukusebenzisa zonke izinsiza zeseva ezitholakalayo ukucubungula isicelo esisodwa. Okusho ukuthi, unohlobo oluthile lweseva futhi, ngokwesibonelo, ama-processor cores angama-56. Usebenzisa umbuzo othatha isekhondi elilodwa futhi uzosebenzisa ama-cores angama-56. Futhi uma ubeke ama-ClickHouses angu-200 lapho kuseva eyodwa, kuzovela ukuthi imicu engu-10 izoqala. Ngokuvamile, konke kuzoba kubi kakhulu.

Esinye isizathu ukuthi ukwabiwa komsebenzi kuzo zonke lezi zimo kuzobe kungalingani. Abanye bazoqeda ngaphambi kwesikhathi, abanye bazoqeda kamuva. Uma konke lokhu kwenzeke ngesikhathi esisodwa, i-ClickHouse ngokwayo izothola ukuthi ingasabalaliswa kanjani kahle idatha phakathi kwemicu.

Futhi esinye isizathu ukuthi uzoba nokuxhumana kwe-interprocessor nge-TCP. Idatha kuzodingeka i-serialized, ikhishwe, futhi leli inani elikhulu lama-microshards. Ngeke kumane kusebenze ngempumelelo.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Enye i-antipattern, nakuba ingenakubizwa ngokuthi i-antipattern. Leli inani elikhulu lokuhlanganisa ngaphambilini.

Ngokuvamile, ukuhlanganisa kusengaphambili kuhle. Ubunemigqa eyibhiliyoni, wayihlanganisa futhi yaba imigqa engu-1, futhi manje umbuzo usenziwa ngokushesha. Konke kuhle. Ungakwenza lokhu. Futhi kulokhu, ngisho ne-ClickHouse inohlobo olukhethekile lwetafula, i-AggregatingMergeTree, eyenza ukuhlanganisa okukhulayo njengoba idatha ifakiwe.

Kodwa kunezikhathi lapho ucabanga ukuthi sizohlanganisa idatha enjengale futhi sihlanganise idatha efana nale. Futhi komunye umnyango ongumakhelwane, futhi angifuni ukusho ukuthi yimuphi, basebenzisa amathebula e-SummingMergeTree ukufingqa ngokhiye oyinhloko, futhi amakholomu angaba ngu-20 asetshenziswa njengokhiye oyinhloko. Uma kwenzeka, ngishintshe amagama amakholomu athile ukuze kube yimfihlo, kodwa lokho kuhle kakhulu.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Futhi izinkinga ezinjalo ziyavela. Okokuqala, umthamo wedatha yakho awehli kakhulu. Isibonelo, iyancipha izikhathi ezintathu. Izikhathi ezintathu kungaba intengo enhle ukukhokhela amakhono angenamkhawulo wezibalo avelayo uma idatha yakho ingahlanganisiwe. Uma idatha ihlanganisiwe, esikhundleni sezibalo uthola kuphela izibalo ezidabukisayo.

Futhi yini ekhetheke kangaka ngayo? Iqiniso wukuthi laba bantu bomnyango ongomakhelwane bayaye bacele ukwengeza enye ikholomu kukhiye wokuqala. Okusho ukuthi, sihlanganise idatha kanje, kodwa manje sifuna okwengeziwe kancane. Kodwa i-ClickHouse ayinawo ukhiye oyinhloko wokushintsha. Ngakho-ke, kufanele sibhale ezinye izikripthi ku-C++. Futhi angizithandi izikripthi, ngisho noma ziku-C++.

Futhi uma ubheka ukuthi i-ClickHouse idalelwe ini, khona-ke idatha engahlanganisiwe iyisimo esazalelwa sona. Uma usebenzisa i-ClickHouse ngedatha engahlanganisiwe, lokho kusho ukuthi ukwenza kahle. Uma uhlanganisa, lokhu kwesinye isikhathi kuyathethelelwa.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Elinye icala elithokozisayo yimibuzo ekwiluphu engapheli. Kwesinye isikhathi ngiya kwenye iseva yokukhiqiza bese ngibheka uhlu lwezinqubo zombukiso lapho. Futhi ngaso sonke isikhathi lapho ngithola ukuthi kukhona okubi okwenzekayo.

Ngokwesibonelo, kanje. Kuyacaca ngokushesha ukuthi konke kungenziwa ngesicelo esisodwa. Vele ubhale i-url phakathi nohlu lapho.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Kungani imibuzo eminingi enjalo ku-loop engapheli mibi? Uma inkomba ingasetshenziswa, khona-ke uzoba namaphasi amaningi kudatha efanayo. Kodwa uma inkomba isetshenziswa, isibonelo, unokhiye oyinhloko we-ru futhi ubhala url = okuthile lapho. Futhi ucabanga ukuthi uma i-URL eyodwa kuphela ifundwa etafuleni, konke kuzolunga. Kodwa empeleni cha. Ngoba i-ClickHouse yenza yonke into ngamaqoqo.

Lapho edinga ukufunda uhla oluthile lwedatha, ufunda kancane, ngoba inkomba ku-ClickHouse iyingcosana. Le nkomba ayikuvumeli ukuthi uthole umugqa owodwa kuthebula, ububanzi bohlobo oluthile kuphela. Futhi idatha icindezelwe ngamabhulokhi. Ukuze ufunde umugqa owodwa, udinga ukuthatha ibhulokhi yonke futhi uyisuse. Futhi uma wenza inqwaba yemibuzo, uzoba nokunqwabelana okuningi, futhi uzoba nomsebenzi omningi okufanele uwenze ngokuphindaphindiwe.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Futhi njengebhonasi, ungaqaphela ukuthi ku-ClickHouse akufanele wesabe ukudlulisa ngisho namamegabhayithi ngisho namakhulu amamegabhayithi esigabeni sika-IN. Ngikhumbula kusukela ekusebenzeni kwethu ukuthi uma ku-MySQL sidlulisela inqwaba yamanani esigabeni se-IN, isibonelo, sidlulisela ama-megabytes angu-100 wezinombolo ezithile lapho, bese i-MySQL idla amagigabhayithi angu-10 ememori futhi akukho okunye okwenzekayo kuyo, yonke into. isebenza kabi.

Futhi okwesibili ukuthi ku-ClickHouse, uma imibuzo yakho isebenzisa inkomba, ngakho-ke ayihambi kancane kuneskena esigcwele, okungukuthi, uma udinga ukufunda cishe lonke ithebula, izohamba ngokulandelana futhi ifunde lonke ithebula. Ngokuvamile, uzozithola eyedwa.

Kodwa nokho kunobunzima obuthile. Isibonelo, iqiniso lokuthi IN nge-subquery ayisebenzisi inkomba. Kodwa lena inkinga yethu futhi sidinga ukuyilungisa. Akukho okuyisisekelo lapha. Sizoyilungisa*.

Futhi enye into ethokozisayo ukuthi uma unesicelo eside kakhulu futhi ukucubungula isicelo esisatshalalisiwe kuyaqhubeka, khona-ke lesi sicelo eside kakhulu sizothunyelwa kuseva ngayinye ngaphandle kokucindezelwa. Isibonelo, ama-megabytes angu-100 namaseva angu-500. Futhi, ngokufanele, uzoba namagigabhayithi angama-50 adluliselwe kunethiwekhi. Izodluliselwa bese konke kuzoqedwa ngempumelelo.

* esetshenziswa kakade; Konke kwalungiswa njengoba kwakuthenjisiwe.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Futhi icala elijwayelekile yilapho izicelo zivela ku-API. Isibonelo, udale uhlobo oluthile lwesevisi yakho. Futhi uma othile edinga isevisi yakho, bese uvula i-API futhi ngemva kwezinsuku ezimbili ubona ukuthi kukhona okungaqondakali okwenzekayo. Konke kugcwele futhi kunezicelo ezimbi kakhulu ebezingamele zenzeke.

Futhi kunesixazululo esisodwa kuphela. Uma uvule i-API, kuzodingeka uyinqamule. Isibonelo, thula uhlobo oluthile lwezilinganiso. Azikho ezinye izinketho ezijwayelekile. Uma kungenjalo, bazobhala ngokushesha iskripthi futhi kuzoba nezinkinga.

Futhi i-ClickHouse inesici esikhethekile - ukubalwa kwe-quota. Ngaphezu kwalokho, ungadlulisela ukhiye wakho wesabelo. Lokhu, isibonelo, i-ID yomsebenzisi wangaphakathi. Futhi ama-quota azobalwa ngokuzimela ngayinye yazo.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Manje enye into ethokozisayo. Lokhu ukuphindaphinda okwenziwa ngesandla.

Ngiyazi ngezimo eziningi lapho, ngaphandle kokuthi i-ClickHouse inokusekelwa okwakhelwe ngaphakathi kokuphindaphinda, abantu baphindaphinda i-ClickHouse ngesandla.

Siyini isimiso? Unepayipi lokucubungula idatha. Futhi isebenza ngokuzimela, isibonelo, ezikhungweni zedatha ezahlukene. Ubhala idatha efanayo ngendlela efanayo ku-ClickHouse. Yiqiniso, ukuzijwayeza kubonisa ukuthi idatha isazohluka ngenxa yezici ezithile kukhodi yakho. Ngethemba ukuthi ikuwe.

Futhi ngezikhathi ezithile kusazodingeka ukuthi uvumelanise ngesandla. Isibonelo, kanye ngenyanga abalawuli benza i-rsync.

Eqinisweni, kulula kakhulu ukusebenzisa ukuphindaphinda okwakhelwe ku-ClickHouse. Kodwa kungase kube nokuphikisana, ngoba kulokhu udinga ukusebenzisa i-ZooKeeper. Ngeke ngisho lutho olubi mayelana ne-ZooKeeper, ngokuyisisekelo, uhlelo lusebenza, kodwa kwenzeka ukuthi abantu bangayisebenzisi ngenxa ye-java-phobia, ngoba i-ClickHouse iyisistimu enhle kangaka, ebhalwe ku-C ++, ongayisebenzisa futhi yonke into izolunga . Futhi i-ZooKeeper iku-java. Futhi ngandlela-thile awufuni ngisho nokubukeka, kodwa-ke ungasebenzisa ukuphindaphinda okwenziwa ngesandla.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

I-ClickHouse iwuhlelo olusebenzayo. Ucabangela izidingo zakho. Uma unokuphindaphinda okwenziwa ngesandla, ungakha ithebula Elisabalalisiwe elibuka ama-replicas akho okwenziwa ngesandla futhi lenze iphutha phakathi kwawo. Futhi kukhona inketho ekhethekile ekuvumela ukuthi ugweme ama-flops, noma ngabe imigqa yakho ihlukana ngokuhlelekile.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Ezinye izinkinga zingase ziphakame uma usebenzisa izinjini zamathebula zakudala. I-ClickHouse ingumakhi onenqwaba yezinjini zamatafula ahlukene. Kuzo zonke izimo ezibucayi, njengoba kubhaliwe kumadokhumenti, sebenzisa amathebula avela emndenini we-MergeTree. Futhi konke okunye - lokhu kunjalo, kumacala ngamanye noma ukuhlolwa.

Kuthebula le-MergeTree, awudingi ukuba nanoma yiluphi usuku nesikhathi. Usengayisebenzisa. Uma lungekho usuku nesikhathi, bhala ukuthi okumisiwe kungu-2000. Lokhu kuzosebenza futhi ngeke kudinge izinsiza.

Futhi enguqulweni entsha yeseva, ungacacisa nokuthi unokwahlukanisa ngokwezifiso ngaphandle kokhiye wokuhlukanisa. Kuyoba okufanayo.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Ngakolunye uhlangothi, ungasebenzisa izinjini zamathebula zasendulo. Isibonelo, gcwalisa idatha kanye bese ubheka, usonta futhi ususe. Ungasebenzisa i-Log.

Noma ukugcina amavolumu amancane okucutshungulwa okuphakathi yi-StripeLog noma i-TinyLog.

Inkumbulo ingasetshenziswa uma inani ledatha lilincane futhi ungakwazi ukumane uhlanganise okuthile ku-RAM.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

I-ClickHouse ayiyithandi ngempela idatha eyenziwe kabusha.

Nasi isibonelo esijwayelekile. Lena inombolo enkulu yama-URL. Uwabeka etafuleni elilandelayo. Futhi-ke banquma ukwenza JOIN nabo, kodwa lokhu ngeke kusebenze, njengomthetho, ngoba i-ClickHouse isekela i-Hash JOIN kuphela. Uma ingekho i-RAM eyanele yedatha eningi edinga ukuxhunywa, khona-ke JOIN ngeke isebenze*.

Uma idatha ingeyekhadinali ephezulu, khona-ke ungakhathazeki, yigcine efomini elingashintshiwe, ama-URL abekwe ngqo kuthebula elikhulu.

* futhi manje i-ClickHouse nayo inokuhlanganisa okuhlanganisayo, futhi isebenza ezimeni lapho idatha emaphakathi ingangeni ku-RAM. Kodwa lokhu akusebenzi futhi izincomo zisasebenza.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Ezinye izibonelo ezimbalwa, kodwa sengivele ngiyangabaza ukuthi ziyi-anti-pattern noma cha.

I-ClickHouse inephutha elilodwa elaziwayo. Ayazi ukuthi ibuyekeza kanjani*. Ngandlela thize, lokhu kuhle nakakhulu. Uma unedatha ebalulekile, isibonelo, i-accounting, akekho ozokwazi ukuyithumela, ngoba azikho izibuyekezo.

* Ukusekelwa kokuvuselela nokususa kumodi ye-batch kuye kwanezelwa kudala.

Kodwa kunezindlela ezithile ezikhethekile ezivumela ukubuyekezwa njengokungathi kungemuva. Isibonelo, amatafula afana ne-ReplaceMergeTree. Benza izibuyekezo ngesikhathi sokuhlanganisa okungemuva. Ungaphoqa lokhu usebenzisa ithebula lokuthuthukisa. Kodwa ungakwenzi lokhu kaningi, ngoba kuzokusula ngokuphelele ukwahlukanisa.

AMAJOIN asabalalisiwe ku-ClickHouse nawo awaphathwa kahle umhleli wemibuzo.

Kubi, kodwa ngezinye izikhathi Kulungile.

Ukusebenzisa i-ClickHouse kuphela ukufunda idatha emuva usebenzisa okuthi khetha*.

Ngeke ngincome ukusebenzisa i-ClickHouse ngezibalo ezinzima. Kodwa lokhu akulona iqiniso ngokuphelele, ngoba sesivele siyasuka kulesi sincomo. Futhi sisanda kungeza amandla okusebenzisa amamodeli okufunda ngomshini ku-ClickHouse - Catboost. Futhi kuyangikhathaza ngoba ngicabanga, β€œYeka ukwesabeka. Lokhu kuvela ukuthi mingaki imijikelezo ngebhayithi ngayinye! Ngiyakuzonda ngempela ukumosha amawashi ngamabhayithi.

Ukusetshenziswa ngempumelelo kweClickHouse. U-Alexey Milovidov (Yandex)

Kodwa ungesabi, faka i-ClickHouse, konke kuzolunga. Uma kukhona, sinomphakathi. Phela, umphakathi nguwe. Futhi uma unezinkinga, ungaya okungenani engxoxweni yethu, futhi ngethemba ukuthi bazokusiza.

Imibuzo yakho

Siyabonga ngombiko! Ngingakhononda kuphi ngokuphahlazeka kweClickHouse?

Ungakhononda kimi mathupha njengamanje.

Muva nje ngiqale ukusebenzisa i-ClickHouse. Ngokushesha ngalahla isikhombimsebenzisi se-cli.

Yeka amaphuzu.

Ngemva kwesikhashana ngiphahlaze iseva ngokukhetha okuncane.

Unekhono.

Ngivule isiphazamisi se-GitHub, kodwa ayizange indiva.

Asibone.

U-Alexey wangikhohlisa ukuthi ngihambele umbiko, wathembisa ukungitshela ukuthi ufinyelela kanjani kudatha engaphakathi.

Kulula kakhulu.

Ngikubonile lokhu izolo. Imininingwane eyengeziwe.

Awekho amaqhinga amabi lapho. Kukhona nje ukucindezela kwe-block-by-block. Okuzenzakalelayo yi-LZ4, ungakwazi ukunika amandla i-ZSTD*. Ivimba ukusuka ku-64 kilobytes ukuya ku-1 megabyte.

* kukhona futhi ukusekelwa kwama-codec akhethekile wokucindezela angasetshenziswa kuchungechunge namanye ama-algorithms.

Ingabe amabhulokhi idatha nje eluhlaza?

Hhayi eluhlaza ngokuphelele. Kukhona ama-array. Uma unekholomu yezinombolo, izinombolo zilandelana zibekwe ohlwini.

Kuyabonakala.

U-Alexey, isibonelo esasine-uniqExact over IPs, okungukuthi iqiniso lokuthi i-uniqExact ithatha isikhathi eside ukubala ngemigqa kunezinombolo, njalonjalo. Kuthiwani uma sisebenzisa i-feint ngezindlebe zethu futhi siphonsa ngesikhathi sokuhlola iphutha? Okusho ukuthi, kubonakala sengathi uthe kudiski yethu ayihlukile kakhulu. Uma sifunda imigqa evela kudiski kanye nokusakaza, ingabe ama-aggregate ethu azoshesha noma cha? Noma sisazozuza kancane lapha? Kimina kubonakala sengathi ukuhlolile lokhu, kodwa ngasizathu simbe awuzange ukukhombise kubhentshimakhi.

Ngicabanga ukuthi izohamba kancane kunangaphandle kokulingisa. Kulokhu, ikheli le-IP kufanele lihlukaniswe kusuka kuyunithi yezinhlamvu. Vele, kwaClickHouse, ukuhlukaniswa kwekheli lethu le-IP nakho kuthuthukisiwe. Sizame kakhulu, kodwa lapho unezinombolo ezibhalwe ngendlela yezinkulungwane eziyishumi. Angikhululekile kakhulu. Ngakolunye uhlangothi, umsebenzi we-uniqExact uzosebenza kancane ezintanjeni, hhayi nje ngenxa yokuthi lezi ziyizintambo, kodwa futhi ngoba kukhethwa ubuchwepheshe obuhlukile be-algorithm. Izintambo zicutshungulwa ngokuhlukile nje.

Kuthiwani uma sithatha uhlobo lwedatha yakudala? Isibonelo, sibhale phansi i-id yomsebenzisi, esinayo, sayibhala phansi njengomugqa, bese siyayishaya, izoba mnandi kakhulu noma cha?

Ngiyangabaza. Ngicabanga ukuthi kuzodabukisa kakhulu, ngoba phela ukuhlukanisa izinombolo kuyinkinga enkulu. Kubukeka kimina ukuthi lo uzakwethu waze wanikeza umbiko wokuthi kunzima kanjani ukuhlukanisa izinombolo ngefomu lezinkulungwane eziyishumi, kodwa mhlawumbe akunjalo.

U-Alexey, ngiyabonga kakhulu ngombiko! Futhi ngiyabonga kakhulu ngeClickHouse! Nginombuzo mayelana nezinhlelo. Ingabe zikhona izinhlelo zesici sokubuyekeza izichazamazwi ngokungaphelele?

Okusho ukuthi, ukuqalisa kabusha ingxenye?

Yebo Yebo. Njengekhono lokusetha inkambu ye-MySQL lapho, okungukuthi buyekeza ngemva kwalokho ukuze kulayishwe le datha kuphela uma isichazamazwi sikhulu kakhulu.

Isici esithakazelisa kakhulu. Futhi ngicabanga ukuthi omunye umuntu ukuphakamisile engxoxweni yethu. Mhlawumbe kwakunguwe.

Angicabangi kanjalo.

Kuhle, manje kuvela ukuthi kunezicelo ezimbili. Futhi ungaqala kancane kancane ukukwenza. Kodwa ngifuna ukukuxwayisa ngaso leso sikhathi ukuthi lesi sici silula ukusisebenzisa. Okusho ukuthi, ngombono, udinga nje ukubhala inombolo yenguqulo etafuleni bese ubhala: inguqulo engaphansi kwalokhu nokunye. Lokhu kusho ukuthi, cishe, sizokunikeza lokhu kubathandi. Ingabe ungumshisekeli?

Yebo, kodwa, ngeshwa, hhayi ku-C++.

Ingabe ozakwenu bayakwazi ukubhala ngo-C++?

Ngizomthola umuntu.

Kuhle*.

* isici sengezwe ezinyangeni ezimbili ngemuva kombiko - umbhali wombuzo uwuthuthukisile futhi wathumela owakhe ukudonsa isicelo.

Siyabonga!

Sawubona! Siyabonga ngombiko! Ushilo ukuthi i-ClickHouse inhle kakhulu ekusebenziseni zonke izinsiza ezitholakalayo kuyo. Futhi isikhulumi esiseduze noLuxoft sikhulume ngesixazululo sakhe se-Russian Post. Uthe bayithanda ngempela i-ClickHouse, kodwa abazange bayisebenzise esikhundleni sembangi yabo enkulu ngoba idla yonke i-CPU. Futhi abakwazanga ukukuxhuma ekwakhiweni kwabo, ku-ZooKeeper yabo enamadokodo. Kungenzeka yini ukukhawulela i-ClickHouse ngandlela thile ukuze ingadli yonke into etholakala kuyo?

Yebo, kungenzeka futhi kulula kakhulu. Uma ufuna ukusebenzisa ama-cores ambalwa, vele ubhale set max_threads = 1. Futhi yilokho kuphela, kuzofeza isicelo kumongo owodwa. Ngaphezu kwalokho, ungacacisa izilungiselelo ezihlukile zabasebenzisi abahlukene. Ngakho akunankinga. Futhi tshela ozakwenu baseLuxoft ukuthi akukuhle ukuthi abakutholanga lokhu kulungiselelwa emibhalweni.

U-Alexey, sawubona! Ngicela ukubuza ngalombuzo. Lesi akusona isikhathi sokuqala ngizwa ukuthi abantu abaningi baqala ukusebenzisa i-ClickHouse njengendawo yokugcina izingodo. Embikweni uthe ungakwenzi lokhu, okungukuthi awudingi ukugcina izintambo ezinde. Ucabangani ngakho?

Okokuqala, izingodo, njengomthetho, azizona izintambo ezinde. Kukhona, vele, okuhlukile. Isibonelo, enye isevisi ebhalwe nge-java yenza okuhlukile, ifakiwe. Ngokunjalo ku-loop engapheli, futhi isikhala ku-hard drive siyaphela. Isixazululo silula kakhulu. Uma imigqa mide kakhulu, yinqume. Kusho ukuthini ubude? Amashumi amakhilobhayithi mabi*.

* kuzinguqulo zakamuva ze-ClickHouse, "i-adaptive index granularity" ivuliwe, eqeda inkinga yokugcina imigqa emide ingxenye enkulu.

Ingabe i-kilobyte ijwayelekile?

Kuhle.

Sawubona! Siyabonga ngombiko! Sengivele ngabuza ngalokhu engxoxweni, kodwa angikhumbuli noma ngithole impendulo. Ingabe zikhona izinhlelo zokwandisa ngandlela thile isigaba esithi WITH ngendlela ye-CTE?

Hhayi okwamanje. Isigaba sethu esithi WITH asisho lutho. Kufana nesici esincane kithi.

Ngiyaqonda. Ngiyabonga!

Siyabonga ngombiko! Inohlonze impela! Umbuzo womhlaba. Ingabe akhona amacebo okushintsha ukucishwa kwedatha, mhlawumbe ngendlela yohlobo oluthile lwama-stubs?

Impela. Lona umsebenzi wethu wokuqala kulayini wethu. Manje sicabanga ngenkuthalo ukuthi singayenza kanjani yonke into ngendlela efanele. Futhi kufanele uqale ukucindezela ikhibhodi*.

* wacindezela izinkinobho kukhibhodi wenza konke.

Ingabe lokhu ngandlela thile kuzothinta ukusebenza kwesistimu noma cha? Ingabe ukufakwa kuzoshesha njengoba kwenzeka manje?

Mhlawumbe ukuzisusa ngokwabo kanye nezibuyekezo ngokwazo zizoba nzima kakhulu, kodwa lokhu ngeke kuthinte ukusebenza kokukhethiwe noma ukusebenza kokufakwayo.

Futhi omunye umbuzo omncane. Esethulweni ukhulume ngokhiye oyinhloko. Ngokuvumelana nalokho, sinokuhlukanisa, okuyinyanga ngokuzenzakalelayo, kulungile? Futhi uma sibeka ibanga ledethi elingena enyangeni, bese kufundwa lokhu kuhlukanisa kuphela, akunjalo?

Yebo.

Umbuzo. Uma singakwazi ukukhetha noma yimuphi ukhiye oyinhloko, ingabe kulungile ukukwenza ngokuqondile ngokwenkambu ethi β€œIdethi” ukuze ngemuva kube nokuhlelwa kabusha okuncane kwale datha ukuze ilingane ngendlela ehleleke kakhudlwana? Uma ungenayo imibuzo yobubanzi futhi ungakwazi ngisho nokukhetha noma yimuphi ukhiye oyinhloko, ingabe kufanelekile ukufaka idethi kukhiye oyinhloko?

Yebo.

Mhlawumbe kunengqondo ukubeka inkambu kukhiye oyinhloko ezocindezela idatha kangcono uma ihlelwa yile nkambu. Isibonelo, i-ID yomsebenzisi. Umsebenzisi, isibonelo, uya kusayithi elifanayo. Kulokhu, faka i-id yomsebenzisi nesikhathi. Bese idatha yakho izocindezelwa kangcono. Ngokuqondene nosuku, uma ungenayo ngempela futhi ungakaze ube nemibuzo yebanga ngezinsuku, awudingi ukufaka usuku kukhiye oyinhloko.

KULUNGILE ngiyabonga kakhulu!

Source: www.habr.com

Engeza amazwana