I-ClickHouse yabasebenzisi abathuthukile emibuzweni nasezimpenduloni

Ngo-Ephreli, onjiniyela be-Avito bahlangana emihlanganweni ye-inthanethi nomthuthukisi oyinhloko we-ClickHouse u-Alexey Milovidov kanye no-Kirill Shvakov, umthuthukisi we-Golang wase-Integros. Sixoxile ngokuthi silusebenzisa kanjani uhlelo lokuphathwa kwedathabhesi nokuthi yiziphi izingqinamba esihlangabezana nazo.

Ngokusekelwe emhlanganweni, sihlanganise i-athikili enezimpendulo zochwepheshe emibuzweni yethu neyezithameli mayelana nezipele, ukwabelana kabusha kwedatha, izichazamazwi zangaphandle, umshayeli we-Golang kanye nokubuyekeza izinguqulo ze-ClickHouse. Kungase kube usizo kubathuthukisi asebevele besebenza ngenkuthalo ne-Yandex DBMS futhi abanentshisekelo kwesimanje nekusasa layo. Ngokuzenzakalelayo, izimpendulo zibhalwe ngu-Alexey Milovidov, ngaphandle uma kubhalwe ngenye indlela.

Qaphela, kunombhalo omningi ngaphansi kokusikwa. Sithemba ukuthi okuqukethwe okunemibuzo kuzokusiza ukuthi uzulazule.

I-ClickHouse yabasebenzisi abathuthukile emibuzweni nasezimpenduloni

Okuqukethwe

Uma ungafuni ukufunda umbhalo, ungabuka ukuqoshwa kwemibuthano esiteshini sethu se-YouTube. Amakhodi esikhathi akumazwana okuqala ngaphansi kwevidiyo.

I-ClickHouse ivuselelwa njalo, kepha idatha yethu ayinjalo. Yini okufanele uyenze ngakho?

I-ClickHouse ivuselelwa njalo, futhi idatha yethu, eyalungiswa yacutshungulwa ekugcineni, ayibuyekezwa futhi ikukhophi eyisipele.

Ake sithi sibe nenkinga futhi idatha yalahleka. Sinqume ukubuyisela, futhi kwavela ukuthi izingxenye ezindala, ezigcinwe kumaseva ayisipele, zihluke kakhulu kunguqulo esetshenzisiwe ye-ClickHouse. Yini okufanele uyenze esimweni esinjalo, futhi kungenzeka?

Isimo lapho ubuyisele khona idatha kusuka kusipele ngefomethi endala, kodwa ayixhumeki kunguqulo entsha, akunakwenzeka. Senza isiqiniseko sokuthi ifomethi yedatha ku-ClickHouse ihlala ihambisana nokubuyela emuva. Lokhu kubaluleke kakhulu kunokuhambisana okubuyela emuva ekusebenzeni uma ukuziphatha komsebenzi othile ongavamile ukusetshenziswa kushintshile. Inguqulo entsha ye-ClickHouse kufanele ihlale ikwazi ukufunda idatha egcinwe kudiski. Lona umthetho.

Yiziphi izinqubo ezingcono kakhulu zamanje zokulondoloza idatha kusuka ku-ClickHouse?

Indlela yokwenza ama-backups, kucatshangelwa ukuthi sithuthukise ukusebenza kokugcina, i-database enkulu yama-terabytes, nedatha ebuyekezwayo, ake sithi, ezinsukwini ezintathu ezedlule, futhi akukho zinqubo ezenzeka kuyo?

Singenza isixazululo sethu futhi sibhale ku-bash: qoqa lawa makhophi ayisipele ngendlela ethize. Mhlawumbe asikho isidingo sokuhudula noma yini, futhi ibhayisikili lasungulwa kudala?

Ake siqale ngemikhuba emihle kakhulu. Ozakwethu bahlale beluleka, ekuphenduleni imibuzo mayelana nama-backups, ukubakhumbuza ngesevisi ye-Yandex.Cloud, lapho le nkinga isivele ixazululiwe. Ngakho sebenzisa uma kungenzeka.

Asikho isisombululo esiphelele sama-backups, amaphesenti ayikhulu akhelwe ku-ClickHouse. Kukhona ezinye izikhala ezingasetshenziswa. Ukuze uthole isixazululo esiphelele, kuzodingeka ukuthi ukhenceze kancane ngesandla, noma udale ama-wrappers ngendlela yemibhalo.

Ngizoqala ngezixazululo ezilula futhi ngigcine ngeziyinkimbinkimbi kakhulu, kuye ngokuthi umthamo wedatha kanye nobukhulu beqoqo. Uma iqoqo likhulu, isixazululo siba yinkimbinkimbi.

Uma ithebula elinedatha lithatha amagigabhayithi ambalwa kuphela, ukwenza isipele kungenziwa kanje:

  1. Londoloza incazelo yethebula okungukuthi imethadatha − bonisa ithebula lokudala.
  2. Yenza indawo yokulahla usebenzisa iklayenti le-ClickHouse - khetha * kusuka etafuleni ukufayela. Ngokuzenzakalelayo uzothola ifayela ngefomethi ethi TabSeparated. Uma ufuna ukusebenza kahle kakhulu, ungakwenza ngefomethi Yomdabu.

Uma inani ledatha likhulu, khona-ke ukwenza ikhophi yasenqolobaneni kuzothatha isikhathi esiningi nesikhala esiningi. Lokhu kubizwa ngokuthi isipele esinengqondo; akuboshelwe kufomethi yedatha ye-ClickHouse. Uma kunjalo, khona-ke njengesinyathelo sokugcina ungathatha isipele futhi usilayishe ku-MySQL ukuze ululame.

Ukuze uthole izimo ezithuthuke kakhulu, i-ClickHouse inekhono elakhelwe ngaphakathi lokudala isifinyezo sama-partitions ohlelweni lwamafayela wendawo. Lesi sici sitholakala njengesicelo shintsha ukwahlukanisa okufriza kwethebula. Noma kalula shintsha ukufriza kwetafula - lesi isifinyezo setafula lonke.

Isifinyezo sizodalwa ngokuqhubekayo kuthebula elilodwa ku-shard eyodwa, okusho ukuthi, akunakwenzeka ukudala isifinyezo esingaguquki seqoqo lonke ngale ndlela. Kodwa emisebenzini eminingi asikho isidingo esinjalo, futhi kwanele ukwenza isicelo ku-shard ngayinye futhi uthole isifinyezo esingaguquki. Idalwe ngendlela yama-hardlinks ngakho-ke ayithathi isikhala esengeziwe. Okulandelayo, ukopisha lesi sithombe esifinyeziwe kuseva eyisipele noma kwisitoreji osisebenzisela izipele.

Ukubuyisela isipele esinjalo kulula kakhulu. Okokuqala, dala amathebula usebenzisa izincazelo zethebula ezikhona. Okulandelayo, kopisha izifinyezo ezilondoloziwe zezingxenye ku-Directory-Detached kulawa mathebula bese ugijima umbuzo. namathisela ukwahlukanisa. Lesi sixazululo sifaneleka kakhulu kumavolumu abucayi kakhulu wedatha.

Kwesinye isikhathi udinga okuthile okupholile nakakhulu - ezimeni lapho unamashumi noma amakhulu ama-terabyte kuseva ngayinye kanye namakhulu amaseva. Kunesixazululo lapha engisicoshe kozakwethu bakwa-Yandex.Metrica. Ngeke ngiyituse kuwo wonke umuntu - yifunde futhi uzinqumele ukuthi ifanelekile noma cha.

Okokuqala udinga ukudala amaseva amaningana ngamashalofu amakhulu ediski. Okulandelayo, kulawa maseva, phakamisa amaseva ambalwa we-ClickHouse futhi uwalungiselele ukuze asebenze njengenye ikhophi yamashadi afanayo. Bese usebenzisa isistimu yefayela noma ithuluzi elithile kulawa maseva elikuvumela ukuthi udale izifinyezo. Kunezinketho ezimbili lapha. Inketho yokuqala izifinyezo ze-LVM, inketho yesibili yi-ZFS ku-Linux.

Ngemuva kwalokho, nsuku zonke udinga ukudala isifinyezo, sizoqamba amanga futhi sithathe indawo ethile. Ngokwemvelo, uma idatha ishintsha, inani lesikhala lizokhula ngokuhamba kwesikhathi. Lesi sifinyezo singakhishwa nganoma yisiphi isikhathi futhi idatha ibuyiselwe, isisombululo esinjalo esiyinqaba. Futhi, sidinga futhi ukukhawulela lezi zifaniso ku-config ukuze zingazami ukuba ngabaholi.

Ingabe kuzokwazi ukuhlela i-lag elawulwayo ye-replica ku-shafts?

Kulo nyaka uhlela ukwenza ama-shafts e-ClickHouse. Ingabe kuzokwazi ukuhlela i-lag elawulwayo yama-replicas kuzo? Singathanda ukuyisebenzisela ukuzivikela ezimeni ezingezinhle ezinezinguquko nezinye izinguquko.

Ingabe kungenzeka ukwenza uhlobo oluthile lokubuyela emuva ukuze uthole ama-alters? Isibonelo, ku-shaft ekhona, thatha futhi usho ukuthi kuze kube manje usebenzisa izinguquko, futhi kusukela kulo mzuzu uyeka ukusebenzisa izinguquko?

Uma umyalo ufika kuqoqo lethu futhi waliphula, khona-ke sinesifaniso esinemibandela esinehora, lapho singasho khona ukuthi masiyisebenzise okwamanje, kodwa ngeke sisebenzise izinguquko kuso emizuzwini eyishumi yokugcina?

Okokuqala, mayelana ne-lag elawulwayo yama-replicas. Kube nesicelo esinjalo esivela kubasebenzisi, futhi sidale inkinga ku-Github ngesicelo: "Uma umuntu edinga lokhu, njengakho, beka inhliziyo." Akekho olethiwe, futhi inkinga yavalwa. Nokho, usungalithola leli thuba ngokusetha i-ClickHouse. Yiqiniso, kusukela kunguqulo 20.3 kuphela.

I-ClickHouse ihlale yenza ukuhlanganisa idatha ngemuva. Lapho ukuhlanganisa sekuqediwe, isethi ethile yezingcezu zedatha ithathelwa indawo ucezu olukhulu. Ngesikhathi esifanayo, izingcezu zedatha ezazikhona ngaphambili ziyaqhubeka nokuhlala kudiski isikhathi esithile.

Okokuqala, ayaqhubeka nokugcinwa inqobo nje uma kukhona imibuzo ekhethiwe ewasebenzisayo, ukuze ahlinzeke ngokusebenza okungavimbeli. Imibuzo ekhethiwe ifundeka kalula ezingxenyeni ezindala.

Okwesibili, kukhona futhi umkhawulo wesikhathi - izingcezu ezindala zedatha zilele kudiski imizuzu eyisishiyagalombili. Le mizuzu eyisishiyagalombili ingenziwa ngokwezifiso futhi iguqulwe ibe usuku olulodwa. Lokhu kuzobiza isikhala sediski: kuye ngokugeleza kwedatha, kuvela ukuthi ngosuku lokugcina idatha ngeke ibe kabili kuphela, ingaba izikhathi ezinhlanu ngaphezulu. Kodwa uma kunenkinga enkulu, ungamisa iseva ye-ClickHouse futhi ulungise yonke into.

Manje kuphakama umbuzo wokuthi lokhu kuvikela kanjani ekuguquleni. Kuyafaneleka ukubheka ngokujulile lapha, ngoba ezinguqulweni ezindala ze-ClickHouse, i-alter yasebenza ngendlela yokuthi ivele yashintsha izingcezu ngokuqondile. Kukhona ucezu lwedatha olunamafayela athile, futhi senza, isibonelo, shintsha ikholomu yokudonsela phansi. Bese le kholomu ikhishwa ngokoqobo kuzo zonke izingcezu.

Kodwa kusukela ngenguqulo 20.3, indlela yokushintsha ishintshwe ngokuphelele, futhi manje izingcezu zedatha zihlala zingashintshile. Azishintshi nhlobo - izinguquko manje zisebenza ngendlela efanayo nokuhlanganisa. Esikhundleni sokushintsha ucezu ngaso leso sikhathi, sakha entsha. Ku-chunk entsha, amafayela angashintshile aba ama-hardlinks, futhi uma sisusa ikholomu, izomane ingekho ku-chunk entsha. Ucezu oludala luzosuswa ngokuzenzakalelayo ngemva kwemizuzu eyisishiyagalombili, futhi lapha ungakwazi ukulungisa izilungiselelo ezishiwo ngenhla.

Okufanayo kuyasebenza nasezinguqukweni ezifana noguquko. Uma wenza shintsha ukususa noma shintsha isibuyekezo, ayishintshi ucezu, kodwa idala entsha. Bese esusa endala.

Kuthiwani uma isakhiwo setafula sishintshile?

Indlela yokubuyisela isipele esenziwe ngohlelo oludala? Futhi umbuzo wesibili umayelana necala ngezifinyezo namathuluzi wesistimu yefayela. Ingabe ama-Btrfs alungile lapha esikhundleni se-ZFS ku-Linux LVM?

Uma wenza namathisela ukwahlukanisa izingxenye ezinesakhiwo esihlukile, khona-ke i-ClickHouse izokutshela ukuthi lokhu akunakwenzeka. Lesi yisixazululo. Esokuqala ukwakha ithebula lesikhashana lohlobo lwe-MergeTree ngesakhiwo esidala, namathisela idatha lapho usebenzisa i-attach, futhi wenze umbuzo wokushintsha. Bese ungakwazi ukukopisha noma ukudlulisa le datha bese unamathisela futhi, noma usebenzise isicelo shintsha ukwahlukanisa kokuhambisa ithebula.

Manje umbuzo wesibili ukuthi i-Btrfs ingasetshenziswa yini. Okokuqala, uma une-LVM, izifinyezo ze-LVM zanele, futhi uhlelo lwefayela lungaba yi-ext4, akunandaba. Nge-Btrts, konke kuncike kokuhlangenwe nakho kwakho kokuyisebenzisa. Lolu wuhlelo lwamafayela asebekhulile, kodwa kusenezinsolo zokuthi yonke into izosebenza kanjani esimweni esithile. Ngeke ngincome ukusebenzisa lokhu ngaphandle uma unama-Btrfs ekukhiqizeni.

Yiziphi izinqubo ezihamba phambili zamanje ekwabiweni kabusha kwedatha?

Indaba yokwaba kabusha iyinkimbinkimbi futhi inezici eziningi. Kunezimpendulo ezimbalwa okungenzeka lapha. Ungasuka ohlangothini olulodwa futhi usho lokhu - i-ClickHouse ayinaso isici esakhelwe ngaphakathi sokwabelana kabusha. Kodwa ngesaba ukuthi le mpendulo ngeke ifanele noma ubani. Ngakho-ke, ungasuka kolunye uhlangothi futhi uthi i-ClickHouse inezindlela eziningi zokwabelana kabusha kwedatha.

Uma iqoqo liphelelwa isikhala noma lingakwazi ukuthwala umthwalo, ungeza amaseva amasha. Kodwa lezi ziphakeli azinalutho ngokuzenzakalelayo, akukho datha kuzo, akukho mthwalo. Udinga ukuhlela kabusha idatha ukuze isakazeke ngokulinganayo kuqoqo elisha, elikhulu.

Indlela yokuqala lokhu okungenziwa ngayo ukukopisha ingxenye yezingxenye kumaseva amasha usebenzisa isicelo shintsha ukwahlukanisa kokulanda ithebula. Isibonelo, ube nama-partitions ngenyanga, futhi uthatha inyanga yokuqala ka-2017 futhi uyikopishe kuseva entsha, bese ukopisha inyanga yesithathu kwenye iseva entsha. Futhi wenza lokhu kuze kube yilapho iba ngaphezulu noma ngaphansi ngokulingana.

Ukudlulisa kungenziwa kuphela kulezo zingxenye ezingashintshi ngesikhathi sokurekhoda. Ukuze uthole ama-partitions amasha, ukurekhoda kuzodingeka ukuthi kukhutshazwe, ngoba ukudlulisa kwabo akuyona i-athomu. Uma kungenjalo, uzogcina unezimpinda noma izikhala kudatha. Nokho, le ndlela iyasebenza futhi isebenza ngempumelelo impela. Izingxenye ezicindezelwe esezilungile zithunyelwa ngenethiwekhi, okungukuthi, idatha ayicindezelwanga noma ayibhalwa kabusha.

Le ndlela ine-drawback eyodwa, futhi kuya ngohlelo lwe-sharding, noma ngabe ubophekile kulolu hlelo lwe-sharding, imuphi ukhiye we-sharding obunawo. Esibonelweni sakho secala elinamamethrikhi, ukhiye wokwabelana uyi-hashi yendlela. Uma ukhetha ithebula Elisabalalisiwe, liya kuwo wonke ama-shards kuqoqo ngesikhathi esisodwa futhi lithatha idatha lapho.

Lokhu kusho ukuthi empeleni akunandaba kuwe ukuthi iyiphi idatha egcine kuyiphi i-shard. Into eyinhloko ukuthi idatha endleleni eyodwa iphelela ku-shard eyodwa, kodwa iyiphi engabalulekile. Kulokhu, ukudlulisa ama-partitions esenziwe ngomumo kulungile, ngoba ngemibuzo ekhethiwe uzophinde uthole idatha ephelele - kungakhathaliseki ukuthi ngaphambi kokwabelana kabusha noma ngemva kwalokho, uhlelo alunandaba ngempela.

Kodwa kunezimo eziyinkimbinkimbi kakhulu. Uma ezingeni le-logic yohlelo lokusebenza uthembele esikimini esikhethekile se-sharding, ukuthi leli klayenti litholakala kuleyo naleyo shard, futhi isicelo singathunyelwa lapho ngqo, hhayi etafuleni Elisabalalisiwe. Noma usebenzisa inguqulo yakamuva ye-ClickHouse futhi unike amandla ukulungiselelwa thuthukisa ukweqa ama-shards angasetshenzisiwe. Kulesi simo, phakathi nombuzo okhethiwe, isisho lapho isigaba sizohlaziywa khona futhi kuzobalwa ukuthi yiziphi izingcezu okudingeka zisetshenziswe ngokuya ngohlelo lwe-sharding. Lokhu kusebenza inqobo nje uma idatha ihlukaniswa ncamashí nalolu hlelo lwe-sharding. Uma uzihlele kabusha mathupha, ukuxhumana kungashintsha.

Ngakho lena indlela yokuqala. Futhi ngilinde impendulo yakho, noma ngabe indlela ifanelekile, noma ake siqhubeke.

UVladimir Kolobaev, umqondisi wesistimu oholayo e-Avito: U-Alexey, indlela oyishilo ayisebenzi kahle uma udinga ukusabalalisa umthwalo, kuhlanganise nokufunda. Singathatha isabelo esingenyanga ngenyanga futhi singathatha inyanga edlule siye kwenye indawo, kodwa uma isicelo sifika sale datha, sizoyilayisha kuphela. Kodwa singathanda ukulayisha lonke iqoqo, ngoba uma kungenjalo, isikhathi esithile wonke umthwalo wokufunda uzocutshungulwa ngamashadi amabili.

U-Alexey Milovidov: Impendulo lapha iyamangalisa - yebo, kubi, kodwa kungase kusebenze. Ngizochaza kahle ukuthi kanjani. Kuyafaneleka ukubheka isimo somthwalo esiza ngemuva kwedatha yakho. Uma lokhu kuyidatha yokuqapha, khona-ke cishe singasho ngokuqinisekile ukuthi izicelo eziningi ezedatha entsha.

Ufake amaseva amasha, wathuthela izingxenye ezidala, kodwa futhi washintsha indlela idatha entsha erekhodwa ngayo. Futhi idatha entsha izosatshalaliswa kulo lonke iqoqo. Ngakho, ngemva kwemizuzu emihlanu nje, izicelo zemizuzu emihlanu yokugcina zizolayisha ngokulinganayo iqoqo; ngemva kosuku, izicelo zamahora angu-XNUMX zizolayisha ngokulinganayo iqoqo. Futhi izicelo zenyanga edlule, ngeshwa, zizoya kuphela engxenyeni yamaseva eqoqo.

Kodwa ngokuvamile ngeke ube nezicelo eziqondile zikaFebhuwari 2019. Kungenzeka ukuthi, uma izicelo zingena ku-2019, zizoba ngezonyaka wonke we-2019 - isikhathi eside, hhayi uhla oluthile oluncane. Futhi izicelo ezinjalo zizokwazi futhi ukulayisha iqoqo ngokulinganayo. Kodwa ngokuvamile, ukuphawula kwakho kunembile ukuthi lesi yisixazululo se-ad hoc esingasabalalisi idatha ngokulinganayo ngokuphelele.

Nginamaphuzu ambalwa okuphendula umbuzo. Enye yazo imayelana nendlela yokuklama ekuqaleni isikimu sokwahlukanisa ukuze ukwaba kabusha kungadala ubuhlungu obuncane. Lokhu akwenzeki ngaso sonke isikhathi.

Isibonelo, unedatha yokuqapha. Idatha yokuqapha ikhula ngezizathu ezintathu. Okokuqala ukuqoqwa kwedatha yomlando. Okwesibili ukukhula kwezimoto. Futhi okwesithathu ukwanda kwenani lezinto ezingaphansi kokuqashwa. Kukhona ama-microservices amasha namamethrikhi adinga ukulondolozwa.

Kungenzeka ukuthi kulokhu, ukwanda okukhulu kuhlotshaniswa nesizathu sesithathu - ukwanda kokusetshenziswa kokuqapha. Futhi kulokhu, kufanelekile ukubheka uhlobo lomthwalo, yimiphi imibuzo eyinhloko ekhethiwe. Imibuzo eyisisekelo ekhethiwe cishe izosuselwa kusethi encane yamamethrikhi.

Isibonelo, ukusetshenziswa kwe-CPU kwamanye amaseva ngesevisi ethile. Kuvele ukuthi kunesethi ethile yokhiye othola ngayo le datha. Futhi isicelo ngokwaso sale datha cishe silula futhi siqedwa ngama-millisecond angamashumi. Isetshenziselwa izinsizakalo zokuqapha namadeshibhodi. Ngithemba ukuthi ngikuqonda kahle lokhu.

UVladimir Kolobaev: Iqiniso liwukuthi sivame ukunxusa idatha yomlando, njengoba siqhathanisa isimo samanje nesomlando ngesikhathi sangempela. Futhi kubalulekile kithi ukuthi sifinyelele ngokushesha inani elikhulu ledatha, futhi i-ClickHouse yenza umsebenzi omuhle kakhulu ngalokhu.

Uqinisile impela, sihlangabezana nezicelo eziningi ezifundwe osukwini lokugcina, njenganoma iyiphi isistimu yokuqapha. Kodwa ngesikhathi esifanayo, umthwalo kudatha yomlando nawo mkhulu kakhulu. Isuka ohlelweni lokuxwayisa oluhamba njalo ngemizuzwana engamashumi amathathu bese ithi ku-ClickHouse: “Nginike idatha yamaviki ayisithupha adlule. Manje ngakheleni uhlobo oluthile lwesilinganiso esinyakazayo kuzo, futhi ake siqhathanise inani lamanje nelomlando.”

Ngingathanda ukusho ukuthi ezicelweni zakamuva kakhulu sinelinye ithebula elincane lapho sigcina khona idatha yezinsuku ezimbili kuphela, futhi izicelo eziyinhloko zingena kulo. Sithumela kuphela imibuzo emikhulu yomlando etafuleni elikhulu elibiwe.

U-Alexey Milovidov: Ngeshwa, kuvele kungasebenzi kahle esimeni sakho, kodwa ngizokutshela incazelo yezikimu ezimbili ezimbi neziyinkimbinkimbi zokushadi ezingadingi ukusetshenziswa, kodwa ezisetshenziswa enkonzweni yabangane bami.

Kukhona iqoqo elikhulu elinemicimbi ye-Yandex.Metrica. Imicimbi iwukubuka kwamakhasi, ukuchofoza, nokuguqulwa. Izicelo eziningi ziya kuwebhusayithi ethile. Uvula isevisi ye-Yandex.Metrica, unewebhusayithi - avito.ru, iya embikweni, futhi isicelo senziwe kuwebhusayithi yakho.

Kodwa kunezinye izicelo - ezokuhlaziya nezomhlaba jikelele - ezenziwa abahlaziyi bangaphakathi. Uma kwenzeka, ngiyaqaphela ukuthi abahlaziyi bangaphakathi benza izicelo zezinsizakalo ze-Yandex kuphela. Kodwa noma kunjalo, ngisho nezinsizakalo ze-Yandex zinesabelo esikhulu sayo yonke idatha. Lezi izicelo akuzona zokubala ezithile, kodwa ezokuhlunga okubanzi.

Ihlelwa kanjani idatha ngendlela yokuthi yonke into isebenze kahle esibalini esisodwa, kanye nemibuzo yomhlaba wonke? Obunye ubunzima ukuthi inani lezicelo ku-ClickHouse yeqoqo le-Metrics yizinkulungwane ezimbalwa ngomzuzwana. Ngesikhathi esifanayo, iseva eyodwa ye-ClickHouse ayikwazi ukusingatha izicelo ezingezona ezincane, isibonelo, izinkulungwane ezimbalwa ngomzuzwana.

Usayizi weqoqo ngamaseva angamakhulu ayisithupha. Uma uvele udonse itafula Elisabalalisiwe phezu kwaleli qoqo bese uthumela izicelo eziyizinkulungwane ezimbalwa lapho, kuzoba kubi kakhulu kunokuzithumela kuseva eyodwa. Ngakolunye uhlangothi, inketho yokuthi idatha isatshalaliswa ngokulinganayo, futhi siyahamba futhi sicele kuwo wonke amaseva, ichithwa ngokushesha.

Kukhona inketho ephambene ne-diametrically. Cabanga nje uma sabelane ngedatha kuwo wonke amasayithi, futhi isicelo sesayithi elilodwa siya endaweni eyodwa. Manje iqoqo lizokwazi ukuphatha izicelo eziyizinkulungwane eziyishumi ngomzuzwana, kodwa kushadi elilodwa noma yisiphi isicelo sizosebenza kancane kakhulu. Ngeke isakala ngokuya nge-throughput. Ikakhulukazi uma lokhu kuyisayithi avito.ru. Ngeke ngidalule imfihlo uma ngithi i-Avito ingenye yezindawo ezivakashelwe kakhulu ku-RuNet. Futhi ukuyicubungula kushadi elilodwa kungaba ubuhlanya.

Ngakho-ke, uhlelo lwe-sharding lwakhiwe ngendlela enobuqili. Iqoqo lonke lihlukaniswe inani lamaqoqo, esiwabiza ngokuthi izendlalelo. Iqoqo ngalinye liqukethe amashadi ayishumi nambili kuye kwayishumi nambili. Kunamaqoqo anjalo angamashumi amathathu nesishiyagalolunye esewonke.

Kwenzeka kanjani lokhu konke? Inani lamaqoqo alishintshi - njengoba lalingamashumi amathathu nesishiyagalolunye eminyakeni embalwa edlule, lihlala linjalo. Kodwa phakathi kwazo zonke, kancane kancane sikhulisa inani lamashadi njengoba siqongelela idatha. Futhi uhlelo lwe-sharding lulonke lufana nalokhu: lawa maqoqo ahlukaniswe amawebhusayithi, futhi ukuze uqonde ukuthi iyiphi iwebhusayithi ekuyiphi iqoqo, kusetshenziswa i-metabase ehlukile ku-MySQL. Isayithi elilodwa - kuqoqo elilodwa. Futhi ngaphakathi kwayo, ukwabiwa kwenzeka ngokuya ngomazisi besivakashi.

Uma sirekhoda, siyawahlukanisa ngengxenye esele ye-ID yesivakashi. Kodwa uma wengeza i-shard entsha, uhlelo lokwahlukanisa luyashintsha; siyaqhubeka nokuhlukana, kodwa ngokusele kokuhlukaniswa ngenye inombolo. Lokhu kusho ukuthi isivakashi esisodwa sesivele sitholakala kumaseva ambalwa, futhi awukwazi ukuthembela kulokhu. Lokhu kwenziwa kuphela ukuze kuqinisekiswe ukuthi idatha icindezelwe kangcono. Futhi lapho senza izicelo, siya kuthebula elithi Distributed, elibheka iqoqo futhi lifinyelela inqwaba yamaseva. Isikimu esiwubulima lesi.

Kodwa indaba yami izobe ingaphelele uma ngingasho ukuthi silahle lesi sikimu. Kuhlelo olusha, sishintshe yonke into futhi sakopisha yonke idatha sisebenzisa i-clickhouse-copier.

Kuhlelo olusha, zonke izingosi zihlukaniswe izigaba ezimbili - ezinkulu nezincane. Angazi ukuthi umkhawulo wakhethwa kanjani, kodwa umphumela waba ukuthi amasayithi amakhulu aqoshwa kuqoqo elilodwa, lapho kukhona ama-shards angu-120 anezinhlamvu ezintathu ngayinye - okungukuthi, amaseva angu-360. Futhi i-sharding scheme iwukuthi noma yisiphi isicelo siya kuwo wonke ama-shards ngesikhathi esisodwa. Uma manje uvula noma yiliphi ikhasi lombiko le-avito.ru ku-Yandex.Metrica, isicelo sizoya kumaseva angu-120. Kukhona amasayithi amakhulu ambalwa ku-RuNet. Futhi izicelo akuzona eziyinkulungwane ngomzuzwana, kodwa zingaphansi kwekhulu. Konke lokhu kuhlafunwa buthule yithebula elithi Distributed, ngalinye licubungula ngamaseva angu-120.

Futhi iqoqo lesibili elezindawo ezincane. Nasi isikimu sokwahlukanisa esisekelwe ku-ID yesayithi, futhi isicelo ngasinye siya eshadini elilodwa ncamashi.

I-ClickHouse inesisetshenziswa se-clickhouse-copier. Ungasitshela ngaye?

Ngizosho ngokushesha ukuthi lesi sixazululo sinzima kakhulu futhi asikhiqizi kangako. Inzuzo ukuthi igcoba idatha ngokuphelele ngokwephethini oyicacisayo. Kodwa i-drawback ye-utility ukuthi ayishintshi nhlobo. Ikopisha idatha isuka ku-schema yeqoqo iye kwesinye i-cluster schema.

Lokhu kusho ukuthi ukuze isebenze kufanele ube namaqoqo amabili. Angatholakala kumaseva afanayo, kodwa, nokho, idatha ngeke ihanjiswe ngokwandayo, kodwa izokopishwa.

Isibonelo, kwakukhona amaseva amane, manje aseyisishiyagalombili. Udala ithebula Elisabalalisiwe elisha kuwo wonke amaseva, amathebula endawo amasha futhi uqalise i-clickhouse-copier, ubonisa kulo uhlelo lokusebenza okufanele lufunde kusukela lapho, wamukele uhlelo olusha lokwahlukanisa futhi udlulisele idatha lapho. Futhi kumaseva amadala uzodinga indawo ephindwe izikhathi ezine nengxenye kunamanje, ngoba idatha endala kufanele ihlale kubo, futhi ingxenye yedatha efanayo endala izofika phezu kwabo. Uma ucabange kusengaphambili ukuthi idatha idinga ukwabiwa kabusha futhi kunesikhala, khona-ke le ndlela ifanelekile.

Isebenza kanjani i-clickhouse-copier ngaphakathi? Ihlukanisa wonke umsebenzi ube yisethi yemisebenzi yokucubungula ingxenye eyodwa yetafula kushadi olulodwa. Yonke le misebenzi ingenziwa ngokuhambisana, futhi i-clickhouse-copier ingaqhutshwa emishinini ehlukene ezimweni eziningi, kodwa ekwenzayo ngengxenye eyodwa akuyona into engaphezu kokukhethwa kokufaka. Idatha iyafundwa, iyacindezelwa, ihlukaniswe kabusha, bese iyacindezelwa futhi, ibhalwe ndawana thize, bese ihlungwa kabusha. Lesi yisinqumo esinzima.

Ubunento yomshayeli ebizwa ngokuthi i-resharding. Kuthiwani ngaye?

Emuva ngo-2017, ubunento yokushayela ebizwa ngokuthi i-resharding. Kukhona nenketho ku-ClickHouse. Njengoba ngikuqonda, ayizange isuke. Ungangitshela ukuthi kungani lokhu kwenzeka? Kubonakala kuhambisana kakhulu.

Inkinga yonke ukuthi uma kudingekile ukwaba kabusha idatha endaweni, ukuvumelanisa okuyinkimbinkimbi kakhulu kuyadingeka ukuze kwenziwe lokhu nge-athomu. Lapho siqala ukubheka ukuthi lokhu kuvumelanisa kusebenza kanjani, kwacaca ukuthi kunezinkinga ezibalulekile. Futhi lezi zinkinga eziyisisekelo azizona nje i-theory, kodwa ngokushesha zaqala ukuzibonakalisa ekusebenzeni ngendlela yento engachazwa kalula - akukho lutho olusebenzayo.

Kungenzeka yini ukuhlanganisa zonke izingcezu zedatha ndawonye ngaphambi kokuyihambisa kumadiski anensa?

Umbuzo mayelana ne-TTL ngokuhambisa inketho yediski enensayo kumongo wokuhlanganisa. Ingabe ikhona indlela, ngaphandle kwe-cron, yokuhlanganisa zonke izingxenye zibe yinto eyodwa ngaphambi kokuzihambisa kumadiski ahamba kancane?

Impendulo yombuzo kungenzeka ukuthi ngandlela-thile unamathisele ngokuzenzakalelayo zonke izingcezu zibe yinye ngaphambi kokuzidlulisela - cha. Angicabangi ukuthi lokhu kuyadingeka. Awudingi ukuhlanganisa zonke izingxenye zibe munye, kodwa mane uthembele eqinisweni lokuthi zizodluliselwa kumadiski ahamba kancane ngokuzenzakalelayo.

Sinemibandela emibili yemithetho yokudlulisa. Eyokuqala injengoba igcwaliswa. Uma isigaba samanje sesitoreji sinephesenti elithile lesikhala esikhululekile esingaphansi kwamaphesenti athile, sikhetha ingxenye eyodwa bese siyihambisa endaweni yokubeka enensa. Noma kunalokho, hhayi kancane, kodwa olandelayo - njengoba ulungisa.

Umbandela wesibili usayizi. Kumayelana nokuhambisa izingcezu ezinkulu. Ungalungisa i-threshold ngokwesikhala samahhala kudiski esheshayo, futhi idatha izodluliswa ngokuzenzakalelayo.

Ungathuthela kanjani ezinguqulweni ezintsha ze-ClickHouse uma ingekho indlela yokuhlola ukuhambisana kusengaphambili?

Lesi sihloko kuxoxwa ngaso njalo engxoxweni yocingo ye-ClickHouse kucatshangelwa izinguqulo ezahlukene, futhi namanje. Kuphephe kangakanani ukuthuthukisa kusuka kunguqulo 19.11 kuya ku-19.16 futhi, isibonelo, ukusuka ku-19.16 kuya ku-20.3. Iyiphi indlela engcono kakhulu yokuthuthela ezinguqulweni ezintsha ngaphandle kokwazi ukuhlola ukuhambisana ku-sandbox kusengaphambili?

Kunemithetho eminingana “yegolide” lapha. Okokuqala - funda i-changelog. Kukhulu, kodwa kunezigaba ezihlukene mayelana nezinguquko ezibuyela emuva ezingahambelani. Ungaphathi la maphuzu njengefulegi elibomvu. Lokhu kuvamise ukungahambisani okuncane okubandakanya ukusebenza konqenqema okungenzeka ukuthi awukusebenzisi.

Okwesibili, uma ingekho indlela yokuhlola ukuhambisana ebhokisini lesihlabathi, futhi ufuna ukuvuselela ngokushesha ekukhiqizeni, isincomo siwukuthi awudingi ukwenza lokhu. Qala udale ibhokisi lesihlabathi bese uhlola. Uma ingekho indawo yokuhlola, cishe awunayo inkampani enkulu kakhulu, okusho ukuthi ungakopisha enye idatha ku-laptop yakho futhi uqiniseke ukuthi yonke into isebenza kahle kuyo. Ungakwazi ngisho nokuphakamisa izifaniso ezimbalwa endaweni emshinini wakho. Noma ungathatha inguqulo entsha endaweni ethile eduze futhi ulayishe enye idatha lapho - okungukuthi, udale indawo yokuhlola ethuthukisiwe.

Omunye umthetho owokuthi ungabuyekezwa isonto lonke ngemva kokukhishwa kwenguqulo ngenxa yokubamba iziphazamisi ekukhiqizweni kanye nokulungiswa okusheshayo kwakamuva. Ake sithole izinombolo zezinguqulo ze-ClickHouse ukuze singadideki.

Kukhona inguqulo 20.3.4. Inombolo 20 ibonisa unyaka wokukhiqiza - 2020. Ngokombono walokho okungaphakathi, lokhu akunandaba, ngakho-ke ngeke sikunake. Okulandelayo - 20.3. Sandisa inombolo yesibili - kulokhu 3 - njalo lapho sikhulula ukukhishwa ngokusebenza okuthile okusha. Uma sifuna ukwengeza isici esithile ku-ClickHouse, kufanele sikhulise le nombolo. Okusho ukuthi, kunguqulo 20.4 ClickHouse izosebenza kangcono nakakhulu. Idijithi yesithathu ngu-20.3.4. Lapha u-4 inombolo yokukhishwa kwe-patch lapho singezanga izici ezintsha, kodwa salungisa iziphazamisi ezithile. Futhi u-4 usho ukuthi sikwenze izikhathi ezine.

Ungacabangi ukuthi lokhu kuyinto embi kakhulu. Ngokuvamile umsebenzisi angafaka inguqulo yakamuva futhi izosebenza ngaphandle kwezinkinga nge-uptime ngonyaka. Kodwa ake ucabange ukuthi komunye umsebenzi wokucubungula ama-bitmaps, angezwe ozakwethu baseShayina, iseva iyaphahlazeka lapho idlulisa izimpikiswano ezingalungile. Sinesibopho sokulungisa lokhu. Sizokhipha inguqulo entsha yesichibi futhi i-ClickHouse izozinza kakhulu.

Uma une-ClickHouse esebenza ekukhiqizeni, futhi inguqulo entsha ye-ClickHouse iphuma nezici ezengeziwe - isibonelo, i-20.4.1 ingeyokuqala ngqa, ungajahi ukuyibeka ekukhiqizeni ngosuku lokuqala. Kungani futhi kudingeka? Uma usuvele ungasebenzisi i-ClickHouse, ungayifaka, futhi cishe konke kuzolunga. Kodwa uma i-ClickHouse isivele isebenza ngokuzinzile, hlala ubheka ama-patches kanye nezibuyekezo ukuze ubone ukuthi yiziphi izinkinga esizilungisayo.

UKirill Shvakov: Ngingathanda ukungeza okuncane mayelana nezindawo zokuhlola. Wonke umuntu wesaba kakhulu izindawo zokuhlola futhi ngesizathu esithile bakholelwa ukuthi uma uneqoqo elikhulu kakhulu le-ClickHouse, indawo yokuhlola akufanele ibe ngaphansi noma okungenani izikhathi eziyishumi zibe zincane. Akunjalo neze.

Ngingakutshela ngesami isibonelo. Nginephrojekthi, futhi kukhona i-ClickHouse. Indawo yethu yokuhlola ingeyakhe nje - lona umshini omncane obonakalayo e-Hetzner ngama-euro angamashumi amabili, lapho kusetshenziswa khona yonke into. Ukuze senze lokhu, sine-automation egcwele ku-Ansible, ngakho-ke, ngokomthetho, akwenzi mehluko ukuthi siye kuphi - kumaseva wehadiwe noma vele usebenzise emishinini ebonakalayo.

Yini engenziwa? Kungaba kuhle ukunikeza isibonelo emibhalweni ye-ClickHouse yokuthi ungafaka kanjani iqoqo elincane ekhaya lakho - e-Docker, e-LXC, mhlawumbe udale incwadi yokudlala ye-Ansible, ngoba abantu abahlukene banokusetshenziswa okuhlukile. Lokhu kuzokwenza kube lula kakhulu. Uma uthatha futhi uhambisa iqoqo ngemizuzu emihlanu, kulula kakhulu ukuzama ukuthola okuthile. Lokhu kulula kakhulu, ngoba ukungena enguqulweni yokukhiqiza ongazange uyihlole kuwumgwaqo ongayi ndawo. Kwesinye isikhathi iyasebenza futhi kwesinye isikhathi ayisebenzi. Ngakho-ke, ithemba lempumelelo libi.

UMaxim Kotyakov, unjiniyela osezingeni eliphezulu u-Avito: Ngizongeza kancane mayelana nezindawo zokuhlola kusukela ochungechungeni lwezinkinga izinkampani ezinkulu ezibhekene nazo. Sineqoqo eligcwele eligcwele lokwamukela i-ClickHouse; ngokwezinhlelo zedatha nezilungiselelo, ikhophi eqondile yalokho okukhiqizwayo. Leli qoqo lisatshalaliswa ezitsheni ezinomthamo omncane wezinsiza. Sibhala iphesenti elithile ledatha yokukhiqiza lapho, ngenhlanhla kungenzeka ukuphindaphinda ukusakaza e-Kafka. Yonke into ekhona iyavunyelaniswa futhi ilinganiselwe - kokubili ngokomthamo nokugeleza, futhi, ngokombono, zonke ezinye izinto ziyalingana, kufanele ziziphathe njengokukhiqiza ngokuya ngamamethrikhi. Konke okungase kube nokuqhuma kuqala kugingqelwa kule ndawo bese kushiywa lapho izinsuku ezimbalwa kuze kube kulungile. Kodwa ngokwemvelo, lesi sixazululo siyabiza, sinzima futhi sinezindleko zokusekela ezingezona zero.

U-Alexey Milovidov: Ngizokutshela ukuthi imvelo yokuhlola yabangane bethu abavela ku-Yandex.Metrica injani. Iqoqo elilodwa lalinamaseva angama-600, elinye linama-360, futhi kukhona iqoqo lesithathu namaqoqo amaningana. Indawo yokuhlola eyodwa yazo imane nje ingamashadi amabili anezithombe ezimbili lilinye. Kungani izingcezu ezimbili? Ukuze awuwedwa. Futhi kufanele kube nama-replicas futhi. Inani elincane nje ongakwazi ukulikhokhela.

Le ndawo yokuhlola ikuvumela ukuthi uhlole ukuthi imibuzo yakho iyasebenza yini nokuthi uma kukhona okukhulu okuphukile. Kodwa ngokuvamile izinkinga zivela zemvelo ehluke ngokuphelele, lapho konke kusebenza, kodwa kukhona izinguquko ezincane emthwalweni.

Ake ngenze isibonelo. Sinqume ukufaka inguqulo entsha ye-ClickHouse. Ithunyelwe endaweni yokuhlola, ukuhlola okuzenzakalelayo kuqediwe ku-Yandex.Metrica ngokwayo, eqhathanisa idatha yenguqulo endala kanye nentsha, esebenzisa lonke umzila. Futhi-ke, izivivinyo eziluhlaza zeCI yethu. Ngaphandle kwalokho besingeke size siyiphakamise le nguqulo.

Konke kuhamba kahle. Sesiqala ukungena ekukhiqizeni. Ngithola umlayezo wokuthi umthwalo emagrafu ukhuphuke izikhathi eziningana. Sibuyisela emuva inguqulo. Ngibheka igrafu futhi ngibone: umthwalo empeleni ukhuphuke izikhathi eziningana ngesikhathi sokukhishwa, futhi wehla ubuyela emuva lapho uphuma. Sabe sesiqala ukuhlehlisa inguqulo. Nomthwalo wakhula ngendlela efanayo futhi wahlehla ngendlela efanayo. Ngakho isiphetho yilesi: umthwalo unyukile ngenxa yesakhiwo, akukho lutho olumangalisayo.

Kwabe sekunzima ukukholisa ozakwethu ukuthi bafake inguqulo entsha. Ngithi: “Kulungile, phuma. Gcina iminwe yakho, konke kuzosebenza. Manje umthwalo emagrafu unyukile, kodwa konke kuhamba kahle. Bambelela." Ngokuvamile, senze lokhu, futhi yilokho - inguqulo yakhululwa ukuze ikhiqizwe. Kodwa cishe kuzo zonke izakhiwo kuphakama izinkinga ezifanayo.

Umbuzo wokubulala kufanele ubulale imibuzo, kodwa akunjalo. Kungani?

Umsebenzisi, uhlobo oluthile lomhlaziyi, weza kimi futhi wadala isicelo esibeka iqoqo lami le-ClickHouse. Enye i-node noma iqoqo eliphelele, kuye ngokuthi isicelo siye saya kuyiphi ikhophi noma i-shard. Ngiyabona ukuthi zonke izinsiza ze-CPU kule seva ziseshalofini, yonke into ibomvu. Ngesikhathi esifanayo, i-ClickHouse ngokwayo iphendula izicelo. Futhi ngiyabhala: “Ngicela ungibonise, uhlu lwezinqubo, yisiphi isicelo esidale lobu buhlanya.”

Ngithola lesi sicelo futhi ngibhala ukubulala kuso. Futhi ngiyabona ukuthi akukho okwenzekayo. Iseva yami iseshalofini, i-ClickHouse bese inginika imiyalo, ikhombisa ukuthi iseva iyaphila, futhi konke kuhle. Kodwa nginokwehliswa kwesithunzi kuzo zonke izicelo zabasebenzisi, ukucekelwa phansi kuqala ngamarekhodi ku-ClickHouse, futhi umbuzo wami wokubulala awusebenzi. Kungani? Bengicabanga ukuthi umbuzo wokubulala bekufanele ubulale imibuzo, kodwa akunjalo.

Manje kuzoba nempendulo exakile. Iphuzu liwukuthi umbuzo wokubulala awubulali imibuzo.

Kill query ihlola ibhokisi elincane elithi “Ngifuna lo mbuzo ubulawe.” Futhi isicelo ngokwaso sibheka leli fulegi lapho sicubungula ibhulokhi ngayinye. Uma isethiwe, isicelo siyayeka ukusebenza. Kuvela ukuthi akekho obulala isicelo, yena ngokwakhe kufanele ahlole konke futhi ayeke. Futhi lokhu kufanele kusebenze kuzo zonke izimo lapho isicelo sisesimweni sokucubungula amabhulokhi wedatha. Izocubungula ibhulokhi elandelayo yedatha, ihlole ifulegi, bese ima.

Lokhu akusebenzi ezimeni lapho isicelo sivinjiwe emsebenzini othile. Yiqiniso, cishe lokhu akulona icala lakho, ngoba, ngokusho kwakho, isebenzisa ithoni yezinsiza zeseva. Kungenzeka ukuthi lokhu kungasebenzi esimweni sokuhlunga kwangaphandle nakweminye imininingwane. Kodwa ngokuvamile lokhu akufanele kwenzeke, kuyiphutha. Futhi into kuphela engingayincoma ukubuyekeza i-ClickHouse.

Ungasibala kanjani isikhathi sokuphendula ngaphansi komthwalo wokufunda?

Kunetafula eligcina izinto ezihlanganisiwe - izinto zokubala ezihlukahlukene. Inani lemigqa lilinganiselwa ezigidini eziyikhulu. Kungenzeka yini ukubala esikhathini sokuphendula esingabikezelwa uma uthela i-1K RPS yezinto ezingu-1K?

Uma sibheka umongo, sikhuluma ngomthwalo wokufunda, ngoba azikho izinkinga ngokubhala - ngisho nenkulungwane, ngisho nezinkulungwane eziyikhulu, futhi ngezinye izikhathi imigqa eyizigidi ezimbalwa ingafakwa.

Izicelo zokufunda zihluke kakhulu. Ekukhetheni oku-1, i-ClickHouse ingenza izicelo ezingaba ngamashumi ezinkulungwane ngomzuzwana, ngakho-ke nezicelo zokhiye owodwa zizovele zidinga izinsiza ezithile. Futhi imibuzo enjalo yamaphuzu izoba nzima kakhulu kunezinye isizindalwazi senani elingukhiye, ngoba ekufundweni ngakunye kuyadingeka ukufunda ibhulokhi yedatha ngenkomba. Inkomba yethu ayibheki irekhodi ngalinye, kodwa ibanga ngalinye. Okusho ukuthi, kuzodingeka ufunde lonke uhla - lokhu kuyimigqa engu-8192 ngokuzenzakalelayo. Futhi kuzodingeka ucindezele ibhulokhi yedatha ecindezelwe isuka ku-64 KB iye ku-1 MB. Ngokuvamile, imibuzo enjalo eqondisiwe ithatha ama-millisecond ambalwa ukuqeda. Kodwa lena inketho elula.

Ake sizame izibalo ezilula. Uma uphindaphinda ama-millisecond ambalwa ngenkulungwane, uthola imizuzwana embalwa. Kunjengokungathi akunakwenzeka ukuhambisana nezicelo eziyinkulungwane ngomzuzwana, kodwa empeleni kungenzeka, ngoba sinama-processor cores amaningana. Ngakho-ke, empeleni, i-ClickHouse kwesinye isikhathi ingabamba i-1000 RPS, kodwa ngezicelo ezimfushane, eziqondiswe ngqo.

Uma udinga ukukala iqoqo le-ClickHouse ngenani lezicelo ezilula, khona-ke ngincoma into elula kakhulu - ukwandisa inani le-replicas futhi uthumele izicelo ku-replica engahleliwe. Uma i-replica eyodwa ibamba izicelo ezingamakhulu amahlanu ngomzuzwana, okuyiqiniso ngokuphelele, khona-ke ama-replica amathathu azophatha inkulungwane eyodwa nengxenye.

Kwesinye isikhathi, kunjalo, ungalungiselela i-ClickHouse ngenani eliphakeme lokufundwa kwamaphuzu. Yini edingekayo kulokhu? Okokuqala ukunciphisa ubumbudumbudu benkomba. Kulokhu, akufanele kuncishiswe kube okukodwa, kodwa ngesisekelo sokuthi inani lokufakiwe kunkomba lizoba izigidi ezimbalwa noma amashumi ezigidi ngeseva ngayinye. Uma ithebula linemigqa eyizigidi eziyikhulu, imbudumbudu ingasethwa ibe ngu-64.

Ungakwazi ukunciphisa usayizi block onomfutho. Kukhona izilungiselelo zalokhu min cindezela usayizi webhulokhi, usayizi webhulokhi yokucindezela ubukhulu obukhulu. Angancishiswa, agcwaliswe kabusha ngedatha, bese imibuzo eqondisiwe izoshesha. Kodwa noma kunjalo, i-ClickHouse akuyona isizindalwazi senani elingukhiye. Inombolo enkulu yezicelo ezincane i-antipattern yomthwalo.

UKirill Shvakov: Ngizonikeza iseluleko uma kwenzeka kukhona ama-akhawunti ajwayelekile lapho. Lesi yisimo esijwayelekile lapho i-ClickHouse igcina uhlobo oluthile lwekhawunta. Nginomsebenzisi, uvela ezweni elinjalo nelinjalo, kanye nenkambu yesithathu, futhi ngidinga ukukhulisa okuthile ngokukhuphukayo. Thatha i-MySQL, yenza ukhiye oyingqayizivele - ku-MySQL ingukhiye oyimpinda, futhi ku-PostgreSQL kuwukungqubuzana - bese wengeza uphawu lokuhlanganisa. Lokhu kuzosebenza kangcono kakhulu.

Uma ungenayo idatha eningi, alikho iphuzu ekusebenziseni i-ClickHouse. Kukhona i-database evamile futhi bakwenza kahle lokhu.

Yini engingayilungisa ku-ClickHouse ukuze idatha eyengeziwe ibe kunqolobane?

Ake sicabange ngesimo - amaseva ane-RAM engu-256 GB, ohlelweni lwansuku zonke lwe-ClickHouse kuthatha cishe i-60-80 GB, phezulu - kufika ku-130. Yini enganikwa amandla futhi ilungiswe ukuze idatha eyengeziwe ibe kunqolobane futhi, ngokufanele, kukhona uhambo olumbalwa oluya kudiski?

Ngokuvamile, inqolobane yekhasi lesistimu yokusebenza yenza umsebenzi omuhle walokhu. Uma uvula nje phezulu, bheka lapho kugcinwe kunqolobane noma mahhala - iphinde isho ukuthi yimalini egcinwe kunqolobane - khona-ke uzobona ukuthi yonke inkumbulo yamahhala isetshenziselwa inqolobane. Futhi lapho ufunda le datha, izofundwa hhayi kudiski, kodwa kusukela ku-RAM. Ngesikhathi esifanayo, ngingasho ukuthi i-cache isetshenziswe ngempumelelo ngoba idatha ecindezelweyo egcinwe kunqolobane.

Nokho, uma ufuna ukusheshisa eminye imibuzo elula nakakhulu, kungenzeka ukunika amandla inqolobane kudatha encishisiwe ngaphakathi kwe-ClickHouse. Kubizwa inqolobane engacindezelwanga. Efayeleni lokumisa le-config.xml, setha usayizi wenqolobane ongacindezelwanga enanini olidingayo - ngincoma ukuthi kungabi ngaphezu kwengxenye ye-RAM yamahhala, ngoba okunye kuzongena ngaphansi kwenqolobane yekhasi.

Ngaphezu kwalokho, kunezilungiselelo ezimbili zeleveli yesicelo. Isilungiselelo sokuqala - sebenzisa inqolobane engacindezelwanga - kuhlanganisa ukusetshenziswa kwayo. Kunconywa ukuthi uyinike amandla kuzo zonke izicelo, ngaphandle kwezisindayo, ezingakwazi ukufunda yonke idatha futhi zisule inqolobane. Futhi ukulungiselelwa kwesibili kuyinto efana nenani eliphakeme lemigqa yokusebenzisa inqolobane. Ikhawulela ngokuzenzakalela imibuzo emikhulu ukuze idlule inqolobane.

Ngingayimisa kanjani i-storage_configuration yokugcina ku-RAM?

Embhalweni omusha we-ClickHouse ngifunde isigaba esihlobene enokugcinwa kwedatha. Incazelo iqukethe isibonelo nge-SSD esheshayo.

Ngiyazibuza ukuthi into efanayo ingalungiswa kanjani ngememori eshisayo yevolumu. Futhi omunye umbuzo. Ukukhetha kusebenza kanjani nale nhlangano yedatha, izofunda yonke isethi noma leyo kuphela ekudiski, futhi ingabe le datha icindezelwe kumemori? Futhi isigaba sangaphambili sisebenza kanjani nenhlangano yedatha enjalo?

Lesi silungiselelo sithinta ukugcinwa kwezingcezu zedatha, futhi ifomethi yazo ayishintshi nganoma iyiphi indlela.
Ake sibhekisise.

Ungamisa ukugcinwa kwedatha ku-RAM. Konke okulungiselelwe idiski yindlela yayo. Udala i-tmpfs partition efakwe endleleni ethile ohlelweni lwefayela. Ucacisa le ndlela njengendlela yokugcina idatha ye-partition eshisa kakhulu, izingcezu zedatha ziqala ukufika futhi zibhalwe lapho, konke kuhamba kahle.

Kodwa angincomi ukwenza lokhu ngenxa yokwethembeka okuphansi, nakuba okungenani unezimpendulo ezintathu ezikhungweni zedatha ezahlukene, khona-ke kungenzeka. Uma kwenzeka okuthile, idatha izobuyiselwa. Ake sicabange ukuthi iseva ivalwe ngokuzumayo yaphinde yavulwa. I-partition yaphinde yafakwa, kodwa kwakungekho lutho lapho. Lapho iseva ye-ClickHouse iqala, ibona ukuthi ayinazo lezi zingcezu, nakuba, ngokusho kwemethadatha ye-ZooKeeper, kufanele zibe khona. Ubheka ukuthi yiziphi izifaniso ezinazo, azicele futhi azilande. Ngale ndlela idatha izobuyiselwa.

Ngalo mqondo, ukugcinwa kwedatha ku-RAM akuhlukile ngokuyisisekelo ekuyigcineni kudiski, ngoba lapho idatha ibhalwa kudiski, iqala igcine kunqolobane yekhasi futhi ibhalwe ngokuhamba kwesikhathi. Lokhu kuncike kunketho yokukhweza isistimu yefayela. Kepha uma kwenzeka, ngizosho ukuthi i-ClickHouse ayihambisani lapho ifaka.

Kulokhu, idatha eku-RAM igcinwa ngefomethi efanayo ncamashi naleyo ekudiski. Umbuzo okhethiwe ngendlela efanayo ukhetha izingcezu ezidinga ukufundwa, ukhethe izigaba zedatha ezidingekayo ezicucu, futhi uzifunde. Futhi lapho kusebenza khona ngokufana ncamashi, kungakhathaliseki ukuthi idatha ibiku-RAM noma ikudiski.

I-Low Cardinality isebenza kuze kube yiliphi inani lamanani ahlukile?

I-Low Cardinality iklanywe ngobuhlakani. Ihlanganisa izichazamazwi zedatha, kodwa ezendawo. Okokuqala, kunezichazamazwi ezihlukene zesiqephu ngasinye, futhi okwesibili, ngisho nangaphakathi kwesiqephu esisodwa zingahluka kuhlu ngalunye. Lapho inani lamanani ahlukile lifinyelela inombolo yomkhawulo—isigidi esisodwa, ngicabanga—isichazamazwi simane sibekwe eshelufini bese kwakhiwa esisha.

Impendulo iwukuthi: kububanzi bendawo ngayinye - yithi, usuku ngalunye - endaweni ethile kufika esigidini samanani ayingqayizivele I-Low Cardinality iyasebenza. Ngemuva kwalokho kuzoba nokubuyela emuva, lapho kuzosetshenziswa izichazamazwi eziningi ezahlukene, hhayi esisodwa. Izosebenza cishe ngokufana nekholomu yeyunithi yezinhlamvu evamile, mhlawumbe ingasebenzi kahle kancane, kodwa ngeke kube khona ukuwohloka kokusebenza okubi kakhulu.

Yiziphi izinqubo ezihamba phambili zokusesha umbhalo ogcwele etafuleni elinemigqa eyizigidi eziyizinkulungwane ezinhlanu?

Kunezimpendulo ezahlukene. Okokuqala ukusho ukuthi i-ClickHouse akuyona injini yokusesha egcwele umbhalo. Kunezinhlelo ezikhethekile zalokhu, isibonelo, Islastiki и sphinx. Kodwa-ke, ngiya ngokuya ngibona abantu bethi basuka ku-Elasticsearch baye ku-ClickHouse.

Kungani lokhu kwenzeka? Bachaza lokhu ngokuthi i-Elasticsearch iyeka ukubhekana nomthwalo kwezinye izilinganiso, iqala ngokwakhiwa kwezinkomba. Izinkomba ziba nzima kakhulu, futhi uma nje udlulisela idatha ku-ClickHouse, kuvela ukuthi zigcinwa izikhathi eziningana ngokuphumelelayo ngokwevolumu. Ngesikhathi esifanayo, imibuzo yosesho ngokuvamile yayingenjalo kangangokuthi kwakudingeka kutholwe ibinzana elithile kuwo wonke umthamo wedatha, kucatshangelwa i-morphology, kodwa ehluke ngokuphelele. Isibonelo, thola amabhayithi athile kulogi emahoreni ambalwa adlule.

Kulokhu, udala inkomba ku-ClickHouse, inkambu yokuqala okuzoba usuku nesikhathi. Futhi ukunqanyulwa kwedatha okukhulu kakhulu kuzosuselwa ebangeni ledethi. Ngaphakathi kwebanga ledethi elikhethiwe, njengomthetho, sekungenzeka vele ukusesha umbhalo ogcwele, ngisho nokusebenzisa indlela ye-brute force usebenzisa i-like. U-opharetha ofana ku-ClickHouse nguyena osebenza kahle kakhulu njengo-opharetha ongawuthola. Uma uthola okuthile okungcono, ngitshele.

Kodwa noma kunjalo, kufana nokuskena okugcwele. Futhi ukuskena okugcwele kungahamba kancane hhayi ku-CPU kuphela, kodwa nakudiski. Uma kungazelelwe une-terabyte eyodwa yedatha ngosuku, futhi usesha igama phakathi nosuku, kuzodingeka ukuthi uskene i-terabyte. Futhi mhlawumbe kuma-hard drive avamile, futhi ekugcineni azolayishwa ngendlela yokuthi ngeke ukwazi ukufinyelela le seva nge-SSH.

Kulokhu, ngilungele ukunikeza elinye iqhinga elincane. Kuyahlolwa - kungase kusebenze, kungase kungasebenzi. I-ClickHouse inezinkomba zombhalo ogcwele ngendlela yezihlungi ze-trigram Bloom. Ozakwethu e-Arenadata sebevele bezamile lezi zinkomba, futhi zivame ukusebenza ngendlela ehlosiwe.

Ukuze uzisebenzise ngendlela efanele, kufanele uqonde kahle ukuthi zisebenza kanjani: siyini isihlungi se-trigram Bloom nokuthi ungakhetha kanjani usayizi waso. Ngingasho ukuthi bazosiza ngemibuzo kweminye imishwana eyivelakancane, imigqa engaphansi engavamile ukutholakala kudatha. Kulesi simo, izigaba ezingaphansi zizokhethwa ngezinkomba futhi idatha encane izofundwa.

Muva nje, i-ClickHouse yengeze imisebenzi ethuthuke kakhulu yokusesha okugcwele umbhalo. Lokhu, okokuqala, ukusesha inqwaba yamayunithi ezinhlamvu angaphansi ngesikhathi esisodwa endaweni eyodwa, okuhlanganisa nezinketho ezizwelayo, ezingazweli, ezisekelwa i-UTF-8 noma i-ASCII kuphela. Khetha ephumelela kakhulu oyidingayo.

Sesha izisho ezivamile eziningi kuphasi eyodwa nakho sekuvele. Awudingi ukubhala u-X njengochungechunge oluncane olulodwa noma u-X njengolunye uchungechunge oluncane. Ubhala ngaso leso sikhathi, futhi konke kwenziwa kahle ngangokunokwenzeka.

Okwesithathu, manje sekunosesho olulinganiselwe lwe-regexps kanye nosesho olulinganiselwe lwamayunithi ezinhlamvu angaphansi. Uma othile angapelwanga kahle igama, lizoseshwa ukuze kufane nomkhawulo.

Iyiphi indlela engcono kakhulu yokuhlela ukufinyelela ku-ClickHouse ngenani elikhulu labasebenzisi?

Sitshele ukuthi singakuhlela kanjani kangcono ukufinyelela kwenani elikhulu labathengi nabahlaziyi. Ungawakha kanjani umugqa, ubeke phambili imibuzo eminingi ngesikhathi esisodwa, futhi ngawaphi amathuluzi?

Uma iqoqo likhulu ngokwanele, khona-ke isisombululo esihle kungaba ukukhulisa amaseva amabili ngaphezulu, okuzoba indawo yokungena yabahlaziyi. Okungukuthi, ungavumeli abahlaziyi ukuthi bafinyelele ama-shards athile ku-cluster, kodwa mane udale amaseva amabili angenalutho, ngaphandle kwedatha, futhi ulungiselele amalungelo okufinyelela kuwo. Kulesi simo, izilungiselelo zomsebenzisi zezicelo ezisabalalisiwe zidluliselwa kumaseva akude. Okusho ukuthi, ulungisa yonke into kulawa maseva amabili, futhi izilungiselelo zinomthelela kulo lonke iqoqo.

Empeleni, lawa maseva awanayo idatha, kodwa inani le-RAM kuwo libaluleke kakhulu ekwenzeni izicelo. Idiski ingaphinda isetshenziselwe idatha yesikhashana uma ukuhlanganisa kwangaphandle noma ukuhlunga kwangaphandle kunikwe amandla.

Kubalulekile ukubheka izilungiselelo ezihlotshaniswa nayo yonke imikhawulo engenzeka. Uma manje ngiya kuqoqo le-Yandex.Metrica njengomhlaziyi futhi ngicele isicelo khetha ukubala kusukela hits, khona-ke ngokushesha ngizonikezwa okuhlukile ukuthi angikwazi ukwenza isicelo. Inombolo enkulu yemigqa engivunyelwe ukuyiskena yizigidigidi eziyikhulu, futhi sekuphelele kukhona amathriliyoni angamashumi amahlanu awo etafuleni elilodwa kuqoqo. Lona umkhawulo wokuqala.

Ake sithi ngisusa umkhawulo werowu bese ngiqalisa umbuzo futhi. Bese ngizobona okuhlukile okulandelayo - ukulungiselelwa kunikwe amandla phoqa inkomba ngedethi. Angikwazi ukuqedela umbuzo uma ngingakalicacisi ibanga ledethi. Akumele uthembele kubahlaziyi ukuthi bakucacise mathupha. Isimo esijwayelekile yilapho kubhalwa ibanga ledethi lapho idethi yomcimbi phakathi kweviki. Futhi-ke bavele bacacisa ubakaki endaweni engafanele, futhi esikhundleni futhi kwavela ukuthi noma - noma ukufana kwe-URL. Uma ungekho umkhawulo, izocaca kukholomu ye-URL futhi ivele ichithe ithani lezisetshenziswa.

Ngaphezu kwalokho, i-ClickHouse inezilungiselelo ezimbili ezibalulekile. Ngeshwa, abakudala kakhulu. Omunye ubizwa nje kuqala. Uma okubalulekile okungu-≠ 0, kanye nezicelo ezinokubaluleka okuthile zenziwa, kodwa isicelo esinevelu ebalulekile engaphansi, okusho okubalulekile okuphezulu, senziwa, isicelo esinevelu ebalulekile yokukhulu, okusho ukubaluleka okuphansi. , imisiwe nje futhi ngeke isebenze nhlobo ngalesi sikhathi.

Lesi isilungiselelo esingcolile kakhulu futhi asifanele izimo lapho iqoqo linomthwalo oqhubekayo. Kodwa uma unezicelo ezimfushane, eziqhumayo ezibalulekile, futhi iqoqo ngokuvamile alisebenzi, lokhu kusetha kuyafaneleka.

Ukulungiselelwa okulandelayo okubalulekile kubizwa Okubalulekile kochungechunge lwe-OS. Imane isethe inani elihle lazo zonke izintambo zokwenziwa kwezicelo zomhleli we-Linux. Isebenza ngakho-ke, kodwa isasebenza. Uma usetha inani eliphansi elimnandi - likhulu kunani lonke, futhi ngenxa yalokho elibaluleke kakhulu eliphansi kakhulu - futhi usethe -19 ngezicelo ezinokubaluleka okuphezulu, lapho-ke i-CPU izosebenzisa izicelo ezingabalulekile kakhulu cishe izikhathi ezine kunezo ezibalulekile.

Udinga futhi ukumisa isikhathi esiphezulu sokwenza isicelo - yithi, imizuzu emihlanu. Isivinini esincane sokwenza umbuzo siyinto ebanda kunazo zonke. Lokhu kulungiselelwa sekunesikhathi eside kukhona, futhi akudingeki nje kuphela ukugomela ukuthi i-ClickHouse ayinensisi, kodwa iyakuphoqa.

Cabanga nje, uyamisa: uma imibuzo ethile icubungula imigqa engaphansi kwesigidi ngomzuzwana, awukwazi ukwenza lokho. Lokhu kuhlazisa igama lethu elihle, isizindalwazi sethu esihle. Ake sikuvimbe lokhu. Empeleni kunezilungiselelo ezimbili. Omunye ubizwa isivinini esincane sokwenza - emigqeni ngomzuzwana, futhi okwesibili kubizwa ngokuthi ukuphela kwesikhathi ngaphambi kokuhlola isivinini sokwenza imiz - imizuzwana eyishumi nanhlanu ngokuzenzakalelayo. Okusho ukuthi, imizuzwana eyishumi nanhlanu ingenzeka, futhi-ke, uma ihamba kancane, vele uphonse okuhlukile futhi uchithe isicelo.

Udinga futhi ukusetha ama-quota. I-ClickHouse inesici se-quota esakhelwe ngaphakathi esibala ukusetshenziswa kwensiza. Kodwa, ngeshwa, hhayi izinsiza ze-hardware ezifana ne-CPU, amadiski, kodwa anengqondo - inani lezicelo ezicutshunguliwe, imigqa namabhayithi afundiwe. Futhi ungakwazi ukumisa, isibonelo, izicelo eziyikhulu phakathi nemizuzu emihlanu kanye nezicelo eziyinkulungwane ngehora.

Kungani ibalulekile? Ngoba eminye imibuzo yezibalo izokwenziwa mathupha ngokuqondile kuklayenti le-ClickHouse. Futhi konke kuzolunga. Kodwa uma unabahlaziyi abathuthukile enkampanini yakho, bazobhala iskripthi, futhi kungase kube nephutha kusikripthi. Futhi leli phutha lizobangela ukuthi isicelo senziwe nge-loop engapheli. Yilokhu okudingeka sizivikele kukho.

Kungenzeka yini ukunikeza amaklayenti ayishumi imiphumela yombuzo owodwa?

Sinabasebenzisi abambalwa abathanda ukuza nezicelo ezinkulu kakhulu ngesikhathi esifanayo. Isicelo sikhulu futhi, ngokomthetho, senziwa ngokushesha, kodwa ngenxa yokuthi kunezicelo eziningi ezinjalo ngesikhathi esifanayo, kuba buhlungu kakhulu. Kungenzeka yini ukwenza isicelo esifanayo, esifike izikhathi eziyishumi ngokulandelana, kanye, futhi sinikeze umphumela kumakhasimende ayishumi?

Inkinga ukuthi asinayo imiphumela yenqolobane noma inqolobane yedatha emaphakathi. Kukhona inqolobane yekhasi yesistimu yokusebenza, ezokuvimbela ukuthi ungafundi idatha kusuka kudiski futhi, kodwa, ngeshwa, idatha isazocindezelwa, ikhishwe futhi icutshungulwe kabusha.

Ngingathanda ngandlela thize ukugwema lokhu, ngokufaka kunqolobane idatha emaphakathi, noma ngokuhlanganisa imibuzo efanayo kolunye uhlobo lolayini nokwengeza inqolobane yemiphumela. Okwamanje sinesicelo esisodwa sokudonsa ekuthuthukisweni esingeza inqolobane yesicelo, kodwa kuphela ngemibuzo engaphansi kwezigaba ezingaphakathi nokujoyina - okungukuthi, isisombululo asiphelele.

Nokho, nathi sibhekana nesimo esinjalo. Isibonelo se-canonical ikakhulukazi imibuzo enamakhasi. Kukhona umbiko, unamakhasi amaningana, futhi kukhona isicelo somkhawulo we-10. Khona-ke into efanayo, kodwa umkhawulo we-10,10. Bese kuba elinye ikhasi elilandelayo. Futhi umbuzo uwukuthi, kungani sibala konke lokhu njalo? Kodwa manje asikho isixazululo, futhi ayikho indlela yokukugwema.

Kunesinye isixazululo esibekwe njengemoto eseceleni kweClickHouse - Ummeleli we-ClickHouse.

UKirill Shvakov: Ummeleli we-ClickHouse unomkhawulo wesilinganiso esakhelwe ngaphakathi kanye nenqolobane yemiphumela eyakhelwe ngaphakathi. Izilungiselelo eziningi zenziwa lapho ngoba inkinga efanayo yayixazululwa. Ummeleli ikuvumela ukuthi ukhawulele izicelo ngokuzifaka kulayini futhi ulungiselele ukuthi inqolobane yesicelo ihlala isikhathi esingakanani. Uma izicelo bezifana ngempela, ummeleli uzozithumela izikhathi eziningi, kodwa uzoya ku-ClickHouse kanye kuphela.

I-Nginx nayo inenqolobane enguqulweni yamahhala, futhi lokhu kuzosebenza. I-Nginx inamasethingi okuthi uma izicelo zifika ngesikhathi esifanayo, izobambezela ezinye kuze kuqedwe esisodwa. Kodwa ku-ClickHouse Proxy lapho ukusetha kwenziwa kangcono kakhulu. Yenzelwe ngqo i-ClickHouse, ikakhulukazi lezi zicelo, ngakho ifaneleka kakhulu. Hhayi-ke, kulula ukuyifaka.

Kuthiwani ngokusebenza okungavumelaniyo nemibono eyenziwe ngezinto ezibonakalayo?

Kunenkinga yokuthi ukusebenza ngenjini ye-replay asynchronous - okokuqala idatha ibhaliwe, bese iyawa. Uma ithebhulethi enziwe into enama-aggregate athile ihlala ngaphansi kophawu, kuzobhalwa izimpinda kuyo. Futhi uma kungekho logic eyinkimbinkimbi, khona-ke idatha izophinda iphindwe. Yini ongayenza ngakho?

Kunesixazululo esisobala - ukusebenzisa i-trigger ekilasini elithile lama-matviews ngesikhathi sokusebenza kokugoqa okungavumelaniyo. Ingabe zikhona izinhlamvu zesiliva noma izinhlelo zokuqalisa ukusebenza okufanayo?

Kuyafaneleka ukuqonda ukuthi ukuphindaphinda kusebenza kanjani. Engizokutshela kona manje akuhambisani nombuzo, kodwa uma nje kufanele ukukhumbula.

Lapho ufaka etafuleni eliphindiwe, kuba nokuphindaphindeka kwawo wonke amabhlogo afakiwe. Uma uphinda ufaka ibhulokhi efanayo equkethe inombolo efanayo yemigqa efanayo ngokulandelana okufanayo, idatha izokhishwa. Uzothola okuthi “Kulungile” empendulweni yokufaka, kodwa empeleni iphakethe elilodwa ledatha lizobhalwa, futhi ngeke liphindwe.

Lokhu kuyadingeka ukuze kuqinisekiswe. Uma uthola okuthi “Kulungile” ngesikhathi sokufaka, idatha yakho isifakiwe. Uma uthola iphutha ku-ClickHouse, kusho ukuthi azifakiwe futhi udinga ukuphinda ukufakwa. Kodwa uma uxhumano luphukile ngesikhathi sokufakwa, awazi ukuthi idatha ifakiwe noma cha. Okuwukuphela kwenketho ukuphinda ukufaka futhi. Uma ngabe idatha ifakiwe ngempela futhi uyifake kabusha, kukhona ukuphindaphinda kwebhulokhi. Lokhu kuyadingeka ukuze kugwenywe izimpinda.

Futhi kubalulekile ukuthi isebenza kanjani emibonweni eyenziwe ngezinto ezibonakalayo. Uma idatha ikhishiwe lapho ifakwa kuthebula eliyinhloko, ngeke iye nasekubukeni okwenyama.

Manje mayelana nombuzo. Isimo sakho sinzima kakhulu ngoba uqopha izimpinda zemigqa ngayinye. Okusho ukuthi, akulona lonke iphekhi eliyimpinda, kodwa imigqa ethile, futhi igoqeka ngemuva. Ngempela, idatha izogoqeka kuthebula eliyinhloko, kodwa idatha engagoqiwe izoya ekubukeni okwenyama, futhi ngesikhathi sokuhlanganisa akukho okuzokwenzeka ekubukeni okwenziwe imizimba. Ngoba umbono owenziwe ngenyama uyinto nje yokufaka inhlamvu. Phakathi kokunye ukusebenza, akukho okwengeziwe okwenzeka kuyo.

Futhi angikwazi ukukujabulisa lapha. Udinga nje ukubheka isixazululo esithile saleli cala. Isibonelo, kungenzeka yini ukuyidlala kabusha ngokubuka okwenyama, futhi indlela yokukhipha ingasebenza ngendlela efanayo. Kodwa ngeshwa, hhayi njalo. Uma ihlanganisa, ngeke isebenze.

UKirill Shvakov: Saphinda sakhiwa ngezinduku zokudondolozela emuva. Kube nenkinga yokuthi kukhona okuvelayo kokukhangisa, futhi kukhona idatha esingayibonisa ngesikhathi sangempela - lokhu kuyimibono nje. Awavamile ukuphindaphindwa, kodwa uma lokhu kwenzeka, sizowagoqa ngokuhamba kwesikhathi noma kunjalo. Futhi kwakukhona izinto ezingenakuphinda zenziwe - ukuchofoza kanye nayo yonke le ndaba. Kodwa futhi ngangifuna ukubabonisa cishe ngokushesha.

Yenziwa kanjani imibono eguquliwe? Kwakukhona imibono lapho ibhalwe khona ngokuqondile - yayibhalelwe idatha eluhlaza, futhi yabhalelwa ukubukwa. Lapho, ngesinye isikhathi idatha ayilungile kakhulu, iphindwe kabili, njalonjalo. Futhi kukhona ingxenye yesibili yetafula, lapho ibukeka ifana ncamashi nemibono eyenziwe ngezinto ezibonakalayo, okungukuthi, ifana ngokuphelele ngesakhiwo. Ngesinye isikhathi sibala kabusha idatha, sibale idatha ngaphandle kwezimpinda, sibhalele kulawo mathebula.

Sidlule ku-API - lokhu ngeke kusebenze ku-ClickHouse mathupha. Futhi i-API ibukeka: lapho nginedethi yokwengezwa kokugcina etafuleni, lapho kuqinisekiswa khona ukuthi idatha efanele isivele ibaliwe, futhi yenza isicelo etafuleni elilodwa nakwelinye ithebula. Kokunye isicelo sikhetha kuze kufike esikhathini esithile, kanti kwesinye sithola okungakabalwa. Futhi iyasebenza, kodwa hhayi nge-ClickHouse kuphela.

Uma unohlobo oluthile lwe-API - lwabahlaziyi, kubasebenzisi - ke, ngokuyisisekelo, lokhu kuyinketho. Ubala njalo, ubala njalo. Lokhu kungenziwa kanye ngosuku noma ngesinye isikhathi. Uzikhethela ububanzi ongabudingi futhi obungagxekayo.

I-ClickHouse inamalogi amaningi. Ngingabona kanjani konke okwenzeka kuseva ngokubuka nje?

I-ClickHouse inenani elikhulu kakhulu lamalogi ahlukene, futhi le nombolo iyanda. Ezinguqulweni ezintsha, ezinye zazo zize zinikwe amandla ngokuzenzakalela; kuzinguqulo ezindala kufanele zivunyelwe uma zibuyekezwa. Nokho, ziyanda futhi ziyanda. Ekugcineni, ngingathanda ukubona ukuthi kwenzekani ngeseva yami manje, mhlawumbe ohlotsheni oluthile lwedeshibhodi yesifinyezo.

Ingabe unalo iqembu le-ClickHouse, noma amaqembu abangani bakho, asekela ukusebenza okuthile kwamadeshibhodi enziwe ngomumo angabonisa lawa malogi njengomkhiqizo oqediwe? Ekugcineni, ukubuka izingodo ku-ClickHouse kuhle. Kodwa kungaba kuhle kakhulu uma isivele ilungiselelwe ngesimo sedeshibhodi. Ngizothola ukukhahlelwa kulokhu.

Akhona amadeshibhodi, nakuba engajwayelekile. Enkampanini yethu, cishe amaqembu angama-60 asebenzisa i-ClickHouse, futhi okuxakayo ukuthi amaningi awo anamadeshibhodi azenzele wona, futhi ahluke kancane. Amanye amaqembu asebenzisa ukufakwa kwangaphakathi kwe-Yandex.Cloud. Kukhona imibiko eseyenziwe ngomumo, nakuba ingeyona yonke edingekayo. Abanye banezabo.

Ozakwethu baseMetrica banedeshibhodi yabo eGrafana, nami ngineyami yeqoqo labo. Ngibheka izinto ezifana ne-cache hit for the serif cache. Futhi okunzima nakakhulu ukuthi sisebenzisa amathuluzi ahlukene. Ngidale ideshibhodi yami ngisebenzisa ithuluzi elidala kakhulu elibizwa nge-Graphite-web. Mubi ngokuphelele. Futhi ngisayisebenzisa ngale ndlela, nakuba i-Grafana ingase ibe lula kakhulu futhi ibe yinhle.

Into eyisisekelo kumadeshibhodi iyafana. Lawa amamethrikhi esistimu eqoqo: i-CPU, inkumbulo, idiski, inethiwekhi. Okunye - inombolo yezicelo ngesikhathi esisodwa, inombolo yokuhlanganisa kanyekanye, inombolo yezicelo ngomzuzwana, inani eliphakeme lezingcezu zezingxenyekazi zetafula le-MergeTree, ukubeleka kokuphindaphinda, usayizi womugqa wokuphindaphinda, inombolo yemigqa efakiwe ngomzuzwana, inombolo yamabhulokhi afakiwe ngomzuzwana. Yilokhu kuphela okutholwa hhayi kumalogi, kodwa kumamethrikhi.

UVladimir Kolobaev: Alexey, ngingathanda ukukulungisa kancane. Kukhona uGrafana. I-Grafana inomthombo wedatha, okuyi-ClickHouse. Okusho ukuthi, ngingenza izicelo ezivela eGrafana ngqo ku-ClickHouse. I-ClickHouse inetafula elinamalogi, liyafana kuwo wonke umuntu. Ngenxa yalokho, ngifuna ukufinyelela kuleli tafula lokungena eGrafana futhi ngibone izicelo ezenziwa yiseva yami. Kungaba kuhle ukuba nedeshibhodi efana nalena.

Ngiyishayele mina ngebhayisikili. Kodwa nginombuzo - uma konke kusezingeni, futhi iGrafana isetshenziswa yiwo wonke umuntu, kungani i-Yandex ingenayo ideshibhodi esemthethweni enjalo?

UKirill Shvakov: Eqinisweni, umthombo wedatha oya ku-ClickHouse manje usekela i-Altinity. Futhi ngifuna ukunikeza i-vector ukuthi imbe kuphi nokuthi ubani ozoyiphusha. Ungababuza, ngoba i-Yandex isayenza i-ClickHouse, hhayi indaba ezungezile. I-Altinity yinkampani eyinhloko okwamanje ekhangisa i-ClickHouse. Ngeke bamlahle, kodwa bayomsekela. Ngoba, empeleni, ukulayisha ideshibhodi kuwebhusayithi ye-Grafana, udinga kuphela ukubhalisa futhi uyilayishe - azikho izinkinga ezikhethekile.

U-Alexey Milovidov: Onyakeni odlule, i-ClickHouse yengeze amakhono amaningi wokwenza iphrofayela yemibuzo. Kunamamethrikhi esicelweni ngasinye ekusetshenzisweni kwensiza. Futhi kamuva nje, singeze iphrofayili yombuzo yezinga eliphansi ukuze sibone ukuthi umbuzo uchitha kuphi yonke i-millisecond. Kodwa ukuze ngisebenzise lokhu kusebenza, kufanele ngivule iklayenti le-console bese ngithayipha isicelo, engihlala ngisikhohlwa. Ngiyigcine ndawana thize futhi ngilokhu ngikhohlwa ukuthi kuphi ngempela.

Ngifisa sengathi ngabe bekunethuluzi elisanda kusho, nansi imibuzo yakho esindayo, eqoqwe ngesigaba sombuzo. Ngacindezela eyodwa, futhi babengitshela ukuthi yingakho inzima. Asikho isisombululo esinjalo manje. Futhi kuyamangaza ngempela ukuthi lapho abantu bengibuza: “Ngitshele, akhona amadeshibhodi asenziwe aselungile e-Grafana?”, ngithi: “Hamba kusizindalwazi sakwaGrafana, kunomphakathi “Wamadeshibhodi”, futhi kukhona ideshibhodi. kusuka ku-Dimka, kukhona ideshibhodi evela e-Kostyan. Angazi ukuthi kuyini, angikaze ngiyisebenzise mina.”

Ungakuthonya kanjani ukuhlanganiswa ukuze iseva ingashayi ku-OOM?

Nginetafula, kunengxenye eyodwa kuphela etafuleni, i-ReplaceMergeTree. Sengineminyaka emine ngibhala ama-data kuwo. Bengidinga ukwenza ushintsho kuyo futhi ngisuse idatha ethile.

Ngenze lokhu, futhi ngesikhathi sokucutshungulwa kwalesi sicelo, yonke inkumbulo kuwo wonke amaseva kuqoqo yaqedwa, futhi wonke amaseva kuqoqo angena ku-OOM. Base besukuma bonke ndawonye, ​​baqala ukuhlanganisa lokhu kusebenza okufanayo, lokhu vimba wedatha, futhi bawela ku-OOM futhi. Baphinde basukuma bawa futhi. Futhi le nto ayizange ime.

Kwabe sekuvela ukuthi empeleni lokhu kwakuyisiphazamisi esalungiswa yizinsizwa. Lokhu kuhle kakhulu, ngiyabonga kakhulu. Kodwa kwasala. Futhi manje, uma ngicabanga ngokwenza uhlobo oluthile lokuhlanganisa etafuleni, nginombuzo - kungani ngingakwazi ngandlela thize ukuthonya lokhu kuhlanganisa? Isibonelo, zikhawulele ngenani le-RAM edingekayo, noma, ngokuyisisekelo, ngenani elizocubungula leli thebula elithile.

Nginetafula elibizwa ngokuthi “Amamethrikhi”, ngicela ungicubungulele lona ngemicu emibili. Asikho isidingo sokudala ukuhlanganisa okuyishumi noma okuhlanu ngokuhambisana, kwenze kabili. Ngicabanga ukuthi nginenkumbulo eyanele kokubili, kodwa kungase kungenele ukucubungula eziyishumi. Kungani ukwesaba kusasele? Ngoba itafula liyakhula, futhi ngolunye usuku ngizobhekana nesimo, empeleni, asisekho ngenxa yesiphazamisi, kodwa ngoba idatha izoshintsha ngenani elikhulu kangangokuthi ngeke ngibe nenkumbulo eyanele iseva. Bese iseva izophahlazeka ku-OOM lapho ihlanganiswa. Ngaphezu kwalokho, ngingakwazi ukukhansela ukuguqulwa, kodwa uMerji akasekho.

Uyazi, lapho uhlanganisa, iseva ngeke iwele ku-OOM, ngoba lapho ihlanganisa, inani le-RAM lisetshenziselwa uhla olulodwa oluncane lwedatha. Ngakho konke kuzolunga kungakhathaliseki inani ledatha.

UVladimir Kolobaev: Kuhle. Lapha umzuzu uwukuthi ngemva kokulungiswa kwesiphazamisi, ngazidawuniloda inguqulo entsha, futhi kwelinye itafula, elincane, lapho kunokuhlukaniswa okuningi, ngenze umsebenzi ofanayo. Futhi ngesikhathi sokuhlanganisa, cishe i-100 GB ye-RAM yashiswa kuseva. Bengine-150 ematasa, i-100 idliwe, futhi iwindi elingu-50 GB lisele, ngakho angizange ngiwele ku-OOM.

Yini okwamanje engivikelayo ekuweleni ku-OOM uma empeleni idla i-100 GB ye-RAM? Yini okufanele uyenze uma kungazelelwe i-RAM ekuhlanganisweni iphela?

U-Alexey Milovidov: Kunenkinga yokuthi ukusetshenziswa kwe-RAM ikakhulukazi ukuhlanganisa akunqunyelwe. Futhi inkinga yesibili ukuthi uma uhlobo oluthile lokuhlanganisa lunikiwe, kufanele lwenziwe ngoba luqoshwa kulogi yokuphindaphinda. Ilogi yokuphindaphinda yizenzo ezidingekayo ukuletha isifaniso esimweni esingaguquki. Uma ungenzi ukukhohlisa okwenziwa ngesandla okuzobuyisela emuva lelogi yokuphindaphinda, ukuhlanganisa kuzodingeka kwenziwe ngandlela thize.

Vele, bekungeke kube okungadingekile ukuba nomkhawulo we-RAM "uma kwenzeka" ukuvikela ku-OOM. Ngeke kusize ukuhlanganisa ukuqedela, kuzoqala futhi, kufinyelele umkhawulo othile, kuphonse okuhlukile, bese kuphinda kuqale - akukho okuhle okuzovela kulokhu. Kodwa empeleni, kungaba usizo ukwethula lo mkhawulo.

Uzothuthukiswa kanjani umshayeli we-Golang we-ClickHouse?

Umshayeli we-Golang, owabhalwa nguKirill Shvakov, manje usekelwa ngokusemthethweni ithimba le-ClickHouse. Yena endaweni yokugcina ye-ClickHouse, manje usemkhulu futhi ungokoqobo.

Inothi elincane. Kukhona inqolobane emangalisayo nethandekayo yezinhlobo ezijwayelekile zokuhleleka okungapheli - lena yi-Vertica. Futhi banomshayeli wabo osemthethweni we-python, osekelwa abathuthukisi be-Vertica. Futhi izikhathi eziningana kwenzeka ukuthi izinguqulo zesitoreji kanye nezinguqulo zomshayeli zahlukana kakhulu, futhi umshayeli ngesikhathi esithile wayeka ukusebenza. Futhi iphuzu lesibili. Ukusekelwa kwalo mshayeli osemthethweni, kubonakala kimi, kwenziwa uhlelo “lwengono” - ubabhalela udaba, futhi lulenga unomphela.

Nginemibuzo emibili. Manje umshayeli we-Kirill's Golang cishe uyindlela ezenzakalelayo yokuxhumana kusuka ku-Golang nge-ClickHouse. Ngaphandle kwalapho othile esaxhumana ngesixhumi esibonakalayo se-http ngoba ethanda ngaleyo ndlela. Ukuthuthukiswa kwalo mshayeli kuzoqhubeka kanjani? Ingabe izovunyelaniswa nanoma yiziphi izinguquko eziphukile endaweni yokugcina? Futhi iyiphi inqubo yokucabangela udaba?

UKirill Shvakov: Esokuqala yindlela yonke into ehlelwa ngayo ngokomthetho. Leli phuzu akuzange kuxoxwe ngakho, ngakho akukho engingakuphendula.

Ukuze siphendule umbuzo mayelana nodaba, sidinga umlando omncane womshayeli. Ngangisebenzela inkampani eyayinemininingwane eminingi. Kwakuyi-spinner yokukhangisa enenani elikhulu lemicimbi edinga ukugcinwa endaweni ethile. Futhi ngesinye isikhathi kwavela i-ClickHouse. Sayigcwalisa ngedatha, futhi ekuqaleni yonke into yayihamba kahle, kodwa yabe iClickHouse yaphahlazeka. Ngaleso sikhathi sanquma ukuthi asiyidingi.

Ngemva konyaka, sabuyela emcabangweni wokusebenzisa i-ClickHouse, futhi kwakudingeka sibhale idatha lapho ngandlela thile. Umlayezo oyisingeniso ubuthi: i-hardware ibuthakathaka kakhulu, kunezinsiza ezimbalwa. Kodwa besihlala sisebenza ngale ndlela, ngakho-ke sibheke kumthethonqubo wendabuko.

Njengoba besisebenza kwaGo, kwacaca ukuthi sidinga umshayeli weGo. Ngikwenze cishe isikhathi esigcwele - bekungumsebenzi wami. Sayiletha endaweni ethile, futhi ngokomthetho akekho owayecabanga ukuthi omunye umuntu ngaphandle kwethu uzoyisebenzisa. Kwabe sekufika i-CloudFlare nenkinga efanayo ncamashi, futhi isikhathi eside sisebenza nabo ngokushelela, ngoba babenemisebenzi efanayo. Ngaphezu kwalokho, lokhu sikwenze kokubili ku-ClickHouse ngokwethu nakumshayeli.

Ngesinye isikhathi, ngivele ngayeka ukukwenza, ngoba umsebenzi wami ngokweClickHouse nomsebenzi washintsha kancane. Ngakho-ke izinkinga azivaliwe. Ngezikhathi ezithile, abantu abadinga okuthile ngokwabo bazibophezela endaweni yokugcina. Bese ngibheka isicelo sokudonsa futhi ngezinye izikhathi ngize ngihlele okuthile ngokwami, kodwa lokhu kwenzeka kuyaqabukela.

Ngifuna ukubuyela kumshayeli. Eminyakeni embalwa edlule, lapho yonke le nto iqala, i-ClickHouse nayo yayihlukile futhi inamakhono ahlukene. Manje sesinokuqonda kwendlela yokwenza kabusha umshayeli ukuze isebenze kahle. Uma lokhu kwenzeka, inguqulo yesi-2 izobe ingahambelani nanoma yikuphi ngenxa yezinduku eziqoqiwe.

Angazi ngingaluhlela kanjani lolu daba. Anginaso isikhathi esiningi mina. Uma abanye abantu beqeda umshayeli, ngingabasiza ngibatshele ukuthi benzeni. Kodwa ukuhlanganyela okusebenzayo kwe-Yandex ekuthuthukiseni iphrojekthi akukakaxoxwa.

U-Alexey Milovidov: Eqinisweni, akukho mthetho mayelana nalaba bashayeli okwamanje. Okuwukuphela kwento ukuthi zihanjiswa enhlanganweni esemthethweni, okungukuthi, lo mshayeli ubonwa njengesixazululo esizenzakalelayo esisemthethweni se-Go. Bakhona abanye abashayeli, kodwa beza ngokwehlukana.

Asinakho ukuthuthukiswa kwangaphakathi kulaba bashayeli. Umbuzo uthi singaqasha yini umuntu oyedwa, hhayi lo mshayeli, kodwa ukuze kuthuthukiswe bonke abashayeli bomphakathi, noma singamthola ovela ngaphandle.

Isichazamazwi sangaphandle asilayishi ngemva kokuqaliswa kabusha ngokulungiselelwa kwe-lazy_load kunikwe amandla. Okufanele ngikwenze?

Sinezilungiselelo ze-lazy_load ezinikwe amandla, futhi ngemva kokuthi iseva isiqaliswe kabusha, isichazamazwi asizilayishi ngokwaso. Iphakanyiswa kuphela ngemva kokuba umsebenzisi efinyelele lesi sichazamazwi. Futhi okokuqala ngifinyelela kuyo, inikeza iphutha. Kungenzeka yini ukuthi ngandlela thize ukulayisha izichazamazwi ngokuzenzakalela usebenzisa i-ClickHouse, noma udinga njalo ukulawula ukulungela kwazo ngokwakho ukuze abasebenzisi bangatholi amaphutha?

Mhlawumbe sinenguqulo endala ye-ClickHouse, ngakho-ke isichazamazwi asizange silayishe ngokuzenzakalelayo. Kungenzeka yini lokhu?

Okokuqala, izichazamazwi zingaphoqelelwa ukuthi zilayishwe kusetshenziswa umbuzo layisha kabusha izichazamazwi zesistimu. Okwesibili, mayelana nephutha - uma isichazamazwi sesivele silayishiwe, imibuzo izosebenza ngokusekelwe kudatha elayishiwe. Uma isichazamazwi singakalayishwa, sizolayishwa ngqo ngesikhathi sesicelo.

Lokhu akulula kakhulu ezichazamazwini ezinzima. Isibonelo, udinga ukudonsa imigqa eyisigidi ku-MySQL. Othile wenza ukukhetha okulula, kodwa lokhu okukhethiwe kuzolinda imigqa eyisigidi esifanayo. Kunezixazululo ezimbili lapha. Okokuqala ukuvala lazy_load. Okwesibili, lapho iseva iphezulu, ngaphambi kokubeka umthwalo kuyo, yenza isichazamazwi sokulayisha kabusha uhlelo noma vele wenze umbuzo osebenzisa isichazamazwi. Bese isichazamazwi sizolayishwa. Udinga ukulawula ukutholakala kwezichazamazwi ngokulungiselelwa kwe-lazy_load, ngoba i-ClickHouse ayizilayishi ngokuzenzakalelayo.

Impendulo yombuzo wokugcina iwukuthi inguqulo indala noma idinga ukulungiswa.

Yini okufanele yenziwe ngeqiniso lokuthi izichazamazwi zokulayisha kabusha zesistimu azilayishi noma yisiphi isichazamazwi kweziningi uma okungenani esisodwa sazo siphahlazeka ngephutha?

Kukhona omunye umbuzo mayelana nezichazamazwi zokulayisha kabusha kwesistimu. Sinezichazamazwi ezimbili - eyodwa ayilayishiwe, eyesibili iyalayishwa. Kulesi simo, izichazamazwi zokulayisha kabusha zeSistimu azilayishi noma yisiphi isichazamazwi, futhi kufanele ulayishe ukhomba ngephoyinti igama laso usebenzisa isichazamazwi sokulayisha kabusha kwesistimu. Ingabe lokhu kuhlobene nenguqulo ye-ClickHouse?

Ngifuna ukukujabulisa. Lokhu kuziphatha bekushintsha. Lokhu kusho ukuthi uma ubuyekeza i-ClickHouse, nayo izoshintsha. Uma ungajabulile ngokuziphatha kwakho kwamanje layisha kabusha izichazamazwi zesistimu, buyekeza, futhi asethembe ukuthi izoshintsha ibe ngcono.

Ingabe ikhona indlela yokumisa imininingwane ku-ClickHouse config, kodwa ungayibonisi uma kwenzeka kuba namaphutha?

Umbuzo olandelayo umayelana namaphutha ahlobene nesichazamazwi, okuyimininingwane. Siyicacisile imininingwane yokuxhumana ku-ClickHouse config yesichazamazwi, futhi uma kunephutha, sithola le mininingwane kanye nephasiwedi njengempendulo.

Sixazulule leli phutha ngokwengeza imininingwane ekucushweni komshayeli we-ODBC. Ingabe ikhona indlela yokumisa imininingwane ku-ClickHouse config, kodwa ungabonisi le mininingwane uma kwenzeka kuba namaphutha?

Isixazululo sangempela lapha ukucacisa lezi mininingwane ku-odbc.ini, futhi ku-ClickHouse ngokwayo icacisa kuphela i-ODBC Data Source Name. Lokhu ngeke kwenzeke kweminye imithombo yesichazamazwi - noma kusichazamazwi esine-MySQL, noma kwezinye, akufanele ubone iphasiwedi lapho uthola umlayezo wephutha. Ku-ODBC, ngizophinde ngibheke - uma ikhona, udinga nje ukuyisusa.

Ibhonasi: izizinda ze-Zoom ezivela emibuthanweni

Ngokuchofoza esithombeni, isizinda sebhonasi esivela kumibuthano sizovulelwa abafundi abaphikelela kakhulu. Sicisha umlilo kanye nama-mascots obuchwepheshe be-Avito, sinikeza nozakwethu abavela ekamelweni lomphathi wesistimu noma iklabhu yekhompyutha yesikole esidala, futhi siqhuba imihlangano yansuku zonke ngaphansi kwebhuloho ngokumelene nengemuva le-graffiti.

I-ClickHouse yabasebenzisi abathuthukile emibuzweni nasezimpenduloni

Source: www.habr.com

Engeza amazwana