Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Popeza ClickHouse ndi dongosolo lapadera, poigwiritsa ntchito ndikofunika kuganizira mbali za zomangamanga zake. Mu lipoti ili, Alexey adzalankhula za zitsanzo za zolakwika zomwe zimagwiritsidwa ntchito pogwiritsa ntchito ClickHouse, zomwe zingayambitse ntchito yosagwira ntchito. Zitsanzo zothandiza ziwonetsa momwe kusankha njira imodzi kapena ina yosinthira deta ingasinthire magwiridwe antchito potengera kukula kwake.

Moni nonse! Dzina langa ndine Alexey, ndimapanga ClickHouse.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Choyamba, ndikufulumira kukusangalatsani nthawi yomweyo, lero sindikuuzani zomwe ClickHouse ndi. Kunena chilungamo ndatopa nazo. Nthawi iliyonse ndikakuuzani chomwe chiri. Ndipo mwina aliyense amadziwa kale.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

M'malo mwake, ndikuwuzani zolakwika zomwe zingatheke, ndiye kuti, momwe mungagwiritsire ntchito ClickHouse molakwika. M'malo mwake, palibe chifukwa choopa, chifukwa tikupanga ClickHouse ngati dongosolo losavuta, losavuta, komanso logwira ntchito kunja kwa bokosi. Ndinayiyika, palibe vuto.

Koma mukuyenerabe kukumbukira kuti dongosololi ndi lapadera ndipo mutha kukumana ndi vuto lachilendo lomwe lingachotse dongosololi m'malo ake otonthoza.

Ndiye, pali mtundu wanji wa rake? Nthawi zambiri ndilankhula zinthu zodziwikiratu. Chilichonse ndi chodziwikiratu kwa aliyense, aliyense amamvetsa zonse ndipo akhoza kusangalala kuti ali anzeru kwambiri, ndipo omwe samamvetsetsa adzaphunzira zatsopano.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Chitsanzo choyamba ndi chophweka, chomwe, mwatsoka, nthawi zambiri chimapezeka, ndi chiwerengero chachikulu cha kuika ndi timagulu tating'onoting'ono, mwachitsanzo, kuchuluka kwazing'ono.

Ngati tiganizira momwe ClickHouse imachitira kuyika, ndiye kuti mutha kutumiza data yosachepera terabyte pa pempho limodzi. Si vuto.

Ndipo tiyeni tiwone momwe machitidwewo angakhalire. Mwachitsanzo, tili ndi tebulo kuchokera ku data ya Yandex.Metrica. Zogunda. 105 mizati ina. 700 mabayiti osakanizidwa. Ndipo tidzalowetsa m'njira yabwino mumagulu a mizere miliyoni imodzi.

Timayika MergeTree patebulo, imakhala mizere theka la milioni pamphindikati. Zabwino. Pa tebulo lofananizidwa lidzakhala laling'ono, pafupifupi mizere 400 pamphindikati.

Ndipo ngati muthandizira kuyika quorum, mumapeza zochepa, koma kuchita bwino, mawu 250 pamphindikati. Kuyika quorum ndi chinthu chosalembedwa mu ClickHouse*.

*Pofika 2020, zalembedwa kale.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Nanga chingachitike n’chiyani ngati mutachita zoipa? Timayika mzere umodzi mu tebulo la MergeTree ndikupeza mizere 59 pamphindikati. Ndiko kuchedwa nthawi 10. Mu ReplicatedMergeTree - mizere 000 pamphindikati. Ndipo ngati quorum itsegulidwa, ndiye kuti imakhala mizere iwiri pamphindikati. M'malingaliro anga, uwu ndi mtundu wina wa zopanda pake. Kodi mungachepetse bwanji motero? Ndidalembanso pa T-sheti yanga kuti ClickHouse sayenera kuchedwetsa. Koma nthawi zina zimachitika.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndipotu uku ndi kulephera kwathu. Tikadatha kupanga chilichonse kuti chiziyenda bwino, koma sitinatero. Ndipo sitinachite izi chifukwa script yathu sinafune. Tinali kale ndi ma butche. Tinangolandira ma batchi pakhomo lathu, ndipo palibe vuto. Timayika ndipo zonse zimayenda bwino. Koma, ndithudi, mitundu yonse ya zochitika ndi zotheka. Mwachitsanzo, mukakhala ndi ma seva angapo omwe deta imapangidwira. Ndipo samayika zambiri nthawi zambiri, koma amakhalabe ndikuyika pafupipafupi. Ndipo tiyenera mwanjira ina kupewa izi.

Kuchokera pamalingaliro aukadaulo, mfundo ndi yakuti mukamayika mu ClickHouse, deta siyimatha kukumbukira. Tilibe ngakhale ndondomeko yeniyeni ya MergeTree, koma MergeTree, chifukwa palibe chipika kapena memTable. Timangolemba nthawi yomweyo deta ku fayilo ya fayilo, yokonzedwa kale m'mizere. Ndipo ngati muli ndi mizati 100, ndiye kuti mafayilo opitilira 200 adzafunika kulembedwa ku bukhu lina. Zonsezi ndizovuta kwambiri.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndipo funso limadzuka: "Kodi mungatani bwino?" Ngati zinthu zili choncho kuti mukufunikabe kuti mujambule deta mu ClickHouse.

Njira 1. Iyi ndi njira yophweka. Gwiritsani ntchito mtundu wina wa mzere wogawidwa. Mwachitsanzo, Kafka. Mumangotulutsa deta kuchokera ku Kafka ndikuyiyika kamodzi sekondi imodzi. Ndipo zonse zikhala bwino, mumalemba, zonse zikuyenda bwino.

Zoyipa zake ndikuti Kafka ndi njira ina yogawa kwambiri. Ndikumvetsetsanso ngati muli ndi Kafka kale pakampani yanu. Ndi zabwino, ndi yabwino. Koma ngati kulibe, muyenera kuganiza katatu musanakokerenso dongosolo lina logawidwa mu polojekiti yanu. Choncho ndi bwino kuganizira njira zina.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Njira 2. Iyi ndi njira yakale ya sukulu komanso nthawi yomweyo yosavuta. Kodi muli ndi seva yamtundu wina yomwe imapanga zipika zanu. Ndipo imangolemba zolemba zanu ku fayilo. Ndipo kamodzi sekondi, mwachitsanzo, timasintha fayiloyi ndikudula ina. Ndipo script ina, mwina kudzera pa cron kapena daemon, imatenga fayilo yakale kwambiri ndikuyilembera ku ClickHouse. Mukalemba zipika kamodzi sekondi imodzi, ndiye kuti zonse zikhala bwino.

Koma choyipa cha njirayi ndikuti ngati seva yanu yomwe mitengoyo imapangidwira imasowa kwinakwake, ndiye kuti datayo idzatha.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Njira 3. Palinso njira ina yosangalatsa, yomwe sikutanthauza mafayilo osakhalitsa. Mwachitsanzo, muli ndi mtundu wina wa spinner wotsatsa kapena daemon ina yosangalatsa yomwe imapanga deta. Ndipo mutha kudziunjikira mulu wa data mwachindunji mu RAM, mu buffer. Ndipo ikadutsa nthawi yokwanira, mumayika chosungirachi pambali, pangani china chatsopano, ndipo mu ulusi wina, ikani zomwe zasonkhanitsidwa kale mu ClickHouse.

Kumbali inayi, deta imasowanso ndikupha -9. Seva yanu ikasweka, mutaya datayi. Ndipo vuto lina ndiloti ngati simunathe kulemba ku database, ndiye kuti deta yanu idzaunjikana mu RAM. Ndipo mwina RAM idzatha, kapena mudzangotaya deta.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Njira 4. Njira ina yosangalatsa. Kodi muli ndi mtundu wina wa seva. Ndipo imatha kutumiza deta ku ClickHouse nthawi yomweyo, koma chitani munjira imodzi. Mwachitsanzo, ndinatumiza pempho la http ndi kusamutsa-encoding: chunked ndi insert. Ndipo imapanga chunks osati kawirikawiri, mutha kutumiza mzere uliwonse, ngakhale padzakhala pamwamba pakupanga deta iyi.

Komabe, pankhaniyi deta idzatumizidwa ku ClickHouse nthawi yomweyo. Ndipo ClickHouse idzazisokoneza zokha.

Koma mavuto amabukanso. Tsopano mutaya deta, kuphatikizapo pamene ndondomeko yanu yaphedwa komanso ngati ndondomeko ya ClickHouse iphedwa, chifukwa idzakhala yosakwanira. Ndipo mu ClickHouse zoyikapo zimakhala ndi ma atomiki mpaka pamlingo wina wodziwika kukula kwa mizere. Kwenikweni, iyi ndi njira yosangalatsa. Angagwiritsidwenso ntchito.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Njira 5. Nayi njira ina yosangalatsa. Uwu ndi mtundu wina wa ma seva opangidwa ndi anthu ammudzi kuti agwirizane ndi data. Sindinaziyang'ane ndekha, kotero sindingathe kutsimikizira chilichonse. Komabe, palibe zitsimikizo zomwe zimaperekedwa kwa ClickHouse yokha. Ilinso ndi gwero lotseguka, koma kumbali ina, mutha kugwiritsidwa ntchito pazabwino zina zomwe timayesera kupereka. Koma pa chinthu ichi - sindikudziwa, pitani ku GitHub, yang'anani nambalayo. Mwinamwake iwo analemba chinachake chachilendo.

* pofika 2020, ziyenera kuwonjezeredwa kuti ziganizidwe Nyumba ya Kitten.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Njira 6. Njira ina ndiyo kugwiritsa ntchito matebulo a Buffer. Ubwino wa njirayi ndikuti ndiyosavuta kugwiritsa ntchito. Pangani tebulo la Buffer ndikuyiyikamo.

Choyipa ndichakuti vutoli silimathetsedwa kwathunthu. Ngati, mulingo ngati MergeTree, muyenera kusonkhanitsa deta ndi gulu limodzi pamphindikati, ndiye mulingo wa tebulo la buffer, muyenera kuyika magulu osachepera masauzande angapo pamphindikati. Ngati ipitilira 10 pa sekondi iliyonse, ikhalabe yoyipa. Ndipo ngati muyiyika mumagulu, ndiye kuti mukuwona kuti imakhala mizere zana limodzi pamphindikati. Ndipo izi zili kale pa data yolemera kwambiri.

Komanso ma buffer tables alibe chipika. Ndipo ngati pali chinachake cholakwika ndi seva yanu, ndiye kuti deta idzatayika.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndipo monga bonasi, posachedwapa tapeza mwayi ku ClickHouse kuti tipeze deta kuchokera ku Kafka. Pali injini ya tebulo - Kafka. Mukungopanga. Ndipo mukhoza kupachika ziwonetsero zakuthupi pa izo. Pankhaniyi, idzachotsa deta kuchokera ku Kafka ndikuyiyika m'matebulo omwe mukufuna.

Ndipo chomwe chili chosangalatsa kwambiri pa mwayiwu ndikuti sitinali ife amene tinachita izo. Ichi ndi mawonekedwe amdera. Ndipo ndimati "chinthu chapagulu," ndikutanthauza popanda kunyoza kulikonse. Tinawerenga code, tinabwereza, ziyenera kugwira ntchito bwino.

* Pofika 2020, chithandizo chofananira chawonekera KaluluMQ.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndi chiyani chinanso chomwe chingakhale chovuta kapena chosayembekezereka pakuyika deta? Ngati mupanga pempho lamtengo wapatali ndikulemba mawu owerengeka mumtengo. Mwachitsanzo, tsopano () ndi mawu owerengeka. Ndipo pamenepa, ClickHouse ikukakamizika kuyambitsa womasulira mawuwa pamzere uliwonse, ndipo ntchito idzatsika ndi malamulo akuluakulu. Ndi bwino kupewa izi.

* pakadali pano, vutolo lathetsedwa kwathunthu, palibenso kuwongolera magwiridwe antchito mukamagwiritsa ntchito mawu mu VALUES.

Chitsanzo china ndi pamene pangakhale mavuto mukakhala ndi deta pa gulu limodzi limene lili mugulu la magawo. Mwachikhazikitso, magawo a ClickHouse ndi mwezi. Ndipo ngati muyika gulu la mizere miliyoni, ndipo pali deta kwa zaka zingapo, ndiye kuti mudzakhala ndi magawo angapo pamenepo. Ndipo izi ndizofanana ndi kuti padzakhala magulu angapo ang'onoang'ono kukula kwake, chifukwa mkati mwake nthawi zonse amagawidwa m'magawo.

* Posachedwapa, mumayendedwe oyesera, ClickHouse adawonjezera chithandizo cha mawonekedwe ophatikizika a chunks ndi chunks mu RAM yokhala ndi chipika cholembera, chomwe chimathetsa vutoli.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Tsopano tiyeni tione mtundu wachiwiri wa vuto - kulemba deta.

Kulemba kwa data kumatha kukhala kolimba kapena chingwe. Chingwe ndi pamene mudangochitenga ndikulengeza kuti minda yanu yonse ndi yamtundu wa chingwe. Izi ndizoyipa. Palibe chifukwa chochitira izi.

Tiyeni tiwone momwe tingachitire molondola muzochitikazo mukafuna kunena kuti tili ndi gawo lina, chingwe, ndikulola kuti ClickHouse adziwonetse yekha, ndipo sindidzadandaula. Koma m’pofunikabe kuyesetsa.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Mwachitsanzo, tili ndi adilesi ya IP. Nthawi ina, tinasunga ngati chingwe. Mwachitsanzo, 192.168.1.1. Ndipo mwanjira ina, idzakhala mitundu ingapo ya UInt32 *. 32 bits ndikwanira adilesi ya IPv4.

Choyamba, chodabwitsa kwambiri, deta idzapanikizidwa pafupifupi mofanana. Padzakhala kusiyana, ndithudi, koma osati kwakukulu. Kotero palibe mavuto apadera ndi disk I / O.

Koma pali kusiyana kwakukulu mu nthawi ya purosesa ndi nthawi yofunsa mafunso.

Tiyeni tiwerenge ma adilesi apadera a IP ngati asungidwa ngati manambala. Izi zimafikira mizere 137 miliyoni pamphindikati. Ngati zomwezo zili mu mawonekedwe a zingwe, ndiye kuti mizere 37 miliyoni pamphindikati. Sindikudziwa chifukwa chake izi zidachitika mwangozi. Ndinachita zopempha zimenezi ndekha. Komabe pafupifupi 4 nthawi pang'onopang'ono.

Ndipo ngati muwerengera kusiyana kwa disk space, ndiye kuti palinso kusiyana. Ndipo kusiyana kuli pafupifupi kotala imodzi, chifukwa pali ma adilesi ambiri apadera a IP. Ndipo ngati pangakhale mizere yokhala ndi matanthauzo ochepa ochepa, ndiye kuti ikadapanikizidwa mosavuta mogwirizana ndi mtanthauzira mawu kukhala pafupifupi voliyumu yofanana.

Ndipo kusiyana kwa nthawi zinayi sikugona panjira. Mwina simukusamala ndithu, koma ndikawona kusiyana kotereku, zimandimvetsa chisoni.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Tiyeni tiwone milandu yosiyanasiyana.

1. Chochitika chimodzi pamene muli ndi zosiyana zosiyana. Pankhaniyi, timagwiritsa ntchito njira yosavuta yomwe mwina mukudziwa ndipo mungagwiritse ntchito DBMS iliyonse. Zonsezi ndizomveka osati za ClickHouse zokha. Ingolembani zozindikiritsa manambala munkhokwe. Ndipo mutha kusintha kukhala zingwe ndikubwerera kumbali ya pulogalamu yanu.

Mwachitsanzo, muli ndi dera. Ndipo mukuyesera kuchisunga ngati chingwe. Ndipo izo zidzalembedwa kumeneko: Moscow ndi Moscow Region. Ndipo ndikawona kuti akuti "Moscow", palibe kanthu, koma pamene ndi Moscow, mwanjira ina imakhala yomvetsa chisoni kwambiri. Umu ndi ma byte angati.

M'malo mwake, timangolemba nambala ya Ulnt32 ndi 250. Tili ndi 250 ku Yandex, koma yanu ikhoza kukhala yosiyana. Zikatero, ndinena kuti ClickHouse ili ndi luso lotha kugwira ntchito ndi geobase. Mukungolemba zolemba zomwe zili ndi zigawo, kuphatikizapo mbiri yakale, i.e. padzakhala Moscow, Chigawo cha Moscow, ndi chirichonse chomwe mungafune. Ndipo mukhoza kusintha pa mlingo pempho.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Njira yachiwiri ndiyofanana, koma ndi chithandizo mkati mwa ClickHouse. Uwu ndiye mtundu wa data wa Enum. Mukungolemba zonse zomwe mukufuna mkati mwa Enum. Mwachitsanzo, mtundu wa chipangizo ndi kulemba pamenepo: kompyuta, mafoni, piritsi, TV. Pali zosankha 4 zonse.

Choyipa ndichakuti muyenera kusintha nthawi ndi nthawi. Njira imodzi yokha yowonjezedwa. Tiyeni tisinthe tebulo. M'malo mwake, tebulo losintha mu ClickHouse ndi laulere. Makamaka kwaulere kwa Enum chifukwa deta pa disk sisintha. Komabe, alter imapeza loko * patebulo ndipo iyenera kudikirira mpaka zonse zomwe zasankhidwa zichitike. Ndipo pokhapokha kusinthaku kudzachitika, mwachitsanzo, pali zovuta zina.

* m'mitundu yaposachedwa ya ClickHouse, ALTER idapangidwa kuti ikhale yosatsekereza.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Njira ina yomwe ili yapadera kwambiri ya ClickHouse ndikulumikiza madikishonale akunja. Mutha kulemba manambala mu ClickHouse, ndikusunga zolemba zanu pamakina aliwonse omwe angakuthandizireni. Mwachitsanzo, mutha kugwiritsa ntchito: MySQL, Mongo, Postgres. Mutha kupanganso microservice yanu yomwe imatumiza izi kudzera pa http. Ndipo pamlingo wa ClickHouse, mumalemba ntchito yomwe ingasinthe izi kuchokera ku manambala kupita ku zingwe.

Iyi ndi njira yapadera koma yothandiza kwambiri yolumikizirana patebulo lakunja. Ndipo pali njira ziwiri. Mwachiwonetsero chimodzi, izi zidzasungidwa kwathunthu, kupezeka kwathunthu mu RAM ndikusinthidwa pafupipafupi. Ndipo mwanjira ina, ngati izi sizikukwanira mu RAM, mutha kuzisunga pang'ono.

Nachi chitsanzo. Pali Yandex.Direct. Ndipo pali kampani yotsatsa ndi zikwangwani. Mwina pali pafupifupi mamiliyoni mamiliyoni amakampani otsatsa. Ndipo amakwanira bwino mu RAM. Ndipo pali mabiliyoni a zikwangwani, sizikwanira. Ndipo timagwiritsa ntchito dikishonale yosungidwa kuchokera ku MySQL.

Vuto lokhalo ndiloti dikishonale yosungidwa idzagwira ntchito bwino ngati kugunda kuli pafupi ndi 100%. Ngati ndi yaying'ono, ndiye mukakonza mafunso pagulu lililonse la data, muyenera kutenga makiyi omwe akusowa ndikupita kukatenga deta kuchokera ku MySQL. Za ClickHouse, nditha kutsimikizirabe kuti - inde, sizimachedwa, sindilankhula za machitidwe ena.

Ndipo monga bonasi, otanthauzira mawu ndi njira yosavuta yosinthira deta mu ClickHouse. Ndiko kuti, mudali ndi lipoti lamakampani otsatsa, wogwiritsa ntchitoyo adangosintha kampani yotsatsa ndipo muzonse zakale, m'malipoti onse, izi zidasinthanso. Ngati mulemba mizere molunjika patebulo, sizingakhale zotheka kusintha.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Njira ina pamene simukudziwa komwe mungapeze zizindikiritso za zingwe zanu. inu mukhoza mophweka izo. Komanso, njira yosavuta ndiyo kutenga 64-bit hash.

Vuto lokhalo ndiloti ngati hashi ndi 64-bit, ndiye kuti mudzakhala ndi zogundana. Chifukwa ngati pali mizere biliyoni pamenepo, ndiye kuti mwayiwu ukuwonekera kale.

Ndipo sizingakhale bwino kubisa mayina amakampani otsatsa motere. Ngati zotsatsa zotsatsa zamakampani osiyanasiyana zikusakanikirana, ndiye kuti padzakhala chinthu chosamvetsetseka.

Ndipo pali njira yosavuta. Zowona, sizoyeneranso kwambiri pazambiri zazikulu, koma ngati china chake sichili chovuta kwambiri, ingowonjezerani chozindikiritsa kasitomala pa kiyi ya mtanthauzira mawu. Kenako mudzakhala ndi zogundana, koma mkati mwa kasitomala m'modzi. Ndipo timagwiritsa ntchito njirayi pamapu a ulalo ku Yandex.Metrica. Tili ndi ma URL pamenepo, timasunga ma hashes. Ndipo ife tikudziwa kuti, ndithudi, pali kugunda. Koma tsamba likawonetsedwa, mwayi woti patsamba limodzi la wogwiritsa ntchito ma URL ena amamatirana ndipo izi zitha kuzindikirika zitha kunyalanyazidwa.

Monga bonasi, chifukwa cha ntchito zambiri ma hashes okha ndi okwanira ndipo zingwe siziyenera kusungidwa kulikonse.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Chitsanzo china ndi ngati zingwe zili zazifupi, mwachitsanzo, madera a webusayiti. Zitha kusungidwa momwe zilili. Kapena, mwachitsanzo, chinenero cha osatsegula ru ndi 2 bytes. Inde, ndimamvera chisoni ma byte, koma musadandaule, 2 byte sizomvetsa chisoni. Chonde sungani momwe zilili, osadandaula.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Mlandu wina ndi pamene, m'malo mwake, pali mizere yambiri ndipo pali zosiyana zambiri mwa izo, ndipo ngakhale kukhazikitsidwa kumakhala kopanda malire. Chitsanzo chodziwika bwino ndi mawu osakira kapena ma URL. Sakani mawu, kuphatikiza typos. Tiyeni tiwone kuchuluka kwa mawu osakira apadera omwe amapezeka patsiku. Ndipo zikuoneka kuti iwo ali pafupifupi theka la zochitika zonse. Ndipo pamenepa, mungaganize kuti muyenera kusintha zambiri, kuwerengera zozindikiritsa, ndikuziyika patebulo losiyana. Koma simuyenera kuchita zimenezo. Ingosungani mizere iyi momwe ilili.

Ndikwabwino kusapanga chilichonse, chifukwa mukachisunga padera, muyenera kujowina. Ndipo kujowina uku ndi, mwachisawawa, mwayi wokumbukira, ngati ukugwirizanabe ndi kukumbukira. Ngati sichikukwanira, ndiye kuti padzakhala mavuto.

Ndipo ngati deta yasungidwa m'malo mwake, ndiye kuti imawerengedwa mu dongosolo lofunikira kuchokera ku fayilo ya fayilo ndipo zonse ziri bwino.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ngati muli ndi ma URL kapena zingwe zina zovuta, ndiye kuti muyenera kuganizira kuti mutha kuwerengeratu mtundu wina wa zochotsa ndikuzilemba mugawo lapadera.

Kwa ma URL, mwachitsanzo, mutha kusunga madambwe padera. Ndipo ngati mukufunadi domain, ingogwiritsani ntchito gawo ili, ndipo ma URL adzagona pamenepo, ndipo simudzawakhudza.

Tiyeni tione kusiyana kwake. ClickHouse ili ndi ntchito yapadera yomwe imawerengera dera. Ndi yachangu kwambiri, ife wokometsedwa izo. Ndipo, kunena zoona, sizigwirizana ndi RFC, koma zimaganizira zonse zomwe tikufuna.

Ndipo nthawi imodzi tidzangotenga ma URL ndikuwerengera dera. Izi zimagwira mpaka 166 milliseconds. Ndipo ngati mutenga malo opangidwa okonzeka, ndiye kuti amangokhala 67 milliseconds, i.e. pafupifupi katatu mwachangu. Ndipo zimathamanga osati chifukwa timafunika kuwerengera, koma chifukwa timawerenga zochepa.

Ndicho chifukwa chake pempho limodzi, lomwe liri pang'onopang'ono, limakhala ndi liwiro lapamwamba la gigabytes pamphindi. Chifukwa imawerenga ma gigabytes ambiri. Izi kwathunthu zosafunika deta. Pempho likuwoneka kuti likuyenda mwachangu, koma limatenga nthawi yayitali kuti limalize.

Ndipo ngati muyang'ana kuchuluka kwa deta pa disk, zimakhala kuti URL ndi 126 megabytes, ndipo domain ndi 5 megabytes. Zimakhala zochepera 25. Komabe, pempholi limachitidwa nthawi 4 zokha mwachangu. Koma ndichifukwa choti data ndiyotentha. Ndipo ngati kukadazizira, mwina 25 nthawi mwachangu chifukwa cha disk I/O.

Mwa njira, ngati mukuganiza kuti dera laling'ono ndi lotani kuposa URL, limakhala laling'ono la 4. Koma pazifukwa zina, deta imatenga nthawi 25 pa disk. Chifukwa chiyani? Chifukwa cha kupsinjika. Ndipo ulalo ndi wothinikizidwa, ndipo ankalamulira ndi wothinikizidwa. Koma nthawi zambiri URL imakhala ndi zinyalala zambiri.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndipo, zowona, zimalipira kugwiritsa ntchito mitundu yoyenera ya data yomwe idapangidwa makamaka pazofunikira kapena zomwe zili zoyenera. Ngati muli mu IPv4, sungani UInt32*. Ngati IPv6, ndiye FixedString(16), chifukwa IPv6 adilesi ndi 128 bits, mwachitsanzo kusungidwa mwachindunji mu mawonekedwe a binary.

Koma bwanji ngati nthawi zina mumakhala ndi ma adilesi a IPv4 ndipo nthawi zina IPv6? Inde, mukhoza kusunga zonse ziwiri. Mzere umodzi wa IPv4, wina wa IPv6. Zachidziwikire, pali njira yowonetsera IPv4 mu IPv6. Izi zidzagwiranso ntchito, koma ngati nthawi zambiri mumafuna adilesi ya IPv4 pazofunsira, ndiye kuti zingakhale bwino kuziyika mugawo lapadera.

* ClickHouse tsopano ili ndi IPv4 yosiyana, mitundu ya data ya IPv6 yomwe imasunga deta moyenera monga manambala, koma imayimilira mosavuta ngati zingwe.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

M'pofunikanso kuzindikira kuti m'pofunika preprocessing deta pasadakhale. Mwachitsanzo, mumalandira zipika zosaphika. Ndipo mwina simuyenera kungowayika mu ClickHouse nthawi yomweyo, ngakhale ndizoyesa kuchita chilichonse ndipo zonse zigwira ntchito. Koma m'pofunikabe kuwerengera zomwe zingatheke.

Mwachitsanzo, msakatuli Baibulo. Mu dipatimenti ina yapafupi, yomwe sindikufuna kuloza chala, mawonekedwe osatsegula amasungidwa motere, ndiko kuti, ngati chingwe: 12.3. Ndiyeno, kuti apange lipoti, amatenga chingwechi ndikuchigawa mumagulu angapo, ndiyeno m'chigawo choyamba cha gululo. Mwachibadwa, chirichonse chimachepetsa. Ndinawafunsa chifukwa chimene amachitira zimenezi. Anandiuza kuti sakonda kukhathamiritsa msanga. Ndipo sindimakonda kupeputsa msanga.

Kotero mu nkhani iyi zingakhale zolondola kwambiri kugawa mu 4 mizati. Osachita mantha pano, chifukwa iyi ndi ClickHouse. ClickHouse ndi database ya columnar. Ndipo mizati yaudongo kwambiri, imakhala yabwinoko. Padzakhala 5 BrowserVersions, pangani mizati 5. Izi nzabwino.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Tsopano tiyeni tiwone zoyenera kuchita ngati muli ndi zingwe zazitali kwambiri, zazitali kwambiri. Siziyenera kusungidwa mu ClickHouse konse. M'malo mwake, mutha kusunga chizindikiritso mu ClickHouse. Ndipo ikani mizere yayitali iyi mu dongosolo lina.

Mwachitsanzo, imodzi mwantchito zathu zowunikira ili ndi magawo ena a zochitika. Ndipo ngati pali magawo ambiri a zochitika, timangosunga zoyamba 512 zomwe zimabwera chifukwa 512 sichisoni.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndipo ngati simungathe kusankha pamitundu yanu ya data, ndiye kuti mutha kulembanso deta mu ClickHouse, koma patebulo losakhalitsa la mtundu wa Log, wapadera kwa data kwakanthawi. Pambuyo pake, mutha kusanthula zomwe mumagawana nazo, zomwe zilipo, ndikupanga mitundu yoyenera.

*ClickHouse tsopano ili ndi mtundu wa data Low Cardinanity zomwe zimakulolani kuti muzisunga zingwe moyenera ndi khama lochepa.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Tsopano tiyeni tione nkhani ina yosangalatsa. Nthawi zina zinthu zimagwira ntchito modabwitsa kwa anthu. Ndikubwera ndikuwona izi. Ndipo nthawi yomweyo zikuwoneka kuti izi zidachitidwa ndi olamulira ena odziwa zambiri, anzeru omwe ali ndi chidziwitso chambiri pakukhazikitsa mtundu wa MySQL 3.23.

Apa tikuwona magome chikwi, iliyonse yomwe imalemba zotsalira za kugawa omwe amadziwa zomwe ndi chikwi.

M'malo mwake, ndimalemekeza zomwe anthu ena akumana nazo, kuphatikizapo kumvetsetsa za kuvutika komwe kungapezeke kudzera muzochitika izi.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndipo zifukwa zake zimakhala zomveka bwino. Awa ndi malingaliro akale omwe mwina adasonkhanitsidwa pamene akugwira ntchito ndi machitidwe ena. Mwachitsanzo, matebulo a MyISAM alibe kiyi yoyambira yophatikizika. Ndipo njira iyi yogawanitsa deta ingakhale kuyesa kofuna kupeza ntchito zomwezo.

Chifukwa china n'chakuti n'kovuta kuchita zosintha zilizonse pamatebulo akuluakulu. Zonse zidzatsekedwa. Ngakhale m'matembenuzidwe amakono a MySQL vuto ili sililinso lalikulu.

Kapena, mwachitsanzo, microsharding, koma zambiri pambuyo pake.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Palibe chifukwa chochitira izi mu ClickHouse, chifukwa, choyamba, fungulo loyambirira limaphatikizidwa, deta imalamulidwa ndi kiyi yoyamba.

Ndipo nthawi zina anthu amandifunsa kuti: "Kodi mayankho a mafunso osiyanasiyana mu ClickHouse amasiyana bwanji kutengera kukula kwa tebulo?" Ndikunena kuti sizikusintha konse. Mwachitsanzo, muli ndi tebulo lokhala ndi mizere biliyoni imodzi ndipo mumawerenga mizere miliyoni imodzi. Zonse zili bwino. Ngati pali mizere thililiyoni patebulo ndikuwerenga mizere miliyoni imodzi, zikhala pafupifupi zofanana.

Ndipo, chachiwiri, mitundu yonse ya zinthu monga magawo pamanja sikufunika. Ngati mutalowa ndikuyang'ana zomwe zili pa fayilo, mudzawona kuti tebulo ndi lalikulu kwambiri. Ndipo pali chinachake chonga magawano mkati. Ndiye kuti, ClickHouse amakuchitirani chilichonse ndipo simuyenera kuvutika.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Kusintha mu ClickHouse ndikwaulere ngati alter add/drop column.

Ndipo simuyenera kupanga matebulo ang'onoang'ono, chifukwa ngati muli ndi mizere 10 kapena mizere 10 patebulo, ndiye kuti zilibe kanthu. ClickHouse ndi kachitidwe kamene kamapangitsa kuti zinthu zitheke, osati latency, chifukwa chake sizomveka kukonza mizere 000.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndikoyenera kugwiritsa ntchito tebulo limodzi lalikulu. Chotsani malingaliro akale, zonse zikhala bwino.

Ndipo monga bonasi, mu mtundu waposachedwa tsopano tili ndi kuthekera kopanga kiyi yogawa mosagwirizana kuti tichite ntchito zamitundu yonse yokonza pamagawo apawokha.

Mwachitsanzo, mufunika matebulo ang'onoang'ono ambiri, mwachitsanzo, pakufunika kukonza deta yapakatikati, mumalandira ma chunks ndipo muyenera kuwasintha musanalembe ku tebulo lomaliza. Pachifukwa ichi, pali injini yabwino ya tebulo - StripeLog. Zili ngati TinyLog, zabwinoko.

* tsopano ClickHouse ilinso nayo table function input.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Antipattern ina ndi microsharding. Mwachitsanzo, muyenera kugawa deta ndipo muli ndi ma seva 5, ndipo mawa padzakhala ma seva 6. Ndipo mukuganiza za momwe mungasinthire deta iyi. Ndipo m'malo mwake simumaphwanya 5 shards, koma mu 1 shards. Kenako mumayika ma microshards awa ku seva yosiyana. Ndipo mupeza, mwachitsanzo, 000 ClickHouses pa seva imodzi, mwachitsanzo. Olekanitsa zochitika pamadoko osiyana kapena ma database osiyana.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Koma izi sizabwino kwambiri mu ClickHouse. Chifukwa ngakhale chitsanzo chimodzi cha ClickHouse chimayesa kugwiritsa ntchito zida zonse zomwe zilipo kuti zithandizire pempho limodzi. Ndiye kuti, muli ndi mtundu wina wa seva ndipo ili ndi, mwachitsanzo, ma processor cores 56. Mukufunsa funso lomwe limatenga sekondi imodzi ndipo lidzagwiritsa ntchito ma cores 56. Ndipo ngati mutayika 200 ClickHouses pamenepo pa seva imodzi, ndiye kuti ulusi 10 udzayamba. Kawirikawiri, zonse zidzakhala zoipa kwambiri.

Chifukwa china ndi chakuti kugawidwa kwa ntchito pazochitika zonsezi kudzakhala kosafanana. Ena amamaliza kale, ena amamaliza pambuyo pake. Ngati zonsezi zidachitika nthawi imodzi, ndiye kuti ClickHouse yokha ingadziwe momwe mungagawire bwino deta pakati pa ulusi.

Ndipo chifukwa china ndikuti mudzakhala ndi kulumikizana kwa interprocessor kudzera pa TCP. Zambirizi ziyenera kusinthidwa, kuchotsedwa, ndipo ichi ndi chiwerengero chachikulu cha ma microshards. Sizidzagwira ntchito bwino.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Wina antipattern, ngakhale sangathe kutchedwa antipattern. Ichi ndi chiwerengero chachikulu cha pre-aggregation.

Mwambiri, kuphatikizika koyambirira ndikwabwino. Munali ndi mizere mabiliyoni, mudaiphatikiza ndipo idakhala mizere 1, ndipo funsoli likuchitika nthawi yomweyo. Zonse ndi zabwino. Mukhoza kuchita izi. Ndipo chifukwa cha izi, ngakhale ClickHouse ili ndi mtundu wapadera wa tebulo, AggregatingMergeTree, yomwe imapanga zowonjezereka monga momwe deta imayikidwa.

Koma pali nthawi zina zomwe mumaganiza kuti tidzasonkhanitsa deta monga iyi ndi kusonkhanitsa deta ngati iyi. Ndipo m'madipatimenti ena oyandikana nawo, sindikufunanso kunena kuti ndi iti, amagwiritsa ntchito matebulo a SummingMergeTree kuti afotokoze mwachidule ndi kiyi yoyamba, ndipo pafupifupi mizati 20 imagwiritsidwa ntchito ngati kiyi yoyamba. Zikatero, ndinasintha mayina a mizati kuti ikhale yachinsinsi, koma ndizokongola kwambiri.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndipo mavuto amenewa amabuka. Choyamba, kuchuluka kwa data yanu sikutsika kwambiri. Mwachitsanzo, amachepetsa katatu. Katatu ungakhale mtengo wabwino kuti upeze mphamvu zopanda malire za analytics zomwe zimabwera ngati deta yanu siyikuphatikizidwa. Ngati deta ikuphatikizidwa, ndiye kuti m'malo mwa analytics mumapeza ziwerengero zomvetsa chisoni zokha.

Ndipo chapadera ndi chiyani pa izo? Chowonadi ndi chakuti anthu awa ochokera ku dipatimenti yoyandikana nawo nthawi zina amapita kukafunsa kuti awonjezere gawo lina ku kiyi yoyamba. Ndiko kuti, tinasonkhanitsa deta monga chonchi, koma tsopano tikufuna zochulukirapo. Koma ClickHouse ilibe kiyi yosinthira. Chifukwa chake, tiyenera kulemba zolemba zina mu C ++. Ndipo sindimakonda zolemba, ngakhale zili mu C ++.

Ndipo ngati muyang'ana zomwe ClickHouse idapangidwira, ndiye kuti zomwe sizinaphatikizidwe ndizofanana ndendende zomwe zidabadwira. Ngati mukugwiritsa ntchito ClickHouse pazinthu zosaphatikiza, ndiye kuti mukuchita bwino. Ngati mukuphatikiza, izi nthawi zina zimakhululukidwa.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Nkhani ina yosangalatsa ndi mafunso omwe ali mu loop yopanda malire. Nthawi zina ndimapita ku seva ina yopanga ndikuyang'ana mndandanda wazowonetsa pamenepo. Ndipo nthawi iliyonse ndikazindikira kuti pali chinachake choyipa chikuchitika.

Mwachitsanzo, monga chonchi. Nthawi yomweyo zikuwonekeratu kuti zonse zitha kuchitika mwa pempho limodzi. Ingolembani ulalo ndi mndandanda pamenepo.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Chifukwa chiyani mafunso ambiri oterowo amakhala oipa? Ngati index sikugwiritsidwa ntchito, ndiye kuti mudzakhala ndi madutsa ambiri pa data yomweyo. Koma ngati ndondomeko ikugwiritsidwa ntchito, mwachitsanzo, muli ndi kiyi yoyamba ya ru ndipo mumalemba url = chinachake pamenepo. Ndipo mukuganiza kuti ngati ulalo umodzi wokha uwerengedwa patebulo, zonse zikhala bwino. Koma kwenikweni ayi. Chifukwa ClickHouse imachita zonse m'magulu.

Akafunika kuwerenga zambiri, amawerenga pang'ono, chifukwa index mu ClickHouse ndi yochepa. Mlozerawu sukulolani kuti mupeze mzere umodzi patebulo, koma mndandanda wamtundu wina. Ndipo deta imapanikizidwa mu midadada. Kuti muwerenge mzere umodzi, muyenera kutenga chipika chonse ndikuchichotsa. Ndipo ngati mukufunsa mafunso ambiri, mudzakhala ndi kuphatikizika kwakukulu, ndipo mudzakhala ndi ntchito yambiri yoti muchite mobwerezabwereza.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndipo monga bonasi, mutha kuzindikira kuti mu ClickHouse simuyenera kuchita mantha kusamutsa ngakhale ma megabytes komanso mazana a megabytes kupita ku IN gawo. Ndikukumbukira kuchokera muzochita zathu kuti ngati mu MySQL timasamutsa mulu wamtengo wapatali ku gawo la IN, mwachitsanzo, timasamutsa ma megabytes 100 a manambala ena pamenepo, ndiye MySQL imadya magigabytes 10 a kukumbukira ndipo palibe chomwe chimachitika, chirichonse. imagwira ntchito molakwika.

Ndipo chachiwiri ndi chakuti mu ClickHouse, ngati mafunso anu amagwiritsa ntchito index, ndiye kuti nthawi zonse sakhala pang'onopang'ono kusiyana ndi kujambula kwathunthu, mwachitsanzo, ngati mukufunikira kuwerenga pafupifupi tebulo lonse, lidzapita motsatizana ndikuwerenga tebulo lonse. M'malo mwake, adzazindikira yekha.

Komabe pali zovuta zina. Mwachitsanzo, mfundo yakuti IN yokhala ndi subquery sigwiritsa ntchito index. Koma ili ndi vuto lathu ndipo tiyenera kukonza. Palibe chofunikira apa. Tikonza*.

Ndipo chinthu china chochititsa chidwi ndi chakuti ngati muli ndi pempho lalitali kwambiri ndikugawidwa kwa pempho kukuchitika, ndiye kuti pempho lalitali kwambiri lidzatumizidwa ku seva iliyonse popanda kukanikiza. Mwachitsanzo, ma megabytes 100 ndi ma seva 500. Ndipo, motero, mudzakhala ndi ma gigabytes 50 osamutsidwa pa netiweki. Idzafalitsidwa ndiyeno zonse zidzakwaniritsidwa bwino.

* kugwiritsa ntchito kale; Chilichonse chinakonzedwa monga momwe analonjezera.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Ndipo vuto lodziwika bwino ndi pomwe zopempha zimachokera ku API. Mwachitsanzo, mudapanga mtundu wina wa ntchito yanu. Ndipo ngati wina akusowa ntchito yanu, ndiye kuti mumatsegula API ndipo kwenikweni masiku awiri pambuyo pake mukuwona kuti chinachake chosamvetsetseka chikuchitika. Chilichonse chadzaza ndipo zopempha zoyipa zikubwera zomwe siziyenera kuchitika.

Ndipo pali njira imodzi yokha. Ngati mwatsegula API, ndiye kuti muyenera kudula. Mwachitsanzo, yambitsani mtundu wina wa magawo. Palibe njira zina zabwinobwino. Apo ayi, iwo adzalemba nthawi yomweyo script ndipo padzakhala mavuto.

Ndipo ClickHouse ili ndi gawo lapadera - kuwerengera kwa quota. Komanso, mutha kusamutsa kiyi yanu ya quota. Izi, mwachitsanzo, ID ya ogwiritsa ntchito mkati. Ndipo ma quotas adzawerengedwa paokha kwa aliyense wa iwo.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Tsopano chinthu china chosangalatsa. Uku ndi kubwereza pamanja.

Ndikudziwa nthawi zambiri pomwe, ngakhale ClickHouse ili ndi chithandizo chobwereza, anthu amatengera ClickHouse pamanja.

Kodi mfundo yake ndi yotani? Muli ndi payipi yokonza deta. Ndipo imagwira ntchito paokha, mwachitsanzo, m'malo osiyanasiyana a data. Mumalemba zomwezo chimodzimodzi mu ClickHouse. Zowona, machitidwe akuwonetsa kuti deta idzasiyanabe chifukwa cha zina zomwe zili mu code yanu. Ine ndikuyembekeza izo ziri mwanu.

Ndipo nthawi ndi nthawi mudzayenera kulunzanitsa pamanja. Mwachitsanzo, kamodzi pamwezi ma admins amachita rsync.

M'malo mwake, ndikosavuta kugwiritsa ntchito kubwereza komwe kumapangidwa mu ClickHouse. Koma pakhoza kukhala zotsutsana, chifukwa izi muyenera kugwiritsa ntchito ZooKeeper. Sindinganene chilichonse choyipa chokhudza ZooKeeper, kwenikweni, dongosololi limagwira ntchito, koma zimachitika kuti anthu sagwiritsa ntchito chifukwa cha java-phobia, chifukwa ClickHouse ndi dongosolo labwino kwambiri, lolembedwa mu C ++, lomwe mungagwiritse ntchito zonse zikhala bwino. Ndipo ZooKeeper ili mu java. Ndipo mwanjira inayake simukufuna ngakhale kuyang'ana, koma mutha kugwiritsa ntchito kubwereza pamanja.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

ClickHouse ndi njira yothandiza. Amaganizira zosowa zanu. Ngati muli ndi kubwereza pamanja, ndiye kuti mutha kupanga tebulo logawidwa lomwe limayang'ana zolemba zanu zamanja ndikulephera pakati pawo. Ndipo palinso njira yapadera yomwe imakupatsani mwayi kuti mupewe ma flops, ngakhale mizere yanu itapatuka mwadongosolo.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Mavuto ena angabwere ngati mugwiritsa ntchito injini zama tebulo. ClickHouse ndi womanga yemwe ali ndi mulu wamainjini amitundu yosiyanasiyana. Pamilandu yonse yayikulu, monga momwe zalembedwera, gwiritsani ntchito matebulo ochokera kubanja la MergeTree. Ndipo zina zonse - izi zili choncho, pamilandu yamunthu payekha kapena mayeso.

Pa tebulo la MergeTree, simuyenera kukhala ndi tsiku ndi nthawi. Mutha kugwiritsabe ntchito. Ngati palibe tsiku ndi nthawi, lembani kuti kusakhulupirika ndi 2000. Izi zidzagwira ntchito ndipo sizidzafuna zothandizira.

Ndipo mu mtundu watsopano wa seva, mutha kufotokoza kuti muli ndi magawo ogawa popanda kiyi yogawa. Zidzakhalanso chimodzimodzi.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Kumbali ina, mutha kugwiritsa ntchito ma injini akale a tebulo. Mwachitsanzo, lembani deta kamodzi ndikuyang'ana, kupotoza ndi kuchotsa. Mutha kugwiritsa ntchito Log.

Kapena kusunga ma voliyumu ang'onoang'ono kuti apangidwe apakatikati ndi StripeLog kapena TinyLog.

Memory itha kugwiritsidwa ntchito ngati kuchuluka kwa data kuli kochepa ndipo mutha kungogwedeza china chake mu RAM.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

ClickHouse sakonda zambiri zomwe zasinthidwa.

Nachi chitsanzo chodziwika bwino. Ichi ndi chiwerengero chachikulu cha ma URL. Inu muziika mu tebulo lotsatira. Kenako adaganiza zopanga JOIN nawo, koma izi sizigwira ntchito, monga lamulo, chifukwa ClickHouse imangothandiza Hash JOIN. Ngati palibe RAM yokwanira pa data yambiri yomwe ikufunika kulumikizidwa, ndiye kuti JOIN sigwira ntchito*.

Ngati detayo ndi ya makadinala apamwamba, ndiye musadandaule, sungani mu mawonekedwe a denormalized, ma URL ali mwachindunji pa tebulo lalikulu.

* ndipo tsopano ClickHouse ilinso ndi cholumikizira chophatikizira, ndipo imagwira ntchito pomwe data yapakatikati siyikugwirizana ndi RAM. Koma izi sizothandiza ndipo malingaliro ake akugwirabe ntchito.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Zitsanzo zina zingapo, koma ndikukayikira kale ngati zili zotsutsana kapena ayi.

ClickHouse ili ndi cholakwika chimodzi chodziwika. Sikudziwa momwe angasinthire*. Mwanjira zina, izi ndi zabwino. Ngati muli ndi deta yofunikira, mwachitsanzo, kuwerengera ndalama, ndiye kuti palibe amene adzatha kutumiza, chifukwa palibe zosintha.

* Thandizo losinthira ndikuchotsa mu batch mode adawonjezedwa kalekale.

Koma pali njira zina zapadera zomwe zimalola zosintha ngati kumbuyo. Mwachitsanzo, matebulo ngati ReplaceMergeTree. Amapanga zosintha panthawi yophatikizana zakumbuyo. Mutha kukakamiza izi pogwiritsa ntchito optimize table. Koma musachite izi nthawi zambiri, chifukwa zidzachotsa gawolo.

Ma JOIN Ogawidwa mu ClickHouse nawonso samayendetsedwa bwino ndi query planner.

Zoyipa, koma nthawi zina zili bwino.

Kugwiritsa ntchito ClickHouse kokha kuwerenga deta pogwiritsa ntchito kusankha *.

Sindingalimbikitse kugwiritsa ntchito ClickHouse pamawerengero ovuta. Koma izi sizowona kwathunthu, chifukwa tikuchoka kale pamalingaliro awa. Ndipo posachedwapa tawonjezera kuthekera kogwiritsa ntchito makina ophunzirira makina mu ClickHouse - Catboost. Ndipo izo zimandivutitsa ine chifukwa ine ndimaganiza, β€œNdizowopsya bwanji. Umu ndi momwe zimakhalira maulendo angati pa byte! Ndimadana kwambiri ndi kuwononga mawotchi pa byte.

Kugwiritsa ntchito bwino kwa ClickHouse. Alexey Milovidov (Yandex)

Koma musawope, ikani ClickHouse, zonse zikhala bwino. Ngati chilichonse, tili ndi gulu. Mwa njira, anthu ammudzi ndi inu. Ndipo ngati muli ndi vuto lililonse, mutha kupita kumacheza athu, ndipo mwachiyembekezo adzakuthandizani.

Mafunso anu

Zikomo chifukwa cha lipoti! Kodi ndingadandaule kuti za kuwonongeka kwa ClickHouse?

Mutha kudandaula kwa ine panokha.

Posachedwa ndayamba kugwiritsa ntchito ClickHouse. Nthawi yomweyo ndinasiya mawonekedwe a cli.

Zigoli bwanji.

Patapita nthawi ndinaphwanya seva ndi kusankha kakang'ono.

Muli ndi luso.

Ndinatsegula cholakwika cha GitHub, koma sichinanyalanyazidwe.

Tidzawona.

Alexey anandinyengerera kuti ndipite ku lipotilo, ndikulonjeza kundiuza momwe mumapezera deta mkati.

Zosavuta kwambiri.

Ndinazindikira izi dzulo. Zowonjezereka.

Kulibe zidule zoipa kumeneko. Pali kuponderezana kwa block-by-block. Zosasintha ndi LZ4, mutha kuloleza ZSTD*. Ma block kuchokera ku 64 kilobytes mpaka 1 megabyte.

* Palinso chithandizo cha ma codec apadera apadera omwe angagwiritsidwe ntchito mu unyolo ndi ma aligorivimu ena.

Kodi midadada ndi data yaiwisi chabe?

Osati kwathunthu yaiwisi. Pali masanjidwe. Ngati muli ndi gawo la manambala, ndiye kuti manambala otsatizana amayikidwa motsatira.

Zikumveka.

Alexey, chitsanzo chomwe chinali ndi uniqExact over IPs, kutanthauza kuti uniqExact imatenga nthawi yayitali kuti iwerengere ndi mizere kuposa manambala, ndi zina zotero. Nanga bwanji ngati tigwiritsa ntchito nthiti ndi makutu athu ndikuponyedwa panthawi yowerengera? Ndiko kuti, mukuwoneka kuti mwanena kuti pa disk yathu sizosiyana kwambiri. Ngati tiwerenga mizere kuchokera ku disk ndikuponya, kodi magulu athu adzakhala othamanga kapena ayi? Kapena tipindulabe pang'ono apa? Zikuwoneka kwa ine kuti mudayesa izi, koma pazifukwa zina simunasonyeze mu benchmark.

Ndikuganiza kuti zikhala pang'onopang'ono kusiyana ndi popanda kuponyera. Pankhaniyi, adilesi ya IP iyenera kudulidwa kuchokera pachingwe. Zachidziwikire, ku ClickHouse, ma adilesi athu a IP amakonzedwanso. Tinayesetsa kwambiri, koma muli ndi manambala olembedwa mu mawonekedwe a zikwi khumi. Zovuta kwambiri. Kumbali ina, ntchito ya uniqExact idzagwira ntchito pang'onopang'ono pa zingwe, osati chifukwa chakuti izi ndi zingwe, komanso chifukwa chapadera chapadera cha algorithm chimasankhidwa. Zingwe zimangokonzedwa mosiyana.

Nanga bwanji ngati titenga mtundu wa data wakale kwambiri? Mwachitsanzo, tidalemba chizindikiritso cha ogwiritsa ntchito, chomwe tili nacho, ndikuchilemba ngati mzere, kenako ndikuchiphwanya, chikhala chosangalatsa kapena ayi?

Ndikukayika. Ndikuganiza kuti zikhala zachisoni kwambiri, chifukwa pambuyo pake, kugawa manambala ndi vuto lalikulu. Zikuwoneka kwa ine kuti mnzangayu adapereka lipoti la momwe zimakhalira zovuta kuwerengera manambala mu mawonekedwe a zikwi khumi, koma mwina ayi.

Alexey, zikomo kwambiri chifukwa cha lipoti! Ndipo zikomo kwambiri chifukwa cha ClickHouse! Ndili ndi funso lokhudza mapulani. Kodi pali mapulani aliwonse oti asinthe madikishonale mosakwanira?

Ndiko kuti, kuyambiransoko pang'ono?

Inde Inde. Monga kuthekera kokhazikitsa gawo la MySQL pamenepo, mwachitsanzo, sinthani pambuyo pake kuti deta iyi yokhayo imakwezedwa ngati dikishonale ndi yayikulu kwambiri.

Mbali yosangalatsa kwambiri. Ndipo ndikuganiza kuti munthu wina adapereka lingaliro pamacheza athu. Mwina munali inunso.

sindikuganiza choncho.

Zabwino, tsopano zikuwoneka kuti pali zopempha ziwiri. Ndipo mukhoza kuyamba kuchita pang'onopang'ono. Koma ndikufuna kukuchenjezani nthawi yomweyo kuti izi ndizosavuta kugwiritsa ntchito. Ndiye kuti, m'malingaliro, mumangofunika kulemba nambala yamtunduwu patebulo ndikulemba: mtundu wocheperako ndi izi. Izi zikutanthauza kuti, mwina, tidzapereka izi kwa okonda. Kodi ndinu okonda?

Inde, koma, mwatsoka, osati mu C ++.

Kodi anzanu amadziwa kulemba mu C++?

Ndipeza wina.

Zabwino*.

* Mbaliyo idawonjezedwa patangotha ​​​​miyezi iwiri lipotilo - wolemba funsolo adalipanga ndikutumiza lake kukoka pempho.

Бпасибо!

Moni! Zikomo chifukwa cha lipoti! Mudanena kuti ClickHouse ndiyabwino kugwiritsa ntchito zonse zomwe zilipo. Ndipo wokamba nkhani pafupi ndi Luxoft adalankhula za yankho lake ku Russian Post. Ananenanso kuti amakonda kwambiri ClickHouse, koma sanaigwiritse ntchito m'malo mwa mpikisano wawo wamkulu ndendende chifukwa ikudya CPU yonse. Ndipo sanathe kuziyika muzomanga zawo, mu ZooKeeper yawo yokhala ndi ma dockers. Kodi ndizotheka kuchepetsa ClickHouse mwanjira ina kuti isawononge chilichonse chomwe chimapezeka kwa icho?

Inde, n’zotheka ndipo n’zosavuta kwambiri. Ngati mukufuna kudya ma cores ochepa, ingolembani set max_threads = 1. Ndipo ndizo, idzachita pempholo pachimake chimodzi. Komanso, mutha kutchula zokonda zosiyanasiyana kwa ogwiritsa ntchito osiyanasiyana. Choncho palibe vuto. Ndipo auzeni anzanu aku Luxoft kuti sizabwino kuti sanapeze izi pazolembedwa.

Alexey, moni! Ndikufuna kufunsa za funso ili. Aka si koyamba kuti ndimve kuti anthu ambiri akuyamba kugwiritsa ntchito ClickHouse ngati yosungirako zipika. Pa lipoti mudati musachite izi, i.e. simuyenera kusunga zingwe zazitali. Mukuganiza bwanji pa izi?

Choyamba, zipika zimakhala, monga lamulo, osati zingwe zazitali. Pali, ndithudi, zosiyana. Mwachitsanzo, ntchito zina zolembedwa mu java zimaponya chosiyana, zimayikidwa. Ndi zina zotero mu chipika chosatha, ndipo danga pa hard drive likutha. Yankho lake ndi losavuta. Ngati mizereyo ndi yayitali kwambiri, iduleni. Kodi kutalika kumatanthauza chiyani? Makumi a kilobytes ndi oipa *.

* m'mitundu yaposachedwa ya ClickHouse, "adaptive index granularity" imathandizidwa, zomwe zimathetsa vuto losunga mizere yayitali nthawi zambiri.

Kodi kilobyte ndiyabwinobwino?

Chabwino.

Moni! Zikomo chifukwa cha lipoti! Ndinafunsa kale za izi pamacheza, koma sindikukumbukira ngati ndinalandira yankho. Kodi pali mapulani okulitsa gawo la WITH mwanjira ya CTE?

Osati pano. Gawo lathu la WITH ndi lopanda pake. Zili ngati gawo laling'ono kwa ife.

Ndikumvetsa. Zikomo!

Zikomo chifukwa cha lipoti! Zosangalatsa kwambiri! Funso lapadziko lonse lapansi. Kodi pali malingaliro osintha kufufutidwa kwa data, mwina ngati mtundu wina wa stubs?

Moyenera. Iyi ndi ntchito yathu yoyamba pamzere wathu. Tsopano tikuganizira mozama momwe tingachitire zonse molondola. Ndipo muyenera kuyamba kukanikiza kiyibodi*.

* adakanikiza mabatani pa kiyibodi ndikuchita chilichonse.

Kodi izi zitha kukhudza momwe dongosolo limagwirira ntchito kapena ayi? Kodi kulowetsako kudzakhala mofulumira monga momwe zilili panopa?

Mwina zochotsa zokha ndi zosintha zokha zidzakhala zolemetsa kwambiri, koma izi sizikhudza magwiridwe antchito a osankhidwa kapena magwiridwe antchito.

Ndipo funso lina laling'ono. Pachiwonetsero mudalankhula za kiyi yoyamba. Chifukwa chake, tili ndi magawo, omwe mwezi ndi mwezi ndi osakhazikika, sichoncho? Ndipo tikayika madeti omwe akukwana mwezi umodzi, ndiye kuti magawowa okha ndi omwe amawerengedwa, sichoncho?

Inde.

Funso. Ngati sitingathe kusankha makiyi oyamba, ndiye kuti ndi zolondola kuti tichite izi molingana ndi gawo la "Date" kotero kuti kumbuyo kumakhala kukonzanso pang'ono kwa deta iyi kuti igwirizane mwadongosolo? Ngati mulibe mafunso osiyanasiyana ndipo simungathe kusankha kiyi iliyonse, kodi ndi bwino kuyika tsiku pa kiyi yoyamba?

Inde.

Mwina ndizomveka kuyika gawo mu kiyi yoyamba yomwe ingapondereze bwino deta ngati yasankhidwa ndi gawo ili. Mwachitsanzo, ID ID. Wogwiritsa ntchito, mwachitsanzo, amapita kumalo omwewo. Pankhaniyi, ikani wosuta id ndi nthawi. Ndiyeno deta yanu bwino wothinikizidwa. Ponena za tsikulo, ngati mulibe ndipo mulibe mafunso osiyanasiyana pamasiku, ndiye kuti simukuyenera kuyika tsikulo pa kiyi yoyamba.

OK zikomo kwambiri!

Source: www.habr.com

Kuwonjezera ndemanga