Ngokwejwayelekile, "intambo" ye-DBMS, izibonelo zayo okuyi-MySQL, Postgres, MS SQL Server, idatha igcinwa ngale ndlela:
Kulokhu, amanani ahlobene nomugqa owodwa agcinwa ngokuhlangene. Kukholomu ye-DBMS, amanani avela kumakholomu ahlukene agcinwa ngokuhlukana, futhi idatha yekholomu eyodwa igcinwa ndawonye:
Izibonelo zamakholomu ama-DBMS yi-Vertica, Paraccel (Actian Matrix, Amazon Redshift), Sybase IQ, Exasol, Infobright, InfiniDB, MonetDB (VectorWise, Actian Vector), LucidDB, SAP HANA, Google Dremel, Google PowerDrill, Druid, kdb+.
Inkampani ingumthumeli we-imeyili
adambise
I-Clickhouse ifaka ku-Ubuntu ngomyalo owodwa. Uma uyayazi i-SQL, ungaqala ngokushesha ukusebenzisa i-Clickhouse ngezidingo zakho. Nokho, lokhu akusho ukuthi ungakwazi "ukukhombisa ukudala ithebula" ku-MySQL futhi ukopishe-unamathisele i-SQL ku-Clickhouse.
Uma kuqhathaniswa ne-MySQL, kunomehluko obalulekile wohlobo lwedatha ezincazelweni ze-schema sethebula kule DBMS, ngakho-ke usadinga isikhathi esithile ukuze uguqule izincazelo ze-schema setafula futhi ufunde izinjini zetafula ukuze ukhululeke.
I-Clickhouse isebenza kahle ngaphandle kwesofthiwe eyengeziwe, kodwa uma ufuna ukusebenzisa ukuphindaphinda uzodinga ukufaka i-ZooKeeper. Ukuhlaziywa kokusebenza kombuzo kubonisa imiphumela emihle kakhulu - amathebula esistimu aqukethe lonke ulwazi, futhi yonke idatha ingatholwa kusetshenziswa i-SQL endala nenesicefe.
Ukukhiqiza
Ibhentshimakhi Ukuqhathanisa kwe-Clickhouse ngokumelene ne-Vertica ne-MySQL kuseva yokumisa: amasokhethi amabili e-Intel® Xeon® CPU E5-2650 v2 @ 2.60GHz; 128 GiB RAM; md RAID-5 ku-8 6TB SATA HDD, ext4.Ibhentshimakhi ukuqhathaniswa kweClickhouse nesitoreji samafu se-Amazon RedShift.- Izingcaphuno zebhulogi
Cloudflare mayelana nokusebenza kweClickhouse :
Isizindalwazi se-ClickHouse sinomklamo olula kakhulu - wonke ama-node ku-cluster anokusebenza okufanayo futhi asebenzisa i-ZooKeeper kuphela ukuze axhumane. Sakhe iqoqo elincane lamanodi amaningana futhi senza ukuhlola, lapho sithole ukuthi isistimu inokusebenza okumangazayo, okuhambisana nezinzuzo ezifunwayo kumabhentshimakhi wokuhlaziya we-DBMS. Sinqume ukubhekisisa umqondo ongemuva kwe-ClickHouse. Isithiyo sokuqala sokucwaninga kwakuwukuntuleka kwamathuluzi kanye nomphakathi omncane we-ClickHouse, ngakho-ke sangena ekwakhiweni kwale DBMS ukuze siqonde ukuthi isebenza kanjani.
I-ClickHouse ayikusekeli ukwamukela idatha ngokuqondile evela e-Kafka, njengoba iyisizindalwazi nje, ngakho sibhale eyethu isevisi ye-adaptha kokuthi Go. Ifunde imilayezo enekhodi ye-Cap'n Proto esuka e-Kafka, yayiguqulela ku-TSV, futhi yayifaka ku-ClickHouse ngamaqoqo ngesixhumi esibonakalayo se-HTTP. Kamuva sabhala kabusha le sevisi ukuze sisebenzise umtapo wezincwadi we-Go ngokuhambisana nesixhumi esibonakalayo se-ClickHouse yethu ukuze sithuthukise ukusebenza. Lapho sihlola ukusebenza kokwamukela amaphakethe, sithole into ebalulekile - kwavela ukuthi ku-ClickHouse lokhu kusebenza kuncike kakhulu kusayizi wephakethe, okungukuthi, inani lemigqa efakwe ngesikhathi esifanayo. Ukuze siqonde ukuthi kungani lokhu kwenzeka, sifunde ukuthi i-ClickHouse igcina kanjani idatha.
Injini eyinhloko, noma kunalokho, umndeni wezinjini zetafula ezisetshenziswa yi-ClickHouse ukugcina idatha, i-MergeTree. Le njini ngokomqondo iyafana ne-algorithm ye-LSM esetshenziswa ku-Google BigTable noma i-Apache Cassandra, kodwa igwema ukwakha ithebula lememori elimaphakathi futhi ibhala idatha ngokuqondile kudiski. Lokhu kuyinikeza ukubhala okuhle kakhulu, njengoba iphakethe ngalinye elifakiwe lihlelwa kuphela ngokhiye oyinhloko othi "ukhiye oyinhloko", ocindezelwe, futhi ubhalwe kudiski ukuze kwakheke ingxenye.
Ukungabikho kwethebula lememori noma yimuphi umqondo "wokusha" kwedatha nakho kusho ukuthi zinganezelwa kuphela, isistimu ayisekeli ukushintsha noma ukususwa. Kusukela namuhla, okuwukuphela kwendlela yokususa idatha ukuyisusa ngenyanga yekhalenda, njengoba amasegimenti engalokothi eqe umngcele wenyanga. Ithimba le-ClickHouse lisebenza ngokuzimisela ekwenzeni lesi sici sibe ngokwezifiso. Ngakolunye uhlangothi, kwenza amasegimenti okubhala nokuhlanganisa angabi nangxabano, ngakho-ke thola izikali zokuphuma zilandelana nenombolo yokufaka okuhambisanayo kuze kugcwale i-I/O noma ama-cores.
Kodwa-ke, lesi simo siphinde sisho ukuthi isistimu ayifanele amaphakethe amancane, ngakho-ke izinsiza ze-Kafka nezifakeli zisetshenziselwa ukugcina ibhafa. Ngaphezu kwalokho, i-ClickHouse engemuva iyaqhubeka nokuhlanganisa izingxenye, ukuze izingcezu eziningi zolwazi zihlanganiswe futhi zirekhodwe izikhathi eziningi, ngaleyo ndlela kwandise ukushuba kokurekhoda. Kodwa-ke, izingxenye eziningi kakhulu ezingahlobene zizobangela ukudonsa kanzima kokufakwayo inqobo nje uma ukuhlanganisa kuqhubeka. Sithole ukuthi ukuvumelana okungcono kakhulu phakathi kokungenisa idatha yesikhathi sangempela nokusebenza kokungenisa ukwamukela inani elilinganiselwe lokufakwa ngesekhondi ngalinye kuthebula.
Isihluthulelo sokusebenza kokufunda ithebula ukukhonjwa nendawo yedatha kudiski. Noma ngabe ukucutshungulwa kushesha kangakanani, lapho injini idinga ukuskena ama-terabyte edatha kudiski futhi isebenzise ingxenyenamba yayo kuphela, kuzothatha isikhathi. I-ClickHouse iyisitolo sekholomu, ngakho ingxenye ngayinye iqukethe ifayela lekholomu ngayinye (ikholomu) elinamanani ahlungiwe omugqa ngamunye. Ngakho, wonke amakholomu angekho embuzweni angaqala eqiwe, bese amaseli amaningi angacutshungulwa ngokuhambisana nokwenziwa kwe-vectorized. Ukuze ugweme ukuskena okugcwele, ingxenye ngayinye inefayela elincane lenkomba.
Uma kubhekwa ukuthi wonke amakholomu ahlelwa "ngokhiye oyinhloko", ifayela lenkomba liqukethe kuphela amalebula (imigqa ethathiwe) yawo wonke umugqa we-Nth, ukuze ukwazi ukuwagcina enkumbulweni ngisho namathebula amakhulu kakhulu. Isibonelo, ungasetha izilungiselelo ezizenzakalelayo ukuthi "zimake yonke imigqa engu-8192", bese "incane" inkomba yetafula elinesigidi sesigidi esingu-1. imigqa engena kalula kumemori ingathatha izinhlamvu eziyi-122 kuphela.
Ukuthuthukiswa kohlelo
Ukuthuthukiswa nokuthuthukiswa kweClickhouse kungalandelelwa
Bhala ukubuyekeza
Kubonakala sengathi ukuthandwa kukaClickhouse kukhula kakhulu, ikakhulukazi emphakathini okhuluma isiRashiya. Ingqungquthela ye-High load 2018 yonyaka odlule (eMoscow, ngoNovemba 8-9, 2018) ibonise ukuthi izilo ezifana ne-vk.com ne-Badoo zisebenzisa i-Clickhouse, efaka idatha (isibonelo, izingodo) kusukela emashumini ezinkulungwane zamaseva ngesikhathi esisodwa. Kuvidiyo yemizuzu engama-40
Izicelo
Ngemuva kokuchitha isikhathi ngicwaninga, ngicabanga ukuthi kunezindawo lapho i-ClickHouse ingaba wusizo khona noma ikwazi ukufaka ngokuphelele ezinye izixazululo zendabuko nezidumile njengeMySQL, PostgreSQL, ELK, Google Big Query, Amazon RedShift, TimescaleDB, Hadoop, MapReduce, Pinot kanye I-Druid. Okulandelayo imininingwane yokusebenzisa i-ClickHouse ukuthuthukisa noma ukufaka esikhundleni ngokuphelele i-DBMS engenhla.
Ukunweba i-MySQL ne-PostgreSQL
Muva nje, sishintshe ingxenye ye-MySQL nge-ClickHouse yesikhulumi sezindaba
I-Clickhouse isebenzisa ama-algorithms wokucindezela amabili anciphisa inani ledatha cishe
Ukushintsha kwe-ELK
Ngokusekelwe kokuhlangenwe nakho kwami, isitaki se-ELK (I-ElasticSearch, i-Logstash ne-Kibana, kuleli cala i-ElasticSearch) idinga izinsiza eziningi ukuze isebenze kunalokho okudingekayo ukugcina izingodo. I-ElasticSearch iyinjini enhle uma ufuna ukusesha kwelogi okuhle kombhalo ogcwele (engicabanga ukuthi awukudingi ngempela), kodwa ngiyazibuza ukuthi kungani isiphenduke injini ye-de facto ejwayelekile yokugawula. Ukusebenza kwayo kokungenisa, kuhlanganiswe ne-Logstash, kusinikeze izinkinga ngisho nalapho sinomthwalo wemisebenzi olula futhi kudinga ukungezwa kwe-RAM eyengeziwe nesikhala sediski. Njengesizindalwazi, i-Clickhouse ingcono kune-ElasticSearch ngenxa yalezi zizathu ezilandelayo:
- Ukusekelwa kolimi lwe-SQL;
- Izinga elingcono kakhulu lokucindezelwa kwedatha egciniwe;
- Ukusekela ukusesha kwe-Regex esikhundleni sokusesha umbhalo ogcwele;
- Ukuhlela imibuzo okuthuthukisiwe nokusebenza okungcono sekukonke.
Njengamanje, inkinga enkulu ephakamayo uma kuqhathaniswa ne-ClickHouse ne-ELK ukuntuleka kwezixazululo zokulayisha izingodo, kanye nokuntuleka kwemibhalo kanye nezifundiswa kulesi sihloko. Ngesikhathi esifanayo, umsebenzisi ngamunye angasetha i-ELK esebenzisa incwadi ye-Digital Ocean, ebaluleke kakhulu ekusetshenzisweni okusheshayo kobuchwepheshe obunjalo. Kukhona injini yedatha lapha, kodwa ayikho i-Filebeat ye-ClickHouse okwamanje. Yebo ikhona
Ngokukhetha izixazululo ze-minimalist, ngizame ukusebenzisa i-FluentBit, ithuluzi eliphansi kakhulu lokulayisha irekhodi lememori, nge-ClickHouse ngenkathi ngizama ukugwema ukusebenzisa i-Kafka. Kodwa-ke, ukungezwani okuncane kudinga ukulungiswa, njengokuthi
Njengenye indlela ye-Kibana, ungasebenzisa i-ClickHouse njenge-backend
Ukushintshwa kwe-Google Big Query ne-Amazon RedShift (isixazululo sezinkampani ezinkulu)
Ikesi elikahle lokusebenzisa le-BigQuery ukulayisha u-1TB wedatha ye-JSON bese uqhuba imibuzo yokuhlaziya kuyo. I-Big Query iwumkhiqizo omuhle okunzima ukuwucabangela ngokweqile. Lena isofthiwe eyinkimbinkimbi kakhulu kune-ClickHouse esebenza ku-cluster yangaphakathi, kodwa ngokombono weklayenti, inokuningi okufanayo ne-ClickHouse. I-BigQuery "ingakhuphuka intengo" ngokushesha uma usuqale ukukhokhela UKUKHETHA ngakunye, ngakho iyisixazululo sangempela se-SaaS esinazo zonke izinto ezinhle nezimbi.
I-ClickHouse iyisinqumo esingcono kakhulu uma usebenzisa imibuzo eminingi ebiza kakhulu. Uma imibuzo ethi KHETHA oyisebenzisayo nsuku zonke, kuba nephuzu elingeziwe lokususa Umbuzo Omkhulu nge-ClickHouse, ngoba ukumiselela okunjalo kuzokongela izinkulungwane zamadola uma kukhulunywa ngama-terabytes amaningi edatha ecutshungulwayo. Lokhu akusebenzi kudatha egciniwe, eshibhile impela ukuyicubungula ku-Big Query.
Esihlokweni sika-Alexander Zaitsev, umsunguli we-Altinity
Ukushintshwa kwe-TimescaleDB
I-TimescaleDB isandiso se-PostgreSQL esenza ngcono ukusebenza ngama-timeseries kusizindalwazi esijwayelekile (
Nakuba i-ClickHouse ingeyona imbangi engathi sína ku-niche yochungechunge lwesikhathi, kodwa ngokuya ngesakhiwo sekholomu kanye nokwenziwa kombuzo we-vector, ishesha kakhulu kune-TimescaleDB ezimweni eziningi zokucubungula imibuzo yokuhlaziya. Ngesikhathi esifanayo, ukusebenza kokuthola idatha yepakethe ye-ClickHouse cishe izikhathi ezi-3 ngaphezulu, ngaphezu kwalokho, kusebenzisa isikhala sediski esiphindwe izikhathi ezingu-20, okubaluleke kakhulu ekucubunguleni idatha enkulu yomlando:
Ngokungafani ne-ClickHouse, okuwukuphela kwendlela yokonga isikhala sediski ku-TimescaleDB ukusebenzisa i-ZFS noma amasistimu wefayela afanayo.
Izibuyekezo ezizayo ku-ClickHouse cishe zizokwethula ukucindezelwa kwe-delta, okuzoyenza ifaneleke nakakhulu ukucubungula nokugcina idatha yochungechunge lwesikhathi. I-TimescaleDB ingaba yisinqumo esingcono kune-ClickHouse engenalutho kulezi zimo ezilandelayo:
- ukufakwa okuncane okune-RAM encane kakhulu (<3 GB);
- inani elikhulu LOKUFAKA okuncane ongafuni ukukubhafa kube yizicucu ezinkulu;
- ukungaguquguquki okungcono, ukufana kanye nezidingo ze-ACID;
- Ukusekelwa kwe-PostGIS;
- hlanganisa namathebula e-PostgreSQL akhona, njengoba i-Timescale DB empeleni iyi-PostgreSQL.
Ukuncintisana ne-Hadoop ne-MapReduce systems
I-Hadoop neminye imikhiqizo ye-MapReduce ingenza izibalo eziningi eziyinkimbinkimbi, kodwa ijwayele ukusebenza ngokubambezeleka okukhulu. I-ClickHouse ilungisa le nkinga ngokucubungula ama-terabyte edatha futhi ikhiqize imiphumela cishe ngokushesha. Ngakho, i-ClickHouse iphumelela kakhulu ekwenzeni ucwaningo lokuhlaziya olusheshayo, olusebenzisanayo, okufanele luthakasele ososayensi bedatha.
Ukuncintisana noPinot noDruid
Izimbangi eziseduze ze-ClickHouse yi-columnar, imikhiqizo yomthombo ovulekile ongakala ngayo i-Pinot ne-Druid. Umsebenzi omuhle kakhulu wokuqhathanisa lezi zinhlelo ushicilelwe esihlokweni
Lesi sihloko sidinga ukubuyekezwa - sithi i-ClickHouse ayisekeli ukusebenza kwe-UPDATE ne-DELETE, okungelona iqiniso ngokuphelele mayelana nezinguqulo zakamuva.
Asinaso isipiliyoni esiningi ngalawa ma-DBMS, kodwa angikuthandi inkimbinkimbi yengqalasizinda eyisisekelo edingekayo ukuze kuqhutshwe i-Druid ne-Pinot - iyinqwaba "yezingxenye ezihambayo" ezizungezwe yi-Java kuzo zonke izinhlangothi.
I-Druid ne-Pinot amaphrojekthi e-Apache incubator, ambozwe ngokuningiliziwe yi-Apache emakhasini abo ephrojekthi ye-GitHub. U-Pinot uvele ku-incubator ngo-Okthoba 2018, kanti u-Druid wazalwa ezinyangeni eziyisi-8 ngaphambili - ngoFebhuwari.
Ukuntuleka kolwazi mayelana nendlela i-AFS esebenza ngayo kungiphakamisela imibuzo ethile, futhi mhlawumbe ewubuphukuphuku. Ngiyazibuza ukuthi ingabe ababhali be-Pinot babonile yini ukuthi i-Apache Foundation ithandeka kakhulu ku-Druid, futhi ingabe isimo sengqondo esinjalo ngakulowo esincintisana naye sibangele umuzwa womona? Ngabe ukuthuthukiswa kwe-Druid kuzokwehla futhi ukuthuthukiswa kwe-Pinot kuzosheshisa uma abaxhasi abasekela lowo wangaphambili beba nesithakazelo kulokhu kokugcina?
Ukubi kweClickHouse
Ukungavuthwa: Ngokusobala, lokhu kusewubuchwepheshe obuyisicefe, kodwa kunoma yikuphi, akukho okufana nalokhu okubonwa kwenye i-DBMS yekholomu.
Okufakwayo okuncane akwenzi kahle ngesivinini esikhulu: okufakiwe kufanele kuhlukaniswe kube iziqephu ezinkulu ngoba ukusebenza kokufakwayo okuncane kuyehla ngokulingana nenani lamakholomu kumugqa ngamunye. Lena yindlela i-ClickHouse egcina ngayo idatha kudiski - ikholomu ngayinye isho ifayela elingu-1 noma ngaphezulu, ngakho-ke ukuze ufake umugqa ongu-1 oqukethe amakholomu angu-100, udinga ukuvula nokubhala okungenani amafayela angu-100. Yingakho ukufaka kumthamo kudinga umlamuli (ngaphandle uma iklayenti ngokwalo linikeza ukugcina kumthamo) - ngokuvamile i-Kafka noma uhlobo oluthile lwesistimu yokubeka umugqa. Ungasebenzisa futhi injini yethebula le-Buffer ukuze kamuva ukopishe izingcezu ezinkulu zedatha kumathebula e-MergeTree.
Ukujoyina kwethebula kunqunyelwe i-RAM yeseva, kodwa okungenani akhona! Isibonelo, i-Druid ne-Pinot azinakho nhlobo ukuxhumana okunjalo, njengoba kunzima ukukusebenzisa ngokuqondile ezinhlelweni ezisabalalisiwe ezingasekeli ukuhambisa izingcezu ezinkulu zedatha phakathi kwamanodi.
okutholakele
Eminyakeni ezayo, sihlela ukusebenzisa kabanzi i-ClickHouse e-Qwintry, njengoba le DBMS inikeza ibhalansi enhle kakhulu yokusebenza, okungaphezulu okuphansi, ukulinganisa, kanye nokulula. Ngiyaqiniseka ukuthi izosabalala ngokushesha uma umphakathi we-ClickHouse usuqhamuke nezindlela eziningi zokuwusebenzisa ekufakweni okuncane nokuphakathi.
Ezinye izikhangiso 🙂
Siyabonga ngokuhlala nathi. Uyazithanda izindatshana zethu? Ufuna ukubona okuqukethwe okuthakaselayo okwengeziwe? Sisekele ngokufaka i-oda noma ngokuncoma kubangani,
I-Dell R730xd 2x ishibhile esikhungweni sedatha se-Equinix Tier IV e-Amsterdam? Lapha kuphela
Source: www.habr.com