Ngesiqhelo, "umtya" weDBMS, imizekelo eyiMySQL, Postgres, MS SQL Server, idatha igcinwa ngolu hlobo lulandelayo:
Kule meko, amaxabiso anxulumene nomqolo omnye agcinwa ngokwasemzimbeni kufutshane. Kwikholamu ye-DBMSs, amaxabiso asuka kwikholamu ezahlukeneyo agcinwa ngokwahlukeneyo, kwaye idatha esuka kwikholamu enye igcinwa kunye:
Imizekelo yee-DBMS ze-column yi-Vertica, i-Paraccel (i-Actian Matrix, i-Amazon Redshift), i-Sybase IQ, i-Exasol, i-Infobright, i-InfiniDB, i-MonetDB (iVectorWise, i-Actian Vector), i-LucidDB, i-SAP HANA, i-Google Dremel, i-Google PowerDrill, i-Druid, i-kdb +.
Inkampani yokuthumela iposi
Ukulula
I-Clickhouse ifakwe kwi-Ubuntu ngomyalelo omnye. Ukuba uyayazi iSQL, ungaqala kwangoko ukusebenzisa iClickhouse kwiimfuno zakho. Nangona kunjalo, oku akuthethi ukuba unokwenza "bonisa itafile yokudala" kwi-MySQL kwaye ukopishe-unamathisele i-SQL kwiClickhouse.
Xa kuthelekiswa neMySQL, kukho umahluko obalulekileyo wohlobo lwedatha kwiinkcazo zeschema setafile, ke uya kusafuna ixesha elithile lokutshintsha iinkcazo zeschema setafile kwaye ufunde iinjini zetafile ukuze ukhululeke.
I-Clickhouse isebenza kakuhle ngaphandle kwesoftware eyongezelelweyo, kodwa ukuba ufuna ukusebenzisa ukuphindaphinda, kuya kufuneka ufake iZooKeeper. Uhlalutyo lwentsebenzo yombuzo lubonisa iziphumo ezigqwesileyo - iitheyibhile zesistim ziqulethe lonke ulwazi, kwaye yonke idatha inokufunyanwa kusetyenziswa iSQL endala kwaye ikruqulayo.
Imveliso
Umlinganiselo ukuthelekiswa kweClickhouse kunye neVertica kunye ne-MySQL kuqwalaselo lomncedisi: iisokethi ezimbini ze-Intel® Xeon® CPU E5-2650 v2 @ 2.60GHz; 128 GiB RAM; md RAID-5 kwi-8 6TB SATA HDD, ext4.Umlinganiselo ukuthelekiswa kweClickhouse kunye ne-Amazon RedShift yokugcina ilifu.- Izicatshulwa zeblogi
Cloudflare ekusebenzeni kweClickhouse :
I-database yeClickHouse inoyilo olulula kakhulu - zonke iindawo ezikuluhlu zinomsebenzi ofanayo kwaye zisebenzisa kuphela i-ZooKeeper yokulungelelanisa. Sakhe iqela elincinci leendawo ezininzi kunye novavanyo olwenziweyo, apho safumanisa ukuba inkqubo inomsebenzi oncomekayo, ohambelana nezibonelelo ezichaziweyo kuhlalutyo lwebhentshi zeDBMS. Sagqiba ekubeni sijonge ngakumbi ingcamango emva kweClickHouse. Umqobo wokuqala wophando yayikukunqongophala kwezixhobo kunye noluntu oluncinci lweClickHouse, ngoko ke sijonge ukuyila le DBMS ukuqonda indlela esebenza ngayo.
I-ClickHouse ayikuxhasi ukufumana idatha ngokuthe ngqo kwi-Kafka kuba iyisiseko sedatha, ngoko sibhale inkonzo yethu yeadaptha kwi-Go. Ifunde imiyalezo efakwe kwi-Cap'n Proto esuka eKafka, yaguqulelwa kwi-TSV kwaye yayifaka kwi-ClickHouse kwiibhetshi nge-interface ye-HTTP. Emva koko sibhala kwakhona le nkonzo ukuze sisebenzise ilayibrari ye-Go ngokubambisana ne-ClickHouse's interface ukuphucula ukusebenza. Xa sivavanya ukusebenza kokufumana iipakethi, sifumene into ebalulekileyo - kwavela ukuba kwiClickHouse le ntsebenzo ixhomekeke kakhulu kubukhulu bepakethi, oko kukuthi, inani lemiqolo efakwe ngaxeshanye. Ukuqonda ukuba kutheni oku kwenzeka, sijonge indlela iClickHouse egcina ngayo idatha.
Injini ephambili, okanye kunoko usapho lweenjini zeetafile, ezisetyenziswa yiClickHouse ukugcina idatha yiMergeTree. Le njini ngokwengqiqo iyafana ne-algorithm ye-LSM esetyenziswa kwiGoogle BigTable okanye i-Apache Cassandra, kodwa inqanda ukwakha itafile yememori ephakathi kwaye ibhale idatha ngokuthe ngqo kwidiski. Oku kunika i-output egqwesileyo yokubhala, kuba ipakethe nganye efakiweyo ihlelwa kuphela ngesitshixo esingundoqo, sicinezelwe, kwaye sibhalelwe kwidiski ukwenza icandelo.
Ukungabikho kwetafile yememori okanye nayiphi na ingqikelelo "yokutsha" kwedatha ikwathetha ukuba banokongezwa kuphela; inkqubo ayikuxhasi ukutshintsha okanye ukucinywa. Okwangoku, ekuphela kwendlela yokucima idatha kukuyicima ngenyanga yekhalenda, kuba amacandelo engazange awele umda wenyanga. Iqela leClickHouse lisebenza ngenkuthalo ukwenza eli nqaku libe ngokwezifiso. Kwelinye icala, yenza ukubhala kunye nokudibanisa amacandelo angabikho mbambano, ke ngoko fumana izikali zokuphumelela ngokulandelelana kunye nenani lokufakwa ngaxeshanye de i-I/O okanye i-core saturation yenzeke.
Nangona kunjalo, oku kuthetha ukuba inkqubo ayifanelekanga kwiipakethi ezincinci, ngoko ke iinkonzo ze-Kafka kunye nezifakeli zisetyenziselwa ukuphazamisa. Okulandelayo, i-ClickHouse ngasemva iyaqhubeka iqhuba ngokudityaniswa kwecandelo, ukuze uninzi lweenkcukacha ezincinci zidityaniswe kwaye zirekhodwe amaxesha amaninzi, ngaloo ndlela kwandisa ukuqina kokurekhoda. Nangona kunjalo, iindawo ezininzi ezingadityaniswanga ziya kubangela ukubetheka okunamandla kofakelo okoko nje udibaniso luqhubeka. Siye safumanisa ukuba eyona nto ingcono phakathi kokungeniswa kwexesha lokwenyani kunye nokusebenza kokungeniswa kukungenisa inani elilinganiselweyo lokufakwa ngomzuzwana kwitheyibhile.
Isitshixo sokusebenza kokufundwa kwetafile kukusalathisa kunye nendawo yedatha kwidiski. Kungakhathaliseki ukuba ukuqhutyelwa phambili kukhawuleza kangakanani, xa injini idinga ukuskena i-terabytes yedatha kwidiski kwaye isebenzise inxalenye yayo kuphela, kuya kuthatha ixesha. I-ClickHouse yivenkile yekholomu, ngoko ke icandelo ngalinye liqulethe ifayile yoluhlu ngalunye (uluhlu) olunamaxabiso ahleliweyo kumqolo ngamnye. Ngale ndlela, iikholamu ezipheleleyo ezilahlekileyo kumbuzo zinokutsitywa kuqala, kwaye iiseli ezininzi zinokusetyenzwa ngokunxuseneyo nokuphunyezwa kwevectorized. Ukunqanda ukuskena okupheleleyo, icandelo ngalinye linefayile encinci yesalathiso.
Ngenxa yokuba zonke iikholomu zihlelwa "ngesitshixo sokuqala", ifayile yesalathisi iqulethe kuphela iilebhile (imigca ehluthiweyo) yomqolo ngamnye we-Nth ukuze ikwazi ukuzigcina kwimemori nakwiitafile ezinkulu kakhulu. Umzekelo, unokuseta useto olungagqibekanga ukuba "uphawule yonke imigca ye-8192", emva koko "incinci" isalathisi setafile ene-1 trillion. Imigca engena lula kwinkumbulo iyakuthatha kuphela amagama angama-122.
Uphuhliso lwenkqubo
Uphuhliso kunye nokuphuculwa kweClickhouse kunokulandelwa
Ukudumisa
Ukuthandwa kweClickhouse kubonakala kukhula ngokukhawuleza, ngakumbi kuluntu oluthetha isiRashiya. Umthwalo ophezulu wenkomfa ye-2018 yonyaka ophelileyo (eMoscow, ngoNovemba 8-9, 2018) wabonisa ukuba izilo ezifana ne-vk.com kunye ne-Badoo zisebenzisa i-Clickhouse, apho bafaka khona idatha (umzekelo, iilogi) ukusuka kumashumi amawaka amaseva ngaxeshanye. Kwividiyo yemizuzu engama-40
Iindawo zokusetyenziswa
Emva kokuchitha ixesha elithile ndiphanda, ndicinga ukuba kukho iindawo apho iClickHouse inokuba luncedo okanye inokutshintsha ngokupheleleyo ezinye, izisombululo zemveli nezithandwayo ezifana neMySQL, PostgreSQL, ELK, Google Big Query, Amazon RedShift, TimescaleDB, Hadoop, MapReduce, Pinot kunye Druid. Oku kulandelayo kuchaza iinkcukacha zokusebenzisa iClickHouse ukwenza imodyuli okanye indawo ngokupheleleyo yeDBMS engentla.
Ukwandisa ubunakho beMySQL kunye nePostgreSQL
Kutshanje sitshintshe iMySQL ngokuyinxenye ngeClickHouse yeqonga lethu leendaba
I-Clickhouse isebenzisa ii-algorithms ezimbini zoxinzelelo ezinciphisa umthamo wedatha malunga
Ukutshintsha i-ELK
Ngokusekwe kumava am, istaki se-ELK (i-ElasticSearch, i-Logstash kunye ne-Kibana, kule meko ithile i-ElasticSearch) ifuna izixhobo ezingaphezulu zokuqhuba kunokuba ziyimfuneko ukugcina izingodo. I-ElasticSearch yinjini enkulu ukuba ufuna uphendlo lwelogi olupheleleyo olupheleleyo (endingacingi ukuba uyalufuna ngokwenene), kodwa ndiyazibuza ukuba kutheni ibeyinjini yokugawulwa kwemithi esemgangathweni. Ukusebenza kwayo ngokudibeneyo kunye neLogstash kusinike iingxaki naphantsi kwemithwalo elula kwaye ifuna ukuba songeze ngakumbi nangakumbi i-RAM kunye nediski yendawo. Njengesiseko sedatha, iClickhouse ingcono kune-ElasticSearch ngezi zizathu zilandelayo:
- Inkxaso yolwimi lwe-SQL;
- Elona qondo lilungileyo loxinzelelo lwedatha egciniweyo;
- Inkxaso yeRegex yokukhangela rhoqo imbonakalo endaweni yophendlo olupheleleyo lokubhaliweyo;
- Ukuphuculwa kocwangciso lwemibuzo kunye nokusebenza okuphezulu kukonke.
Okwangoku, ingxaki enkulu evelayo xa kuthelekiswa ne-ClickHouse kunye ne-ELK kukungabikho kwezisombululo zokulayisha iilogi, kunye nokungabikho kwamaxwebhu kunye nezifundo ngesihloko. Ngaphezu koko, umsebenzisi ngamnye unokuqwalasela i-ELK usebenzisa i-Digital Ocean manual, ebaluleke kakhulu ekuphunyezweni ngokukhawuleza kwezo teknoloji. Kukho injini yedatha, kodwa akukho Filebeat yeClickHouse okwangoku. Ewe, ikho
Ndikhetha izisombululo ezincinci, ndizamile ukusebenzisa iFluentBit, isixhobo sokuthumela iilogi ezinememori encinci kakhulu, kunye neClickHouse, ngelixa ndizama ukuphepha ukusebenzisa iKafka. Nangona kunjalo, ukungahambelani okuncinci kufuneka kulungiswe, njenge
Njengenye indlela, i-Kibana ingasetyenziswa njenge-ClickHouse backend
Ukutshintshwa koMbuzo omkhulu kaGoogle kunye neAmazon RedShift (isisombululo seenkampani ezinkulu)
Eyona meko yokusetyenziswa kweBigQuery kukulayisha i-1 TB yedatha ye-JSON kwaye iqhube imibuzo yohlalutyo kuyo. I-Big Query yimveliso egqwesileyo enokulinganiswa kwayo kungenakubaxwa. Le isoftware entsonkothileyo kakhulu kuneClickHouse, esebenza kwiqela langaphakathi, kodwa ngokwembono yomxhasi inokuninzi okufanayo kunye neClickHouse. I-BigQuery inokubiza ngokukhawuleza xa uqala ukuhlawula KHETHA, ngoko sisisombululo se-SaaS esiyinyani nazo zonke iingenelo kunye nokubi.
I-ClickHouse lolona khetho lulungileyo xa uqhuba imibuzo eninzi ebiza kakhulu. Okukhona imibuzo EKHETHAYO oyiqhubayo yonke imihla, kokukhona kuyavakala ukuba umisele uMbuzo oMkhulu ngeClickHouse, kuba ukutshintshwa okunjalo kunokugcina amawaka eedola xa kufikwa kwiiterabytes ezininzi zedatha elungiswayo. Oku akusebenzi kwidatha egciniweyo, enexabiso eliphantsi ukuqhubekekiswa kuMbuzo omkhulu.
Kwinqaku lika-Altinity umququzeleli u-Alexander Zaitsev
Ukutshintshwa kwe-TimescaleDB
I-TimescaleDB lulwandiso lwePostgreSQL oluphucula ukusebenza kunye nothotho lwexesha kwisiseko sedatha esiqhelekileyo (
Nangona i-ClickHouse ingeyena ukhuphisana olunzulu kwi-niche yochungechunge lwexesha, kodwa isakhiwo se-columnar kunye ne-vector query execution, ikhawuleza kakhulu kune-TimescaleDB kwiimeko ezininzi zokuphendula imibuzo yohlalutyo. Ngelo xesha, ukusebenza kokufumana idatha ye-batch esuka kwi-ClickHouse malunga namaxesha e-3 aphezulu, kwaye iphinda isebenzise amaxesha angama-20 ngaphantsi kwendawo yediski, ebaluleke kakhulu ekuqhubeni imiqulu emikhulu yedatha yembali:
Ngokungafaniyo neClickHouse, ekuphela kwendlela yokugcina indawo yedisk kwi-TimescaleDB kukusebenzisa iZFS okanye iinkqubo ezifanayo zefayile.
Uhlaziyo oluzayo lweClickHouse luya kwazisa ucinezelo lwe-delta, oluya kuyenza ilunge ngakumbi ukusetyenzwa kunye nokugcina idatha yothotho lwexesha. I-TimescaleDB inokuba lukhetho olungcono kuneClickHouse engenanto kwezi meko zilandelayo:
- ukufakwa okuncinci kunye ne-RAM encinci kakhulu (<3 GB);
- inani elikhulu le-INSERT emincinci ongafuniyo ukuyikhusela kumaqhekeza amakhulu;
- ukuhambelana okungcono, ukufana kunye neemfuno ze-ACID;
- Inkxaso yePostGIS;
- ukudibanisa neetafile zePostgreSQL ezikhoyo, kuba iTimescale DB yeyona PostgreSQL.
Ukhuphiswano kunye ne-Hadoop kunye ne-MapReduce systems
I-Hadoop kunye nezinye iimveliso ze-MapReduce zinokwenza izibalo ezininzi ezinzima, kodwa zihlala ziqhuba ngee-latencies ezinkulu.ClickHouse ilungisa le ngxaki ngokucubungula i-terabytes yedatha kunye nokuvelisa iziphumo ngokukhawuleza. Ngaloo ndlela, i-ClickHouse iyasebenza kakhulu ekwenzeni ngokukhawuleza, uphando olusebenzayo lohlalutyo, olufanele lube nomdla kwizazinzulu zedatha.
Ukhuphiswano kunye nePinot kunye neDruid
Abona bakhuphisana nabo bakufutshane beClickHouse yi-columnar, imveliso yomthombo ovulekileyo we-linearly scalable Pinot kunye neDruid. Umsebenzi ogqwesileyo othelekisa ezi nkqubo upapashwa kwinqaku
Eli nqaku lifuna ukuhlaziywa - lithi i-ClickHouse ayixhasi ukusebenza kwe-UPDATE kunye ne-DELETE, okungekho nyani ngokupheleleyo kwiinguqulelo zamva nje.
Asinawo amava amaninzi ngezi nkcukacha zogcino-lwazi, kodwa andikuthandi ncam ukuntsonkotha kweziseko ezifunekayo ukuze kuqhutywe iDruid kunye nePinot - liqela elipheleleyo leenxalenye ezihambayo ezingqongwe yiJava macala onke.
I-Druid kunye nePinot ziiprojekthi ze-Apache incubator, inkqubela phambili ehlanganiswe ngokubanzi yi-Apache kumaphepha eprojekthi ye-GitHub. UPinot wabonakala kwi-incubator ngo-Okthobha ka-2018, kwaye uDruid wazalwa kwiinyanga ezisi-8 ngaphambili - ngoFebruwari.
Ukunqongophala kolwazi malunga nendlela i-AFS esebenza ngayo iphakamisa imibuzo ethile, kwaye mhlawumbi isidenge, kum. Ndiyazibuza ukuba ngaba ababhali bePinot baqaphele ukuba i-Apache Foundation ithandeka ngakumbi kwi-Druid, kwaye ingaba esi simo sengqondo ngakulowo ukhuphisana naye sibangele umona? Ngaba uphuhliso lukaDruid luya kuncipha kwaye ukukhula kukaPinot kuya kukhawuleza ukuba abaxhasi bangaphambili baya kuba nomdla kokugqibela?
Ukungalungi kweClickHouse
Ukungakhuli: Ngokucacileyo, oku akuseyiyo itekhnoloji ekruqulayo, kodwa kuyo nayiphi na imeko, akukho nto efana nale ibonwa kwezinye ii-DBMS zekholamu.
Ukufakwa okuncinci akwenzi kakuhle ngesantya esiphezulu: ukufakwa kufuneka kuhlulwe kwiinqununu ezinkulu ngenxa yokuba ukusebenza kwezinto ezincinci ezifakelwayo kunciphisa ngokulingana nenani leekholomu kumqolo ngamnye. Yile ndlela iClickHouse igcina ngayo idatha kwidiski - ikholomu nganye imele ifayile eyi-1 okanye ngaphezulu, ngoko ke ukufaka umqolo omnye oqulethe iikholamu eziyi-1, kufuneka uvule kwaye ubhale ubuncinane iifayile ze-100. Yiyo loo nto ufakelo lwebuffering lufuna umntu ophakathi (ngaphandle kokuba umxhasi ngokwakhe ubonelela ngesithintelo) - ngokuqhelekileyo iKafka okanye uhlobo oluthile lwenkqubo yolawulo lomgca. Ungasebenzisa kwakhona i-injini yetafile ye-Buffer ukukopa kamva amaqhekeza amakhulu edatha kwiitafile zeMergeTree.
Ukudibanisa kwetheyibhile kukhawulelwe yi-RAM yomncedisi, kodwa ubuncinci balapho! Umzekelo, iDruid kunye nePinot abanalo udibaniso olunjalo konke konke, kuba kunzima ukuphumeza ngokuthe ngqo kwiinkqubo ezisasaziweyo ezingaxhasi ukuhambisa amanqwanqwa edatha phakathi kweenodi.
ezifunyanisiweyo
Siceba ukusebenzisa ngokubanzi iClickHouse eQwintry kwiminyaka ezayo, njengoko le DBMS ibonelela ngokulinganisela okugqwesileyo kokusebenza, ukugqithisa okuphantsi, ukulinganisa kunye nokulula. Ndiqinisekile ukuba iya kuqalisa ukusasazeka ngokukhawuleza xa uluntu lweClickHouse luza neendlela ezininzi zokuyisebenzisa kufakelo oluncinci ukuya koluphakathi.
Ezinye iintengiso 🙂
Enkosi ngokuhlala nathi. Ngaba uyawathanda amanqaku ethu? Ngaba ufuna ukubona umxholo onomdla ngakumbi? Sixhase ngokufaka iodolo okanye ngokucebisa abahlobo,
Dell R730xd 2x ngexabiso eliphantsi kwiziko ledatha le-Equinix Tier IV eAmsterdam? Kuphela apha
umthombo: www.habr.com