Sivavanye njani iiDatabase zeXesha eliNinzi

Sivavanye njani iiDatabase zeXesha eliNinzi

Kule minyaka imbalwa idlulileyo, ii-database zothotho lwexesha ziye zajika zisuka kwinto engaqhelekanga (ekhethekileyo esetyenziswa kwiinkqubo zokubeka iliso ezivulekileyo (kwaye zibotshelelwe kwizisombululo ezithile) okanye kwiiprojekthi zeDatha enkulu) zaba “yimveliso yabathengi”. Kwintsimi yeRussian Federation, umbulelo okhethekileyo kufuneka unikwe iYandex kunye neClickHouse. Ukuza kuthi ga ngoku, ukuba ubufuna ukugcina isixa esikhulu sedatha yothotho lwexesha, kuya kufuneka ufikelele kwimfuno yokwakha istaki seHadoop eyoyikekayo kwaye usigcine, okanye unxibelelane neeprothokholi zomntu ngamnye kwinkqubo nganye.

Kungabonakala ngathi ngo-2019 inqaku malunga ne-TSDB efanelekileyo ukusetyenziswa liza kuba nesivakalisi esinye: "sebenzisa nje iClickHouse." Kodwa ... kukho ama-nuances.

Ngokwenene, i-ClickHouse iphuhlisa ngokusebenzayo, isiseko somsebenzisi sikhula, kwaye inkxaso iyasebenza kakhulu, kodwa ngaba siye saba ngamaxhoba kwimpumelelo yoluntu yeClickHouse, eye yasibekela ezinye, mhlawumbi izisombululo ezisebenzayo / ezithembekileyo?

Ekuqaleni konyaka ophelileyo, saqala ukusebenza kwakhona inkqubo yethu yokubeka iliso, apho kwavela umbuzo wokukhetha isiseko sedatha esifanelekileyo sokugcina idatha. Ndifuna ukuthetha ngembali yolu khetho apha.

Џџ ѕЃ ° °

Okokuqala, isandulela esiyimfuneko. Kutheni le nto sidinga eyethu inkqubo yokubeka iliso konke konke kwaye yayilwa njani?

Saqala ukubonelela ngeenkonzo zenkxaso kwi-2008, kwaye ngo-2010 kwacaca ukuba kuye kwaba nzima ukudibanisa idatha malunga neenkqubo ezenzeka kwisiseko somthengi kunye nezisombululo ezazikho ngelo xesha (sithetha ngayo, uThixo andixolele, Cacti, Zabbix kunye neGrafite esakhulayo).

Iimfuno zethu eziphambili ibizi:

  • inkxaso (ngelo xesha - ezininzi, kwaye kwixesha elizayo - amakhulu) abathengi ngaphakathi kwenkqubo enye kwaye kwangaxeshanye ubukho benkqubo yolawulo lwesilumkiso esisembindini;
  • ukuguquguquka ekulawuleni inkqubo yokulumkisa (ukunyuka kwezilumkiso phakathi kwamagosa omsebenzi, ukucwangcisa, isiseko solwazi);
  • ukukwazi ukuchaza ngokunzulu iigrafu (i-Zabbix ngelo xesha iguqulelwe iigrafu ngendlela yemifanekiso);
  • ukugcinwa kwexesha elide ledatha enkulu (unyaka okanye ngaphezulu) kunye nokukwazi ukuyifumana ngokukhawuleza.

Kweli nqaku sinomdla kwinqaku lokugqibela.

Xa sithetha ngogcino, iimfuno bezimi ngolu hlobo lulandelayo:

  • inkqubo kufuneka isebenze ngokukhawuleza;
  • kunqweneleka ukuba inkqubo inojongano lweSQL;
  • inkqubo kufuneka izinzile kwaye ibe nesiseko somsebenzisi osebenzayo kunye nenkxaso (sakuba sijongene nesidingo sokuxhasa iinkqubo ezifana neMemcacheDB, engazange iphuhliswe, okanye i-MooseFS esasazwayo yokugcina, i-tracker ye-bug eyayigcinwe ngesiTshayina: siphinda eli bali kwiprojekthi yethu ayifuni);
  • ukuthotyelwa kwe-theorem ye-CAP: Ukuhambelana (okufunekayo) - idatha kufuneka ibe yinto entsha, asifuni ukuba inkqubo yokulawula isilumkiso ingafumani idatha entsha kwaye ikhuphe izilumkiso malunga nokungafiki kwedatha yazo zonke iiprojekthi; Ukunyamezela ukwahlula (okufunekayo) - asifuni ukufumana inkqubo ye-Split Brain; Ukufumaneka (akubalulekile, ukuba kukho i-replica esebenzayo) - sinokutshintshela kwinkqubo yokugcina ngokwethu kwimeko yengozi, sisebenzisa ikhowudi.

Ngokungaqhelekanga, ngelo xesha iMySQL yajika yaba sisisombululo esifanelekileyo kuthi. Ulwakhiwo lwethu lwedatha lwalulula kakhulu: i-id yeseva, i-counter id, isitampu sexesha kunye nexabiso; iisampulu ekhawulezileyo yedatha eshushu yaqinisekiswa lichibi elikhulu le-buffer, kwaye isampulu yedatha yembali yaqinisekiswa yi-SSD.

Sivavanye njani iiDatabase zeXesha eliNinzi

Ke, sifumene isampula yedatha entsha yeeveki ezimbini, kunye neenkcukacha ukuya kuthi ga kwi-200 ms yesibini phambi kokuba idatha inikezelwe ngokupheleleyo, kwaye sihlala kule nkqubo ixesha elide.

Okwangoku, ixesha lihambile kwaye inani ledatha likhule. Ngo-2016, umthamo wedatha ufikelele kumashumi e-terabytes, eyayiyindleko ebalulekileyo kumxholo wokugcinwa kwe-SSD eqashiweyo.

Ngeli xesha, i-database ye-columnar yayiye yasasazeka ngokubanzi, esaqala ukucinga ngayo ngokusebenzayo: kwiinkcukacha ze-columnar, idatha igcinwe, njengoko unokuqonda, kwiikholamu, kwaye ukuba ujonga idatha yethu, kulula ukubona enkulu. inani lokuphindaphinda okunokuthi, kwi Ukuba usebenzisa isiseko sedatha yekholamu, yicinezele usebenzisa ucinezelo.

Sivavanye njani iiDatabase zeXesha eliNinzi

Nangona kunjalo, inkqubo ephambili yenkampani yaqhubeka isebenza ngokuzinzileyo, kwaye andizange ndifune ukuzama ukutshintshela kwenye into.

Kwi-2017, kwinkomfa yePercona Live eSan Jose, abaphuhlisi beClickhouse mhlawumbi bazibhengeze okokuqala. Ekuboneni kokuqala, inkqubo yayilungele imveliso (kakuhle, i-Yandex.Metrica yinkqubo yokuvelisa imveliso), inkxaso yayikhawuleza kwaye ilula, kwaye, okona kubaluleke kakhulu, ukusebenza kwakulula. Ukusukela ngo-2018, siqale inkqubo yotshintsho. Kodwa ngelo xesha, kwakukho ezininzi iinkqubo ze-TSDB "zabantu abadala" kunye nexesha elivavanyiweyo, kwaye sagqiba ekubeni sichithe ixesha elide kwaye sithelekise ezinye iindlela ukuze siqinisekise ukuba akukho zisombululo ezizezinye kwiClickhouse, ngokweemfuno zethu.

Ukongeza kwiimfuno ezisele zichaziwe, ezintsha ziye zavela:

  • inkqubo entsha kufuneka inike ubuncinane ukusebenza okufanayo njenge-MySQL kumlinganiselo ofanayo wehardware;
  • ukugcinwa kwenkqubo entsha kufuneka kuthathe indawo encinci kakhulu;
  • I-DBMS kufuneka kube lula ukuyilawula;
  • Bendifuna ukutshintsha isicelo kancinci xa nditshintsha iDBMS.

Zeziphi iinkqubo esiqale ukuziqwalasela?

Apache Hive/Apache Impala
Isitaki esidala, esivavanyiweyo seHadoop. Ngokusisiseko, lujongano lweSQL olwakhiwe phezulu kokugcina idatha kwiifomathi zemveli kwiHDFS.

IiPros.

  • Ngokusebenza okuzinzileyo, kulula kakhulu ukukala idatha.
  • Kukho izisombululo zekholomu zokugcina idatha (indawo encinci).
  • Ukwenziwa ngokukhawuleza kakhulu kwemisebenzi ehambelanayo xa izibonelelo zikhona.

I-Cons

  • YiHadoop, kwaye kunzima ukuyisebenzisa. Ukuba asikakulungeli ukuthatha isisombululo esenziwe efini (kwaye asikakulungeli ngokwemiqathango yeendleko), yonke i-stack iya kufuneka idityaniswe kwaye ixhaswe yizandla ze-admins, kwaye ngokwenene asifuni. oku.
  • Idatha idityanisiwe ngokukhawuleza ngokwenene.

Kodwa:

Sivavanye njani iiDatabase zeXesha eliNinzi

Isantya sifumaneka ngokukala inani leeseva zekhompyuter. Ukubeka nje, ukuba siyinkampani enkulu, ebandakanyeka kuhlalutyo, kwaye kubalulekile ukuba ishishini lidibanise ulwazi ngokukhawuleza (nangona kwiindleko zokusebenzisa ubuninzi bezixhobo zekhompyutha), oku kunokukhetha kwethu. Kodwa besingekakulungeli ukuphindaphinda iinqanawa zehardware ukukhawulezisa imisebenzi.

Druid/Pinot

Kukho okuninzi malunga ne-TSDB ngokuthe ngqo, kodwa kwakhona, istaki seHadoop.

kukho inqaku elikhulu lithelekisa okuhle kunye nokubi kweDruid kunye nePinot ngokuchasene neClickHouse .

Ngamagama ambalwa: iDruid/Pinot ibonakala ingcono kuneClickhouse kwiimeko apho:

  • Unendalo engaqhelekanga yedatha (kwimeko yethu, sirekhoda kuphela amaxesha eemetriki zeseva, kwaye, eneneni, le yitafile enye. Kodwa kunokubakho ezinye iimeko: uchungechunge lwexesha lezixhobo, uchungechunge lwexesha loqoqosho, njl. njl. Ulwakhiwo lwalo, ekufuneka ludityaniswe kwaye lulungiswe).
  • Ngaphezu koko, kukho le datha eninzi.
  • Iitheyibhile kunye nedatha kunye nochungechunge lwexesha zivela kwaye zinyamalale (oko kukuthi, isethi ethile yedatha yafika, yahlalutywa kwaye yacinywa).
  • Akukho khrayitheriya icacileyo apho idatha inokwahlulwa khona.

Kwiimeko ezichaseneyo, iClickHouse iqhuba ngcono, kwaye le yimeko yethu.

Cofa indlu

  • SQL-like
  • Kulula ukuphatha.
  • Abantu bathi iyasebenza.

Ufakwa kuluhlu olufutshane ukuze avavanywe.

I-InfluxDB

Enye indlela yangaphandle kwiClickHouse. Kwiminus: Ukufumaneka okuphezulu kufumaneka kuphela kwinguqulelo yorhwebo, kodwa kufuneka kuthelekiswe.

Ufakwa kuluhlu olufutshane ukuze avavanywe.

Cassandra

Kwelinye icala, siyazi ukuba isetyenziselwa ukugcina amaxesha emetric ngeenkqubo zokubeka iliso ezinje, umzekelo, UmqondisoFX okanye i-OkMeter. Nangona kunjalo, kukho izinto ezithile.

I-Cassandra ayikho i-database ye-columnar ngokwengqiqo yendabuko. Ijongeka ngakumbi njengombono womqolo, kodwa umgca ngamnye unokuba nenani elahlukileyo leekholomu, okwenza kube lula ukucwangcisa umbono wekholamu. Ngaloo ndlela, kuyacaca ukuba kunye nomda we-2 yeebhiliyoni zekholomu, kunokwenzeka ukugcina idatha ethile kwiikholomu (kunye nochungechunge lwexesha elifanayo). Ngokomzekelo, kwi-MySQL kukho umda weekholomu ze-4096 kwaye kulula ukukhubeka kwimpazamo ngekhowudi 1117 ukuba uzama ukwenza okufanayo.

I-injini yeCassandra igxile ekugcinweni kwedatha enkulu kwinkqubo esasazwayo ngaphandle kwenkosi, kwaye i-theorem yeCassandra CAP ekhankanywe ngasentla ingaphezulu malunga ne-AP, oko kukuthi, malunga nokufumaneka kwedatha kunye nokuchasana nokwahlula. Ke, esi sixhobo sinokuba sihle ukuba ufuna kuphela ukubhala kule database kwaye unqabile ukufunda kuyo. Kwaye apha kunengqiqo ukusebenzisa iCassandra njengendawo yokugcina "ebandayo". Oko kukuthi, njengendawo ehlala ixesha elide, ethembekileyo yokugcina inani elikhulu leedatha zembali ezingafane zifuneke, kodwa zinokufunyanwa xa kuyimfuneko. Noko ke, ngenxa yokuphelela, siya kuwuvavanya nathi. Kodwa, njengoko benditshilo ngaphambili, akukho mnqweno wokuphinda ubhale ikhowudi kwisisombululo esikhethiweyo sesiseko sedatha, ngoko siya kuyivavanya kancinci - ngaphandle kokulungelelanisa ulwakhiwo lwedatha kwiinkcukacha zeCassandra.

Prometheus

Ewe, ngenxa yokufuna ukwazi, sigqibe kwelokuba sivavanye ukusebenza kwePrometheus yokugcina - ukuqonda nje ukuba sikhawuleza okanye sicotha kunezisombululo zangoku kwaye singakanani.

Indlela yokuvavanya kunye neziphumo

Ngoko ke, sivavanye i-5 yogcino lweenkcukacha kwi-6 elandelayo: i-ClickHouse (i-node ye-1), i-ClickHouse (itafile esasazwayo kwii-nodes ze-3), i-InfluxDB, i-Mysql 8, i-Cassandra (i-3 nodes) kunye ne-Prometheus. Isicwangciso sovavanyo simi ngolu hlobo lulandelayo:

  1. layisha idatha yembali ngeveki (i-840 yezigidi zexabiso ngosuku; 208 amawaka eemitha);
  2. sivelisa umthwalo wokurekhoda (iindlela ze-6 zomthwalo zicatshangelwe, jonga ngezantsi);
  3. Ngokunxuseneyo nokurekhoda, simane sikhetha, silinganisa izicelo zomsebenzisi osebenza ngeetshathi. Ukuze singabi nzima kakhulu, sikhethe idatha yee-metrics ze-10 (yiyo kanye ukuba zininzi kangakanani kwigrafu ye-CPU) ngeveki.

Silayisha ngokuxelisa indlela yokuziphatha yearhente yethu yokubeka iliso, ethumela amaxabiso kwimetric nganye kanye rhoqo ngemizuzwana eli-15. Kwangaxeshanye, sinomdla wokwahluka:

  • inani lilonke leemethrikhi ekubhalwe kuzo idatha;
  • isithuba sokuthumela amaxabiso kwimetric enye;
  • ubungakanani bebhetshi.

Malunga nobungakanani bebhetshi. Ekubeni kungakhuthazwa ukulayisha phantse zonke iinkcukacha zethu zovavanyo kunye nokufakwa okukodwa, siya kufuna i-relay eqokelela i-metrics engenayo kwaye idibanise ngokwamaqela kwaye ibhale kwi-database njengokufakwa kwebhetshi.

Kwakhona, ukuqonda ngcono indlela yokutolika idatha efunyenweyo, makhe sicinge ukuba asithumeli nje iqela leemetriki, kodwa iimetriki zilungelelaniswe kwiiseva - iimethrikhi ezili-125 ngeseva nganye. Apha umncedisi yinto nje ekhoyo-ukuqonda nje ukuba, umzekelo, iimethrikhi ezili-10000 zihambelana malunga neeseva ezingama-80.

Kwaye apha, sithathela ingqalelo yonke le nto, ziindlela zethu ezi-6 zokubhala zedatha:

Sivavanye njani iiDatabase zeXesha eliNinzi

Kukho amanqaku amabini apha. Okokuqala, kuCassandra ezi sayizi zebhetshi zajika zaba zikhulu kakhulu, apho sasebenzisa ixabiso le-50 okanye i-100. Kwaye okwesibini, ekubeni i-Prometheus isebenza ngokungqongqo kwimodi yokutsala, i.e. yona ngokwayo iyahamba kwaye iqokelele idata kwimithombo yemetrics (kwaye ne-pushgateway, ngaphandle kwegama, ayitshintshi imeko), imithwalo ehambelanayo yaphunyezwa kusetyenziswa indibaniselwano ye-static configs.

Iziphumo zovavanyo zezi zilandelayo:

Sivavanye njani iiDatabase zeXesha eliNinzi

Sivavanye njani iiDatabase zeXesha eliNinzi

Sivavanye njani iiDatabase zeXesha eliNinzi

Yintoni efanele ukuqwalaselwa: iisampulu ezikhawulezayo ezikhawulezayo ezivela kwi-Prometheus, iisampulu ezicothayo ezisuka eCassandra, iisampuli ezicothayo ezingamkelekanga ezivela kwi-InfluxDB; Ngokubhekiselele kwisantya sokurekhoda, i-ClickHouse iphumelele wonke umntu, kwaye i-Prometheus ayithathi nxaxheba kukhuphiswano, kuba yenza ukufakwa ngokwayo kwaye asilinganisi nantoni na.

Ngenxa yoko,: I-ClickHouse kunye ne-InfluxDB yenze okulungileyo, kodwa i-cluster evela kwi-Influx inokwakhiwa kuphela ngesiseko se-Enterprise version, ebiza imali, ngelixa i-ClickHouse ayibizi nto kwaye yenziwe eRashiya. Kunengqiqo ukuba e-USA ukhetho lunokuthi luthande i-InfluxDB, kwaye kwilizwe lethu lithanda iClickHouse.

umthombo: www.habr.com

Yongeza izimvo