Ithiyori nomkhuba wokusebenzisa i-HBase

Sawubona Igama lami ngingu-Danil Lipovoy, ithimba lethu e-Sbertech laqala ukusebenzisa i-HBase njengendawo yokugcina idatha yokusebenza. Phakathi nokuyifunda, isipiliyoni sinqwabelene engangifuna ukusihlela futhi ngikuchaze (sithemba ukuthi kuzoba usizo kwabaningi). Konke ukuhlola ngezansi kwenziwe ngezinguqulo ze-HBase 1.2.0-cdh5.14.2 kanye ne-2.0.0-cdh6.0.0-beta1.

  1. Izakhiwo ezijwayelekile
  2. Ibhala idatha ku-HBASE
  3. Ifunda idatha evela ku-HBASE
  4. Ukugcinwa Kwedatha
  5. I-Batch data processing MultiGet/MultiPut
  6. Isu lokuhlukanisa amatafula abe yizifunda (ukuhlukanisa)
  7. Ukubekezelela amaphutha, ukuhlanganisa kanye nendawo yedatha
  8. Izilungiselelo nokusebenza
  9. Ukuhlolwa Kwengcindezi
  10. okutholakele

1. Izakhiwo ezijwayelekile

Ithiyori nomkhuba wokusebenzisa i-HBase
I-backup Master ilalela ukushaya kwenhliziyo kwalowo osebenzayo endaweni ye-ZooKeeper futhi, uma kwenzeka inyamalala, ithatha imisebenzi ye-master.

2. Bhala idatha ku-HBASE

Okokuqala, ake sibheke icala elilula - ukubhala into yenani elingukhiye etafuleni usebenzisa i- put(rowkey). Iklayenti kufanele lithole kuqala ukuthi i-Root Region Server (RRS), egcina ithebula elithi hbase:meta, ikuphi. Uthola lolu lwazi ku-ZooKeeper. Ngemva kwalokho ifinyelela i-RRS futhi ifunde ithebula elithi hbase:meta, lapho ikhipha khona ulwazi mayelana nokuthi iyiphi i-RegionServer (RS) enesibopho sokugcina idatha yokhiye womugqa onikeziwe kuthebula lenzuzo. Ukuze isetshenziswe esikhathini esizayo, i-meta table igcinwa kunqolobane yiklayenti ngakho-ke izingcingo ezilandelayo ziya ngokushesha, ziqonde ngqo ku-RS.

Okulandelayo, i-RS, ngemva kokuthola isicelo, okokuqala isibhalela ku-WriteAheadLog (WAL), okudingekayo ukuze kutholakale kabusha uma kwenzeka ingozi. Bese igcina idatha ku-MemStore. Lesi isigcinalwazi kumemori equkethe isethi yokhiye abahleliwe bendawo ethile. Ithebula lingahlukaniswa libe izifunda (ama-partitions), ngayinye equkethe isethi ehlukene yokhiye. Lokhu kukuvumela ukuthi ubeke izifunda kumaseva ahlukene ukuze uzuze ukusebenza okuphezulu. Nokho, naphezu kokuba sobala kwalesi sitatimende, sizobona kamuva ukuthi lokhu akusebenzi kuzo zonke izimo.

Ngemva kokufaka okufakiwe ku-MemStore, impendulo ibuyiselwa kuklayenti ukuthi okufakiwe kulondolozwe ngempumelelo. Kodwa-ke, empeleni igcinwa kuphela ku-buffer futhi ifika kudiski kuphela ngemuva kokuthi kudlule isikhathi esithile noma lapho igcwele idatha entsha.

Ithiyori nomkhuba wokusebenzisa i-HBase
Uma wenza umsebenzi othi "Susa", idatha ayisulwa ngokoqobo. Zimane zimakwe njengezisusiwe, futhi ukucekelwa phansi kwenzeka ngesikhathi sokubiza umsebenzi omkhulu ohlangene, ochazwe kabanzi esigabeni sesi-7.

Amafayela ngefomethi ye-HFile anqwabelana ku-HDFS futhi ngezikhathi ezithile kwethulwa inqubo encane yokuhlanganisa, evele ihlanganise amafayela amancane abe amakhudlwana ngaphandle kokususa noma yini. Ngokuhamba kwesikhathi, lokhu kuphenduka kube inkinga evela kuphela lapho kufundwa idatha (sizobuyela kulokhu ngemva kwesikhashana).

Ngaphezu kwenqubo yokulayisha echazwe ngenhla, kunenqubo ephumelela kakhulu, okungenzeka ukuthi iyingxenye eqinile yalesi sizindalwazi - i-BulkLoad. Kulele eqinisweni lokuthi sakha ama-HFiles ngokuzimela futhi siwabeke kudiski, esivumela ukuthi silinganise kahle futhi sifinyelele isivinini esihle kakhulu. Eqinisweni, umkhawulo lapha akuyona i-HBase, kodwa amandla we-hardware. Ngezansi kunemiphumela yokuqalisa kuqoqo elihlanganisa i-16 RegionServers kanye ne-16 NodeManager YARN (CPU Xeon E5-2680 v4 @ 2.40GHz * 64 threads), inguqulo ye-HBase 1.2.0-cdh5.14.2.

Ithiyori nomkhuba wokusebenzisa i-HBase

Lapha ungabona ukuthi ngokwandisa inani lama-partitions (izifunda) etafuleni, kanye nabaphathi be-Spark, sithola ukwanda kwejubane lokulanda. Futhi, isivinini sincike kwivolumu yokurekhoda. Amabhulokhi amakhulu anikeza ukwanda kwe-MB/isekhondi, amabhlogo amancane enanini lamarekhodi afakiwe ngeyunithi yesikhathi, zonke ezinye izinto ziyalingana.

Ungakwazi futhi ukuqala ukulayisha kumatafula amabili ngesikhathi esisodwa futhi uthole isivinini esiphindwe kabili. Ngezansi ungabona ukuthi ukubhala amabhulokhi angu-10 KB kumathebula amabili ngesikhathi esisodwa kwenzeka ngesivinini esingaba ngu-600 MB/isekhondi ngalinye (ingqikithi engu-1275 MB/isekhondi), okuhambisana nesivinini sokubhala etafuleni elilodwa 623 MB/sec (bona No. 11 ngenhla)

Ithiyori nomkhuba wokusebenzisa i-HBase
Kodwa okwesibili okunamarekhodi angu-50 KB kubonisa ukuthi ijubane lokulanda likhula kancane, okubonisa ukuthi lisondela kumanani omkhawulo. Ngesikhathi esifanayo, udinga ukukhumbula ukuthi cishe awukho umthwalo owenziwe ku-HBASE ngokwayo, okudingekayo kuyo kuqala ukunikeza idatha kusuka ku-hbase:meta, futhi ngemva kokuhlanganisa ama-HFiles, setha kabusha idatha ye-BlockCache bese ulondoloza idatha. Ibhafa ye-MemStore kudiski, uma ingenalutho.

3. Ukufunda idatha evela ku-HBASE

Uma sicabanga ukuthi iklayenti selinalo lonke ulwazi oluvela ku-hbase:meta (bona iphuzu 2), bese isicelo siya ngqo ku-RS lapho ukhiye odingekayo ugcinwa khona. Okokuqala, ukusesha kwenziwa ku-MemCache. Kungakhathaliseki ukuthi kukhona idatha lapho noma cha, usesho luphinde lwenziwa ku-BlockCache buffer futhi, uma kunesidingo, kuma-HFiles. Uma idatha itholwe kufayela, ifakwa ku-BlockCache futhi izobuyiselwa ngokushesha esicelweni esilandelayo. Ukusesha ku-HFile kuyashesha kakhulu ngenxa yokusetshenziswa kwesihlungi se-Bloom, i.e. ngemva kokufunda idatha encane, inquma ngokushesha ukuthi leli fayela liqukethe ukhiye odingekayo futhi uma kungenjalo, bese lidlulela kwelilandelayo.

Ithiyori nomkhuba wokusebenzisa i-HBase
Ngemva kokuthola idatha kule mithombo emithathu, i-RS ikhiqiza impendulo. Ikakhulukazi, ingadlulisela izinguqulo ezimbalwa ezitholiwe zento ngesikhathi esisodwa uma iklayenti licele ukuguqulwa.

4. Ukugcinwa Kwedatha

Amabhafa we-MemStore kanye ne-BlockCache athatha kufikela ku-80% wememori ye-RS eyabelwe enqwabeni (okunye kubekelwe imisebenzi yesevisi ye-RS). Uma imodi yokusetshenziswa evamile injalo ukuthi izinqubo zibhala futhi ngokushesha zifunde idatha efanayo, khona-ke kunengqondo ukunciphisa i-BlockCache futhi ukwandise i-MemStore, ngoba Uma idatha yokubhala ingangeni kunqolobane ukuze ifundwe, iBlockCache izosetshenziswa kancane. I-BlockCache buffer iqukethe izingxenye ezimbili: I-LruBlockCache (ihlala iku-heap) kanye ne-BucketCache (imvamisa i-off-heap noma ku-SSD). I-BucketCache kufanele isetshenziswe uma kunezicelo eziningi zokufunda futhi zingangeni ku-LruBlockCache, okuholela emsebenzini osebenzayo we-Garbage Collector. Ngesikhathi esifanayo, akufanele ulindele ukwanda okukhulu ekusebenzeni ngokusebenzisa inqolobane yokufunda, kodwa sizobuyela kulokhu esigabeni 8.

Ithiyori nomkhuba wokusebenzisa i-HBase
Kukhona i-BlockCache eyodwa yayo yonke i-RS, futhi kune-MemStore eyodwa yetafula ngalinye (elilodwa lomndeni Wekholomu ngayinye).

Indlela kuchaziwe ngokombono, lapho ubhala, idatha ayingeni kunqolobane futhi ngempela, amapharamitha anjalo CACHE_DATA_ON_WRITE ethebula kanye β€œNenqolobane Yedatha Yokubhala” ye-RS isethelwe kumanga. Kodwa-ke, ekusebenzeni, uma sibhala idatha ku-MemStore, bese siyiphonsa kudiski (ngaleyo ndlela siyisula), bese sisusa ifayela eliwumphumela, bese ngokwenza isicelo sokuthola sizoyithola ngempumelelo idatha. Ngaphezu kwalokho, ngisho noma ukhubaza ngokuphelele i-BlockCache futhi ugcwalise ithebula ngedatha entsha, bese usetha kabusha i-MemStore kudiski, uyisuse futhi uyicele kwenye iseshini, zisazotholwa kwenye indawo. Ngakho-ke i-HBase ayigcini nje kuphela idatha, kodwa futhi nezimfihlakalo ezingaqondakali.

hbase(main):001:0> create 'ns:magic', 'cf'
Created table ns:magic
Took 1.1533 seconds
hbase(main):002:0> put 'ns:magic', 'key1', 'cf:c', 'try_to_delete_me'
Took 0.2610 seconds
hbase(main):003:0> flush 'ns:magic'
Took 0.6161 seconds
hdfs dfs -mv /data/hbase/data/ns/magic/* /tmp/trash
hbase(main):002:0> get 'ns:magic', 'key1'
 cf:c      timestamp=1534440690218, value=try_to_delete_me

Ipharamitha ye-"Cache DATA on Read" isethelwe kumanga. Uma uneminye imibono, wamukelekile ukuxoxa ngakho emazwaneni.

5. Ukucubungula idatha yeqoqo i-MultiGet/MultiPut

Ukucubungula izicelo ezizodwa (Thola/Beka/Susa) kuwumsebenzi obiza kakhulu, ngakho-ke uma kungenzeka, kufanele uzihlanganise zibe Uhlu noma Uhlu, okukuvumela ukuthi uthole ukuthuthukiswa kokusebenza okubalulekile. Lokhu kuyiqiniso ikakhulukazi ekusebenzeni kokubhala, kodwa uma ufunda kuba nomgodi olandelayo. Igrafu engezansi ibonisa isikhathi sokufunda amarekhodi angu-50 ku-MemStore. Ukufundwa kwenziwe ngochungechunge olulodwa futhi i-eksisi evundlile ikhombisa inombolo yokhiye esicelweni. Lapha ungabona ukuthi uma ukhuphuka okhiye abayinkulungwane esicelweni esisodwa, isikhathi sokwenza siyehla, i.e. isivinini siyanda. Nokho, ngemodi ye-MSLAB enikwe amandla ngokuzenzakalela, ngemva kwalo mngcele ukwehla okukhulu kokusebenza kuyaqala, futhi uma inani ledatha likhulu kurekhodi, isikhathi sokusebenza siba side.

Ithiyori nomkhuba wokusebenzisa i-HBase

Ukuhlola kwenziwe emshinini obonakalayo, ama-cores angu-8, inguqulo HBase 2.0.0-cdh6.0.0-beta1.

Imodi ye-MSLAB yakhelwe ukunciphisa ukuhlukana kwenqwaba, okwenzeka ngenxa yokuxutshwa kwedatha yesizukulwane esisha nesidala. Njengendlela yokusebenza, lapho i-MSLAB inikwe amandla, idatha ifakwa kumaseli amancane ngokuqhathaniswa (ama-chunks) futhi acutshungulwe ngezingcezu. Ngenxa yalokho, lapho ivolumu yephakethe ledatha eliceliwe idlula usayizi onikeziwe, ukusebenza kwehla kakhulu. Ngakolunye uhlangothi, ukuvala le modi nakho akufanelekile, ngoba kuzoholela ekumeni ngenxa ye-GC ngesikhathi sokucutshungulwa kwedatha okujulile. Isixazululo esihle ukukhulisa ivolumu yeseli esimweni sokubhala okusebenzayo ngokubeka ngesikhathi esifanayo nokufunda. Kuyaqapheleka ukuthi inkinga ayenzeki uma, ngemuva kokurekhoda, usebenzisa umyalo we-flush, omisa kabusha i-MemStore kudiski, noma uma ulayisha usebenzisa i-BulkLoad. Ithebula elingezansi libonisa ukuthi imibuzo evela ku-MemStore yedatha enkulu (kanye nenani elifanayo) iphumela ekwehleni. Kodwa-ke, ngokwandisa i-chunksize sibuyisela isikhathi sokucubungula kwesijwayelekile.

Ithiyori nomkhuba wokusebenzisa i-HBase
Ngaphezu kokwandisa i-chunksize, ukuhlukanisa idatha ngesifunda kuyasiza, i.e. ukuhlukaniswa kwetafula. Lokhu kubangela ukuthi izicelo ezimbalwa eziza endaweni ngayinye futhi uma zilingana neseli, impendulo ihlala iyinhle.

6. Isu lokuhlukanisa amatafula abe yizifunda (ukuhlukanisa)

Njengoba i-HBase iyisitoreji senani elingukhiye futhi ukwahlukanisa kwenziwa ngokhiye, kubaluleke kakhulu ukuhlukanisa idatha ngokulinganayo kuzo zonke izifunda. Isibonelo, ukuhlukanisa ithebula elinjalo libe izingxenye ezintathu kuzoholela ekutheni idatha ihlukaniswe izifunda ezintathu:

Ithiyori nomkhuba wokusebenzisa i-HBase
Kwenzeka ukuthi lokhu kuholela ekwehleni okubukhali uma idatha elayishwe kamuva ibonakala, ngokwesibonelo, amanani amade, iningi lawo liqala ngedijithi efanayo, isibonelo:

1000001
1000002
...
1100003

Njengoba okhiye begcinwe njenge-byte array, bonke bazoqala ngokufana futhi babe bendawo efanayo #1 egcina lolu hlu lokhiye. Kunamasu amaningana okuhlukanisa:

I-HexStringSplit – Iphendulela ukhiye ube yiyunithi yezinhlamvu enekhodi ye-hexadecimal ebangeni elithi "00000000" => "FFFFFFFF" kanye nokupheda kwesokunxele ngoziro.

I-UniformSplit – Iphendulela ukhiye ohlelweni lwebhayithi enombhalo wekhodi we-hexadecimal ebangeni elithi "00" => "FF" kanye nokupheda kwesokudla ngoziro.

Ngaphezu kwalokho, ungacacisa noma yibuphi ububanzi noma isethi yokhiye ukuze bahlukanise futhi ulungiselele ukuhlukaniswa okuzenzakalelayo. Nokho, enye yezindlela ezilula nezisebenza kahle kakhulu i-UniformSplit kanye nokusetshenziswa kwe-hashi concatenation, isibonelo ipheya elibaluleke kakhulu lamabhayithi kusukela ekusebenzisaneni ukhiye ngomsebenzi we-CRC32(rowkey) kanye ne-rowkey ngokwayo:

hashi + rowkey

Bese yonke idatha izosatshalaliswa ngokulinganayo ezifundeni. Lapho ufunda, amabhayithi amabili okuqala avele alahlwe futhi ukhiye wangempela usale. I-RS iphinde ilawule inani ledatha nokhiye esifundeni futhi, uma imikhawulo yeqiwe, ihlephula ngokuzenzakalelayo ibe izingxenye.

7. Ukubekezelelana kwamaphutha kanye nendawo yedatha

Njengoba isifunda esisodwa kuphela esibophezelekile ngesethi ngayinye yokhiye, isisombululo sezinkinga ezihlobene nokuphahlazeka kwe-RS noma ukuhoxiswa komsebenzi ukugcina yonke idatha edingekayo ku-HDFS. Uma i-RS iwa, inkosi ithola lokhu ngokungabikho kokushaya kwenhliziyo endaweni ye-ZooKeeper. Bese inikezela isifunda esiphakiwe kwenye i-RS ​​futhi njengoba ama-HFiles egcinwa ohlelweni lwamafayela asabalalisiwe, umnikazi omusha uyawafunda futhi aqhubeke nokunikeza idatha. Kodwa-ke, njengoba enye idatha ingase ibe ku-MemStore futhi ingenaso isikhathi sokungena ku-HFiles, i-WAL, ebuye igcinwe ku-HDFS, isetshenziselwa ukubuyisela umlando wokusebenza. Ngemuva kokuthi izinguquko zisetshenziswe, i-RS iyakwazi ukuphendula izicelo, kodwa ukuthutha kuholela eqinisweni lokuthi enye idatha kanye nezinqubo ezizihlinzekayo zigcina ku-node ehlukene, i.e. indawo iyancipha.

Isixazululo senkinga siwukuhlanganisa okukhulu - le nqubo ihambisa amafayela kulawo ma-node abhekene nawo (lapho izifunda zawo zikhona), ngenxa yalokho phakathi nale nqubo umthwalo kunethiwekhi kanye nama-disks wanda kakhulu. Nokho, esikhathini esizayo, ukufinyelela kudatha kusheshiswa ngokuphawulekayo. Ngaphezu kwalokho, i-major_compaction yenza ukuhlanganisa wonke ama-HFiles abe ifayela elilodwa ngaphakathi kwesifunda, futhi iphinde ihlanze idatha kuye ngezilungiselelo zethebula. Isibonelo, ungacacisa inombolo yezinguqulo zento okufanele igcinwe noma impilo yonke lapho into isuswa khona ngokoqobo.

Le nqubo ingaba nomthelela omuhle kakhulu ekusebenzeni kwe-HBase. Isithombe esingezansi sibonisa ukuthi ukusebenza kwehliswe kanjani ngenxa yokuqoshwa kwedatha okusebenzayo. Lapha ungabona ukuthi imicu engama-40 ibhale kanjani etafuleni elilodwa kanye nemicu engama-40 ngesikhathi esisodwa ifunda idatha. Imicu yokubhala ikhiqiza ama-HFiles amaningi, afundwa eminye imicu. Ngenxa yalokho, idatha eyengeziwe idinga ukususwa enkumbulweni futhi ekugcineni i-GC iqale ukusebenza, okukhubaza wonke umsebenzi. Ukwethulwa kokuhlanganiswa okukhulu kwaholela ekususweni kwemfucumfucu ewumphumela kanye nokubuyiselwa kokukhiqiza.

Ithiyori nomkhuba wokusebenzisa i-HBase
Ukuhlola kwenziwe ku-3 DataNodes naku-4 RS (CPU Xeon E5-2680 v4 @ 2.40GHz * 64 imicu). Inguqulo ye-HBase 1.2.0-cdh5.14.2

Kuyaphawuleka ukuthi ukuhlanganisa okukhulu kwethulwa etafuleni "bukhoma", lapho idatha yayibhalwe futhi ifundwe khona. Kube nesitatimende ku-inthanethi sokuthi lokhu kungaholela ekuphenduleni okungalungile uma ufunda idatha. Ukuze kuhlolwe, kwaqaliswa inqubo ekhiqize idatha entsha futhi yabhalwa etafuleni. Ngemva kwalokho ngafunda ngokushesha futhi ngabheka ukuthi inani eliwumphumela lihambisana yini nalokho okulotshwe phansi. Ngenkathi lolu hlelo luqhubeka, ukuhlanganisa okukhulu kwenziwa izikhathi ezingaba ngu-200 futhi akukho nokukodwa ukwehluleka okurekhodiwe. Mhlawumbe inkinga ivela kancane futhi ngesikhathi somthwalo omkhulu kuphela, ngakho-ke kuphephe kakhudlwana ukumisa izinqubo zokubhala nokufunda njengoba kuhleliwe futhi kwenziwe ukuhlanza ukuze kuvinjelwe ukudonswa kwe-GC okunjalo.

Futhi, ukuminyana okukhulu asithinti isimo se-MemStore; ukuze uyifake kudiski futhi uyihlanganise, udinga ukusebenzisa i-flush (connection.getAdmin().flush(TableName.valueOf(tblName))).

8. Izilungiselelo nokusebenza

Njengoba sekushiwo, i-HBase ikhombisa impumelelo yayo enkulu lapho ingadingi ukwenza lutho, lapho isebenzisa i-BulkLoad. Nokho, lokhu kusebenza kumasistimu amaningi nakubantu. Nokho, leli thuluzi lifaneleke kakhulu ukugcina idatha ngenqwaba kumabhulokhi amakhulu, kuyilapho uma inqubo idinga izicelo zokufunda nokubhala eziningi eziqhudelanayo, kusetshenziswa imiyalo ethi Thola ne-Put echazwe ngenhla. Ukunquma amapharamitha alungile, ukwethulwa kwenziwa ngezinhlanganisela ezahlukahlukene zamapharamitha wetafula nezilungiselelo:

  • Imicu engu-10 yethulwa kanyekanye izikhathi ezi-3 zilandelana (ake sibize lokhu ngebhulokhi yemicu).
  • Isikhathi sokusebenza sazo zonke izintambo kubhulokhi silinganiselwe futhi kwaba umphumela wokugcina wokusebenza kwebhulokhi.
  • Yonke imicu yasebenza ngethebula elifanayo.
  • Ngaphambi kokuqala kwebhulokhi ngayinye, ukuhlanganisa okukhulu kwenziwa.
  • Ibhulokhi ngalinye lenze umsebenzi owodwa kuphela kwalokhu okulandelayo:

β€”Beka
β€”Thola
β€”Thola+Beka

  • Ibhulokhi ngayinye yenze iziphindaphindo ze-50 zokusebenza kwayo.
  • Usayizi webhulokhi werekhodi ungamabhayithi angu-100, amabhayithi angu-1000 noma angu-10000 amabhayithi (okungahleliwe).
  • Amabhulokhi aqaliswe ngezinombolo ezihlukene zokhiye abaceliwe (kungaba ukhiye owodwa noma u-10).
  • Amabhulokhi ayeqhutshwa ngaphansi kwezilungiselelo zethebula ezahlukene. Amapharamitha ashintshile:

- I-BlockCache = ivuliwe noma ivaliwe
β€” BlockSize = 65 KB noma 16 KB
- Izahluko = 1, 5 noma 30
- MSLAB = ivuliwe noma ivaliwe

Ngakho-ke i-block ibukeka kanje:

a. Imodi ye-MSLAB ivuliwe/yavalwa.
b. Kudalwe ithebula okwasethelwa lona amapharamitha alandelayo: BlockCache = true/none, BlockSize = 65/16 Kb, Partition = 1/5/30.
c. Ukuminyanisa kusethelwe ku-GZ.
d. Imicu eyi-10 yethulwa ngesikhathi esisodwa nokwenziwa kwe-1/10 put/get/get+put operations kuleli thebula elinamarekhodi angama-100/1000/10000 bytes, enza imibuzo engu-50 ilandelana (okhiye abangahleliwe).
e. Iphuzu d liphindwe kathathu.
f. Isikhathi sokusebenza salo lonke uchungechunge silinganiselwe.

Zonke izinhlanganisela ezingenzeka zahlolwa. Kuyabikezelwa ukuthi isivinini sizokwehla njengoba usayizi werekhodi ukhula, noma ukuthi ukukhubaza ukugcinwa kwesikhashana kuzodala ukwehla. Nokho, umgomo bekuwukuqonda idigri nokubaluleka komthelela wepharamitha ngayinye, ngakho-ke idatha eqoqiwe yafakwa kokokufaka komsebenzi wokuhlehla ngomugqa, okwenza kube nokwenzeka ukuhlola ukubaluleka kusetshenziswa izibalo zika-t. Ngezansi kunemiphumela yamabhulokhi enza imisebenzi ye-Put. Isethi egcwele yezinhlanganisela 2*2*3*2*3 = 144 ongakhetha + 72 tk. ezinye zenziwa kabili. Ngakho-ke, kunama-run angu-216 esewonke:

Ithiyori nomkhuba wokusebenzisa i-HBase
Ukuhlola kwenziwe kuqoqo elincane elihlanganisa 3 DataNodes kanye 4 RS (CPU Xeon E5-2680 v4 @ 2.40GHz * 64 imicu). Inguqulo ye-HBase 1.2.0-cdh5.14.2.

Isivinini sokufaka esiphezulu samasekhondi angu-3.7 sitholwe ngemodi ye-MSLAB icishiwe, etafuleni eline-partition eyodwa, ene-BlockCache enikwe amandla, i-BlockSize = 16, amarekhodi wamabhayithi angu-100, izingcezu ezingu-10 iphakethe ngalinye.
Isivinini sokufaka esiphansi samasekhondi angu-82.8 sitholwe ngemodi ye-MSLAB enikwe amandla, etafuleni elinengxenye eyodwa, ene-BlockCache enikwe amandla, i-BlockSize = 16, amarekhodi angama-10000 bytes, 1 ngalinye.

Manje ake sibheke imodeli. Sibona ikhwalithi enhle yemodeli esekelwe ku-R2, kodwa kusobala ngokuphelele ukuthi i-extrapolation iphikisana lapha. Ukuziphatha kwangempela kwesistimu lapho amapharamitha eshintsha ngeke abe ngomugqa; le modeli ayidingeki ukuze uthole ukuqagela, kodwa ukuze kuqondwe ukuthi kwenzekeni phakathi kwamapharamitha anikeziwe. Isibonelo, lapha sibona kumbandela woMfundi ukuthi ipharamitha ye-BlockSize ne-BlockCache ayinandaba nomsebenzi we-Put (okuyinto evame ukubikezelwa):

Ithiyori nomkhuba wokusebenzisa i-HBase
Kodwa iqiniso lokuthi ukwandisa inani lokuhlukanisa kuholela ekwehleni kokusebenza akulindelekile (sesivele siwubonile umthelela omuhle wokwandisa inani lama-partitions nge-BulkLoad), nakuba kuqondakala. Okokuqala, ukuze kucutshungulwe, kufanele ukhiqize izicelo ezifundeni ezingu-30 esikhundleni sesisodwa, futhi umthamo wedatha awunjalo kangangokuthi lokhu kuzoletha inzuzo. Okwesibili, isikhathi esiphelele sokusebenza sinqunywa i-RS ehamba kancane, futhi njengoba inani lamaDathaNode lingaphansi kwenombolo yama-RS, ezinye izifunda zinendawo eyiziro. Ake sibheke ezinhlanu eziphezulu:

Ithiyori nomkhuba wokusebenzisa i-HBase
Manje ake sihlole imiphumela yokusebenzisa amabhulokhi okuthi Thola:

Ithiyori nomkhuba wokusebenzisa i-HBase
Inani lama-partitions lilahlekelwe ukubaluleka, okungenzeka ukuthi lichazwa iqiniso lokuthi idatha igcinwe kahle futhi inqolobane efundiwe iyipharamitha ebaluleke kakhulu (ngokwezibalo). Ngokwemvelo, ukwandisa inani lemilayezo esicelweni nakho kuyasiza kakhulu ekusebenzeni. Izikolo eziphezulu:

Ithiyori nomkhuba wokusebenzisa i-HBase
Hhayi-ke, ekugcineni, ake sibheke imodeli yebhulokhi elenziwe okokuqala thola bese sibeka:

Ithiyori nomkhuba wokusebenzisa i-HBase
Wonke amapharamitha abalulekile lapha. Futhi imiphumela yabaholi:

Ithiyori nomkhuba wokusebenzisa i-HBase

9. Ukuhlolwa komthwalo

Nokho, ekugcineni sizokwethula umthwalo ohlonipheke kakhulu noma ongaphansi, kodwa kuhlala kuthakazelisa kakhulu uma unokuthile ongaqhathanisa nakho. Kuwebhusayithi yeDataStax, umthuthukisi oyinhloko weCassandra, kukhona Imiphumela I-NT yenombolo yokugcinwa kwe-NoSQL, okuhlanganisa inguqulo ye-HBase engu-0.98.6-1. Ukulayisha kwenziwa imicu engama-40, usayizi wedatha amabhayithi ayi-100, amadiski e-SSD. Umphumela wokuhlola imisebenzi ye-Read-Modify-Write ubonise imiphumela elandelayo.

Ithiyori nomkhuba wokusebenzisa i-HBase
Ngokwazi kwami, ukufunda kwenziwa kumabhulokhi amarekhodi we-100 kanye nama-node angu-16 we-HBase, ukuhlolwa kwe-DataStax kubonise ukusebenza kwezinkulungwane ze-10 ngomzuzwana.

Kuyinhlanhla ukuthi iqoqo lethu libuye libe nama-node angu-16, kodwa akulona "inhlanhla" kakhulu ukuthi ngayinye inama-cores angu-64 (imicu), kuyilapho ekuhlolweni kwe-DataStax kukhona kuphela 4. Ngakolunye uhlangothi, banamadrayivu e-SSD, kuyilapho sinama-HDD. noma ngaphezulu inguqulo entsha ye-HBase nokusetshenziswa kwe-CPU phakathi nomthwalo cishe ayizange ikhule kakhulu (ngokubukeka ngamaphesenti angu-5-10). Nokho, ake sizame ukuqala ukusebenzisa lokhu kumisa. Izilungiselelo zethebula ezizenzakalelayo, ukufunda kwenziwa ebangeni elingukhiye ukusuka ku-0 kuye ku-50 wezigidi ngokungahleliwe (okungukuthi, kusha ngaso sonke isikhathi). Ithebula liqukethe amarekhodi ayizigidi ezingama-50, ahlukaniswe izingxenye ezingama-64. Okhiye basheshiswe kusetshenziswa i-crc32. Izilungiselelo zethebula zizenzakalelayo, i-MSLAB inikwe amandla. Kwethulwa imicu engama-40, uchungechunge ngalunye lufunda isethi yokhiye abangahleliwe abayi-100 futhi ngokushesha lubhala amabhayithi angu-100 akhiqiziwe abuyele kulaba khiye.

Ithiyori nomkhuba wokusebenzisa i-HBase
Isitendi: 16 DataNode kanye ne-16 RS (CPU Xeon E5-2680 v4 @ 2.40GHz * 64 imicu). Inguqulo ye-HBase 1.2.0-cdh5.14.2.

Umphumela omaphakathi usondele ekusebenzeni kwezinkulungwane ezingama-40 ngomzuzwana, okungcono kakhulu kunokuhlolwa kwe-DataStax. Nokho, ngezinjongo zokuhlola, ungakwazi ukushintsha izimo kancane. Akunakwenzeka ukuthi wonke umsebenzi uzokwenziwa etafuleni elilodwa kuphela, futhi nangokhiye abahlukile kuphela. Ake sicabange ukuthi kukhona isethi ethile yokhiye "eshisayo" ekhiqiza umthwalo omkhulu. Ngakho-ke, ake sizame ukwenza umthwalo onamarekhodi amakhulu (10 KB), futhi ngamaqoqo angu-100, kumathebula angu-4 ahlukene futhi sinciphise ububanzi bokhiye abaceliwe ukuba babe yizinkulungwane ezingu-50. Igrafu engezansi ibonisa ukwethulwa kwezintambo ezingu-40, intambo ngayinye ifundeka. isethi yokhiye abayi-100 futhi ngokushesha ibhala okungahleliwe okungu-10 KB kulaba khiye emuva.

Ithiyori nomkhuba wokusebenzisa i-HBase
Isitendi: 16 DataNode kanye ne-16 RS (CPU Xeon E5-2680 v4 @ 2.40GHz * 64 imicu). Inguqulo ye-HBase 1.2.0-cdh5.14.2.

Ngesikhathi somthwalo, ukuhlanganiswa okukhulu kwaqaliswa izikhathi eziningana, njengoba kuboniswe ngenhla, ngaphandle kwale nqubo, ukusebenza kuzokwehlisa kancane kancane, noma kunjalo, umthwalo owengeziwe nawo uvela ngesikhathi sokubulawa. Ukudlondlobala kudalwa yizizathu ezahlukene. Ngezinye izikhathi imicu iqeda ukusebenza futhi kwaba khona isikhashana ngenkathi iqalwa kabusha, ngezinye izikhathi izinhlelo zokusebenza zezinkampani zangaphandle zakha umthwalo kuqoqo.

Ukufunda nokubhala ngokushesha kungenye yezimo zomsebenzi ezinzima kakhulu ze-HBase. Uma wenza izicelo zokubeka ezincane kuphela, isibonelo amabhayithi angu-100, uwahlanganise abe amaphakethe ezicucu eziyizinkulungwane ezingu-10-50, ungathola amakhulu ezinkulungwane zokusebenza ngomzuzwana, futhi isimo sifana nezicelo zokufunda kuphela. Kuyaphawuleka ukuthi imiphumela ingcono kakhulu kunaleyo etholwe yi-DataStax, ngaphezu kwakho konke ngenxa yezicelo kumabhulokhi we-50 ayizinkulungwane.

Ithiyori nomkhuba wokusebenzisa i-HBase
Isitendi: 16 DataNode kanye ne-16 RS (CPU Xeon E5-2680 v4 @ 2.40GHz * 64 imicu). Inguqulo ye-HBase 1.2.0-cdh5.14.2.

10. Iziphetho

Lolu hlelo lumiswe ngendlela eguquguqukayo, kodwa umthelela wenani elikhulu lamapharamitha awukaziwa. Ezinye zazo zihloliwe, kodwa azifakwanga kusethi yokuhlola ewumphumela. Isibonelo, ukuhlola kwasekuqaleni kubonise ukubaluleka okungatheni kwepharamitha efana ne-DATA_BLOCK_ENCODING, ebhala ngekhodi ulwazi kusetshenziswa amananiβ€”kusuka kumaseli angomakhelwane, okuqondakalayo ngedatha ekhiqizwa ngokungahleliwe. Uma usebenzisa inani elikhulu lezinto eziyimpinda, inzuzo ingaba nkulu. Ngokuvamile, singasho ukuthi i-HBase inikeza umbono wesizindalwazi esibucayi nesicatshangelwe kahle, esingakhiqiza kakhulu lapho kwenziwa imisebenzi ngamabhulokhi amakhulu edatha. Ikakhulukazi uma kungenzeka ukuhlukanisa izinqubo zokufunda nokubhala ngesikhathi.

Uma kukhona okuthile ngokubona kwakho okungadalulwanga ngokwanele, ngikulungele ukukutshela kabanzi. Sikumema ukuthi wabelane ngolwazi lwakho noma uxoxisane uma ungavumelani nokuthile.

Source: www.habr.com

Engeza amazwana