Izici zokudizayina imodeli yedatha ye-NoSQL

Isingeniso

Izici zokudizayina imodeli yedatha ye-NoSQL “Kufanele ugijime ngokushesha ngangokunokwenzeka ukuze uhlale endaweni,
futhi ukuze ufike endaweni ethile, kufanele ugijime okungenani ngokuphindwe kabili!”
(c) U-Alice e-Wonderland

Esikhathini esidlule ngacelwa ukuba nginikeze inkulumo abahlaziyi inkampani yethu esihlokweni sokuklama amamodeli wedatha, ngoba ukuhlala kumaphrojekthi isikhathi eside (ngezinye izikhathi iminyaka eminingana) silahlekelwa umbono walokho okwenzekayo esizungezile emhlabeni wobuchwepheshe be-IT. Enkampanini yethu (kwenzeka njalo) amaphrojekthi amaningi awasebenzisi imininingwane yolwazi ye-NoSQL (okungenani okwamanje), ngakho-ke enkulumweni yami ngabanaka ngokwehlukana ngisebenzisa isibonelo se-HBase futhi ngazama ukuqondisa ukwethulwa kwento kulabo. abangakaze bawasebenzise basebenzile. Ikakhulukazi, ngibonise ezinye zezici zomklamo wemodeli yedatha ngisebenzisa isibonelo engasifunda eminyakeni embalwa edlule esihlokweni esithi “Isingeniso se-HB ase Schema Design” ka-Amandeep Khurana. Lapho ngihlaziya izibonelo, ngiqhathanise izinketho eziningana zokuxazulula inkinga efanayo ukuze ngidlulisele kangcono imibono eyinhloko ezilalelini.

Muva nje, "ngaphandle kokwenza lutho," ngizibuze umbuzo (impelasonto ende kaMeyi yokuvalelwa yedwa ikulungele lokhu), izibalo zethiyori zizohambisana kangakanani nokuzijwayeza? Empeleni, le yindlela umqondo walesi sihloko wazalwa ngayo. Unjiniyela osesebenze ne-NoSQL izinsuku ezimbalwa kungenzeka angafundi lutho olusha kuyo (ngakho-ke angase eqe uhhafu wesihloko ngokushesha). Kodwa ngoba abahlaziyiKulabo abangakasebenzi eduze ne-NoSQL, ngicabanga ukuthi kuzoba usizo ekutholeni ukuqonda okuyisisekelo kwezici zokuklama amamodeli wedatha we-HBase.

Ukuhlaziywa kwesibonelo

Ngokubona kwami, ngaphambi kokuthi uqale ukusebenzisa imininingwane yolwazi ye-NoSQL, udinga ukucabanga ngokucophelela futhi ukale ubuhle nobubi. Ngokuvamile inkinga ingase ixazululwe kusetshenziswa ama-DBMS angokwesiko ahlobene. Ngakho-ke, kungcono ukungasebenzisi i-NoSQL ngaphandle kwezizathu ezibalulekile. Uma nokho unqume ukusebenzisa isizindalwazi se-NoSQL, kufanele ucabangele ukuthi izindlela zokuklama lapha zihlukile. Ikakhulukazi ezinye zazo zingase zingajwayelekile kulabo abake basebenzelana kuphela nama-DBMS ahlobene (ngokusho kokubona kwami). Ngakho-ke, emhlabeni "wobudlelwane", sivame ukuqala ngokumodela isizinda senkinga, futhi kuphela lapho, uma kunesidingo, siguqule imodeli. Ku-NoSQL thina kufanele icabangele ngokushesha izimo ezilindelekile zokusebenza ngedatha futhi ekuqaleni wenze idatha ibe yengajwayelekile. Ngaphezu kwalokho, kunenombolo yezinye umehluko, okuzoxoxwa ngakho ngezansi.

Ake sicabangele inkinga elandelayo "yokwenziwa", esizoqhubeka nokusebenza ngayo:

Kuyadingeka ukuklama isakhiwo sokugcina sohlu lwabangane babasebenzisi benethiwekhi yokuxhumana nabantu engaqondakali. Ukwenza lula, sizocabanga ukuthi konke ukuxhumana kwethu kuqondiswe (njengaku-Instagram, hhayi i-Linkedin). Isakhiwo kufanele sikuvumele ukuthi wenze kahle:

  • Phendula umbuzo wokuthi umsebenzisi A uyafunda yini umsebenzisi B (iphethini yokufunda)
  • Vumela ukwengeza/ukukhipha ukuxhumana uma kwenzeka ukubhaliswa/ukungabhalisi komsebenzisi A ukusuka kumsebenzisi B (isifanekiso sokushintsha idatha)

Yebo, kunezinketho eziningi zokuxazulula inkinga. Kusizindalwazi esivamile sobudlelwano, cishe singenza ithebula lobudlelwano (okungenzeka bufanekiselwa uma, ngokwesibonelo, sidinga ukugcina iqembu labasebenzisi: umndeni, umsebenzi, njll., okufaka lo “mngani”), kanye nokuthuthukisa isivinini sokufinyelela singangeza izinkomba/ukuhlukanisa. Kungenzeka ukuthi ithebula lokugcina lingabukeka kanje:

USER_ID
ubunikazi_bomngane

I-Vasya
UPetya

I-Vasya
Olya

ngemva kwalokhu, ukuze kucace futhi kuqondwe kangcono, ngizobonisa amagama esikhundleni sama-ID

Endabeni ye-HBase, siyazi ukuthi:

  • usesho olusebenzayo olungaphumeleli ekuhlolweni kwetafula eligcwele luyenzeka ngokhiye kuphela
    • empeleni, yingakho ukubhala imibuzo ye-SQL ejwayelekile kwabaningi kulwazi olunjalo kuwumbono omubi; ngokobuchwepheshe, vele, ungathumela umbuzo we-SQL ngokujoyina kanye nokunye okunengqondo ku-HBase kusuka ku-Impala efanayo, kodwa izosebenza kanjani...

Ngakho-ke, siphoqeleka ukuthi sisebenzise i-ID yomsebenzisi njengokhiye. Futhi umcabango wami wokuqala esihlokweni esithi "kuphi futhi kanjani ukugcina omazisi babangane?" mhlawumbe umqondo wokuwagcina kumakholomu. Le nketho esobala kakhulu futhi "engenangqondo" izobukeka kanje (asiyibize Inketho 1 (okuzenzakalelayo)ukuze uthole okwengeziwe):

I-RowKey
Amakholomu

I-Vasya
1: uPeti
2: ulala
3 :idla

UPetya
1: Masha
2: Vasya

Lapha, umugqa ngamunye uhambisana nomsebenzisi wenethiwekhi oyedwa. Amakholomu anamagama: 1, 2, ... - ngokwenani labangane, futhi omazisi babangane bagcinwa kumakholomu. Kubalulekile ukuqaphela ukuthi umugqa ngamunye uzoba nenani elihlukile lamakholomu. Esibonelweni esisesithombeni esingenhla, umugqa owodwa unamakholomu amathathu (1, 2 kanye no-3), kanti owesibili unamabili kuphela (1 no-2) - lapha thina ngokwethu sisebenzise izakhiwo ezimbili ze-HBase ezingenazo imininingwane yokuxhumana:

  • ikhono lokushintsha ukwakheka kwamakholomu (engeza umngane -> engeza ikholomu, susa umngane -> susa ikholomu)
  • imigqa ehlukene ingaba nokuqanjwa kwamakholomu okuhlukene

Ake sihlole isakhiwo sethu ukuthi siyahambisana yini nezidingo zomsebenzi:

  • Ukufunda idatha: ukuze siqonde ukuthi u-Vasya ubhalisele u-Olya, kuzodingeka sisuse umugqa wonke ngokhiye u-RowKey = “Vasya” bese uhlela amanani ekholomu size “sihlangane” no-Olya kuwo. Noma phinda ngokusebenzisa amanani awo wonke amakholomu, "ungahlangani" no-Olya bese ubuyisela impendulo ethi Amanga;
  • Ukuhlela idatha: ukwengeza umngane: ngomsebenzi ofanayo nathi sidinga ukususa umugqa wonke usebenzisa ukhiye RowKey = “Vasya” ukubala inani eliphelele labangane bakhe. Sidinga leli nani eliphelele labangane ukuze sinqume inombolo yekholomu lapho sidinga khona ukubhala phansi i-ID yomngane omusha.
  • Ukushintsha idatha: ukususa umngane:
    • Kudingeka ukususa umugqa wonke ngokhiye u-RowKey = “Vasya” bese uhlela amakholomu ukuze uthole leyo lapho umngane azosuswa aqoshwa khona;
    • Okulandelayo, ngemva kokususa umngane, sidinga "ukushintsha" yonke idatha kukholomu eyodwa ukuze singatholi "izikhala" ekubaleni kwazo.

Manje ake sihlole ukuthi lawa ma-algorithms, esizodinga ukuwenza ohlangothini “lwesicelo esinemibandela”, azokhiqiza kanjani, sisebenzisa. I-O-symbolism. Ake sikhombise usayizi wenethiwekhi yethu yokuxhumana nabantu ecatshangelwayo njengo-n. Bese inani eliphezulu labangane umsebenzisi oyedwa angaba nalo (n-1). Singaqhubeka singakunaki lokhu (-1) ngezinjongo zethu, njengoba ngaphakathi kohlaka lokusebenzisa izimpawu ze-O akubalulekile.

  • Ukufunda idatha: kuyadingeka ukususa umugqa wonke futhi uphindaphinde kuwo wonke amakholomu awo emkhawulweni. Lokhu kusho ukuthi isilinganiso esiphezulu sezindleko sizoba cishe u-O(n)
  • Ukuhlela idatha: ukwengeza umngane: ukuze unqume inani labangane, udinga ukuphindaphinda kuwo wonke amakholomu omugqa, bese ufaka ikholomu entsha => O(n)
  • Ukushintsha idatha: ukususa umngane:
    • Ngokufanayo nokwengeza - udinga ukudlula wonke amakholomu emkhawulweni => O(n)
    • Ngemuva kokukhipha amakholomu, sidinga "ukuwahambisa". Uma usebenzisa lokhu “kubheka phambili”, lapho-ke emkhawulweni uzodinga kuze kufike ku-(n-1) ukusebenza. Kodwa lapha nangaphezulu engxenyeni esebenzayo sizosebenzisa indlela ehlukile, ezosebenzisa "i-pseudo-shift" yenombolo engaguquki yokusebenza - okungukuthi, isikhathi esiqhubekayo sizosetshenziswa kukho, kungakhathaliseki ukuthi n. Lesi sikhathi esingaguquki (O(2) ukuze sibe ncamashi) singanakwa uma siqhathaniswa no-O(n). Indlela esetshenziswayo ikhonjisiwe emfanekisweni ongezansi: simane sikopishe idatha kusuka kukholomu "yokugcina" kuya kulena esifuna ukususa kuyo idatha, bese sisusa ikholomu yokugcina:
      Izici zokudizayina imodeli yedatha ye-NoSQL

Sekukonke, kuzo zonke izimo sithole inkimbinkimbi ye-asymptotic computational ye-O(n).
Cishe usuqaphele kakade ukuthi cishe ngaso sonke isikhathi kufanele sifunde umugqa wonke ku-database, futhi ezimweni ezimbili kwezintathu, ukuze sidlule kuwo wonke amakholomu futhi sibale ingqikithi yenani labangane. Ngakho-ke, njengomzamo wokuthuthukisa, ungangeza ikholomu "yokubala", egcina inani labangane bomsebenzisi ngamunye wenethiwekhi. Kulesi simo, asikwazi ukufunda umugqa wonke ukuze sibale isamba senani labangane, kodwa funda ikholomu eyodwa kuphela "yokubala". Into esemqoka ukuthi ungakhohlwa ukuvuselela "ukubala" lapho ukhohlisa idatha. Lokho. siyathuthukiswa Inketho 2 (isibalo):

I-RowKey
Amakholomu

I-Vasya
1: uPeti
2: ulala
3 :idla
inani: 3

UPetya
1: Masha
2: Vasya

inani: 2

Uma kuqhathaniswa nenketho yokuqala:

  • Ukufunda idatha: ukuze uthole impendulo yombuzo othi “Ingabe uVasya uyayifunda i-Olya?” akukho okushintshile => O(n)
  • Ukuhlela idatha: ukwengeza umngane: Sikwenze kwaba lula ukufakwa komngane omusha, njengoba manje asikho isidingo sokuthi sifunde wonke umugqa futhi siphindaphinde amakholomu awo, kodwa singathola kuphela inani lekholomu "yokubala", njll. thola ngokushesha inombolo yekholomu ukuze ufake umngane omusha. Lokhu kuholela ekwehliseni inkimbinkimbi yokubala ku-O(1)
  • Ukushintsha idatha: ukususa umngane: Lapho sisusa umngane, singaphinda sisebenzise le kholomu ukuze sinciphise inani lemisebenzi ye-I/O lapho “sihambisa” idatha iseli elilodwa kwesokunxele. Kodwa isidingo sokuphindaphinda amakholomu ukuze uthole leyo edinga ukususwa sisahlala, ngakho => O(n)
  • Ngakolunye uhlangothi, manje lapho sibuyekeza idatha sidinga ukubuyekeza ikholomu "yokubala" njalo, kodwa lokhu kuthatha isikhathi esingaguquki, esinganakwa ngaphakathi kohlaka lwezimpawu ze-O.

Sekukonke, inketho yesi-2 ibonakala ingcono kakhulu, kepha ifana “noguquko esikhundleni soguquko.” Ukwenza “uguquko” sizokudinga Inketho 3 (ikholomu).
Masiphendule yonke into "ibheke phansi": sizokwabela Igama lekholomu ID yomsebenzisi! Okuzobhalwa kukholomu ngokwayo akusabalulekile kithi, makube inombolo 1 (ngokujwayelekile, izinto eziwusizo zingagcinwa lapho, isibonelo, iqembu "umndeni / abangani / njll."). Le ndlela ingase imangaze “umuntu ongenzi lutho” ongakazilungiseleli ongenalwazi lwangaphambilini lokusebenza nesizindalwazi se-NoSQL, kodwa iyona kanye le ndlela ekuvumela ukuthi usebenzise amandla e-HBase kulo msebenzi ngempumelelo kakhulu:

I-RowKey
Amakholomu

I-Vasya
Phethiya: 1
Ulala: 1
Dasha: 1

UPetya
Masha: 1
Vasya: 1

Lapha sithola izinzuzo eziningana ngesikhathi esisodwa. Ukuze siziqonde, ake sihlaziye isakhiwo esisha futhi silinganisele ubunkimbinkimbi bekhompyutha:

  • Ukufunda idatha: ukuze uphendule umbuzo wokuthi u-Vasya ubhalisele u-Olya, kwanele ukufunda ikholomu eyodwa "Olya": uma ikhona, khona-ke impendulo iyiqiniso, uma kungenjalo - False => O (1)
  • Ukuhlela idatha: ukwengeza umngane: Ukwengeza umngane: mane ungeze ikholomu entsha “I-ID yomngane” => O(1)
  • Ukushintsha idatha: ukususa umngane: vele ususe ikholomu ye-ID yomngane => O(1)

Njengoba ubona, inzuzo ephawulekayo yalo modeli wesitoreji ukuthi kuzo zonke izimo esizidingayo, sisebenza ngekholomu eyodwa kuphela, sigwema ukufunda umugqa wonke kusuka ku-database futhi, ngaphezu kwalokho, sibala wonke amakholomu alo mugqa. Besingama lapho, kodwa...

Ungaxakeka futhi uqhubeke kancane endleleni yokuthuthukisa ukusebenza nokunciphisa ukusebenza kwe-I/O lapho ufinyelela kusizindalwazi. Kuthiwani uma sigcine ulwazi oluphelele lobudlelwano ngokuqondile kukhiye womugqa ngokwawo? Okusho ukuthi, yenza ukhiye ube yinhlanganisela efana ne-userID.friendID? Kulokhu, akudingeki nokufunda amakholomu omugqa nhlobo (Inketho 4(umugqa)):

I-RowKey
Amakholomu

Vasya.Petya
Phethiya: 1

Vasya.Olya
Ulala: 1

I-Vasya.Dasha
Dasha: 1

Petya.Masha
Masha: 1

Petya.Vasya
Vasya: 1

Ngokusobala, ukuhlolwa kwazo zonke izimo zokukhohlisa idatha esakhiweni esinjalo, njengakunguqulo yangaphambilini, kuzoba ngu-O(1). Umehluko ngenketho 3 uzoba kuphela ekusebenzeni kahle kwemisebenzi ye-I/O kusizindalwazi.

Awu, "umnsalo" wokugcina. Kulula ukubona ukuthi kunketho yesi-4, ukhiye womugqa uzoba nobude obuguquguqukayo, okungenzeka ukuthi kuthinte ukusebenza (lapha sikhumbula ukuthi i-HBase igcina idatha njengesethi yamabhayithi kanye nemigqa kumathebula ihlelwa ngokhiye). Futhi sinesihlukanisi esingase sidinge ukuphathwa kwezinye izimo. Ukuqeda leli thonya, ungasebenzisa ama-hashes avela ku-userID ne-friendID, futhi njengoba womabili ama-hashes azoba nobude obuqhubekayo, ungamane uwahlanganise, ngaphandle kwesihlukanisi. Bese idatha esethebula izobukeka kanje (Inketho 5(hashi)):

I-RowKey
Amakholomu

dc084ef00e94aef49be885f9b01f51c01918fa783851db0dc1f72f83d33a5994
Phethiya: 1

dc084ef00e94aef49be885f9b01f51c0f06b7714b5ba522c3cf51328b66fe28a
Ulala: 1

dc084ef00e94aef49be885f9b01f51c00d2c2e5d69df6b238754f650d56c896a
Dasha: 1

1918fa783851db0dc1f72f83d33a59949ee3309645bd2c0775899fca14f311e1
Masha: 1

1918fa783851db0dc1f72f83d33a5994dc084ef00e94aef49be885f9b01f51c0
Vasya: 1

Ngokusobala, inkimbinkimbi ye-algorithmic yokusebenza nesakhiwo esinjalo kuzimo esizicabangelayo izofana naleyo yenketho yesi-4 - okungukuthi, O(1).
Sekukonke, ake sifingqe zonke izilinganiso zethu zobunkimbinkimbi bekhompyutha etafuleni elilodwa:

Ingeza umngane
Ukuhlola umngane
Ukususa umngane

Inketho 1 (okuzenzakalelayo)
O (n)
O (n)
O (n)

Inketho 2 (ukubala)
O (1)
O (n)
O (n)

Inketho 3 (ikholomu)
O (1)
O (1)
O (1)

Inketho yesi-4 (umugqa)
O (1)
O (1)
O (1)

Inketho 5 (hashi)
O (1)
O (1)
O (1)

Njengoba ubona, izinketho 3-5 zibonakala zikhethwa kakhulu futhi ziqinisekisa ngokwethiyori ukwenziwa kwazo zonke izimo zokukhohlisa idatha ngesikhathi esingaguquki. Ezimweni zomsebenzi wethu, asikho isidingo esicacile sokuthola uhlu lwabo bonke abangani bomsebenzisi, kodwa emisebenzini yangempela yephrojekthi, kungaba kuhle ngathi, njengabahlaziyi abahle, "silindele" ukuthi umsebenzi onjalo ungavela futhi. "sakaza utshani." Ngakho-ke, ukuzwela kwami ​​​​kusohlangothini lwenketho 3. Kodwa kungenzeka ukuthi kuphrojekthi yangempela lesi sicelo sesivele sixazululwe ngezinye izindlela, ngakho-ke, ngaphandle kombono ojwayelekile wenkinga yonke, kungcono ukungenzi. iziphetho zokugcina.

Ukulungiswa kokuhlolwa

Ngingathanda ukuhlola izimpikiswano ezingenhla ngokusebenza - lokhu bekuyinhloso yombono ovele ngempelasonto ende. Ukuze wenze lokhu, kuyadingeka ukuhlola isivinini sokusebenza "sohlelo lwethu olunemibandela" kuzo zonke izimo ezichaziwe zokusebenzisa i-database, kanye nokwanda kwalesi sikhathi ngosayizi okhulayo wenethiwekhi yokuxhumana nabantu (n). Ipharamitha esihlosiwe esithakaselayo nesizoyilinganisa phakathi nokuhlolwa isikhathi esichithwa “uhlelo olunemibandela” ukwenza “umsebenzi webhizinisi” owodwa. Ngokuthi “ibhizinisi” sisho okukodwa kokulandelayo:

  • Yengeza umngane oyedwa omusha
  • Ihlola ukuthi uMsebenzisi A ungumngani Womsebenzisi B
  • Ukususa umngane oyedwa

Ngakho-ke, kucatshangelwa izidingo ezivezwe esitatimendeni sokuqala, isimo sokuqinisekisa sivela kanje:

  • Ukuqoshwa kwedatha. Khiqiza ngokungahleliwe inethiwekhi yokuqala kasayizi n. Ukuze usondele “emhlabeni wangempela”, inombolo yabangane umsebenzisi ngamunye anabo futhi iyahlukahluka okungahleliwe. Linganisa isikhathi lapho "uhlelo lwethu lokusebenza olunemibandela" lubhala yonke idatha ekhiqiziwe ku-HBase. Bese uhlukanisa isikhathi esiwumphumela ngenani eliphelele labangane abengeziwe - yile ndlela esithola ngayo isikhathi esimaphakathi "sokusebenza kwebhizinisi" okukodwa.
  • Ukufunda idatha. Kumsebenzisi ngamunye, dala uhlu "lwezinto zomuntu siqu" odinga ukuthola impendulo yalo ukuthi umsebenzisi ubhalisile kubo noma cha. Ubude bohlu = cishe inani labangane bomsebenzisi, futhi engxenyeni yabangane abahloliwe impendulo kufanele ibe "Yebo", kanti enye ingxenye - "Cha". Ukuhlola kwenziwa ngendlela yokuthi izimpendulo ezithi “Yebo” kanye “Cha” zishintshane (okungukuthi, kuzo zonke izimo zesibili kuzodingeka sidlule kuwo wonke amakholomu omugqa ukuze uthole izinketho 1 no-2). Isikhathi esiphelele sokuhlolwa sibe sesihlukaniswa ngenani labangane abahloliwe ukuze kutholwe isilinganiso sesikhathi sokuhlolwa ngesifundo ngasinye.
  • Isusa idatha. Susa bonke abangani kumsebenzisi. Ngaphezu kwalokho, i-oda lokususa alihleliwe (okungukuthi, “sishova” uhlu lwangempela olusetshenziswa ukurekhoda idatha). Isikhathi sokuhlola esiphelele sibe sesihlukaniswa ngenani labangane abasusiwe ukuze kutholwe isilinganiso sesikhathi ngesheke ngalinye.

Izimo zidinga ukwenziwa kumodeli ngayinye yedatha emi-5 kanye nosayizi abahlukene benethiwekhi yokuxhumana nabantu ukuze kubonwe ukuthi isikhathi sishintsha kanjani njengoba sikhula. Ngaphakathi kwe-n eyodwa, ukuxhumana kunethiwekhi kanye nohlu lwabasebenzisi okufanele bahlolwe kufanele, vele, kufane kuzo zonke izinketho ezi-5.
Ukuze uthole ukuqonda okungcono, ngezansi isibonelo sedatha ekhiqiziwe ye-n= 5. "Ijeneretha" ebhaliwe ikhiqiza izichazamazwi ezintathu ze-ID njengokuphumayo:

  • eyokuqala ngeyokufaka
  • okwesibili okokuhlola
  • okwesithathu - ukususa

{0: [1], 1: [4, 5, 3, 2, 1], 2: [1, 2], 3: [2, 4, 1, 5, 3], 4: [2, 1]} # всего 15 друзей

{0: [1, 10800], 1: [5, 10800, 2, 10801, 4, 10802], 2: [1, 10800], 3: [3, 10800, 1, 10801, 5, 10802], 4: [2, 10800]} # всего 18 проверяемых субъектов

{0: [1], 1: [1, 3, 2, 5, 4], 2: [1, 2], 3: [4, 1, 2, 3, 5], 4: [1, 2]} # всего 15 друзей

Njengoba ubona, wonke ama-ID angaphezu kuka-10 kusichazamazwi ukuze ahlolwe yilawo azonikeza impendulo Amanga. Ukufaka, ukuhlola kanye nokususa “abangane” kwenziwa ngokuqondile ngokulandelana okucaciswe kusichazamazwi.

Ukuhlolwa kwenziwa kwi-laptop esebenza Windows 10, lapho i-HBase yayisebenza esitsheni se-Docker esisodwa, kanti iPython ene-Jupyter Notebook yayisebenza kwenye. I-Docker yabelwa ama-CPU cores angu-2 kanye no-2 GB we-RAM. Wonke umqondo, kokubili ukulingisa "uhlelo lokusebenza olunemibandela" kanye "nokufaka amapayipi" kokukhiqiza idatha yokuhlola nesikhathi sokulinganisa, kwabhalwa ngePython. Umtapo wolwazi wasetshenziselwa ukusebenza ne-HBase happybase, ukubala ama-hashes (MD5) ngenketho 5 - hahlib

Kucatshangelwa amandla ekhompuyutha ekhompuyutha ephathekayo ethile, ukwethulwa kwe-n = 10, 30, … kwakhethwa ngokuhlolwa. 170 – lapho isamba sesikhathi sokusebenza somjikelezo ogcwele wokuhlola (zonke izimo zazo zonke izinketho zabo bonke n) sasinengqondo noma sincane kakhulu futhi silingana ngesikhathi sephathi yetiye eyodwa (ngokwesilinganiso imizuzu eyi-15).

Lapha kuyadingeka ukuphawula ukuthi kulokhu kuhlolwa asihloli ngokuyinhloko izibalo zokusebenza eziphelele. Ngisho nokuqhathanisa okuhlobene kwezinketho ezimbili ezihlukene kungase kungalungile ngokuphelele. Manje sinesithakazelo esimweni soshintsho ngesikhathi kuye ngokuthi n, njengoba kucatshangelwa ukucushwa okungenhla "kwesitendi sokuhlola", kunzima kakhulu ukuthola izilinganiso zesikhathi "ezisusiwe" zethonya lezinto ezingahleliwe nezinye izici ( futhi umsebenzi onjalo awuzange ubekwe).

Umphumela wokuhlola

Isivivinyo sokuqala ukuthi sishintsha kanjani isikhathi esichithwe ukugcwalisa uhlu lwabangane. Umphumela ukugrafu engezansi.
Izici zokudizayina imodeli yedatha ye-NoSQL
Izinketho 3-5, njengoba kulindelekile, zibonisa isikhathi esicishe sibe njalo "sokuthengiselana kwebhizinisi", esingaxhomeki ekukhuleni kosayizi wenethiwekhi kanye nomehluko ongaqondakali ekusebenzeni.
Inketho yesi-2 nayo ikhombisa ukusebenza okungaguquki, kepha kubi kakhulu, cishe izikhathi ezi-2 uma kuqhathaniswa nezinketho 3-5. Futhi lokhu akukwazi kodwa ukujabula, njengoba kuhlobana nethiyori - kule nguqulo inani lemisebenzi ye-I/O ukuya/kusuka ku-HBase inkulu ngokuphindwe izikhathi ezingu-2. Lokhu kungaba ubufakazi obungaqondile bokuthi ibhentshi lethu lokuhlola, empeleni, linikeza ukunemba okuhle.
Inketho 1 nayo, njengoba bekulindelekile, iphuma ihamba kancane futhi ikhombisa ukukhuphuka komugqa esikhathini esichithwe ekungezeni usayizi wenethiwekhi.
Manje ake sibheke imiphumela yokuhlolwa kwesibili.
Izici zokudizayina imodeli yedatha ye-NoSQL
Izinketho ezingu-3-5 ziphinde ziziphathe njengoba kulindelekile - isikhathi esingashintshi, esizimele ngobukhulu benethiwekhi. Izinketho 1 no-2 zibonisa ukukhuphuka komugqa ngesikhathi njengoba usayizi wenethiwekhi ukhula kanye nokusebenza okufanayo. Ngaphezu kwalokho, inketho yesi-2 iphenduka kancane kancane - ngokusobala ngenxa yesidingo sokuhlola nokucubungula ikholomu "yokubala" eyengeziwe, ebonakala kakhulu njengoba u-n ekhula. Kodwa ngisazogwema ukwenza noma yiziphi iziphetho, ngoba ukunemba kwalokhu kuqhathanisa kuphansi kakhulu. Ngaphezu kwalokho, lezi zilinganiso (iyiphi inketho, i-1 noma i-2, ishesha) ishintshile kusukela ekugijimeni ukuya ekusebenzeni (ngenkathi igcina isimo sokuncika kanye "nokuhamba entanyeni nentamo").

Hhayi-ke, igrafu yokugcina ingumphumela wokuhlolwa kokususa.

Izici zokudizayina imodeli yedatha ye-NoSQL

Nalapha, azikho izimanga lapha. Izinketho 3-5 zenza ukususwa ngesikhathi esifanayo.
Ngaphezu kwalokho, okuthakazelisayo, izinketho 4 no-5, ngokungafani nezimo zangaphambilini, zibonisa ukusebenza okubi kakhulu kunenketho yesi-3. Ngokusobala, umsebenzi wokususa umugqa ubiza kakhulu kunomsebenzi wokususa ikholomu, okuvame ukunengqondo.

Izinketho 1 no-2, njengoba kulindelekile, zibonisa ukukhuphuka kwesikhathi ngomugqa. Ngesikhathi esifanayo, inketho 2 ihlala ihamba kancane kunenketho 1 - ngenxa yokusebenza okwengeziwe kwe-I/O "ukugcina" ikholomu yokubala.

Iziphetho ezijwayelekile zokuhlolwa:

  • Izinketho 3-5 zibonisa ukusebenza kahle okukhulu njengoba zisebenzisa i-HBase; Ngaphezu kwalokho, ukusebenza kwabo kuyahluka ngokuhlobene ngokuqhubekayo futhi akuncikile kusayizi wenethiwekhi.
  • Umehluko phakathi kwezinketho 4 no-5 awuzange urekhodwe. Kodwa lokhu akusho ukuthi inketho yesi-5 akufanele isetshenziswe. Kungenzeka ukuthi isimo sokuhlola esisetshenzisiwe, kucatshangelwa izici zokusebenza zebhentshi lokuhlola, asizange sikuvumele ukuthi sitholwe.
  • Imvelo yokwenyuka kwesikhathi esidingekayo ukwenza "imisebenzi yebhizinisi" ngedatha ngokuvamile iqinisekisa izibalo zethiyori ezitholwe ngaphambilini zazo zonke izinketho.

Epilogue

Izivivinyo ezinzima ezenziwe akufanele zithathwe njengeqiniso eliphelele. Kunezici eziningi ezingazange zicatshangelwe futhi zahlanekezela imiphumela (lokhu kuguquguquka kubonakala ikakhulukazi kumagrafu anosayizi omncane wenethiwekhi). Isibonelo, ijubane le-thrift, elisetshenziswa yi-happybase, ivolumu kanye nendlela yokusebenzisa i-logic engiyibhale ku-Python (angikwazi ukusho ukuthi ikhodi yabhalwa ngokugcwele nangempumelelo amandla azo zonke izingxenye), mhlawumbe izici ze-HBase caching, umsebenzi wangemuva we Windows 10 kukhompuyutha yami ephathekayo, njll. Ngokuvamile, singacabanga ukuthi zonke izibalo zetiyori zibonise ngokuhlola ukufaneleka kwazo. Hhayi-ke, noma okungenani kwakungenakwenzeka ukubaphikisa "ngokuhlasela kwekhanda" okunjalo.

Sengiphetha, izincomo zawo wonke umuntu oqala ukuklama amamodeli edatha ku-HBase: abstract kokuhlangenwe nakho kwangaphambilini okusebenza nezizindalwazi zobudlelwano futhi akhumbule "imiyalo":

  • Lapho siklama, siqhubeka nomsebenzi namaphethini okukhohlisa idatha, hhayi kumodeli yesizinda
  • Ukufinyelela okusebenzayo (ngaphandle kokuskena kwethebula eligcwele) - ngokhiye kuphela
  • Ukunciphisa umzimba
  • Imigqa ehlukene ingaba namakholomu ahlukene
  • Ukwakheka okunamandla kwezipikha

Source: www.habr.com

Engeza amazwana