Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Hayi Habr! Ndinikezela ingqalelo yakho inguqulelo yenqaku
"Isebenza njani idatabase yobudlelwane".

Xa kufikwa kumaziko edatha obudlelwane andikwazi ukunceda kodwa ndicinga ukuba kukho into engekhoyo. Zisetyenziswa kuyo yonke indawo. Kukho idatabase ezininzi ezahlukeneyo ezikhoyo, ukusuka kwiSQLite encinci kwaye iluncedo ukuya kwiTeradata enamandla. Kodwa kukho amanqaku ambalwa kuphela achaza indlela i-database esebenza ngayo. Unokuzikhangela usebenzisa "umsebenzi wedatha yelaaarelational" ukubona ukuba zimbalwa kangakanani iziphumo. Ngaphezu koko, la manqaku mafutshane. Ukuba ujonge itekhnoloji ye-buzzy yamva nje (iBigData, iNoSQL okanye iJavaScript), uya kufumana amanqaku anzulu achaza indlela asebenza ngayo.

Ngaba oovimba beenkcukacha zobudlelwane badala kakhulu kwaye bayadika kakhulu ukuba bangachazwa ngaphandle kwezifundo zaseyunivesithi, amaphepha ophando kunye neencwadi?

Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Njengomphuhlisi, ndiyakuthiya ukusebenzisa into endingayiqondiyo. Kwaye ukuba i-database isetyenziswe ngaphezu kweminyaka engama-40, kufuneka kubekho isizathu. Ukutyhubela iminyaka, ndichithe amakhulu eeyure ukuqonda ngokwenene ezi bhokisi zimnyama zingaqhelekanga ndizisebenzisa yonke imihla. Oovimba beenkcukacha zonxibelelwano umdla kakhulu kuba isekelwe kwiingcamango eziluncedo nezisebenzisekayo. Ukuba unomdla wokuqonda i-database, kodwa awuzange ube nexesha okanye utyekelo lokungena kwesi sihloko sibanzi, kuya kufuneka uyonwabele eli nqaku.

Nangona isihloko seli nqaku sicacile, injongo yeli nqaku asikokuqonda indlela yokusebenzisa uvimba weenkcukacha. Ngoko ke, Kuya kufuneka sele uyazi ukubhala isicelo soqhagamshelo esilula kunye nemibuzo esisiseko ULAWULO; kungenjalo unokungaqondi eli nqaku. Yiloo nto kuphela ekufuneka uyazi, ndiza kukucacisa okuseleyo.

Ndiza kuqala ngezinto ezisisiseko zesayensi yekhompyuter, njengokuntsokotha kwexesha le-algorithms (BigO). Ndiyazi ukuba abanye benu bayayithiya le ngcamango, kodwa ngaphandle kwayo awuyi kukwazi ukuqonda izinto eziyinkimbinkimbi ngaphakathi kwedatha. Kuba lo ngumxholo omkhulu, Ndiza kugxila into endicinga ukuba ibalulekile: iqhuba njani idatabase SQL ukubuza. Ndizakwazisa nje iikhonsepthi zesiseko sedatabaseukuze ekupheleni kwenqaku ube nombono wento eyenzekayo phantsi kwe-hood.

Ekubeni eli linqaku elide kunye nelobugcisa elibandakanya i-algorithms eninzi kunye nezakhiwo zedatha, thatha ixesha lakho ukulifunda. Ezinye iingqiqo zisenokuba nzima ukuziqonda; ungazitsiba kwaye ufumane umbono jikelele.

Ngolwazi ngakumbi phakathi kwenu, eli nqaku lahlulwe laba ziinxalenye ezi-3:

  • Amagqabantshintshi amanqanaba asezantsi kunye nenqanaba eliphezulu lamacandelo edatha
  • Amagqabantshintshi ngeNkqubo yokuLungisa uMbuzo
  • Isishwankathelo seNtengiselwano kunye noLawulo lwePool yeSithinteli

Buyela kwiSiseko

Kwiminyaka eyadlulayo (kwigalaksi ekude, kude lee...), abaphuhlisi kwakufuneka bazi ngokuthe ngqo inani lemisebenzi ababeyibhala. Babeyazi i-algorithms yabo kunye nezakhiwo zedatha ngentliziyo kuba babengenakukwazi ukuchitha i-CPU kunye nememori yeekhompyutheni zabo ezicothayo.

Kweli candelo, ndiza kukukhumbuza ngezinye zezi ngqiqo njengoko zibalulekile ekuqondeni uvimba weenkcukacha. Ndiza kwazisa nombono isalathisi sedatabase.

O(1) vs O(n2)

Kule mihla, abaphuhlisi abaninzi abakhathali malunga nobunzima bexesha le-algorithms ... kwaye banyanisile!

Kodwa xa ujongene nedatha eninzi (andithethi ngamawaka) okanye ukuba uyasokola kwi-milliseconds, iba yinto ebalulekileyo ukuqonda le ngcamango. Kwaye njengoko unokucinga, oovimba beenkcukacha kufuneka bajongane nezi meko zombini! Andizukwenza ukuba uchithe naliphi na ixesha elingakumbi kuneliyimfuneko ukufumana inqaku. Oku kuya kusinceda ukuba siqonde ingqikelelo yolungiselelo olusekwe kwiindleko kamva (indleko Sekelwe Ukulungiswa).

Umxholo

Ubunzima bexesha le-algorithm isetyenziselwe ukubona ukuba izakuthatha ixesha elingakanani ukwenza i-algorithm yesixa esinikiweyo sedata. Ukuchaza oku kuntsonkotha, sisebenzisa ubhalo lwemathematika u-O omkhulu. Olu bhalo lusetyenziswa nomsebenzi ochaza ukuba mingaphi imisebenzi efunwa yi-algorithm kwinani elinikiweyo lamagalelo.

Umzekelo, xa ndisithi "le algorithm inobunzima O(some_function())", ithetha ukuba i-algorithm ifuna some_function(a_certain_amount_of_data) imisebenzi ukuqhubekekisa isixa esithile sedatha.

ngoko ke Ayililo inani ledatha elibalulekileyo**, kungenjalo ** indlela inani lemisebenzi eyonyuka ngayo ngokunyusa umthamo wedatha. Ixesha elintsonkothileyo aliboneleli ngenani elichanekileyo lemisebenzi, kodwa yindlela elungileyo yokuqikelela ixesha lokwenziwa.

Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Kule grafu ungabona inani lemisebenzi ngokuchasene nomthamo wedatha yegalelo kwiindidi ezahlukeneyo ze-algorithm yexesha elinzima. Ndisebenzise isikali se-logarithmic ukuzibonisa. Ngamanye amazwi, inani ledatha likhula ngokukhawuleza ukusuka kwi-1 ukuya kwibhiliyoni e-1. Siyabona ukuba:

  • I-O (1) okanye ukuntsonkotha okuthe gqolo kuhlala kungatshintshi (kungenjalo bekungayi kubizwa ngokuba yinto entsonkothileyo).
  • O(ungene(n)) ihlala iphantsi nangona iibhiliyoni zedatha.
  • Olona bunzima - O(n2), apho inani lemisebenzi likhula ngokukhawuleza.
  • Ezinye iingxaki ezimbini zonyuka ngokukhawuleza.

U mzekelo

Ngomlinganiselo omncinci wedatha, umahluko phakathi kwe-O (1) kunye ne-O (n2) ayinanto. Umzekelo, masithi une-algorithm efuna ukucubungula izinto ezingama-2000.

  • I-algorithm ye-O (1) iya kuxabisa ukusebenza kwe-1
  • I-O (log(n)) i-algorithm iya kuxabisa imisebenzi esi-7
  • I-algorithm ye-O(n) iya kuxabisa imisebenzi engama-2
  • I-algorithm ye-O(n*log(n)) iya kuxabisa imisebenzi eyi-14
  • I-algorithm ye-O(n2) iya kuxabisa imisebenzi ye-4

Umahluko phakathi kwe-O(1) kunye ne-O(n2) ibonakala inkulu (izigidi ezi-4 zemisebenzi) kodwa uya kuphulukana nobuninzi be-2 ms, ixesha nje lokuqhwanyaza amehlo akho. Ngokwenene, iiprosesa zanamhlanje zinokuqhuba amakhulu ezigidi zokusebenza ngomzuzwana. Yiyo loo nto ukusebenza kunye nokwenza ngcono akuyongxaki kwiiprojekthi ezininzi ze-IT.

Njengoko benditshilo, kusabalulekile ukwazi le ngcamango xa usebenza ngeemali ezinkulu zedatha. Ukuba ngeli xesha i-algorithm kufuneka iqhubekekise i-1 elementi (engeyonto ingako kwisiseko sedatha):

  • I-algorithm ye-O (1) iya kuxabisa ukusebenza kwe-1
  • I-O (log(n)) i-algorithm iya kuxabisa imisebenzi esi-14
  • I-algorithm ye-O(n) iya kuxabisa imisebenzi eyi-1
  • I-algorithm ye-O(n*log(n)) iya kukubiza 14 imisebenzi
  • I-algorithm ye-O(n2) iya kuxabisa i-1 imisebenzi

Andiyenzanga izibalo, kodwa ndingathi nge-O(n2) algorithm unexesha lokusela ikofu (nokuba zimbini!). Ukuba ungeze enye i-0 kumthamo wedatha, uya kuba nexesha lokuthatha i-nap.

Masingene nzulu

Ukubhekisela:

  • Ujongo olululo lwetheyibhile yehashi lufumana into kwi-O(1).
  • Ukukhangela umthi olungelelaniswe kakuhle uvelisa iziphumo kwi-O(log(n)).
  • Ukukhangela uluhlu kuvelisa iziphumo kwi-O(n).
  • Ii-algorithms zokuhlela zingcono zinobunzima O(n*log(n)).
  • I-algorithm yokhetho embi inobunzima O(n2).

Qaphela: Kula macandelo alandelayo siza kubona ezi algorithms kunye nezakhiwo zedatha.

Kukho iintlobo ezininzi zobunzima bexesha le-algorithm:

  • Imeko yemeko eqhelekileyo
  • eyona meko ilungileyo
  • kunye neyona meko imbi kakhulu

Ubunzima bexesha budla ngokuba yeyona meko imbi kakhulu.

Bendithetha kuphela malunga nobunzima bexesha le-algorithm, kodwa ubunzima buyasebenza naku:

  • ukusetyenziswa kwememori ye-algorithm
  • disk I / O ukusetyenziswa algorithm

Ewe kunjalo, kukho iingxaki ezimbi ngakumbi kune-n2, umzekelo:

  • n4: imbi kakhulu! Ezinye zee-algorithms ezikhankanyiweyo zinobu bunzima.
  • 3n: oku kubi ngakumbi! Enye ye-algorithms esiza kuyibona embindini weli nqaku inobu bunzima (kwaye isetyenziswe ngokwenene kwiinkcukacha ezininzi).
  • factorial n: awusoze ufumane iziphumo zakho nokuba unedatha encinci.
  • nn: Ukuba udibana nobu bunzima, kuya kufuneka uzibuze ukuba ngaba le yintsimi yakho yomsebenzi ...

Qaphela: Andikunikanga eyona nkcazo yegama le-O enkulu, ngumbono nje. Unokufunda eli nqaku apha I-Wikipedia yokwenyani (asymptotic) inkcazo.

MergeSort

Wenza ntoni xa ufuna ukuhlela ingqokelela? Intoni? Ubiza uhlobo () umsebenzi... Kulungile, impendulo elungileyo... Kodwa kwisiseko sedatha, kufuneka uqonde ukuba olu hlobo() lusebenza njani.

Kukho ii-algorithms zokuhlela ezilungileyo, ke ndiza kugxila kwezona zibalulekileyo: dibanisa uhlobo. Usenokungasiqondi isizathu sokuba ukuhlenga-hlengisa idatha kuluncedo ngoku, kodwa kuya kufuneka emva kwecandelo lokubuza imibuzo. Ngaphezu koko, ukuqonda ukudibanisa uhlobo kuya kusinceda kamva ukuba siqonde umsebenzi wokudityaniswa kwedatha eqhelekileyo ebizwayo uhlanganise Ujoyine (umbutho wokudibanisa).

Hlanganisa

Njengee-algorithms ezininzi eziluncedo, ukudibanisa uhlobo kuxhomekeke kwiqhinga: ukudibanisa 2 uluhlu olulungelelanisiweyo lobungakanani be-N/2 kwisiqalelo se-N esicwangcisiweyo sixabisa imisebenzi ye-N kuphela. Lo msebenzi ubizwa ngokuba kukudityaniswa.

Makhe sibone ukuba oku kuthetha ukuthini ngomzekelo olula:

Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Lo mzobo ubonisa ukuba ukwakha uluhlu lokugqibela oluhlelweyo lwe-8-element, kufuneka uphindaphinde kube kanye phezu kwe-2 4-element arrays. Kuba zombini izinto ezi-4 sele zihleliwe:

  • 1) uthelekisa zombini iziqalelo zangoku kwiinkqubo ezimbini (ekuqaleni okwangoku = kuqala)
  • 2) emva koko thatha eyona incinci ukuyibeka kuluhlu lwesi-8
  • 3) kwaye uye kwinto elandelayo kuluhlu apho uthathe eyona elementi incinci
  • kwaye phinda 1,2,3 ude ufikelele kwindawo yokugqibela yoluhlu.
  • Emva koko uthatha izinto ezishiyekileyo zolunye uluhlu ukuze uzibeke kuluhlu lwesi-8.

Oku kusebenza ngenxa yokuba zombini ii-arrays ze-4-zicwangcisiwe kwaye ke akufuneki "ubuyele umva" kwezo zixhobo.

Ngoku siyaliqonda iqhinga, nantsi ipseudocode yam yokudibanisa:

array mergeSort(array a)
   if(length(a)==1)
      return a[0];
   end if

   //recursive calls
   [left_array right_array] := split_into_2_equally_sized_arrays(a);
   array new_left_array := mergeSort(left_array);
   array new_right_array := mergeSort(right_array);

   //merging the 2 small ordered arrays into a big one
   array result := merge(new_left_array,new_right_array);
   return result;

Ukudibanisa ukuhluza kwaphula ingxaki kwiingxaki ezincinci kwaye kufumaneke iziphumo zeengxaki ezincinci ukufumana isiphumo sengxaki yokuqala (qaphela: olu hlobo lwe-algorithm lubizwa ngokuba yahlula kwaye unqobe). Ukuba awuyiqondi le algorithm, ungakhathazeki; Khange ndiyiqonde ndiqala ukuyibona. Ukuba inokukunceda, ndiyibona le algorithm njengezigaba ezibini ze-algorithm:

  • Isigaba sokwahlula, apho uluhlu lwahlulahlulwe lwaba luluhlu oluncinci
  • Isigaba sokuhlela kulapho ii-arrays ezincinci zidityanisiwe (ukusebenzisa umanyano) ukwenza uluhlu olukhulu.

Isigaba sokwahlula

Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Kwinqanaba lokwahlula, uluhlu luhlulwe lube ngamanyathelo angama-3. Inani elisemthethweni lamanyathelo yilog (N) (ukususela ku-N = 8, log (N) = 3).

Ndiyazi njani le nto?

Ndikrelekrele! Ngelizwi - imathematika. Ingcinga kukuba inyathelo ngalinye lahlula ubungakanani boluhlu lokuqala ngo-2. Le yingcaciso echanekileyo yelogarithm (isiseko sesi-2).

Inqanaba lokuhlela

Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Kwinqanaba lokumisa, uqala ngeyuniyoni (isiqalelo esinye) uluhlu. Ngexesha ngalinye ufaka imisebenzi yokudibanisa emininzi kwaye ixabiso lilonke ngu-N = 8 imisebenzi:

  • Kwinqanaba lokuqala unodibaniso olu-4 oluxabisa imisebenzi emi-2 nganye
  • Kwinqanaba lesibini unodibaniso olu-2 oluxabisa imisebenzi emi-4 nganye
  • Kwinqanaba lesithathu unodibaniso olu-1 oluxabisa imisebenzi esi-8

Kuba kukho log(N) amanyathelo, ixabiso lilonke N * log(N) imisebenzi.

Izinto ezilungileyo zokudibanisa uhlobo

Kutheni le algorithm inamandla kangaka?

Ngenxa yokuba:

  • Ungayitshintsha ukunciphisa imemori yeenyawo ukuze ungenzi uluhlu olutsha kodwa uguqule ngokuthe ngqo uluhlu lwegalelo.

Qaphela: olu hlobo lwe-algorithm lubizwa ngokuba in-indawo (ukuhlelwa ngaphandle kwememori eyongezelelweyo).

  • Ungayitshintsha ukuze usebenzise indawo yedisk kunye nenani elincinci lememori ngexesha elinye ngaphandle kokwenza idiski ebalulekileyo ye-I/O ngaphezulu. Umbono kukulayisha kwinkumbulo kuphela ezo ndawo zisetyenzwayo ngoku. Oku kubalulekile xa ufuna ukuhlela itafile yeegigabhayithi ezininzi ene-100-megabyte memory buffer kuphela.

Qaphela: olu hlobo lwe-algorithm lubizwa ngokuba uhlobo lwangaphandle.

  • Ungayitshintsha ukuba iqhube kwiinkqubo ezininzi/imisonto/iiseva.

Umzekelo, udidi olusasaziweyo lokudibanisa yenye yezinto eziphambili hadoop (esisakhiwo kwidatha enkulu).

  • Le algorithm inokujika ilothe ibe yigolide (ngokwenene!).

Le algorithm yokuhlela isetyenziswa kuninzi (ukuba ayingabo bonke) oovimba beenkcukacha, kodwa ayikuphela kwayo. Ukuba ufuna ukwazi ngakumbi, ungafunda oku umsebenzi wophando, exoxa ngobuhle kunye nokungalunganga kwe-algorithms yokuhlela isiseko sedatha.

Uluhlu, uMthi kunye neTheyibhile yeHash

Ngoku ukuba siyawuqonda umbono wobunzima bexesha kunye nokuhlelwa, kufuneka ndikuxelele malunga nezakhiwo zedatha ezi-3. Oku kubalulekile kuba bona sisiseko sogcino-lwazi lwangoku. Ndiza kwazisa nombono isalathisi sedatabase.

Uluhlu

Uluhlu olunamacala amabini lolona luhlu lulula lwedatha. Itafile inokucingelwa njengoluhlu. Umzekelo:

Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Olu luhlu luyi-2-dimensional yitafile enemiqolo nemiqolo:

  • Umgca ngamnye umele iziko
  • Iikholamu zigcina iimpawu ezichaza iqumrhu.
  • Ikholamu nganye igcina idatha yohlobo oluthile (inani elipheleleyo, umtya, umhla...).

Oku kukulungele ukugcina kunye nokubonwa kwedatha, nangona kunjalo, xa ufuna ukufumana ixabiso elithile, oku akufanelekanga.

Umzekelo, ukuba ubufuna ukufumana bonke abafana abasebenza e-UK, kuya kufuneka ujonge kumqolo ngamnye ukuze ubone ukuba lo mqolo ungowase-UK. Iyakuxabisa iintengiselwano ze-Nphi N - inani lemigca, engeyiyo imbi, kodwa ingaba kukho indlela ekhawulezayo? Ngoku lixesha lokuba siqhelane nemithi.

Qaphela: Oovimba beenkcukacha abaninzi bale mihla babonelela ngoluhlu olwandisiweyo lokugcina iitafile ngokufanelekileyo: iitafile ezilungelelanisiweyo kunye neetafile zesalathisi. Kodwa oku akutshintshi ingxaki yokufumana ngokukhawuleza imeko ethile kwiqela leekholomu.

Umthi wedatha kunye nesalathisi

Umthi wophendlo wokubini ngumthi wokubini onepropati ekhethekileyo, isitshixo kwindawo nganye kufuneka ibe:

  • mkhulu kunawo onke amaqhosha agcinwe kumthi ongezantsi wasekhohlo
  • ngaphantsi kwazo zonke izitshixo ezigcinwe kumthi ongezantsi wasekunene

Makhe sibone ukuba oku kuthetha ntoni ngokubonakalayo

Umbono

Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Lo mthi une-N = 15 element. Masithi ndikhangela i208:

  • Ndiqala kwingcambu eyisitshixo sayo 136. Ukususela kwi-136<208, ndijonge i-subtree efanelekileyo ye-node 136.
  • 398>208 ngoko ke ndijonge kwi-subtree yasekhohlo ye node 398
  • 250>208 ngoko ke ndijonge kwi-subtree yasekhohlo ye node 250
  • 200<208, ngoko ke ndijonge kwi-subtree efanelekileyo ye-node 200. Kodwa i-200 ayinayo i-subtree efanelekileyo, ixabiso alikho (kuba ukuba ikhona, iya kuba kwi-subtree elungileyo engama-200).

Ngoku masithi ndikhangela i40

  • Ndiqala kwingcambu isitshixo sayo singu 136. Ukusukela ngo 136 > 40, ndijonge kwi subtree esekhohlo ye node 136.
  • 80 > 40, kungoko ndijonge kumthi ongezantsi wenode 80
  • 40= 40, i-node ikhona. Ndibuyisela i-ID yomqolo ngaphakathi kwendawo (engaboniswanga emfanekisweni) kwaye ndijonge kwitafile ye-ID yomqolo onikiweyo.
  • Ukwazi i-ID yomqolo kundivumela ukuba ndazi ngqo apho idatha ikwitafile, ukuze ndikwazi ukuyibuyisela ngoko nangoko.

Ekugqibeleni, zombini uphendlo luya kundixabisa inani lamanqanaba ngaphakathi komthi. Ukuba ufunda indawo malunga nokudibanisa uhlobo ngononophelo, kufuneka ubone ukuba kukho ilog(N) amanqanaba. Kuyavela, ilog yeendleko zokukhangela(N), akukubanga!

Masibuyele kule ngxaki yethu

Kodwa oku abstract kakhulu, ngoko ke masibuyele kwingxaki yethu. Endaweni yenani elipheleleyo, khawucinge ngomtya omele ilizwe lomntu okwitafile edlulileyo. Masithi unomthi oqulathe indawo "yelizwe" (uluhlu lwesi-3) lwetheyibhile:

  • Ukuba ufuna ukwazi ukuba ngubani osebenza e-UK
  • ujonge emthini ukuze ufumane i-node emele i-Great Britain
  • ngaphakathi "UKnode" uya kufumana indawo yeerekhodi zabasebenzi base-UK.

Olu phendlo luza kuxabisa imisebenzi yelog(N) endaweni yemisebenzi ye-N ukuba usebenzisa uluhlu ngokuthe ngqo. Le nto usandula ukuyiveza ibiyiyo isalathisi sedatabase.

Ungakha umthi wesalathiso walo naliphi na iqela leemimandla (umtya, inani, imigca emi-2, inombolo kunye nomtya, umhla...) okoko unomsebenzi wokuthelekisa izitshixo (okt amaqela entsimi) ukuze ukwazi ukucwangcisa. ulandelelwano phakathi kwezitshixo (eyimeko yazo naziphi na iindidi ezisisiseko kwiziko ledatha).

B+TreeIndex

Nangona lo mthi usebenza kakuhle ngokufumana ixabiso elithile, kukho ingxaki enkulu xa ufuna Fumana izinto ezininzi phakathi kwamaxabiso amabini. Oku kuya kubiza i-O(N) kuba kuya kufuneka ujonge indawo nganye emthini kwaye ujonge ukuba iphakathi kwala maxabiso mabini (umzekelo, ngohambo oluyalelweyo lomthi). Ngaphezu koko, lo msebenzi awuyo diski I/O enobuhlobo kuba kufuneka ufunde umthi wonke. Kufuneka sifumane indlela yokwenza ngokufanelekileyo isicelo soluhlu. Ukusombulula le ngxaki, oovimba beenkcukacha bale mihla basebenzisa inguqulelo elungisiweyo yomthi wangaphambili obizwa ngokuba yi-B+ Tree. KuMthi B+Mthi:

  • kuphela iindawo ezisezantsi (amagqabi) gcina ulwazi (indawo yemiqolo kwitheyibhile enxulumeneyo)
  • ezinye iindawo zilapha kumzila kwindawo echanekileyo ngexesha lokukhangela.

Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Njengoko ubona, kukho iindawo ezininzi apha (kabini). Enyanisweni, uneendawo ezongezelelweyo, "izigqibo zesigqibo", eziya kukunceda ufumane i-node echanekileyo (egcina indawo yemigca kwitafile ehambelana nayo). Kodwa ukuntsokotha kokukhangela kusengu-O(log(N)) (kukho inqanaba elinye kuphela). Umahluko omkhulu kukuba iindawo ezikwinqanaba elisezantsi ziqhagamshelwe kubalandeli bazo.

Ngalo Mthi B+, ukuba ujonge amaxabiso phakathi kwama-40 kunye ne-100:

  • Kufuneka ujonge nje u 40 (okanye elona xabiso likufutshane emva kwe 40 ukuba 40 akakho) njengokuba wenze ngomthi odlulileyo.
  • Emva koko uqokelele iindlalifa ezingama-40 usebenzisa amakhonkco endlalifa ngqo de ufikelele kwi-100.

Masithi ufumene abalandela uM kwaye umthi unamaN node. Ukufumana indawo ethile yeendleko zelog(N) njengomthi wangaphambili. Kodwa nje ukuba ufumene le node, uya kufumana abalandela uM kwiMisebenzi kunye neembekiselo zabalandela emva kwabo. Olu phendlo luxabisa kuphela i-M+log(N) imisebenzi xa ithelekiswa nemisebenzi ye-N kumthi odlulileyo. Ngaphezu koko, awudingi ukufunda umthi ogcweleyo (kuphela i-M+log(N) nodes), nto leyo ethetha ukusetyenziswa kwediski encinci. Ukuba u-M mncinane (umz. imiqolo engama-200) no-N mkhulu (i-1 imiqolo), kuya kubakho umahluko OMKHULU.

Kodwa kukho iingxaki ezintsha apha (kwakhona!). Ukuba wongeza okanye ucima umqolo kwisiseko sedatha (kwaye ke ngoko kwisalathiso esinxulumeneyo seB+ Tree):

  • kufuneka ugcine ucwangco phakathi kweendawo ezingaphakathi kwi-B+ Tree, kungenjalo awuyi kukwazi ukufumana iindawo ezingaphakathi komthi ongalungiswanga.
  • kufuneka ugcine elona nani lincinci linokwenzeka lamanqanaba kwi-B+Mthi, kungenjalo i-O(log(N)) ixesha elintsonkothileyo libe ngu-O(N).

Ngamanye amazwi, i-B + Tree kufuneka i-odole kwaye ilinganise. Ngethamsanqa, oku kunokwenzeka ngokucima okuhlakaniphile kunye nokufaka imisebenzi. Kodwa oku kuza kwiindleko: ukufakwa kunye nokucima kumthi we-B+ kubiza i-O(log(N)). Yiyo loo nto abanye benu beyivile loo nto ukusebenzisa izalathisi ezininzi akuyombono ilungileyo. Ngokwenene, Ucotha ngokukhawuleza Faka/uhlaziyo/cima umqolo kwitafilekuba uvimba weenkcukacha ufuna ukuhlaziya izalathisi zetafile zisebenzisa i-O(log(N)) ebiza umsebenzi wesalathiso ngasinye. Ngaphezu koko, ukongeza izalathisi kuthetha umthwalo omninzi womsebenzi umphathi wentengiselwano (iya kuchazwa ekupheleni kwenqaku).

Ukufumana iinkcukacha ezithe vetshe, unokubona inqaku leWikipedia B+Umthi. Ukuba ufuna umzekelo wokuphumeza i-B + Tree kwisiseko sedatha, jonga Oku kubhaliwe и Oku kubhaliwe ukusuka kumphuhlisi okhokelayo weMySQL. Bobabini bagxininisa kwindlela i-InnoDB (injini ye-MySQL) ephatha ngayo izalathisi.

Qaphela: Umfundi undixelele ukuba, ngenxa yokulungiswa kwenqanaba eliphantsi, umthi we-B+ kufuneka ulungelelane ngokupheleleyo.

Hashtable

Ubume bethu bokugqibela bedatha ebalulekileyo yitafile ye-hash. Oku kuluncedo kakhulu xa ufuna ukukhangela ngokukhawuleza amaxabiso. Ngaphezu koko, ukuqonda itafile ye-hash kuya kusinceda kamva ukuba siqonde isiseko sedatha esiqhelekileyo sokujoyina umsebenzi obizwa ngokuba yi-hash join ( Hash ukujoyina). Olu lwakhiwo lwedatha lukwasetyenziswa nguvimba wedatha ukugcina ezinye izinto zangaphakathi (umz. itafile yokutshixa okanye ichibi le-buffer, siza kuyibona yomibini le migaqo kamva).

Itheyibhile ye-hash yisakhiwo sedatha esifumana ngokukhawuleza into ngesitshixo sayo. Ukwakha itafile ye-hash kufuneka uchaze:

  • ngundoqo kwizinto zakho
  • umsebenzi we-hash yezitshixo. I-hashes ezingundoqo ezibaliweyo zinika indawo yezinto (ezibizwayo amacandelo ).
  • umsebenzi wokuthelekisa izitshixo. Nje ukuba ufumene icandelo elichanekileyo, kufuneka ufumane into oyikhangelayo ngaphakathi kwecandelo usebenzisa olu thelekiso.

Umzekelo olula

Makhe sithathe umzekelo ocacileyo:

Usebenza njani oovimba beenkcukacha zobudlelwane (Icandelo 1)

Le theyibhile ye-hash inamacandelo ali-10. Ngenxa yokuba ndiyonqena, ndifanekisela amacandelo ama-5 kuphela, kodwa ndiyazi ukuba uhlakaniphile, ngoko ke ndiza kukuvumela ukuba ufanekise enye i-5 ngokwakho. Ndisebenzise imodyuli ye-hash ye-10 yesitshixo. Ngamanye amagama, ndigcina kuphela idijithi yokugqibela yesitshixo sento yokufumana icandelo layo:

  • ukuba idijithi yokugqibela ngu 0, into iwela kwicandelo 0,
  • ukuba idijithi yokugqibela ngu 1, into iwela kwicandelo 1,
  • ukuba idijithi yokugqibela ngu-2, into iwela kwindawo yesi-2,
  • ...

Umsebenzi wothelekiso endiwusebenzisileyo kukulingana ngokulula phakathi kwee-integer ezimbini.

Masithi ufuna ukufumana isiqalelo sama-78:

  • Itheyibhile ye-hash ibala ikhowudi ye-hash ye-78, eyi-8.
  • Itheyibhile ye-hash ijonga icandelo lesi-8, kwaye into yokuqala eyifumanayo ngama-78.
  • Ubuyisela into yama-78 kuwe
  • Ukukhangela kubiza kuphela imisebenzi emi-2 (enye ukubala ixabiso le-hash kunye nomnye ukujonga into ngaphakathi kwecandelo).

Ngoku masithi ufuna ukufumana isiqalelo sama-59:

  • Itheyibhile ye-hash ibala ikhowudi ye-hash ye-59, eyi-9.
  • Itheyibhile ye-hash ikhangela kwicandelo lesi-9, into yokuqala efunyenwe ngu-99. Ukusukela ngo-99!=59, isiqalelo sama-99 asiyonto esebenzayo.
  • Ukusebenzisa ingqiqo efanayo, into yesibini (9), eyesithathu (79), ..., yokugqibela (29) ithathwa.
  • Into ayifunyenwanga.
  • Ukukhangela kuxabisa imisebenzi esi-7.

Umsebenzi omhle we-hash

Njengoko ubona, kuxhomekeke kwixabiso olifunayo, ixabiso alifani!

Ukuba ngoku nditshintsha umsebenzi we-hash 1 yesitshixo (oko kukuthi, ukuthatha amanani ama-000 okugqibela), ujongo lwesibini lubiza kuphela ukusebenza oku-000 kuba akukho ziqalelo kwicandelo 6. Owona mceli mngeni kukufumana umsebenzi olungileyo we-hash oya kudala amabhakethi aqulethe inani elincinci kakhulu lezinto.

Kumzekelo wam, ukufumana umsebenzi olungileyo we-hash kulula. Kodwa lo ngumzekelo olula, ukufumana umsebenzi olungileyo we-hash kunzima ngakumbi xa isitshixo si:

  • umtya (umzekelo - ifani)
  • Imigca emi-2 (umzekelo - ifani kunye negama)
  • Imigca emi-2 kunye nomhla (umzekelo - ifani, igama lokuqala kunye nomhla wokuzalwa)
  • ...

Ngomsebenzi olungileyo wehashi, ukujonga itafile yehash kubiza iO(1).

Uluhlu vs itafile yehash

Kutheni ungasebenzisi uluhlu?

Hmm, umbuzo omhle.

  • Itafile ye-hash ingaba ilayishwe ngokuyinxenye kwinkumbulo, kwaye amacandelo aseleyo angahlala kwidiski.
  • Ngoluhlu kufuneka usebenzise isithuba esidityanisiweyo kwimemori. Ukuba ulayisha itafile enkulu kunzima kakhulu ukufumana indawo eyaneleyo eqhubekayo.
  • Kwitafile yehash, ungakhetha isitshixo oyifunayo (umzekelo, ilizwe kunye nefani yomntu).

Ukuze ufumane inkcazelo engakumbi, unokufunda inqaku malunga JavaImephu yeHash, okusebenza ngokufanelekileyo kwetafile ye-hash; Awudingi ukuqonda iJava ukuze uqonde iikhonsepthi eziqukwe kweli nqaku.

umthombo: www.habr.com

Yongeza izimvo