Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Sawubona, Habr! Ngethula ekunakeni kwakho ukuhunyushwa kwalesi sihloko
"Isebenza kanjani isizindalwazi esihlobene".

Uma kukhulunywa ngemininingwane yokuxhumana angikwazi ukuzibamba kodwa ngicabanga ukuthi kukhona okushodayo. Zisetshenziswa yonke indawo. Kunemininingwane eminingi ehlukene etholakalayo, kusukela ku-SQLite encane newusizo kuya ku-Teradata enamandla. Kodwa kunezihloko ezimbalwa kuphela ezichaza ukuthi i-database isebenza kanjani. Ungakwazi ukusesha wena usebenzisa "umsebenzi wolwazi ohlobene kanjani" ukuze ubone ukuthi mincane kangakanani imiphumela. Ngaphezu kwalokho, lezi zihloko zifushane. Uma ufuna ubuchwepheshe be-buzzy bakamuva (i-BigData, i-NoSQL noma i-JavaScript), uzothola izindatshana ezijulile ezichaza ukuthi zisebenza kanjani.

Ingabe imininingwane egciniwe yobudlelwano indala kakhulu futhi iyisicefe ukuthi ingachazwa ngaphandle kwezifundo zasenyuvesi, amaphepha ocwaningo nezincwadi?

Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Njengonjiniyela, ngiyakuzonda ukusebenzisa into engingayiqondi. Futhi uma imininingwane yolwazi isisetshenziswe iminyaka engaphezu kwengama-40, kufanele kube nesizathu. Phakathi neminyaka edlule, ngichithe amakhulu amahora ukuze ngiqonde ngempela la mabhokisi amnyama angajwayelekile engiwasebenzisa nsuku zonke. Database Relations ezithakazelisayo kakhulu ngoba ngokusekelwe emicabangweni ewusizo futhi esebenziseka kabusha. Uma ungathanda ukuqonda isizindalwazi, kodwa ungakaze ube nesikhathi noma ukuthambekela kokungena kulesi sihloko esibanzi, kufanele ujabulele lesi sihloko.

Nakuba isihloko salesi sihloko sicacile, inhloso yalesi sihloko akukhona ukuqonda ukuthi isetshenziswa kanjani isizindalwazi. Ngakho, kufanele usuvele wazi ukuthi ungabhala kanjani isicelo sokuxhuma kanye nemibuzo eyisisekelo I-CRUD; kungenjalo ungase ungasiqondi lesi sihloko. Yilokho kuphela okumele ukwazi, ngizokuchaza okunye.

Ngizoqala ngezinye izinto eziyisisekelo zesayensi yekhompiyutha, njengobunzima besikhathi sama-algorithms (BigO). Ngiyazi ukuthi abanye benu bayawuzonda lo mqondo, kodwa ngaphandle kwawo ngeke nikwazi ukuqonda izinto eziyinkimbinkimbi ngaphakathi kwesizindalwazi. Njengoba lesi kuyisihloko esikhulu, Ngizogxila kukho engicabanga ukuthi kubalulekile: ukuthi isizindalwazi sisebenza kanjani SQL uphenyo. Ngizothula nje imiqondo eyisisekelo egciniweukuze ekugcineni kwesihloko ube nombono wokuthi kwenzakalani ngaphansi kwe-hood.

Njengoba lesi kuyisihloko eside nesobuchwepheshe esibandakanya ama-algorithms amaningi nezakhiwo zedatha, thatha isikhathi sakho ukusifunda. Eminye imiqondo ingase ibe nzima ukuyiqonda; ungaweqa futhi uthole umqondo ojwayelekile.

Ukuze uthole ulwazi olwengeziwe phakathi kwakho, lesi sihloko sihlukaniswe izingxenye ezi-3:

  • Uhlolojikelele lwezingxenye zesizindalwazi ezisezingeni eliphansi nezisezingeni eliphezulu
  • Uhlolojikelele Lwenqubo Yokuthuthukisa Umbuzo
  • Uhlolojikelele Lokwenziwayo kanye Nokuphathwa Kwephuli Yesibhafa

Buyela Eziyisisekelo

Eminyakeni edlule (emthala kude, kude...), abathuthukisi bekufanele bazi kahle inani lemisebenzi ababeyibhala ngekhodi. Babewazi ama-algorithms abo kanye nezakhiwo zedatha ngekhanda ngoba babengenakukwazi ukumosha i-CPU nenkumbulo yamakhompyutha abo ahamba kancane.

Kule ngxenye, ngizokukhumbuza eminye yale miqondo njengoba ibalulekile ekuqondeni isizindalwazi. Ngizophinde ngethule umqondo inkomba yesizindalwazi.

O(1) vs O(n2)

Namuhla, abathuthukisi abaningi abanandaba nobunkimbinkimbi besikhathi sama-algorithms... futhi baqinisile!

Kodwa uma ubhekene nedatha eningi (angikhulumi ngezinkulungwane) noma uma uzabalaza ngama-millisecond, kuba semqoka ukuqonda lo mqondo. Futhi njengoba ungacabanga, imininingwane yolwazi kufanele ibhekane nezimo zombili! Ngeke ngikwenze ukuthi uchithe isikhathi esiningi kunesidingo ukuze uthole iphuzu. Lokhu kuzosisiza siqonde umqondo wokwenza ngcono okusekelwe kuzindleko ngokuhamba kwesikhathi (izindleko Esekelwe ukuthuthukisa).

Umqondo

Isikhathi esiyinkimbinkimbi ye-algorithm esetshenziswa ukubona ukuthi i-algorithm izothatha isikhathi esingakanani ukuqedela inani elinikeziwe ledatha. Ukuze sichaze lobu bunzima, sisebenzisa i-O big notation yezibalo. Le notation isetshenziswa nomsebenzi ochaza ukuthi mingaki imisebenzi edingwa yi-algorithm enombolweni ethile yokokufaka.

Isibonelo, uma ngithi "le algorithm inobunzima O(some_function())", kusho ukuthi i-algorithm idinga imisebenzi ethile(a_certain_amount_of_data) ukucubungula inani elithile ledatha.

Ngakho Akulona inani ledatha elibalulekile**, kungenjalo ** ukuthi inani lemisebenzi likhuphuka kanjani ngokwenyusa umthamo wedatha. Isikhathi esiyinkimbinkimbi akunikezi inani eliqondile lemisebenzi, kodwa kuyindlela enhle yokulinganisa isikhathi sokwenza.

Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Kule grafu ungabona inani lemisebenzi eliqhathaniswa nenani ledatha yokufaka yezinhlobo ezahlukene ze-algorithm yesikhathi esiyinkimbinkimbi. Ngisebenzise isikali se-logarithmic ukuze ngiwabonise. Ngamanye amazwi, inani ledatha likhula ngokushesha lisuka ku-1 liye ku-1 bhiliyoni. Singabona ukuthi:

  • U-O(1) noma ubunkimbinkimbi obuqhubekayo buhlala bunjalo (uma kungenjalo bekungeke kubizwe ngokuthi ubunkimbinkimbi obuqhubekayo).
  • O(Log(n)) ihlala iphansi ngisho nezigidigidi zedatha.
  • Ubunzima obukhulu kakhulu - O(n2), lapho inani lemisebenzi likhula ngokushesha.
  • Ezinye izinkinga ezimbili zikhula ngokushesha nje.

Izibonelo

Ngenani elincane ledatha, umehluko phakathi kuka-O(1) kanye no-O(n2) awubalulekile. Isibonelo, ake sithi une-algorithm edinga ukucubungula ama-elementi angu-2000.

  • I-algorithm ye-O(1) izokubiza umsebenzi ongu-1
  • I-algorithm ye-O(log(n)) izokubiza imisebenzi engu-7
  • I-algorithm ye-O(n) izokubiza imisebenzi engu-2
  • I-algorithm ye-O(n*log(n)) izobiza imisebenzi engu-14
  • I-algorithm ye-O(n2) izokubiza imisebenzi engu-4

Umehluko phakathi kuka-O(1) no-O(n2) ubonakala mkhulu (ukusebenza kwezigidi ezingu-4) kodwa uzolahlekelwa umkhawulo ongu-2 ms, isikhathi nje sokucwayiza amehlo akho. Ngempela, amaprosesa anamuhla angakwazi ukucubungula amakhulu ezigidi zokusebenza ngomzuzwana. Yingakho ukusebenza nokwenza kahle kungeyona inkinga kumaphrojekthi amaningi e-IT.

Njengoba ngishilo, kusabalulekile ukwazi lo mqondo lapho usebenza ngamanani amakhulu wedatha. Uma kulokhu i-algorithm kufanele icubungule izakhi eziyi-1 (okungeyona imali eningi kangako kusizindalwazi):

  • I-algorithm ye-O(1) izokubiza umsebenzi ongu-1
  • I-algorithm ye-O(log(n)) izokubiza imisebenzi engu-14
  • I-algorithm ye-O(n) izokubiza ukusebenza okungu-1
  • I-algorithm ye-O(n*log(n)) izobiza imisebenzi engu-14
  • I-algorithm ye-O(n2) izokubiza imisebenzi engu-1

Angikenzi izibalo, kodwa ngingasho ukuthi nge-algorithm ye-O(n2) unesikhathi sokuphuza ikhofi (ngisho namabili!). Uma ungeza okunye okungu-0 kuvolumu yedatha, uzoba nesikhathi sokuthatha isihlwathi.

Ake sijule

Ukubhekisela:

  • Ukubheka okuhle kwetafula le-hashi kuthola isici kokuthi O(1).
  • Ukusesha isihlahla esinokulinganisela kukhiqiza imiphumela kokuthi O(log(n)).
  • Ukusesha amalungu afanayo kukhiqiza imiphumela kokuthi O(n).
  • Ama-algorithms wokuhlunga angcono kakhulu anobunzima O(n*log(n)).
  • I-algorithm yokuhlunga embi inobunzima O(n2).

Qaphela: Ezingxenyeni ezilandelayo sizobona lawa ma-algorithms kanye nezakhiwo zedatha.

Kunezinhlobo eziningana ze-algorithm yesikhathi esiyinkimbinkimbi:

  • isilinganiso sesimo sesimo
  • best case scenario
  • kanye nesimo esibi kakhulu

Isikhathi esiyinkimbinkimbi ngokuvamile yisimo esibi kakhulu.

Bengikhuluma kuphela ngobunkimbinkimbi besikhathi be-algorithm, kodwa ubunkimbinkimbi buyasebenza naku:

  • ukusetshenziswa kwenkumbulo ye-algorithm
  • i-disk I/O ukusetshenziswa kwe-algorithm

Yiqiniso, kunezinkinga ezimbi kakhulu kune-n2, isibonelo:

  • n4: kubi kakhulu! Amanye ama-algorithms ashiwo analobu bunzima.
  • 3n: lokhu kubi nakakhulu! Enye yama-algorithms esizoyibona maphakathi nalesi sihloko inalobu bunzima (futhi empeleni isetshenziswa kumininingwane eminingi).
  • i-factorial n: awusoze wathola imiphumela yakho ngisho nenani elincane ledatha.
  • nn: Uma uhlangabezana nale nkimbinkimbi, kufanele uzibuze ukuthi ngabe ngempela yini le nto oyenzayo...

Qaphela: Angizange ngikunike incazelo yangempela yegama elithi O elikhulu, umbono nje. Ungafunda lesi sihloko ku I-Wikipedia ngencazelo yangempela (engabonakali)

HlanganisaHlunga

Wenzani uma udinga ukuhlunga iqoqo? Ini? Ubiza umsebenzi we-sort()... Kulungile, impendulo enhle... Kodwa kusizindalwazi, kufanele uqonde ukuthi lo msebenzi othi sort() usebenza kanjani.

Kunama-algorithms amaningi okuhlela amahle, ngakho-ke ngizogxila kokubaluleke kakhulu: hlanganisa uhlobo. Ungase ungaqondi ukuthi kungani ukuhlunga idatha kubalulekile okwamanje, kodwa kufanele ngemva kwengxenye yokuthuthukisa umbuzo. Ngaphezu kwalokho, ukuqonda ukuhlanganisa uhlobo kuzosisiza kamuva ukuthi siqonde umsebenzi wokujoyina wesizindalwazi ovamile obizwa ngokuthi hlanganisa Ujoyine (inhlangano yokuhlanganisa).

Hlanganisa

Njengama-algorithms amaningi awusizo, ukuhlanganisa kuncike ebuqilini: ukuhlanganisa amalungu afanayo ahlungiwe angu-2 kasayizi N/2 abe yi-N-elementi ehlelwayo kubiza imisebenzi engu-N kuphela. Lokhu kusebenza kubizwa ngokuthi ukuhlanganisa.

Ake sibone ukuthi lokhu kusho ukuthini ngesibonelo esilula:

Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Lesi sibalo sibonisa ukuthi ukuze wakhe uhlu lokugcina oluhlungiwe lwezinto ezingu-8, udinga kuphela ukuphindaphinda kanye phezu kwama-elementi angu-2 4. Njengoba womabili amalungu afanayo we-elementi engu-4 asehleliwe:

  • 1) uqhathanisa zombili izakhi zamanje ngezigaba ezimbili (ekuqaleni kwamanje = okokuqala)
  • 2) bese uthatha encane kunazo zonke ukuze uyibeke ohlwini lwama-elementi angu-8
  • 3) bese uye ku-elementi elandelayo ohlwini lapho uthathe khona into encane kakhulu
  • bese uphinda 1,2,3 uze ufinyelele ingxenye yokugcina eyodwa yamalungu afanayo.
  • Bese uthatha izakhi ezisele zolunye uhlelo ukuze uzibeke ohlwini lwama-elementi angu-8.

Lokhu kuyasebenza ngoba womabili amalungu afanayo angu-4 ahlungiwe ngakho-ke akudingekile ukuthi "ubuyele emuva" kulawo malungu.

Manje njengoba sesiliqonda leli qhinga, nansi i-pseudocode yami yokuhlanganisa:

array mergeSort(array a)
   if(length(a)==1)
      return a[0];
   end if

   //recursive calls
   [left_array right_array] := split_into_2_equally_sized_arrays(a);
   array new_left_array := mergeSort(left_array);
   array new_right_array := mergeSort(right_array);

   //merging the 2 small ordered arrays into a big one
   array result := merge(new_left_array,new_right_array);
   return result;

Ukuhlanganisa ukuhlunga kwephula inkinga ibe yizinkinga ezincane bese uthola imiphumela yezinkinga ezincane ukuze uthole umphumela wenkinga yokuqala (qaphela: lolu hlobo lwe-algorithm lubizwa ngokuthi hlukanisa futhi unqobe). Uma ungayiqondi le-algorithm, ungakhathazeki; Angizange ngiyiqonde ngesikhathi ngiqala ukuyibona. Uma ingakusiza, ngibona le algorithm njenge-algorithm yezigaba ezimbili:

  • Isigaba sokuhlukanisa, lapho amalungu afanayo ehlukaniswa abe amaqembu afanayo amancane
  • Isigaba sokuhlunga yilapho kuhlanganiswa amalungu afanayo amancane (kusetshenziswa inyunyana) ukuze kwakheke amalungu afanayo amakhulu.

Isigaba sokuhlukaniswa

Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Esigabeni sokuhlukanisa, amalungu afanayo ahlukaniswe abe amaqembu afanayo ngezinyathelo ezi-3. Inombolo esemthethweni yezinyathelo ilogu(N) (kusukela N=8, log(N) = 3).

Ngikwazi kanjani lokhu?

Ngihlakaniphile! Ngamafuphi - izibalo. Umqondo uwukuthi isinyathelo ngasinye sihlukanisa usayizi wamalungu afanayo okuqala ngo-2. Inombolo yezinyathelo inombolo yezikhathi lapho ungakwazi ukuhlukanisa uhlu lwangempela lube kabili. Lena incazelo eqondile ye-logarithm (isisekelo 2).

Isigaba sokuhlunga

Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Esigabeni sokuhlunga, uqala ngamaqoqo afanayo (isici esisodwa). Phakathi nesinyathelo ngasinye usebenzisa imisebenzi eminingi yokuhlanganisa futhi inani lezindleko ngu-N = 8 imisebenzi:

  • Esigabeni sokuqala unokuhlanganisa okungu-4 okubiza ukusebenza okungu-2 ngakunye
  • Esinyathelweni sesibili unokuhlanganisa okungu-2 okubiza imisebenzi emi-4 ngakunye
  • Esinyathelweni sesithathu unokuhlanganisa oku-1 okubiza ukusebenza okungu-8

Njengoba kunezinyathelo ze-log(N), izindleko eziphelele N * log(N) imisebenzi.

Izinzuzo zokuhlanganisa uhlobo

Kungani le algorithm inamandla kangaka?

Ngoba:

  • Ungayishintsha ukuze unciphise isigxivizo sememori ukuze ungadali amalungu afanayo amasha kodwa uguqule ngokuqondile uhlu lokokufaka.

Qaphela: lolu hlobo lwe-algorithm lubizwa in-indawo (ukuhlunga ngaphandle kwenkumbulo eyengeziwe).

  • Ungayishintsha ukuze usebenzise isikhala sediski kanye nenani elincane lememori ngesikhathi esifanayo ngaphandle kokufaka i-disk ebalulekile ye-I/O ngaphezulu. Umqondo uwukulayisha ekhanda kuphela lezo zingxenye ezicutshungulwayo njengamanje. Lokhu kubalulekile uma udinga ukuhlunga ithebula le-multi-gigabyte elinebhafa yememori engu-100-megabyte kuphela.

Qaphela: lolu hlobo lwe-algorithm lubizwa uhlobo lwangaphandle.

  • Ungayishintsha ukuze isebenze kuzinqubo/imicu/amaseva amaningi.

Isibonelo, ukuhlunga okuhlanganisiwe okusabalalisiwe kungenye yezingxenye ezibalulekile Hadoop (okuyisakhiwo kudatha enkulu).

  • Le algorithm ingashintsha umthofu ube yigolide (ngempela!).

Le-algorithm yokuhlunga isetshenziswa kumininingwane eminingi (uma kungezona zonke), kodwa akuyona yodwa. Uma ufuna ukwazi okwengeziwe, ungafunda lokhu umsebenzi wocwaningo, exoxa ngobuhle nobubi be-algorithms yokuhlunga yesizindalwazi esivamile.

I-Array, Tree and Hash Table

Manje njengoba sesiwuqonda umqondo wesikhathi esiyinkimbinkimbi nokuhlelwa, kufanele ngikutshele mayelana nezakhiwo zedatha ezi-3. Lokhu kubalulekile ngoba bona ziyisisekelo solwazi lwesimanje. Ngizophinde ngethule umqondo inkomba yesizindalwazi.

Uhlelo

Amalungu afanayo anezinhlangothi ezimbili wuhlaka lwedatha olulula kakhulu. Ithebula lingacatshangwa njengohlu. Ngokwesibonelo:

Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Leli qembu le-2-dimensional liyithebula elinemigqa namakholomu:

  • Umugqa ngamunye umele ibhizinisi
  • Amakholomu agcina izakhiwo ezichaza ibhizinisi.
  • Ikholomu ngayinye igcina idatha yohlobo oluthile (inamba, iyunithi yezinhlamvu, idethi...).

Lokhu kulungele ukugcina nokubona idatha ngeso lengqondo, noma kunjalo, uma udinga ukuthola inani elithile, lokhu akufanelekile.

Isibonelo, uma ubufuna ukuthola bonke abafana abasebenza e-UK, uzodinga ukubheka umugqa ngamunye ukuze unqume ukuthi lowo mugqa ungowase-UK. Izokubiza nge-N transactionskuphi N - inombolo yemigqa, engeyimbi, kodwa ingaba khona indlela esheshayo? Manje sekuyisikhathi sokuba sijwayelane nezihlahla.

Qaphela: Izingosi zolwazi eziningi zesimanje zinikeza ama-arrays anwetshiwe ukuze kugcinwe amathebula ngendlela efanele: amathebula ahleliwe ayinqwaba kanye namathebula e-index-organized. Kodwa lokhu akuyishintshi inkinga yokuthola ngokushesha isimo esithile eqenjini lamakholomu.

Isihlahla sesizindalwazi kanye nenkomba

Isihlahla sokusesha esinambambili siyisihlahla esinempahla ekhethekile, ukhiye endaweni ngayinye kufanele kube:

  • mkhulu kunabo bonke okhiye abagcinwe esihlahleni esingaphansi kwesokunxele
  • ngaphansi kwabo bonke okhiye abagcinwe esihlahleni esincane esilungile

Ake sibone ukuthi lokhu kusho ukuthini ngeso lengqondo

I-Idea

Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Lesi sihlahla sinezici ezingu-N = 15. Ake sithi ngifuna i-208:

  • Ngiqala empandeni ukhiye wayo ungu-136. Kusukela ku-136<208, ngibheka i-subtree efanele ye-node 136.
  • 398>208 ngakho-ke ngibheke esihlokweni esingezansi se-node 398
  • 250>208 ngakho-ke ngibheke esihlokweni esingezansi se-node 250
  • 200<208, ngakho-ke ngibheka i-subtree efanele ye-node 200. Kodwa i-200 ayinaso i-subtree elungile, inani alikho (ngoba uma ikhona, izoba ku-subtree elungile engu-200).

Manje ake sithi ngifuna ama-40

  • Ngiqala empandeni ukhiye wayo ungu-136. Kusukela ku-136 > 40, ngibheka isihlahla esingaphansi kwesobunxele se-node 136.
  • 80 > 40, yingakho ngibheke esihlahleni esingezansi se-node 80
  • 40= 40, i-node ikhona. Ngibuyisa i-ID yerowu ngaphakathi kwenodi (engabonisiwe esithombeni) bese ngibheka kuthebula i-ID yomugqa enikeziwe.
  • Ukwazi i-ID yomugqa kungivumela ukuthi ngazi kahle ukuthi idatha ikuphi kuthebula, ukuze ngikwazi ukuyithola ngokushesha.

Ekugcineni, kokubili ukusesha kuzongibiza inani lamazinga ngaphakathi kwesihlahla. Uma ufunda ingxenye emayelana nokuhlunga ngokucophelela, kufanele ubone ukuthi kunamazinga welogi(N). Kuvele ukuthi, irekhodi lezindleko zokusesha(N), akukubi!

Ake sibuyele enkingeni yethu

Kodwa lokhu abstract, ngakho-ke ake sibuyele enkingeni yethu. Esikhundleni senombolo ephelele, cabanga ngeyunithi yezinhlamvu emele izwe lomuntu othile kuthebula langaphambilini. Ake sithi unesihlahla esiqukethe inkambu "yezwe" (ikholomu 3) yethebula:

  • Uma ufuna ukwazi ukuthi ubani osebenza e-UK
  • ubheka esihlahleni ukuze uthole i-node emele i-Great Britain
  • ngaphakathi "UKnode" uzothola indawo yamarekhodi abasebenzi base-UK.

Lokhu kusesha kuzobiza imisebenzi yelogi(N) esikhundleni sika-N uma usebenzisa amalungu afanayo ngokuqondile. Oqeda ukwethula kwaba inkomba yesizindalwazi.

Ungakha isihlahla senkomba sanoma yiliphi iqembu lezinkambu (intambo, inombolo, imigqa emi-2, inombolo neyunithi yezinhlamvu, usuku...) inqobo nje uma unomsebenzi wokuqhathanisa okhiye (okungukuthi amaqembu enkambu) ukuze ukwazi ukusetha. ukuhleleka phakathi kwezihluthulelo (okuyisimo sanoma yiziphi izinhlobo eziyisisekelo kusizindalwazi).

B+TreeIndex

Nakuba lesi sihlahla sisebenza kahle ngokuthola inani elithile, kunenkinga enkulu uma udinga thola izakhi eziningi phakathi kwamanani amabili. Lokhu kuzobiza u-O(N) ngoba kuzodingeka ubheke inodi ngayinye esihlahleni futhi uhlole ukuthi iphakathi kwalawa manani amabili yini (isb. ngokunqamula oku-oda kwesihlahla). Ngaphezu kwalokho, lokhu kusebenza akuyona i-disk I/O enobungane njengoba kufanele ufunde sonke isihlahla. Sidinga ukuthola indlela yokwenza kahle isicelo sobubanzi. Ukuxazulula le nkinga, isizindalwazi sesimanje sisebenzisa inguqulo eguquliwe yesihlahla sangaphambilini esibizwa nge-B+Tree. Esihlahleni B+Isihlahla:

  • kuphela ama-node aphansi (amaqabunga) gcina ulwazi (indawo yemigqa kuthebula elihlobene)
  • amanye amanodi akhona ngomzila endaweni efanele ngesikhathi sokucinga.

Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Njengoba ubona, kukhona ama-node amaningi lapha (kabili). Ngempela, unama-node engeziwe, "ama-decision node", azokusiza ukuthi uthole i-node elungile (egcina indawo yemigqa etafuleni elihlobene). Kodwa inkimbinkimbi yokusesha kuseyi-O(log(N)) (kunezinga elilodwa kuphela). Umehluko omkhulu wukuthi ama-node asezingeni eliphansi axhumene nabalandelayo.

Ngalesi sihlahla B+, uma ufuna amanani aphakathi kuka-40 no-100:

  • Udinga nje ukubheka u-40 (noma inani eliseduze ngemva kuka-40 uma u-40 lingekho) njengoba wenza ngesihlahla sangaphambilini.
  • Bese uqoqa izindlalifa ezingama-40 usebenzisa izixhumanisi eziqondile zendlalifa uze ufinyelele ku-100.

Ake nithi nithola abalandela u-M kanti isihlahla sinama-N node. Ukuthola i-node ethile yezindleko zelogi (N) njengesihlahla sangaphambilini. Kepha uma usuthole le nodi, uzothola abalandeli baka-M emisebenzini ye-M enezinkomba zabalandelayo. Lokhu kusesha kubiza kuphela i-M+log(N) imisebenzi eqhathaniswa nemisebenzi engu-N esihlahleni sangaphambilini. Ngaphezu kwalokho, awudingi ukufunda isihlahla esigcwele (kuphela M+log(N) nodes), okusho ukusetshenziswa kwediski okuncane. Uma u-M emncane (isb. imigqa engama-200) no-N emkhulu (imigqa engu-1), kuzoba nomehluko OMKHULU.

Kodwa kunezinkinga ezintsha lapha (futhi!). Uma wengeza noma ususa umugqa kusizindalwazi (ngakho-ke kunkomba ehlobene ye-B+Tree):

  • kufanele ugcine ukuhleleka phakathi kwamanodi ngaphakathi kwe-B+Tree, ngaphandle kwalokho ngeke ukwazi ukuthola amanodi ngaphakathi kwesihlahla esingahlungiwe.
  • kufanele ugcine inani elincane elingenzeka lamazinga ku-B+Tree, ngaphandle kwalokho inkimbinkimbi yesikhathi ye-O(log(N)) iba ngu-O(N).

Ngamanye amazwi, i-B+Tree kumele izihlele futhi ilinganisele. Ngenhlanhla, lokhu kungenzeka ngokususa okuhlakaniphile nokufaka imisebenzi. Kodwa lokhu kuza ngezindleko: ukufakwa nokususwa esihlahleni esingu-B+ kubiza u-O(log(N)). Yingakho abanye benu bezwa lokho ukusebenzisa izinkomba eziningi akuwona umqondo omuhle. Ngempela, wehlisa ijubane ngokushesha faka/ubuyekeze/ususe umugqa etafuleningoba isizindalwazi sidinga ukubuyekeza izinkomba zethebula kusetshenziswa umsebenzi obizayo we-O(log(N)) kunkomba ngayinye. Ngaphezu kwalokho, ukwengeza izinkomba kusho umsebenzi omningi umphathi wokwenziwe (izochazwa ekupheleni kwesihloko).

Ukuze uthole imininingwane eyengeziwe, ungabona isihloko se-Wikipedia ku B+Isihlahla. Uma ufuna isibonelo sokusebenzisa i-B+Tree kusizindalwazi, bheka le ndatshana ΠΈ le ndatshana kusuka kunjiniyela oholayo we-MySQL. Bobabili bagxila endleleni i-InnoDB (injini ye-MySQL) ephatha ngayo izinkomba.

Qaphela: Umfundi ungitshele ukuthi, ngenxa yokulungiselelwa okuphansi, isihlahla se-B+ kufanele silingane ngokuphelele.

I-Hashtable

Isakhiwo sethu sedatha esibalulekile yithebula le-hashi. Lokhu kuwusizo kakhulu uma ufuna ukubheka amanani ngokushesha. Ngaphezu kwalokho, ukuqonda ithebula le-hash kuzosisiza kamuva ukuthi siqonde umsebenzi wokujoyina wesizindalwazi ovamile obizwa ngokuthi ukujoyina kwe-hash ( hash joyina). Lesi sakhiwo sedatha siphinde sisetshenziswe yisizindalwazi ukugcina ezinye izinto zangaphakathi (isb. itafula lokukhiya noma i-buffer pool, sizobona yomibili le miqondo kamuva).

Ithebula le-hash liwuhlaka lwedatha oluthola ngokushesha into ngokhiye walo. Ukwakha ithebula le-hash udinga ukuchaza:

  • ukhiye kwezakhi zakho
  • umsebenzi we-hash okhiye. Ama-hashe okhiye obaliwe anikeza indawo yezinto (ezibizwa izingxenye ).
  • umsebenzi wokuqhathanisa okhiye. Uma usuthole ingxenye efanele, kufanele uthole i-elementi oyifunayo phakathi nesegimenti usebenzisa lesi siqhathaniso.

Isibonelo esilula

Ake sithathe isibonelo esicacile:

Isebenza kanjani imibhalo egciniwe yobudlelwano (Ingxenye 1)

Leli thebula le-hashi linamasegimenti ayi-10. Ngenxa yokuthi ngiyavilapha, ngifanekise amasegimenti angu-5 kuphela, kodwa ngiyazi ukuthi uhlakaniphile, ngakho ngizokuvumela ukuthi uthwebule ezinye ezi-5 uwedwa. Ngisebenzise i-hash function modulo 10 yokhiye. Ngamanye amazwi, ngigcina idijithi yokugcina kuphela yokhiye we-elementi ukuze ngithole ingxenye yayo:

  • uma idijithi yokugcina ingu-0, isici siwela esigabeni 0,
  • uma idijithi yokugcina ingu-1, isici siwela esigabeni 1,
  • uma idijithi yokugcina ingu-2, isici siwela endaweni yesi-2,
  • ...

Umsebenzi wokuqhathanisa engiwusebenzisile uwukulingana nje phakathi kwama-integer amabili.

Ake sithi ufuna ukuthola isici 78:

  • Ithebula le-hashi libala ikhodi ye-hashi engu-78, engu-8.
  • Ithebula le-hashi libheka ingxenye 8, futhi into yokuqala eliyitholayo ngama-78.
  • Ubuyisela into 78 kuwe
  • Ukusesha kubiza imisebenzi emi-2 kuphela (eyodwa ukubala inani le-hashi futhi enye ibheke isici esingaphakathi kwesegimenti).

Manje ake sithi ufuna ukuthola isici 59:

  • Ithebula le-hashi libala ikhodi ye-hashi engu-59, engu-9.
  • Ithebula le-hashi liseshwa esigabeni 9, into yokuqala etholakele ingu-99. Kusukela ngo-99!=59, isici 99 asiyona into evumelekile.
  • Ngokusebenzisa umqondo ofanayo, ingxenye yesibili (9), eyesithathu (79), ..., yokugcina (29) ithathwa.
  • Isici asitholakali.
  • Ukusesha kubiza imisebenzi engu-7.

Umsebenzi omuhle we-hash

Njengoba ubona, kuye ngenani olifunayo, izindleko azifani!

Uma manje ngishintsha umsebenzi we-hashi ongu-1 wokhiye (okungukuthi, ukuthatha amadijithi angu-000 okugcina), ukubheka kwesibili kubiza kuphela ukusebenza oku-000 njengoba zingekho izici esigabeni 6. Inselele yangempela ukuthola umsebenzi omuhle we-hashi ozodala amabhakede aqukethe inani elincane kakhulu lezinto.

Esibonelweni sami, ukuthola umsebenzi omuhle we-hash kulula. Kodwa lesi isibonelo esilula, ukuthola umsebenzi omuhle we-hashi kunzima kakhulu uma ukhiye uwukuthi:

  • umucu (isibonelo - isibongo)
  • Imigqa emi-2 (isibonelo - isibongo nesibongo)
  • Imigqa emi-2 nosuku (isibonelo - isibongo, igama kanye nosuku lokuzalwa)
  • ...

Ngokusebenza okuhle kwe-hashi, ukubheka ithebula le-hashi kubiza u-O(1).

I-Array vs ithebula le-hash

Kungani ungasebenzisi uhlu?

Hmm, umbuzo omuhle.

  • Ithebula le-hashi lingaba ilayishwe kancane kumemori, futhi amasegimenti asele angahlala kudiski.
  • Ngohlelo kufanele usebenzise isikhala esihlangene kumemori. Uma ulayisha itafula elikhulu kunzima kakhulu ukuthola indawo eyanele eqhubekayo.
  • Kuthebula le-hashi, ungakhetha ukhiye oyifunayo (isibonelo, izwe nesibongo somuntu).

Ukuze uthole ukwaziswa okwengeziwe, ungafunda isihloko mayelana JavaI-HashMap, okuwukuqaliswa okuphumelelayo kwethebula le-hashi; awudingi ukuqonda i-Java ukuze uqonde imiqondo ehlanganiswe kulesi sihloko.

Source: www.habr.com

Engeza amazwana