Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

Hlo Habr! Kuv nthuav qhia rau koj mloog cov lus txhais ntawm tsab xov xwm
"Yuav ua li cas ib tug relational database ua hauj lwm".

Thaum nws los txog rau kev sib raug zoo databases kuv pab tsis tau tab sis xav tias ib yam dab tsi ploj lawm. Lawv siv txhua qhov chaw. Muaj ntau ntau cov ntaub ntawv sib txawv muaj, los ntawm qhov me me thiab muaj txiaj ntsig SQLite mus rau Teradata muaj zog. Tab sis tsuas muaj ob peb nqe lus uas piav qhia tias cov ntaub ntawv ua haujlwm li cas. Koj tuaj yeem tshawb nrhiav koj tus kheej siv "howdoesarelationaldatabasework" kom pom tias muaj pes tsawg cov txiaj ntsig muaj. Ntxiv mus, cov kab lus no luv luv. Yog tias koj tab tom nrhiav cov thev naus laus zis tshiab tshaj plaws (BigData, NoSQL lossis JavaScript), koj yuav pom ntau cov ntsiab lus tob piav qhia lawv ua haujlwm li cas.

Puas yog cov ntaub ntawv sib raug zoo dhau lawm thiab dhuav dhau los piav qhia sab nraud ntawm cov tsev kawm ntawv qib siab, cov ntaub ntawv tshawb fawb thiab cov phau ntawv?

Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

Raws li tus tsim tawm, kuv ntxub siv qee yam uas kuv tsis nkag siab. Thiab yog tias databases tau siv ntau tshaj 40 xyoo, yuav tsum muaj laj thawj. Tau ntau xyoo, kuv tau siv ntau pua teev kom nkag siab tiag tiag cov thawv dub coj txawv txawv uas kuv siv txhua hnub. Relational Databases nthuav heev vim lawv raws li cov ntsiab lus tseem ceeb thiab rov siv tau. Yog tias koj txaus siab nkag siab txog cov ntaub ntawv database, tab sis tsis tau muaj sijhawm los yog xav kom nkag siab txog cov ncauj lus dav dav no, koj yuav tsum txaus siab rau tsab xov xwm no.

Txawm hais tias lub npe ntawm kab lus no qhia meej, lub hom phiaj ntawm tsab xov xwm no tsis yog kom nkag siab tias yuav siv cov ntaub ntawv li cas. Vim li no, koj yuav tsum tau paub yuav ua li cas sau ib daim ntawv thov kev sib txuas yooj yim thiab cov lus nug yooj yim RAW; txwv tsis pub koj yuav tsis nkag siab txog kab lus no. Qhov ntawd yog tib qho uas koj yuav tsum paub, Kuv mam li piav qhia ntxiv.

Kuv mam li pib nrog qee qhov kev tshawb fawb hauv computer, xws li lub sijhawm nyuaj ntawm algorithms (BigO). Kuv paub koj ib txhia ntxub lub tswvyim no, tab sis yog tsis muaj nws koj yuav tsis muaj peev xwm to taub qhov intricacies nyob rau hauv lub database. Vim qhov no yog ib lub ntsiab lus loj heev, Kuv mam li tsom ntsoov rau qhov kuv xav tias tseem ceeb: yuav ua li cas cov txheej txheem database SQL kev thov. Kuv mam li qhia Basic database tswvyimyog li ntawd thaum kawg ntawm tsab xov xwm koj muaj ib lub tswv yim ntawm dab tsi tshwm sim nyob rau hauv lub hood.

Txij li qhov no yog ib tsab xov xwm ntev thiab kev qhia uas muaj ntau cov algorithms thiab cov ntaub ntawv cov qauv, siv koj lub sijhawm los nyeem nws. Qee lub tswv yim yuav nyuaj rau kev nkag siab; koj tuaj yeem hla lawv thiab tseem tau txais lub tswv yim dav dav.

Kom paub ntau ntxiv ntawm koj, kab lus no tau muab faib ua 3 ntu:

  • Txheej txheem cej luam ntawm qib qis thiab qib siab database Cheebtsam
  • Txheej txheem cej luam ntawm Query Optimization txheej txheem
  • Txheej txheem cej luam ntawm Kev Lag Luam thiab Kev Tswj Pas Dej Tsis Txaus

Rov qab mus rau qhov pib

Ntau xyoo dhau los (hauv galaxy nyob deb, deb ...), cov neeg tsim khoom yuav tsum paub meej cov haujlwm uas lawv tau coding. Lawv paub lawv cov algorithms thiab cov ntaub ntawv tsim los ntawm lub siab vim lawv tsis muaj peev xwm nkim CPU thiab nco ntawm lawv cov khoos phis tawj qeeb.

Hauv seem no, kuv yuav qhia koj txog qee cov ntsiab lus no vim tias lawv yog qhov tseem ceeb rau kev nkag siab cov ntaub ntawv. Kuv kuj yuav qhia txog lub tswvyim database index.

O(1) vs O(n2)

Niaj hnub no, ntau tus tsim tawm tsis quav ntsej txog lub sijhawm nyuaj ntawm algorithms ... thiab lawv yog!

Tab sis thaum koj tab tom cuam tshuam nrog ntau cov ntaub ntawv (Kuv tsis hais ntau txhiab) lossis yog tias koj tawm tsam hauv milliseconds, nws tseem ceeb heev kom nkag siab txog lub tswv yim no. Thiab raws li koj tuaj yeem xav, databases yuav tsum tau nrog ob qho xwm txheej! Kuv yuav tsis ua rau koj siv sijhawm ntau dua li qhov tsim nyog kom tau txais cov ntsiab lus hla. Qhov no yuav pab peb nkag siab txog lub tswv yim ntawm kev ua kom zoo raws li tus nqi tom qab (nqi raws li optimization).

tswvyim

Lub sij hawm complexity ntawm lub algorithm siv los saib seb nws yuav siv sij hawm ntev npaum li cas los ua ib qho algorithm rau cov ntaub ntawv muab. Txhawm rau piav txog qhov nyuaj no, peb siv cov lej O lej loj, cov ntawv sau no yog siv nrog lub luag haujlwm uas piav qhia txog pes tsawg qhov kev ua haujlwm ib qho kev xav tau rau ib tus lej ntawm cov khoom siv.

Piv txwv li, thaum kuv hais tias "qhov no algorithm muaj complexity O(some_function())", nws txhais tau hais tias lub algorithm yuav tsum tau ib co_function(a_certain_amount_of_data) ua hauj lwm los ua ib tug npaum li cas ntawm cov ntaub ntawv.

Yog li Nws tsis yog tus nqi ntawm cov ntaub ntawv tseem ceeb **, txwv tsis pub ** yuav ua li cas tus naj npawb ntawm kev ua haujlwm nce nrog nce cov ntaub ntawv ntim. Lub sij hawm complexity tsis muab tus naj npawb ntawm kev ua haujlwm, tab sis yog ib txoj hauv kev zoo los kwv yees lub sijhawm ua tiav.

Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

Hauv daim duab no koj tuaj yeem pom tus naj npawb ntawm kev ua haujlwm piv rau cov ntaub ntawv nkag rau ntau hom algorithm lub sijhawm nyuaj. Kuv siv lub logarithmic nplai los tso saib lawv. Hauv lwm lo lus, tus nqi ntawm cov ntaub ntawv sai sai nce ntawm 1 txog 1 billion. Peb tuaj yeem pom tias:

  • O(1) los yog qhov nyuaj tsis tu ncua tseem nyob tas li (tsis li ntawd nws yuav tsis hu ua qhov nyuaj tas li).
  • O(teev(n)) tseem qis txawm tias muaj ntau lab cov ntaub ntawv.
  • Qhov nyuaj tshaj plaws - O(n2), qhov twg tus naj npawb ntawm kev ua haujlwm loj hlob sai.
  • Lwm qhov teeb meem ob qho tib si nce sai.

piv txwv

Nrog me me ntawm cov ntaub ntawv, qhov sib txawv ntawm O(1) thiab O(n2) yog negligible. Piv txwv li, cia peb hais tias koj muaj algorithm uas yuav tsum tau ua 2000 yam.

  • O(1) algorithm yuav raug nqi rau koj 1 txoj haujlwm
  • O(log(n)) algorithm yuav raug nqi rau koj 7 txoj haujlwm
  • O(n) algorithm yuav raug nqi koj 2 kev ua haujlwm
  • O(n*log(n)) algorithm yuav raug nqi koj 14 ua haujlwm
  • O (n2) algorithm yuav raug nqi koj 4 kev ua haujlwm

Qhov txawv ntawm O(1) thiab O(n2) zoo nkaus li loj (4 lab kev ua haujlwm) tab sis koj yuav poob qhov siab tshaj plaws ntawm 2 ms, tsuas yog lub sijhawm los ntsais koj lub qhov muag. Qhov tseeb, cov txheej txheem niaj hnub tuaj yeem ua tiav pua pua lab ntawm kev ua haujlwm ib ob. Qhov no yog vim li cas kev ua tau zoo thiab kev ua kom zoo tsis yog qhov teeb meem hauv ntau qhov haujlwm IT.

Raws li kuv tau hais, nws tseem ceeb heev kom paub lub tswv yim no thaum ua haujlwm nrog cov ntaub ntawv loj heev. Yog hais tias lub sij hawm no lub algorithm yuav tsum tau ua 1 ntsiab (uas tsis yog ntau npaum li cas rau ib tug database):

  • O(1) algorithm yuav raug nqi rau koj 1 txoj haujlwm
  • O(log(n)) algorithm yuav raug nqi rau koj 14 txoj haujlwm
  • O(n) algorithm yuav raug nqi koj 1 ua haujlwm
  • O(n*log(n)) algorithm yuav raug nqi koj 14 ua haujlwm
  • O(n2) algorithm yuav raug nqi rau koj 1 ua haujlwm

Kuv tsis tau ua lej, tab sis kuv xav hais tias nrog O (n2) algorithm koj muaj sijhawm haus kas fes (txawm ob!). Yog tias koj ntxiv lwm 0 rau cov ntaub ntawv ntim, koj yuav muaj sijhawm los pw tsaug zog.

Cia peb mus tob dua

Kev siv:

  • Ib qho zoo hash table lookup pom ib lub hauv O(1).
  • Nrhiav ib tsob ntoo zoo sib npaug ua rau O(log(n)).
  • Nrhiav ib qho array ua rau cov txiaj ntsig hauv O (n).
  • Qhov zoo tshaj plaws sorting algorithms muaj complexity O(n*log(n)).
  • Ib tug phem sorting algorithm muaj complexity O(n2).

Nco tseg: Hauv seem hauv qab no peb yuav pom cov algorithms thiab cov qauv ntaub ntawv.

Muaj ntau ntau hom algorithm lub sij hawm complexity:

  • nruab nrab rooj plaub scenario
  • qhov zoo tshaj plaws case scenario
  • thiab qhov xwm txheej phem tshaj plaws

Lub sij hawm nyuaj feem ntau yog qhov xwm txheej phem tshaj plaws.

Kuv tsuas yog tham txog lub sij hawm complexity ntawm lub algorithm, tab sis complexity kuj siv rau:

  • nco noj ntawm lub algorithm
  • disk I/O noj algorithm

Tau kawg, muaj teeb meem loj dua n2, piv txwv li:

  • n4: zoo! Ib txhia ntawm cov algorithms hais no complexity.
  • 3n :ywg! Ib qho ntawm cov algorithms peb yuav pom nyob rau hauv nruab nrab ntawm tsab xov xwm no muaj qhov nyuaj (thiab nws yog siv nyob rau hauv ntau databases).
  • factorial n: koj yuav tsis tau txais koj cov txiaj ntsig txawm tias muaj cov ntaub ntawv me me.
  • nn: Yog tias koj ntsib qhov nyuaj no, koj yuav tsum nug koj tus kheej tias qhov no puas yog koj qhov haujlwm tiag tiag ...

Nco tseg: Kuv tsis tau muab koj lub ntsiab lus tseeb ntawm qhov loj O npe, tsuas yog lub tswv yim. Koj tuaj yeem nyeem tsab xov xwm no ntawm Wikipedia rau qhov tseeb (asymptotic) txhais.

MergeSort

Koj ua dab tsi thaum koj xav txheeb xyuas cov khoom? Dab tsi? Koj hu rau sort() function... Ok, zoo teb... Tab sis rau ib tug database, koj yuav tsum to taub yuav ua li cas no sort() function ua hauj lwm.

Muaj ob peb qhov zoo sorting algorithms, yog li kuv yuav tsom mus rau qhov tseem ceeb tshaj plaws: ua ke kev. Tej zaum koj yuav tsis nkag siab tias vim li cas kev txheeb xyuas cov ntaub ntawv tseem ceeb tam sim no, tab sis koj yuav tsum tau tom qab cov lus nug optimization ib feem. Ntxiv mus, kev nkag siab txog kev sib koom ua ke yuav pab peb tom qab nkag siab txog cov ntaub ntawv sib koom ua ke hu ua Merge koom (kev sib koom ua ke).

Ua ke

Zoo li ntau cov algorithms muaj txiaj ntsig, kev sib koom ua ke tso siab rau kev ua kom yuam kev: sib txuas 2 arrays ntawm qhov loj N / 2 rau hauv N-cov ntsiab lus txheeb cov nqi tsuas yog N ua haujlwm. Txoj haujlwm no hu ua kev sib koom ua ke.

Cia peb pom qhov no txhais li cas nrog ib qho piv txwv yooj yim:

Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

Daim duab no qhia tau hais tias txhawm rau tsim qhov kawg sorted 8-element array, koj tsuas yog yuav tsum tau rov ua dua ib zaug dhau ntawm 2 4-element arrays. Txij li ob qho tib si 4-element arrays twb tau txheeb xyuas:

  • 1) koj piv ob lub ntsiab lus tam sim no hauv ob lub arrays (thaum pib tam sim no = thawj)
  • 2) tom qab ntawd muab qhov tsawg tshaj plaws los muab tso rau hauv 8 lub ntsiab array
  • 3) thiab txav mus rau lub caij tom ntej hauv cov array uas koj coj cov khoom me tshaj plaws
  • thiab rov ua 1,2,3 kom txog thaum koj mus txog qhov kawg ntawm ib qho ntawm cov arrays.
  • Tom qab ntawd koj coj cov khoom seem ntawm lwm cov array los muab tso rau hauv 8 lub ntsiab array.

Qhov no ua haujlwm vim tias ob qho tib si 4-cov ntsiab lus array raug txheeb xyuas thiab yog li koj tsis tas yuav "rov qab" hauv cov arrays.

Tam sim no peb nkag siab qhov ua kom yuam kev, ntawm no yog kuv pseudocode rau kev sib koom ua ke:

array mergeSort(array a)
   if(length(a)==1)
      return a[0];
   end if

   //recursive calls
   [left_array right_array] := split_into_2_equally_sized_arrays(a);
   array new_left_array := mergeSort(left_array);
   array new_right_array := mergeSort(right_array);

   //merging the 2 small ordered arrays into a big one
   array result := merge(new_left_array,new_right_array);
   return result;

Kev sib koom ua ke rhuav tshem ib qho teeb meem rau hauv cov teeb meem me thiab tom qab ntawd pom cov txiaj ntsig ntawm cov teeb meem me kom tau txais qhov tshwm sim ntawm qhov teeb meem qub (ceeb toom: hom algorithm no hu ua faib thiab kov yeej). Yog tias koj tsis to taub qhov algorithm no, tsis txhob txhawj; Kuv tsis nkag siab nws thawj zaug kuv pom. Yog tias nws tuaj yeem pab koj, kuv pom cov algorithm no ua ob theem algorithm:

  • Division theem, qhov twg cov array muab faib ua me arrays
  • Lub sijhawm txheeb xyuas yog qhov chaw me me sib xyaw ua ke (siv union) los tsim cov array loj dua.

Kev faib theem

Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

Nyob rau hauv lub division theem, lub array muab faib ua unitary arrays nyob rau hauv 3 kauj ruam. Tus naj npawb ntawm cov kauj ruam yog log(N) (txij li N = 8, log(N) = 3).

Kuv paub qhov no li cas?

kuv yog neeg ntse! Nyob rau hauv ib lo lus - lej. Lub tswv yim yog tias txhua kauj ruam faib qhov loj ntawm tus thawj array los ntawm 2. Tus naj npawb ntawm cov kauj ruam yog tus naj npawb ntawm lub sij hawm koj tuaj yeem faib cov thawj array ua ob. Qhov no yog lub ntsiab txhais ntawm lub logarithm (lub hauv paus 2).

Kev txheeb theem

Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

Nyob rau hauv lub sorting theem, koj pib nrog unitary (ib yam khoom) arrays. Thaum txhua kauj ruam koj siv ntau txoj haujlwm sib koom ua ke thiab tus nqi tag nrho yog N = 8 kev ua haujlwm:

  • Hauv thawj theem koj muaj 4 kev sib koom ua ke uas raug nqi 2 kev ua haujlwm txhua
  • Hauv cov kauj ruam thib ob koj muaj 2 kev sib koom ua ke uas raug nqi 4 kev ua haujlwm txhua
  • Hauv theem peb koj muaj 1 kev sib koom ua ke uas raug nqi 8 kev ua haujlwm

Vim muaj log(N) cov kauj ruam, tag nrho cov nqi N * log(N) ua haujlwm.

Qhov zoo ntawm kev sib koom ua ke

Vim li cas qhov no algorithm thiaj li muaj zog?

Vim:

  • Koj tuaj yeem hloov nws kom txo tau lub cim xeeb hneev taw kom koj tsis txhob tsim cov arrays tshiab tab sis ncaj qha hloov cov tswv yim array.

Nco tseg: hom algorithm no hu ua in-qhov chaw (sorting yam tsis muaj kev nco ntxiv).

  • Koj tuaj yeem hloov nws mus siv qhov chaw disk thiab me me ntawm lub cim xeeb tib lub sijhawm yam tsis muaj qhov tseem ceeb disk I / O nyiaj siv ua haujlwm. Lub tswv yim yog thauj mus rau hauv lub cim xeeb tsuas yog cov khoom uas tam sim no tau ua tiav. Qhov no yog qhov tseem ceeb thaum koj xav txheeb xyuas lub rooj ntau-gigabyte nrog tsuas yog 100-megabyte nco tsis tau.

Nco tseg: hom algorithm no hu ua sab nraud sorting.

  • Koj tuaj yeem hloov nws kom khiav ntawm ntau cov txheej txheem / xov / servers.

Piv txwv li, faib kev sib koom ua ke yog ib qho ntawm cov khoom tseem ceeb Hadoop (uas yog ib qho qauv hauv cov ntaub ntawv loj).

  • Qhov algorithm no tuaj yeem tig cov hlau lead ua kub (tiag tiag!).

Qhov kev txheeb xyuas algorithm no yog siv nyob rau hauv feem ntau (yog tias tsis yog tag nrho) databases, tab sis nws tsis yog tib qho xwb. Yog xav paub ntxiv, koj tuaj yeem nyeem qhov no kev tshawb fawb, uas tham txog qhov zoo thiab qhov tsis zoo ntawm cov database sorting algorithms.

Array, Ntoo thiab Hash Table

Tam sim no peb nkag siab lub tswv yim ntawm lub sij hawm complexity thiab sorting, kuv yuav tsum qhia koj txog 3 cov ntaub ntawv qauv. Qhov no tseem ceeb heev vim lawv yog lub hauv paus ntawm niaj hnub databases. Kuv kuj yuav qhia txog lub tswvyim database index.

Array

Ib qho ob-dimensional array yog cov ntaub ntawv yooj yim tshaj plaws. Ib lub rooj tuaj yeem xav tias yog ib qho array. Piv txwv li:

Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

Qhov no 2-dimensional array yog ib lub rooj nrog kab thiab kab:

  • Txhua kab sawv cev rau ib qho chaw
  • Kum khaws cov khoom uas piav txog qhov chaw.
  • Txhua kem khaws cov ntaub ntawv ntawm ib hom tshwj xeeb (tus lej, kab, hnub tim ...).

Qhov no yooj yim rau kev khaws cia thiab pom cov ntaub ntawv, txawm li cas los xij, thaum koj xav nrhiav tus nqi tshwj xeeb, qhov no tsis haum.

Piv txwv li, yog tias koj xav nrhiav txhua tus txiv neej uas ua haujlwm hauv UK, koj yuav tsum tau saib ntawm txhua kab los txiav txim seb kab ntawd puas yog UK. Nws yuav raug nqi koj N muasqhov twg N - pes tsawg kab, uas tsis yog phem, tab sis yuav muaj txoj kev sai? Tam sim no nws yog lub sijhawm rau peb kom paub txog cov ntoo.

Nco tseg: Cov ntaub ntawv niaj hnub no feem ntau muab cov arrays txuas ntxiv rau kev khaws cov rooj kom zoo: heap-organizedtables thiab index-organizedtables. Tab sis qhov no tsis hloov qhov teeb meem sai sai nrhiav ib qho mob tshwj xeeb hauv ib pawg kab.

Database ntoo thiab index

Ib tsob ntoo tshawb nrhiav binary yog tsob ntoo binary nrog cov cuab yeej tshwj xeeb, tus yuam sij ntawm txhua qhov yuav tsum yog:

  • ntau dua tag nrho cov yuam sij khaws cia nyob rau sab laug subtree
  • tsawg dua txhua tus yuam sij khaws cia hauv txoj cai subtree

Cia peb saib qhov no txhais li cas visually

Lub tswv yim

Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

tsob ntoo no muaj N = 15 ntsiab. Wb hais tias kuv tab tom nrhiav rau 208:

  • Kuv pib ntawm lub hauv paus uas nws tus yuam sij yog 136. Txij li thaum 136<208, kuv saib ntawm txoj cai subtree ntawm node 136.
  • 398> 208 yog li ntawd kuv tab tom saib ntawm sab laug subtree ntawm node 398
  • 250> 208 yog li ntawd kuv tab tom saib ntawm sab laug subtree ntawm node 250
  • 200<208, yog li ntawd kuv tab tom saib txoj cai subtree ntawm node 200. Tab sis 200 tsis muaj txoj cai subtree, tsis muaj nuj nqis (vim yog nws muaj, nws yuav nyob rau hauv txoj cai subtree 200).

Tam sim no cia hais tias kuv tab tom nrhiav 40

  • Kuv pib ntawm lub hauv paus uas nws tus yuam sij yog 136. Txij li thaum 136 > 40, Kuv saib ntawm sab laug subtree ntawm node 136.
  • 80> 40, yog li kuv saib ntawm sab laug subtree ntawm node 80
  • 40 = 40, node muaj. Kuv muab cov kab ID hauv cov node (tsis pom hauv daim duab) thiab saib hauv lub rooj rau kab ID.
  • Paub txog kab ID tso cai rau kuv kom paub meej qhov twg cov ntaub ntawv nyob hauv lub rooj, yog li kuv tuaj yeem khaws nws tam sim ntawd.

Thaum kawg, ob qho kev tshawb fawb yuav raug nqi kuv tus lej ntawm qib hauv tsob ntoo. Yog tias koj nyeem ib feem txog kev sib koom ua ke kom zoo, koj yuav tsum pom tias muaj cov log(N) qib. Nws hloov tawm, nrhiav tus nqi log (N), tsis phem!

Cia peb rov qab los rau peb qhov teeb meem

Tab sis qhov no yog qhov tsis txaus ntseeg, yog li cia peb rov qab los rau peb qhov teeb meem. Hloov ntawm tus lej yooj yim, xav txog ib txoj hlua uas sawv cev rau lub tebchaws ntawm ib tus neeg hauv lub rooj dhau los. Cia peb hais tias koj muaj tsob ntoo uas muaj "lub teb chaws" teb (kem 3) ntawm lub rooj:

  • Yog tias koj xav paub leej twg ua haujlwm hauv UK
  • koj saib tsob ntoo kom tau cov node uas sawv cev rau Great Britain
  • hauv "UKnode" koj yuav pom qhov chaw ntawm UK cov ntaub ntawv neeg ua haujlwm.

Qhov kev tshawb nrhiav no yuav raug nqi log (N) kev ua haujlwm es tsis yog N ua haujlwm yog tias koj siv cov array ncaj qha. Qhov koj nyuam qhuav nthuav tawm yog database index.

Koj tuaj yeem tsim ib tsob ntoo index rau txhua pab pawg ntawm cov teb (txoj hlua, tus lej, 2 kab, tus lej thiab txoj hlua, hnub tim ...) tsuav yog koj muaj haujlwm los sib piv cov yuam sij (piv txwv li pawg teb) yog li koj tuaj yeem teeb tsa xaj ntawm cov yuam sij (uas yog rooj plaub rau tej yam yooj yim hauv database).

B+TreeIndex

Thaum tsob ntoo no ua haujlwm zoo kom tau txais tus nqi tshwj xeeb, muaj teeb meem loj thaum koj xav tau tau ntau yam ntawm ob qhov tseem ceeb. Qhov no yuav raug nqi O (N) vim tias koj yuav tsum tau saib txhua qhov ntawm tsob ntoo thiab xyuas seb nws puas nyob nruab nrab ntawm ob qhov txiaj ntsig no (xws li nrog kev txiav txim hla ntawm tsob ntoo). Ntxiv mus, qhov kev ua haujlwm no tsis yog disk I / O tus phooj ywg vim koj yuav tsum nyeem tag nrho tsob ntoo. Peb yuav tsum nrhiav txoj hauv kev kom ua tau zoo qhov kev thov. Txhawm rau daws qhov teeb meem no, cov ntaub ntawv niaj hnub siv cov hloov pauv ntawm cov ntoo dhau los hu ua B + Ntoo. Hauv ib tsob ntoo B + tsob ntoo:

  • tsuas yog cov qis qis (nplooj) khaws cov ntaub ntawv (qhov chaw ntawm kab hauv lub rooj sib tham)
  • lwm cov nodes nyob ntawm no rau routing mus rau qhov tseeb node thaum tshawb nrhiav.

Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

Raws li koj tau pom, muaj ntau cov nodes ntawm no (ob zaug). Tseeb, koj muaj cov nodes ntxiv, "kev txiav txim nodes", uas yuav pab tau koj nrhiav tau qhov tseeb node (uas khaws qhov chaw ntawm cov kab nyob rau hauv lub rooj sib tham). Tab sis qhov kev tshawb nrhiav nyuaj tseem yog O(log(N)) (tseem muaj ib theem ntxiv). Qhov txawv loj yog qhov ntawd nodes nyob rau theem qis yog txuas nrog lawv cov successors.

Nrog no B + Tsob Ntoo, yog tias koj tab tom nrhiav rau qhov tseem ceeb ntawm 40 thiab 100:

  • Koj tsuas yog yuav tsum nrhiav 40 (lossis tus nqi ze tshaj tom qab 40 yog tias tsis muaj 40) zoo li koj tau ua nrog tsob ntoo yav dhau los.
  • Tom qab ntawd sau 40 tus qub txeeg qub teg siv cov kev sib txuas ncaj qha mus txog thaum koj mus txog 100.

Wb hais tias koj nrhiav M successors thiab tsob ntoo muaj N nodes. Nrhiav ib tus nqi tshwj xeeb log(N) zoo li tsob ntoo dhau los. Tab sis ib zaug koj tau txais cov node, koj yuav tau txais M successors hauv M ​​kev ua haujlwm nrog rau kev xa mus rau lawv cov successors. Qhov kev tshawb nrhiav no tsuas yog tus nqi M + log(N) kev ua haujlwm piv rau N kev ua haujlwm ntawm tsob ntoo dhau los. Ntxiv mus, koj tsis tas yuav nyeem tag nrho cov ntoo (tsuas yog M + log (N) nodes), uas txhais tau tias siv disk tsawg dua. Yog tias M me me (piv txwv li 200 kab) thiab N loj (1 kab), yuav muaj qhov sib txawv loj.

Tab sis muaj teeb meem tshiab ntawm no (dua!). Yog tias koj ntxiv lossis rho tawm ib kab hauv cov ntaub ntawv (thiab yog li ntawd hauv B + Tsob Ntoo Index):

  • Koj yuav tsum tswj hwm qhov kev txiav txim ntawm cov nodes hauv B + Tsob Ntoo, txwv tsis pub koj yuav nrhiav tsis tau cov nodes hauv ib tsob ntoo uas tsis txheeb.
  • Koj yuav tsum khaws qhov tsawg kawg nkaus ntawm qib hauv B + Tsob Ntoo, txwv tsis pub O(log(N)) lub sij hawm nyuaj dhau los ua O(N).

Hauv lwm lo lus, B + Ntoo yuav tsum yog tus kheej xaj thiab sib npaug. Luckily, qhov no ua tau nrog ntse rho tawm thiab ntxig cov haujlwm. Tab sis qhov no los ntawm tus nqi: kev ntxig thiab tshem tawm hauv B + ntoo nqi O (log(N)). Vim li ntawd, nej ib txhia twb hnov ​​li ntawd lawm siv ntau qhov indexes tsis yog ib lub tswv yim zoo. Tiag tiag, koj tab tom ua kom nrawm nrawm ntxig / hloov tshiab / rho tawm ntawm kab hauv ib lub roojvim tias cov ntaub ntawv xav tau hloov kho lub rooj qhov ntsuas ntsuas uas siv cov haujlwm kim O (log(N)) rau txhua qhov ntsuas. Tsis tas li ntawd, ntxiv indexes txhais tau tias ua haujlwm ntau dua rau tus thawj tswj kev lag luam (yuav piav nyob rau hauv qhov kawg ntawm tsab xov xwm).

Yog xav paub ntxiv, koj tuaj yeem saib Wikipedia tsab xov xwm ntawm B+Tsob ntoo. Yog tias koj xav tau ib qho piv txwv ntawm kev siv B + Tsob Ntoo hauv cov ntaub ntawv, ua tib zoo saib kab lus no ΠΈ kab lus no los ntawm tus thawj coj MySQL tus tsim tawm. Lawv ob leeg tsom mus rau yuav ua li cas InnoDB (lub cav MySQL) ua haujlwm indexes.

Nco tseg: Ib tus neeg nyeem tau hais rau kuv tias, vim qhov kev ua kom zoo ntawm qib qis, tsob ntoo B + yuav tsum sib npaug tag nrho.

Hashable

Peb cov ntaub ntawv tseem ceeb kawg yog cov lus hash. Qhov no yog qhov tseem ceeb heev thaum koj xav tau sai sai saib cov txiaj ntsig. Ntxiv mus, nkag siab lub rooj hash yuav pab peb tom qab nkag siab txog cov ntaub ntawv sib koom ua haujlwm hu ua hash koom ( hash koom). Cov qauv ntaub ntawv no tseem siv los ntawm cov ntaub ntawv khaws cia qee yam khoom sab hauv (piv txwv li. xauv rooj los yog pas tsis tau, peb mam li pom ob lub ntsiab lus tom qab).

Lub rooj hash yog cov qauv ntaub ntawv uas pom sai sai los ntawm nws qhov tseem ceeb. Txhawm rau tsim lub rooj hash koj yuav tsum tau txhais:

  • Tshaj plaws rau koj cov khoom
  • hash muaj nuj nqi rau cov yuam sij. Cov lej tseem ceeb hashes muab qhov chaw ntawm cov ntsiab lus (hu ua ntu ).
  • muaj nuj nqi rau kev sib piv cov yuam sij. Thaum koj tau pom qhov tseeb ntu, koj yuav tsum pom cov khoom koj tab tom nrhiav hauv ntu uas siv qhov kev sib piv no.

Piv txwv yooj yim

Cia peb ua piv txwv meej:

Yuav Ua Li Cas Relational Databases Ua Haujlwm (Part 1)

Cov lus hash no muaj 10 ntu. Vim kuv tub nkeeg, kuv tsuas thaij 5 ntu xwb, tab sis kuv paub tias koj txawj ntse, kuv mam li tso duab rau 5 ntu ntawm koj tus kheej. Kuv siv hash function modulo 10 ntawm tus yuam sij. Hauv lwm lo lus, kuv khaws tsuas yog tus lej kawg ntawm lub ntsiab lus tseem ceeb kom pom nws ntu:

  • Yog tias tus lej kawg yog 0, lub caij poob mus rau ntu 0,
  • Yog tias tus lej kawg yog 1, lub caij poob mus rau ntu 1,
  • Yog tias tus lej kawg yog 2, lub caij poob rau hauv cheeb tsam 2,
  • ...

Kev sib piv muaj nuj nqi kuv siv tsuas yog kev sib npaug ntawm ob tus lej.

Cia peb hais tias koj xav tau lub ntsiab 78:

  • Lub rooj hash suav cov lej hash rau 78, uas yog 8.
  • Lub rooj hash saib ntawm ntu 8, thiab thawj lub ntsiab lus nws pom yog 78.
  • Nws xa rov qab nqe 78 rau koj
  • Nrhiav tus nqi tsuas yog 2 txoj haujlwm xwb (ib qho los xam tus nqi hash thiab lwm qhov los saib cov khoom hauv ntu).

Tam sim no cia peb hais tias koj xav tau lub ntsiab 59:

  • Lub rooj hash suav cov lej hash rau 59, uas yog 9.
  • Lub rooj hash tshawb hauv ntu 9, thawj lub ntsiab lus pom yog 99. Txij li thaum 99!=59, lub caij 99 tsis yog ib qho khoom siv tau.
  • Siv tib lub logic, lub caij thib ob (9), qhov thib peb (79), ..., kawg (29) raug coj mus.
  • Element tsis pom.
  • Kev tshawb nrhiav nqi 7 kev ua haujlwm.

Zoo hash muaj nuj nqi

Raws li koj tuaj yeem pom, nyob ntawm tus nqi koj tab tom nrhiav, tus nqi tsis zoo ib yam!

Yog tias tam sim no kuv hloov pauv hash function modulo 1 ntawm tus yuam sij (uas yog, noj 000 tus lej kawg), qhov thib ob saib tsuas yog tus nqi 000 kev ua haujlwm txij li tsis muaj cov ntsiab lus hauv ntu 6. Qhov kev sib tw tiag tiag yog nrhiav qhov zoo hash muaj nuj nqi uas yuav tsim cov thoob uas muaj cov ntsiab lus tsawg heev.

Hauv kuv qhov piv txwv, nrhiav qhov zoo hash muaj nuj nqi yog ib qho yooj yim. Tab sis qhov no yog ib qho piv txwv yooj yim, nrhiav qhov zoo hash muaj nuj nqi yog qhov nyuaj dua thaum tus yuam sij yog:

  • txoj hlua (piv txwv li - lub xeem)
  • 2 kab (piv txwv li - lub xeem thiab lub npe)
  • 2 kab thiab hnub yug (piv txwv li - lub xeem, lub npe thiab hnub yug)
  • ...

Nrog rau qhov zoo hash muaj nuj nqi, hash rooj lookups nqi O(1).

Array vs hash table

Vim li cas ho tsis siv array?

Hm, lus nug zoo.

  • Lub rooj hash tuaj yeem ua tau ib nrab loaded rau hauv nco, thiab cov seem seem tuaj yeem nyob hauv disk.
  • Nrog ib qho array koj yuav tsum siv qhov chaw sib txuas hauv lub cim xeeb. Yog tias koj tab tom thauj khoom loj Nws yog qhov nyuaj heev kom nrhiav tau qhov chaw txuas ntxiv txaus.
  • Rau lub rooj hash, koj tuaj yeem xaiv tus yuam sij koj xav tau (piv txwv li, lub tebchaws thiab tus neeg lub xeem).

Yog xav paub ntxiv, koj tuaj yeem nyeem cov lus hais txog JavaHashMap, uas yog kev siv tau zoo ntawm lub rooj hash; koj tsis tas yuav nkag siab Java kom nkag siab cov ntsiab lus hauv kab lus no.

Tau qhov twg los: www.hab.com

Ntxiv ib saib