Isishwankathelo se-Agile DWH Design Methodology

Ukuphuhlisa indawo yokugcina ngumsebenzi omde kwaye unzima.

Kuninzi kubomi beprojekthi kuxhomekeke kwindlela imodeli yento kunye nesiseko esicatshangelwe ngayo ekuqaleni.

Indlela eyamkelekileyo ngokubanzi iye kwaye ihlala iguquguqukayo eyahlukeneyo yokudibanisa iskimu seenkwenkwezi kunye nefom yesithathu eqhelekileyo. Njengomthetho, ngokomgaqo: idatha yokuqala - i-3NF, imiboniso - inkwenkwezi. Le ndlela, ihlolwe ixesha kwaye ixhaswa yimali eninzi yophando, yinto yokuqala (kwaye ngamanye amaxesha kuphela) into efika engqondweni yengcali ye-DWH enamava xa icinga malunga nokuba indawo yokugcina uhlalutyo kufuneka ibonakale njani.

Ngakolunye uhlangothi, ishishini ngokubanzi kunye neemfuno zabathengi ngokukodwa zivame ukutshintsha ngokukhawuleza, kwaye idatha idla ukukhula zombini "ngobunzulu" kunye "nobubanzi". Kwaye kulapho ukusilela okuphambili kwenkwenkwezi kubonakala - kulinganiselwe bhetyebhetye.

Kwaye ukuba kubomi bakho obuzolileyo nobumnandi njengomphuhlisi we-DWH ngequbuliso:

  • umsebenzi wavuka "ukwenza ubuncinane into ngokukhawuleza, kwaye emva koko siya kubona";
  • kwavela iprojekthi ephuhlayo ngokukhawuleza, kunye nokudibanisa imithombo emitsha kunye nokusebenza kwakhona kwemodeli yezoshishino ubuncinane kanye ngeveki;
  • umthengi uye wavela ongenalo nofifi lokuba inkqubo kufuneka ijongeke njani kwaye yeyiphi imisebenzi ekufuneka iyenze ekugqibeleni, kodwa ukulungele ukwenza umfuniselo kwaye ecokisekileyo ngokungaguqukiyo iziphumo ezifunwayo ngelixa ehlala esondela kuyo;
  • Umphathi weprojekthi wasabela ngeendaba ezilungileyo: “Yaye ngoku sihamba ngokukhawuleza!”

Okanye ukuba unomdla wokufumana enye indlela onokuthi wakhe ngayo amaziko okugcina - wamkelekile kwi-cut!

Isishwankathelo se-Agile DWH Design Methodology

Kuthetha ukuthini "ukuguquguquka"?

Okokuqala, makhe sichaze ukuba zeziphi iimpawu ekufuneka isixokelelwano sibe nazo ukuze ibizwe ngokuba “yibhetyebhetye”.

Ngokwahlukileyo, kubalulekile ukukhankanya ukuba iimpawu ezichazwe kufuneka zihambelane ngokukodwa inkqubo, hayi ukuba inkqubo uphuhliso lwayo. Ke ngoko, ukuba ubufuna ukufunda nge-Agile njengendlela yophuhliso, kungcono ukufunda amanye amanqaku. Umzekelo, kanye apho, eHabré, zininzi izinto ezinomdla (ezinje uphononongo и esebenzayo, kwaye yingxaki).

Oku akuthethi ukuba inkqubo yophuhliso kunye nesakhiwo sogcino lwedatha azihambelani ngokupheleleyo. Ngokubanzi, kufuneka kube lula kakhulu ukuphuhlisa indawo yokugcina i-Agile yolwakhiwo olude. Nangona kunjalo, ekusebenzeni, kaninzi kukho iinketho kunye nophuhliso lwe-Agile ye-DWH yakudala ngokutsho kweKimbal kunye neDathaVault - ngokutsho kwe-Waterfall, kunokuhambelana okuvuyisayo kokuguquguquka kwiifom zayo ezimbini kwiprojekthi enye.

Ke, ngawaphi amandla ekufuneka ugcino oluguquguqukayo lube nawo? Kukho amanqaku amathathu apha:

  1. Ukuhanjiswa kwangethuba kunye nokutshintsha ngokukhawuleza - oku kuthetha ukuba ngokufanelekileyo isiphumo seshishini sokuqala (umzekelo, iingxelo zokuqala zokusebenza) kufuneka zifumaneke kwangethuba, oko kukuthi, nangaphambi kokuba yonke inkqubo iyilwe ngokupheleleyo kwaye iphunyezwe. Ngaphezu koko, uhlaziyo ngalunye olulandelayo kufuneka nalo luthathe ixesha elincinci kangangoko.
  2. Ukuphuculwa okuphindaphindiweyo - oku kuthetha ukuba uphuculo ngalunye olulandelayo kufuneka lungachaphazeli umsebenzi osele usebenza. Ngulo mzuzu odla ngokuba ngowona bubi bukhulu kwiiprojekthi ezinkulu - kungekudala okanye kamva, izinto zomntu ngamnye ziqala ukufumana unxibelelwano oluninzi kangangokuba kuba lula ukuphinda ingqikelelo ngokupheleleyo kwikopi ekufutshane kunokongeza intsimi kwitafile esele ikhona. Kwaye ukuba uyamangaliswa kukuba ukuhlalutya impembelelo yokuphuculwa kwezinto ezikhoyo kunokuthatha ixesha elingakumbi kunophuculo ngokwalo, kusenokwenzeka ukuba awukasebenzi kunye neendawo ezinkulu zokugcina idatha kwiibhanki okanye kwi-telecoms.
  3. Ukuziqhelanisa rhoqo nokutshintsha iimfuno zeshishini - Ulwakhiwo lwento yonke kufuneka luyilwe kungekuphela nje kuthathela ingqalelo ukwandiswa okunokwenzeka, kodwa ngolindelo lokuba isalathiso solu lwandiso olulandelayo asinakuphupha nokuphupha ngalo kwinqanaba loyilo.

Kwaye ewe, ukuhlangabezana nazo zonke ezi mfuno kwinkqubo enye kunokwenzeka (ngokuqinisekileyo, kwiimeko ezithile kunye nogcino oluthile).

Apha ngezantsi ndiza kuqwalasela ezimbini zezona ndlela zidumileyo zoyilo lwe-agile kwiindawo zokugcina idatha - Imodeli yeAnchor и IVault yedatha. Ekhohlo ngaphandle kwezibiyeli kukho ubuchule obugqwesileyo, njengomzekelo, i-EAV, i-6NF (kwimo ecocekileyo) kunye nayo yonke into enxulumene nezisombululo ze-NoSQL - hayi ngenxa yokuba zimbi ngandlel 'ithile, kwaye nangenxa yokuba kule meko inqaku liza kusongela ukufumana. umthamo we-avareji disser. Kungokuba konke oku kunxulumene nezisombululo zodidi olwahluke kancinane - nokuba kubuchule obunokuthi busebenzise kwiimeko ezithile, kungakhathaliseki ukuba ulwakhiwo lulonke lweprojekthi yakho (efana ne-EAV), okanye kwihlabathi jikelele ezinye iiparadigms zokugcina ulwazi (ezinje ngoovimba beenkcukacha zegrafu. kunye nezinye iinketho NoSQL).

Iingxaki zendlela ye "classical" kunye nezisombululo zabo kwiindlela eziguquguqukayo

Ngendlela "yakudala" ndithetha inkwenkwezi endala (kungakhathaliseki ukuphunyezwa okuthe ngqo kweeleya ezingaphantsi, banga abalandeli bakaKimball, Inmon kunye neCDM bandixolele).

1. Ikhadinali engqongqo yonxibelelwano

Lo mzekelo usekelwe kulwahlulo olucacileyo lwedatha Ubungakanani и iinyani. Kwaye oku, kuyavakala, kunengqiqo - emva kwayo yonke into, uhlalutyo lwedatha kuninzi oluninzi lwamatyala luhla luhlalutye lwezalathisi zamanani ezithile (iinyani) kumacandelo athile (imilinganiselo).

Kule meko, ukudibanisa phakathi kwezinto zisekwe ngendlela yobudlelwane phakathi kweetafile usebenzisa isitshixo sangaphandle. Oku kubonakala kuyindalo, kodwa ngokukhawuleza kukhokelela kumda wokuqala wokuguquguquka - inkcazo engqongqo yekhadinali yoqhagamshelwano.

Oku kuthetha ukuba kwinqanaba loyilo lwetheyibhile, kufuneka umisele ngokuchanekileyo isibini ngasinye sezinto ezinxulumeneyo ukuba zinokunxulumana nokuba zininzi ukuya kwezininzi, okanye zi-1 ukuya kwezininzi, kwaye “kweliphi icala”. Oku kugqiba ngokuthe ngqo ukuba yeyiphi itafile eya kuba nesitshixo esiphambili kwaye yeyiphi eya kuba nesitshixo sangaphandle. Ukutshintsha esi simo sengqondo xa iimfuno ezintsha zifunyenwe kuya kukhokelela ekubeni kusetyenzwe ngokutsha isiseko.

Ngokomzekelo, xa uyila into "yerisithi yemali", wena, ngokuxhomekeke kwizifungo zesebe lokuthengisa, ubeke ithuba lokuthatha isenzo. unyuso olunye lwezikhundla ezininzi zokutshekisha (kodwa hayi ngokuphambeneyo):

Isishwankathelo se-Agile DWH Design Methodology
Kwaye emva kwexesha elithile, oogxa bazisa isicwangciso esitsha sokuthengisa apho banokusebenza kwindawo efanayo promotions eziliqela ngaxeshanye. Kwaye ngoku kufuneka uguqule iitafile ngokwahlula ubudlelwane kwinto eyahlukileyo.

(Zonke izinto ezithathiweyo apho isheke yokunyusa idityaniswe ngoku nayo kufuneka iphuculwe).

Isishwankathelo se-Agile DWH Design Methodology
Ubudlelwane kwiVault yeDatha kunye neModeli yeAnchor

Ukunqanda le meko kuye kwaba lula: akunyanzelekanga ukuba uthembe isebe lentengiso ukwenza oku. zonke iindibano zigcinwa kuqala kwiitafile ezahlukeneyo kwaye uyiqhube njengokuninzi-ukuya-kuninzi.

Kwacetywa le ndlela UDan Linstedt njengenxalenye yeparadigm IVault yedatha kwaye ixhaswa ngokupheleleyo Lars Rönnbäck в Imodeli yeAnchor.

Ngenxa yoko, sifumana uphawu lokuqala olwahlukileyo lweendlela eziguquguqukayo:

Ubudlelwane phakathi kwezinto abugcinwanga kwiimpawu zamaqumrhu abazali, kodwa luhlobo oluthile lwento.

В IVault yedatha iitafile zokudibanisa ezinjalo zibizwa ikhonkco, kunye Imodeli yeAnchor - iqhina. Ekuboneni kokuqala, zifana kakhulu, nangona ukungafani kwazo akupheli ngegama (eliza kuxutyushwa ngezantsi). Kuzo zombini izakhiwo, iitafile zekhonkco zinokuqhagamshela naliphi na inani lamaqumrhu (akukho mfuneko yokuba 2).

Oku kungafuneki, ekuqalekeni, kubonelela ukuguquguquka okubalulekileyo kohlengahlengiso. Ulwakhiwo olunjalo luba lunyamezelo kungekuphela nje kwiinguqu kwikhadinali yekhonkco ekhoyo, kodwa kunye nokongezwa kwezitsha - ukuba ngoku indawo yokutshekisha nayo inekhonkco kwi-cashier ephule kuyo, ukubonakala kwekhonkco elinjalo kuya kuba lula. ibe sisongezo phezu kweetafile ezikhoyo ngaphandle kokuchaphazela naziphi na izinto ezikhoyo kunye neenkqubo.

Isishwankathelo se-Agile DWH Design Methodology

2. Ukuphindaphinda idatha

Ingxaki yesibini esonjululwe ngolwakhiwo oluguquguqukayo ayicacanga kangako kwaye iyindalo kwindawo yokuqala. SCD2 uhlobo lwemilinganiselo (ukutshintsha kancinci imilinganiselo yodidi lwesibini), nangona ingeyiyo kuphela.

Kwindawo yokugcina impahla yakudala, idimension idla ngokuba yitafile equlathe isitshixo sokumela omnye umntu (njenge-PK) kunye neseti yezitshixo zeshishini kunye neempawu kwiikholamu ezahlukeneyo.

Isishwankathelo se-Agile DWH Design Methodology

Ukuba idimensioni ixhasa uguqulelo, uguqulelo lwemida yokuqinisekisa yongezwa kwiseti esemgangathweni yemihlaba, kwaye iinguqulelo ezininzi zivela kwindawo yokugcina umqolo omnye kwimvelaphi (enye yenguqu nganye kwiimpawu eziguqulelweyo).

Ukuba idimension iqulathe nolunye uhlobo lophawu oluguquguqukayo oluguquliweyo, inani loguqulelo lomlinganiselo ololo hlobo liya kubanomtsalane (nokuba iimpawu ezishiyekileyo aziguqulelwanga okanye zingaze zitshintshe), kwaye ukuba zininzi ezo mpawu, inani leenguqulelo zikhula ngokukhawuleza kwinani labo. Lo mlinganiso unokuthatha isixa esikhulu sendawo yedisk, nangona uninzi lwedatha eyigcinayo iphinda-phinda amaxabiso ophawu olungenakuguqulwa olusuka kweminye imigca.

Isishwankathelo se-Agile DWH Design Methodology

Ngexesha elifanayo, isetyenziswa rhoqo kakhulu denormalization - ezinye iimpawu zigcinwe ngenjongo njengexabiso, kwaye kungekhona njengekhonkco kwincwadi yereferensi okanye enye imilinganiselo. Le ndlela ikhawulezisa ukufikelela kwedatha, ukunciphisa inani lokujoyina xa ufikelela kwi-dimension.

Ngokuqhelekileyo oku kukhokelela kwi ulwazi olufanayo lugcinwa ngaxeshanye kwiindawo ezininzi. Umzekelo, ulwazi malunga nommandla wokuhlala kunye nodidi lomxumi lunokugcinwa ngaxeshanye kwimilinganiselo "yoMthengi" kunye "noThenga", "Ukuhanjiswa" kunye "neMinxeba yeZiko leeFowuni", kunye nakwi "Client-Client Manager". ” itheyibhile yekhonkco.

Ngokubanzi, oku kuchazwe ngasentla kusebenza kwimilinganiselo eqhelekileyo (engaguqulelwanga), kodwa kwiinguqulelo eziguqulelweyo zinokuba nezikali ezahlukileyo: ukubonakala kwenguqulelo entsha yento (ingakumbi ekujongeni umva) akukhokeleli nje kuhlaziyo lwazo zonke ezinxulumeneyo. iitafile, kodwa kwimbonakalo yeCascading yeenguqulelo ezintsha zezinto ezinxulumeneyo - xa iThebhile 1 isetyenziselwa ukwakha iThebhile 2, kunye neThebhile 2 isetyenziselwa ukwakha iThebhile 3, njl. Nokuba akukho nolunye uphawu lweThebhile 1 olubandakanyekayo ekwakhiweni kweThebhile 3 (kunye nezinye iimpawu zeThebhile 2 ezifunyenwe kweminye imithombo zibandakanyeka), ukuguqulelwa kolu lwakhiwo kuya kukhokelela kubuncinci ukuya kwi-overhead eyongezelelweyo, kwaye kubuninzi ukuya kwi-extra. iinguqulelo kwiTheyibhile 3. engenanto yakwenza nayo konke konke, kwaye ngakumbi phantsi kwekhonkco.

Isishwankathelo se-Agile DWH Design Methodology

3. Ukuntsokotha okungahambelaniyo kokuphinda kusetyenzwe kwakhona

Ngelo xesha, i-storefront entsha nganye eyakhiwe ngesiseko somnye yandisa inani leendawo apho idatha "inokuthi "ihluke" xa utshintsho lwenziwa kwi-ETL. Oku, kukhokelela ekwandeni kobunzima (kunye nexesha) lohlaziyo ngalunye olulandelayo.

Ukuba oku ngasentla kuchaza iinkqubo ezineenkqubo ze-ETL ezingaqhelekanga, unako ukuhlala kwi-paradigm enjalo - kufuneka uqiniseke ukuba uhlengahlengiso olutsha lwenziwe ngokuchanekileyo kuzo zonke izinto ezinxulumene nazo. Ukuba uhlaziyo lwenzeka rhoqo, amathuba okuba "ulahleke" ngempazamo imidibaniso emininzi ayanda kakhulu.

Ukuba, ukongeza, sithathela ingqalelo ukuba i-ETL "eguqulelweyo" inzima kakhulu kune "engaguqulelwanga", kuba nzima kakhulu ukunqanda iimpazamo xa uhlaziywa rhoqo esi sixhobo siphela.

Ukugcina izinto kunye neempawu kwiVault yeDatha kunye neModeli yeAnchor

Indlela ecetywayo ngababhali boyilo lwezakhiwo oluguquguqukayo inokuqulunqwa ngolu hlobo lulandelayo:

Kuyimfuneko ukwahlula utshintsho kwizinto ezihlala zinjalo. Oko kukuthi, izitshixo zokugcina ngokwahlukeneyo kwiimpawu.

Nangona kunjalo, umntu akufanele adideke ayiguqulelwanga uphawu nge engatshintshanga: eyokuqala ayiyigcini imbali yotshintsho lwayo, kodwa ingatshintsha (umzekelo, xa ulungisa impazamo yegalelo okanye ufumana idatha entsha); eyesibini ayitshintshi.

Amanqaku okujonga ahluke malunga nokuba yintoni kanye enokuthi ithathwe njengento engenakuguquguquka kwi-Data Vault kunye ne-Anchor Model.

Ukusuka kwimbono yoyilo IVault yedatha, inokugqalwa njengengatshintshiyo iseti yonke yezitshixo - yendalo (TIN yombutho, ikhowudi yemveliso kwinkqubo yomthombo, njl.) kunye ne-surrogate. Kule meko, iimpawu eziseleyo zinokwahlulwa zibe ngamaqela ngokomthombo kunye / okanye ukuphindaphinda kweenguqu kunye Gcina itafile eyahlukileyo kwiqela ngalinye ngeseti ezimeleyo yeenguqulelo.

Kwiparadigm Imodeli yeAnchor ithathwa njengengaguqukanga kuphela isitshixo sokungena undoqo. Yonke enye into (kubandakanywa nezitshixo zendalo) yimeko ekhethekileyo yeempawu zayo. Apho zonke iimpawu zizimele enye kwenye ngokungagqibekanga, ngoko ke kuphawu ngalunye a itafile eyahlukileyo.

В IVault yedatha iitafile eziqulathe izitshixo zeziko zibizwa Hubami. Ii-Hubs zihlala zineseti esisigxina yemimandla:

  • Izitshixo zeNdalo
  • Isitshixo somnye umntu
  • Ikhonkco kumthombo
  • Rekhoda ixesha lokongeza

Izithuba kwiiHubs Ungaze utshintshe kwaye ungabi nanguqulelo. Ngaphandle, ii-hubs zifana kakhulu neetafile zohlobo lwe-ID-map ezisetyenziswa kwezinye iinkqubo ukuvelisa abaxhasi, nangona kunjalo, kuyacetyiswa ukuba kusetyenziswe i-hash ukusuka kwiseti yezitshixo zeshishini njenge-surrogates kwi-Data Vault. Le ndlela yenza lula ukulayisha ubudlelwane kunye neempawu ezivela kwimithombo (akukho mfuneko yokujoyina i-hub ukufumana i-surrogate, ukubala nje i-hash yesitshixo sendalo), kodwa inokubangela ezinye iingxaki (ezinxulumene, umzekelo, ukungqubana, ityala kunye nokungashicileli abalinganiswa kumaqhosha omtya, njl.

Zonke ezinye iimpawu zequmrhu zigcinwa kwiitafile ezikhethekileyo ezibizwa Iisathelayithi. Ihabhu enye inokuba neesathelayithi ezininzi ezigcina iiseti ezahlukeneyo zeempawu.

Isishwankathelo se-Agile DWH Design Methodology

Ukuhanjiswa kweempawu phakathi kweesathelayithi kwenzeka ngokomgaqo utshintsho oludibeneyo Kwisathelayithi enye yeempawu ezingaguqulelwanga zingagcinwa (umzekelo, umhla wokuzalwa kunye ne-SNILS yomntu), kwenye - ngokungafanekiyo utshintshe iinguqulelo (umzekelo, igama lokugqibela kunye nenombolo yepasi), kweyesithathu - ezitshintsha rhoqo (umzekelo, idilesi yokuhanjiswa, udidi, usuku lomyalelo wokugqibela, njl.). Kule meko, uguqulelo lwenziwa kwinqanaba lesathelayithi nganye, kwaye kungekhona iqumrhu lilonke, ngoko kuyacetyiswa ukuba usasaze iimpawu ukwenzela ukuba ukuhlangana kweenguqulelo ngaphakathi kwesathelayithi enye kuncinci (nto leyo inciphisa inani elipheleleyo leenguqulelo ezigciniweyo. ).

Kwakhona, ukwandisa inkqubo yokulayisha idatha, iimpawu ezifunyenwe kwimithombo eyahlukeneyo zihlala zibandakanyiwe kwiisathelayithi zomntu ngamnye.

Iisathelayithi zinxibelelana neHub nge isitshixo sangaphandle (ehambelana ne-1-to-many cardinality). Oku kuthetha ukuba amaxabiso ophawu oluninzi (umzekelo, iinombolo zoqhagamshelwano ezininzi zomxhasi omnye) zixhaswa yile "default" uyilo.

В Imodeli yeAnchor iitafile ezigcina izitshixo zibizwa Ii-ankile. Kwaye bagcina:

  • Izitshixo ezibambeneyo kuphela
  • Ikhonkco kumthombo
  • Rekhoda ixesha lokongeza

Izitshixo zendalo ukusuka kwindawo yokujonga iModeli ye-Anchor ziqwalaselwa iimpawu eziqhelekileyo. Olu khetho lunokubonakala lunzima ngakumbi ukuluqonda, kodwa lunika umda omkhulu wokuchonga into.

Isishwankathelo se-Agile DWH Design Methodology

Umzekelo, ukuba idatha malunga neziko elifanayo inokuvela kwiinkqubo ezahlukeneyo, nganye kuzo isebenzisa isitshixo sayo sendalo. KwiVault yeDatha, oku kunokukhokelela kulwakhiwo olunzima lweehubs ezininzi (enye ngomthombo + inguqulelo yenkosi ehlanganisayo), ngelixa kwimodeli yeAnchor, isitshixo sendalo somthombo ngamnye siwela kuphawu lwaso kwaye sinokusetyenziswa xa ulayisha ngokuzimeleyo. bonke abanye.

Kodwa kukwakho inqaku elinye elikhohlisayo apha: ukuba iimpawu ezivela kwiinkqubo ezahlukeneyo zidityanisiwe kwiziko elinye, kunokwenzeka ukuba kukho ezinye. imithetho ye "gluing", apho inkqubo kufuneka iqonde ukuba iirekhodi ezivela kwimithombo eyahlukeneyo zihambelana nomzekelo omnye wequmrhu.

В IVault yedatha le migaqo iya kumisela kakhulu ukwakheka “ihabhu yokungena endaweni” yequmrhu eliyintloko kwaye ayinakuphembelela nangayiphi na indlela iiHubs ezigcina izitshixo zomthombo wendalo kunye neempawu zazo zokuqala. Ukuba ngaxa lithile imigaqo yokudibanisa iyatshintsha (okanye iimpawu ezenziwa ngayo zihlaziywa), kuya kukwanela ukufomatha kwakhona i-hubs ye-surrogate.

В Imodeli yeAnchor iqumrhu elinjalo ngokuqinisekileyo liyakugcinwa kulo ekuphela kwe-ankile. Oku kuthetha ukuba zonke iimpawu, kungakhathaliseki ukuba zivela kweliphi na imvelaphi, ziya kubotshelelwa kwiqabane elinye. Ukwahlula iirekhodi ezidityanisiweyo ngempazamo kwaye, ngokubanzi, ukubeka esweni ukufaneleka kokudibanisa kwinkqubo enjalo kunokuba nzima kakhulu, ngakumbi ukuba imigaqo intsonkothile kwaye iguquka rhoqo, kwaye uphawu olufanayo lunokufunyanwa kwimithombo eyahlukeneyo (nangona kunjalo ngokuqinisekileyo. yenzeka, kuba uguqulelo lophawu ngalunye lugcina ikhonkco kumthombo walo).

Kuyo nayiphi na imeko, ukuba inkqubo yakho ifanele ukuphumeza umsebenzi ukuthotywa, ukudibanisa iirekhodi kunye nezinye izinto ze-MDM, kuyafaneleka ukunikela ingqalelo ekhethekileyo kwimiba yokugcina izitshixo zendalo kwiindlela ze-agile. Kusenokwenzeka ukuba uyilo lweData yeVault enkulu iya kukhuseleka ngequbuliso malunga neempazamo zokudibanisa.

Imodeli yeAnchor ikwabonelela ngohlobo olongezelelweyo lwento ebizwa ngokuba Iqhina ikhethekile uhlobo oluwohlokayo lwe-ankile, enokuqulatha uphawu olunye kuphela. Iindawo zokuhlala zifanele ukusetyenziselwa ukugcina ulwalathiso olusicaba (umzekelo, isini, ubume bomtshato, udidi lwenkonzo yabathengi, njl. njl.). Ngokungafaniyo ne-Anchor, iKnot ayinazo iitheyibhile zoyelelwano olunxulumeneyo, kwaye uphawu lwayo kuphela (igama) luhlala lugcinwe kwitafile enye kunye nesitshixo. I-Nodes ixhunywe kwii-Anchors nge-tie tables (I-Tie) ngendlela efanayo njengoko i-Anchor ixhunywe omnye komnye.

Akukho luvo lucacileyo malunga nokusetyenziswa kweeNodes. Umzekelo, UNikolai Golov, okhuthaza ngenkuthalo ukusetyenziswa koMfanekiso we-Anchor eRashiya, ukholelwa (kungekhona ngokungekho ngqiqweni) ukuba akukho nenye incwadi yereferensi enokuthi ichazwe ngokuqinisekileyo ukuba rhoqo iyakuba nesigxina kunye nenqanaba elinye, ngoko ke kungcono ukusebenzisa ngoko nangoko i-Anchor epheleleyo yazo zonke izinto.

Omnye umahluko obalulekileyo phakathi kweVault yeDatha kunye nemodeli yeAnchor kukufumaneka iimpawu zoqhagamshelwano:

В IVault yedatha Amakhonkco zizinto ezifanayo ezipheleleyo njengeHubs, kwaye zinokuba nazo iimpawu ezizezakho. Ku Imodeli yeAnchor Iikhonkco zisetyenziselwa kuphela ukudibanisa i-Anchors kunye abanako ukuba nezabo iimpawu. Lo mahluko ukhokelela kwiindlela ezahlukeneyo zemodeli iinyani, eza kuxutyushwa ngokubhekele phaya.

Ukugcinwa kwenyani

Ngaphambi koku, sathetha ikakhulu malunga nokulinganisa imodeli. Iinyani azicacanga kancinci.

В IVault yedatha into eqhelekileyo yokugcina iinyani yi Ikhonkco, kwiisathelayithi izikhombisi zokwenyani zongezwa.

Le ndlela ibonakala ilula. Inika ukufikelela lula kwiimpawu ezihlalutyiweyo kwaye ngokuqhelekileyo zifana netafile yenyani yendabuko (izalathi kuphela zigcinwa kungekhona kwitafile ngokwayo, kodwa kwitheyibhile "yommelwane"). Kodwa kukwakho nemigibe: enye yohlengahlengiso oluqhelekileyo lwemodeli-ukwandiswa kwesitshixo senyani-kuyafuneka. ukongeza isitshixo esitsha sangaphandle kwiLink. Kwaye oku, kwakhona, "kwaphula" imodyuli kwaye kunokubangela imfuno yohlengahlengiso kwezinye izinto.

В Imodeli yeAnchor Uqhagamshelwano alukwazi ukuba neempawu zalo, ngoko ke le ndlela ayiyi kusebenza - ngokupheleleyo zonke iimpawu kunye nezibonakaliso kufuneka zidibaniswe ne-anchor ethile. Isiphetho esisuka koku silula - Inyaniso nganye nayo ifuna i-ankile yayo. Kwezinye zezinto esiziqhelileyo ukuzibona njengezibakala, oku kunokubonakala kungokwemvelo - umzekelo, inyaniso yokuthenga inokuncitshiswa ngokugqibeleleyo kwinto ethi "odolo" okanye "irisithi", ukutyelela indawo kwiseshoni, njl. Kodwa kukho iinyani ekungekho lula ukufumana "into yokuthwala" yendalo - umzekelo, iintsalela zeempahla kwiindawo zokugcina iimpahla ekuqaleni kosuku ngalunye.

Ngokufanelekileyo, iingxaki zokumodareyitha xa kusandiswa isitshixo senyani kwimodeli ye-Anchor akuveli (kwanele ukongeza nje uBudlelwane obutsha kwi-Anchor ehambelanayo), kodwa ukuyila imodeli yokubonisa iinyani akunakuphikiswa; ebonisa imodeli yento yeshishini ngendlela engacacanga.

Indlela ukuguquguquka okufumaneka ngayo

Ukwakhiwa okubangelwa kuzo zombini iimeko kuqulethe iitafile ezininzikunomlinganiselo wesintu. Kodwa kusenokufuneka isithuba sedisk esincinci kakhulu kunye neseti efanayo yeempawu eziguqulelweyo njengedimension yemveli. Ngokwemvelo, akukho mlingo apha - konke malunga nokuqheleka. Ngokusasaza iimpawu kuzo zonke iiSathelayithi (kwiDatha yeDatha) okanye iitafile zomntu ngamnye (iModeli yeAnchor), sinciphisa (okanye siphelise ngokupheleleyo) ukuphinda-phindwa kwamaxabiso ezinye iimpawu xa utshintsha ezinye.

kuba IVault yedatha lokuwina kuya kuxhomekeka ekusasazweni kweempawu phakathi Satellites, kwaye ngenxa Imodeli yeAnchor — iphantse ilingane ngokuthe ngqo kwi-avareji yenani leenguqulelo ngokwento yomlinganiselo.

Nangona kunjalo, ukugcinwa kwendawo kubalulekile, kodwa kungekhona eyona nto iphambili, inzuzo yokugcina iimpawu ngokwahlukileyo. Kunye nokugcinwa okwahlukileyo kobudlelwane, le ndlela yenza ivenkile uyilo lwemodyuli. Oku kuthetha ukuba ukongeza iimpawu zomntu ngamnye kunye neenkalo zesifundo esitsha kwimodeli enjalo kujongeka ngathi superstructure phezu kweseti ekhoyo yezinto ngaphandle kokuzitshintsha. Kwaye yiloo nto kanye eyenza iindlela ezichaziweyo zibhetyebhetye.

Oku kukwafana nokutshintsha ukusuka kwimveliso yeqhekeza ukuya kwimveliso yobuninzi - ukuba kwindlela yendabuko itafile nganye yemodeli iyingqayizivele kwaye ifuna ingqalelo ekhethekileyo, ngoko kwiindlela eziguquguqukayo sele isethi "yamacandelo" aqhelekileyo. Ngakolunye uhlangothi, kukho iitafile ezininzi, kwaye iinkqubo zokulayisha kunye nokubuyisela idatha kufuneka zikhangeleke zinzima. Kwelinye icala, baba eqhelekileyo. Oko kuthetha ukuba kunokubakho ngokuzenzekelayo kunye nemetadata eqhutywa. Umbuzo othi "siya kuyibeka njani?", Impendulo enokuthi ithathe inxalenye ebalulekileyo yomsebenzi ekuyileni ukuphuculwa, ngoku ayifanelekanga (kunye nombuzo malunga nefuthe lokutshintsha imodeli kwiinkqubo zokusebenza. ).

Oku akuthethi ukuba abahlalutyi abadingekiyo kwinkqubo enjalo konke konke-umntu kusafuneka asebenze ngeseti yezinto ezineempawu kwaye abone apho kwaye njani ukulayisha konke. Kodwa inani lomsebenzi, kunye nokwenzeka kunye neendleko zempazamo, ziyancipha kakhulu. Bobabini kwinqanaba lokuhlalutya kwaye ngexesha lokuphuhliswa kwe-ETL, leyo inxalenye ebalulekileyo ingancitshiswa kwi-metadata yokuhlela.

Icala elimnyama

Konke oku kungasentla kwenza ukuba iindlela zombini zibe bhetyebhetye ngokwenene, zihambele phambili ngokwetekhnoloji kwaye zilungele ukuphuculwa okuphindaphindiweyo. Ewe kunjalo, kukho "umgqomo kwi-ointment", endicinga ukuba unokuqikelela ngayo.

Ukubola kwedatha, ephantsi kwemodyuli yoyilo lwezakhiwo eziguquguqukayo, kukhokelela ekwandeni kwenani leetafile kwaye, ngokufanelekileyo, ngaphezulu ukudibanisa xa kusenziwa isampulu. Ukuze ufumane ngokulula zonke iimpawu zomda, kwivenkile yakudala enye ikhethiweyo yanele, kodwa ulwakhiwo oluguquguqukayo luya kufuna uluhlu olupheleleyo lokujoyina. Ngaphezu koko, ukuba zonke ezi zidibanisa kwiingxelo zingabhalwa kwangaphambili, ngoko abahlalutyi abajwayele ukubhala i-SQL ngesandla baya kubandezeleka ngokuphindwe kabini.

Kukho iinyani ezininzi ezenza le meko ibe lula:

Xa usebenza ngemilinganiselo emikhulu, zonke iimpawu zayo phantse azizange zisetyenziswe ngaxeshanye. Oku kuthetha ukuba kusenokubakho ukudityaniswa okumbalwa kunokuba kubonakala xa ujonga kuqala imodeli. IVault yeDatha inokuthathela ingqalelo ukuphindaphindwa okulindelweyo kokwabelana xa kusabiwa iimpawu kwiisathelayithi. Kwangaxeshanye, ii-Hubs okanye ii-Anchors ngokwazo ziyafuneka ikakhulu ukuvelisa kunye nokwenza iimaphu abaxhasi kwinqanaba lokulayisha kwaye azifane zisetyenziswe kwimibuzo (oku kuyinyani ngakumbi kwii-Anchors).

Onke amakhonkco angesitshixo. Ukongeza, indlela "ecinezelekileyo" yokugcina idatha inciphisa umphezulu weetafile zokuskena apho ifuneka khona (umzekelo, xa kuhluzwa ngexabiso lophawu). Oku kunokukhokelela kwinto yokuba isampulu esuka kwisiseko sedatha esiqhelekileyo esineqela lezidibanisi ziya kukhawuleza ngakumbi kunokuskena idimension enye enzima ngeenguqulelo ezininzi kumqolo ngamnye.

Ngokomzekelo, apha oku Eli nqaku liqulethe uvavanyo olucacileyo lokuthelekisa ukusebenza kwemodeli ye-Anchor kunye nesampuli evela kwitafile enye.

Okuninzi kuxhomekeke kwinjini. Amaqonga amaninzi anamhlanje aneendlela zokudibanisa zangaphakathi. Umzekelo, i-MS SQL kunye ne-Oracle banokuthi "batsibe" badibanise kwiitafile ukuba idatha yabo ayisetyenziswanga naphina ngaphandle kwamanye amadibaniso kwaye ayichaphazeli ukhetho lokugqibela (itheyibhile / ukujoyina ukususwa), kunye ne-MPP Vertica. amava oogxa abavela Avito, ibonakalise ukuba yi-injini egqwesileyo yeModeli ye-Anchor, inikwe ukulungiswa kwesandla kwesicwangciso sombuzo. Ngakolunye uhlangothi, ukugcina iModeli ye-Anchor, umzekelo, kwi-Click House, enenkxaso edibeneyo yokujoyina, ayibonakali njengengcamango enhle kakhulu.

Ukongeza, kuzo zombini izakhiwo zikhona iintshukumo ezizodwa, ukwenza ukufikelela kwedatha kube lula (zombini ukusuka kumbono wokusebenza kombuzo kunye nabasebenzisi bokugqibela). Umzekelo, Point-In-Time tables kwiVault yeDatha okanye imisebenzi yetafile ekhethekileyo kwimodeli ye-Anchor.

Iyonke

Ingundoqo ephambili yezakhiwo ezithathwa njenge-flexible yi-modularity "yoyilo" lwabo.

Yile propati evumela:

  • Emva kolungiselelo lokuqala olunxulumene nokuhanjiswa kwemetadata kunye nokubhala i-algorithms ye-ETL esisiseko, ngokukhawuleza unikeze umthengi umphumo wokuqala ngokohlobo lweengxelo ezimbalwa eziqulathe idatha esuka kwizinto ezimbalwa zomthombo. Akuyomfuneko ukucinga ngokupheleleyo (nakwinqanaba eliphezulu) yonke imodeli yento.
  • Imodeli yedatha inokuqalisa ukusebenza (kwaye ibe luncedo) kunye nezinto ezi-2-3 kuphela, kwaye emva koko khula ngokuthe ngcembe (ngokuphathelele imodeli ye-Anchor uNikolai isicelo uthelekiso oluhle kunye ne-mycelium).
  • Uninzi lophuculo, kubandakanywa nokwandisa indawo yesifundo kunye nokongeza imithombo emitsha ayichaphazeli ukusebenza okukhoyo kwaye ayibeki umngcipheko wokwaphula into esele isebenza.
  • Enkosi ngokubola kwizinto eziqhelekileyo, iinkqubo ze-ETL kwiinkqubo ezinjalo zibukeka zifana, ukubhala kwazo ziboleka kwi-algorithmization kwaye, ekugqibeleni, automation.

Ixabiso le bhetyebhetye yi intsebenzo. Oku akuthethi ukuba akunakwenzeka ukuphumeza ukusebenza okwamkelekileyo kwiimodeli ezinjalo. Rhoqo kunokuba kungenjalo, unokufuna umzamo othe kratya kunye nengqalelo kwiinkcukacha ukufezekisa iimethrikhi ozifunayo.

Izicelo

Iintlobo zequmrhu IVault yedatha

Isishwankathelo se-Agile DWH Design Methodology

Ulwazi oluninzi malunga neVault yeDatha:
Iwebhusayithi kaDan Lystadt
Konke malunga neVault yeDatha ngesiRashiya
Malunga neVault yeDatha kwiHabré

Iintlobo zequmrhu Imodeli yeAnchor

Isishwankathelo se-Agile DWH Design Methodology

Iinkcukacha ezingakumbi malunga neModeli yeAnchor:

Iwebhusayithi yabadali beModeli yeAnchor
Inqaku malunga namava okuphumeza i-Anchor Model kwi-Avito

Itheyibhile yesishwankathelo eneempawu ezifanayo kunye nomahluko weendlela eziqwalaselwayo:

Isishwankathelo se-Agile DWH Design Methodology

umthombo: www.habr.com

Yongeza izimvo