Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

Esikhathini esithile esidlule, sasibhekene nombuzo wokukhetha ithuluzi le-ETL lokusebenza ne-Big Data. Isixazululo esisetshenziswe ngaphambilini se-Informatica BDM asizange sisifanele ngenxa yokusebenza okulinganiselwe. Ukusetshenziswa kwayo kwehlisiwe kwaba wuhlaka lokuqalisa imiyalo yokuthumela inhlansi. Ayengekho ama-analogue amaningi emakethe, empeleni, ayekwazi ukusebenza ngevolumu yedatha esibhekana nayo nsuku zonke. Ekugcineni sakhetha u-Ab Initio. Phakathi nemibukiso yokuhlola, umkhiqizo ubonise ijubane eliphezulu kakhulu lokucubungula idatha. Cishe alukho ulwazi mayelana ne-Ab Initio ngesiRashiya, ngakho-ke sinqume ukukhuluma ngokuhlangenwe nakho kwethu ku-HabrΓ©.

I-Ab Initio inezinguquko eziningi zakudala nezingavamile, ikhodi yazo enganwetshwa kusetshenziswa ulimi lwayo lwe-PDL. Ebhizinisini elincane, ithuluzi elinjalo elinamandla cishe lizokweqisa, futhi amakhono alo amaningi angase abize futhi angasetshenziswa. Kodwa uma isikali sakho siseduze ne-Sberov's, khona-ke i-Ab Initio ingase ibe mnandi kuwe.

Kusiza ibhizinisi ukuthi liqongelele ulwazi emhlabeni wonke futhi lithuthukise i-ecosystem, kanye nonjiniyela ukuthuthukisa amakhono akhe ku-ETL, athuthukise ulwazi lwakhe egobolondweni, anikeze ithuba lokukwazi kahle ulimi lwe-PDL, anikeze isithombe esibonakalayo sezinqubo zokulayisha, futhi enze intuthuko ibe lula. ngenxa yobuningi bezingxenye zokusebenza.

Kulokhu okuthunyelwe ngizokhuluma ngamakhono e-Ab Initio futhi nginikeze izici zokuqhathanisa zomsebenzi wayo ne-Hive ne-GreenPlum.

  • Incazelo yohlaka lwe-MDW nomsebenzi wokwenza ngokwezifiso iGreenPlum
  • Ukuqhathaniswa kokusebenza kwe-Ab Initio phakathi kweHive neGreenPlum
  • Ukusebenza kwe-Ab Initio nge-GreenPlum kumodi ye-Near Real Time


Ukusebenza kwalo mkhiqizo kubanzi kakhulu futhi kudinga isikhathi esiningi sokufunda. Nokho, ngamakhono afanele omsebenzi kanye nezilungiselelo zokusebenza ezifanele, imiphumela yokucubungula idatha iyamangalisa kakhulu. Ukusebenzisa i-Ab Initio kunjiniyela kunganikeza umuzwa othokozisayo. Lokhu ukuthatha okusha ekuthuthukisweni kwe-ETL, ingxube phakathi kwendawo ebonakalayo kanye nokuthuthukiswa kokulanda ngolimi olufana neskripthi.

Amabhizinisi athuthukisa i-ecosystem yawo futhi leli thuluzi lisiza kakhulu kunangaphambili. Nge-Ab Initio, ungakwazi ukuqongelela ulwazi mayelana nebhizinisi lakho lamanje futhi usebenzise lolu lwazi ukuze wandise amabhizinisi amadala futhi uvule amasha. Ezinye izindlela ze-Ab Initio zihlanganisa izindawo zokuthuthukiswa kokubukwayo i-Informatica BDM kanye nezindawo ezingathuthukisiwe ezibukwayo i-Apache Spark.

Incazelo ye-Ab Initio

I-Ab Initio, njengamanye amathuluzi e-ETL, iqoqo lemikhiqizo.

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

I-Ab Initio GDE (Indawo Yokuthuthukiswa Kwezithombe) iyindawo yonjiniyela lapho ahlela khona ukuguqulwa kwedatha futhi akuxhumanise nokugeleza kwedatha ngendlela yemicibisholo. Kulokhu, isethi enjalo yokuguqulwa ibizwa ngokuthi igrafu:

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

Okokufaka nokuphumayo koxhumano lwezingxenye zokusebenza kuyizimbobo futhi kuqukethe izinkambu ezibalwe ngaphakathi koshintsho. Amagrafu amaningana axhunywe ngokugeleza ngendlela yemicibisholo ngokulandelana kokubulawa kwawo abizwa ngokuthi ipulani.

Kunezingxenye ezingamakhulu amaningana ezisebenzayo, eziningi. Eziningi zazo zikhethekile kakhulu. Amandla oshintsho lwakudala ku-Ab Initio abanzi kunamanye amathuluzi e-ETL. Isibonelo, Joyina unokuphumayo okuningi. Ngokungeziwe kumphumela wokuxhuma amasethi edatha, ungathola amarekhodi okukhiphayo wamasethi edatha okokufaka okhiye wawo abakwazanga ukuxhunywa. Ungathola futhi ukwenqatshwa, amaphutha kanye nelogi yomsebenzi wokuguqula, engafundwa kukholomu efanayo nefayela lombhalo futhi icutshungulwe nezinye izinguquko:

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

Noma, ngokwesibonelo, ungenza isamukeli sedatha sibe sesimweni setafula futhi ufunde idatha kuyo kukholamu efanayo.

Kukhona izinguquko zangempela. Isibonelo, ukuguqulwa kweSkena kunokusebenza okufana nemisebenzi yokuhlaziya. Kukhona izinguquko ezinamagama azichazayo: Dala Idatha, Funda i-Excel, Yenza Okujwayelekile, Hlunga ngaphakathi KwamaQembu, Run Program, Run SQL, Joyina ne-DB, njll. Amagrafu angasebenzisa amapharamitha wesikhathi sokusebenza, okuhlanganisa ithuba lokudlulisa amapharamitha ukusuka noma ukuya uhlelo lokusebenza. Amafayela anesethi yamapharamitha esenziwe ngomumo adluliselwe kugrafu abizwa ngokuthi amasethi wepharamitha (amasethi).

Njengoba bekulindelekile, i-Ab Initio GDE inenqolobane yayo ebizwa nge-EME (Enterprise Meta Environment). Onjiniyela banethuba lokusebenza nezinguqulo zasendaweni zekhodi futhi bahlole ukuthuthukiswa kwazo endaweni yokugcina emaphakathi.

Kungenzeka, ngesikhathi sokwenza noma ngemva kokwenza igrafu, uchofoze noma yikuphi ukugeleza okuxhuma uguquko futhi ubheke idatha edlulile phakathi kwalezi zinguquko:

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

Kungenzeka futhi ukuthi uchofoze kunoma yikuphi ukusakaza futhi ubone imininingwane yokulandela umkhondo - ukuthi kusebenze ukufana okungakanani ukuguqulwa, mingaki imigqa namabhayithi alayishwe kokuthi yikuphi ukufana:

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

Kungenzeka ukuhlukanisa ukwenziwa kwegrafu ngezigaba futhi umake ukuthi ezinye izinguquko zidinga ukwenziwa kuqala (esigabeni seqanda), okulandelayo esigabeni sokuqala, okulandelayo esigabeni sesibili, njll.

Ngoguquko ngalunye, ungakhetha lokho okubizwa ngokuthi isakhiwo (lapho kuzokwenziwa khona): ngaphandle kokufana noma izintambo ezifanayo, inani elingashiwo. Ngesikhathi esifanayo, amafayela esikhashana adalwa u-Ab Initio lapho izinguquko zisebenza angafakwa kokubili ohlelweni lwefayela leseva naku-HDFS.

Kunguquko ngayinye, ngokusekelwe kusifanekiso esimisiwe, ungazakhela esakho iskripthi ku-PDL, esifana negobolondo.

Nge-PDL, ungakwazi ukunweba ukusebenza koshintsho futhi, ikakhulukazi, ungakwazi ngokuguquguqukayo (ngesikhathi sokusebenza) ukhiqize izingcezu zekhodi ezingafanele ngokuya ngamapharamitha wesikhathi sokusebenza.

I-Ab Initio futhi inokuhlanganiswa okuthuthukiswe kahle ne-OS ngegobolondo. Ngokukhethekile, i-Sberbank isebenzisa i-linux ksh. Ungashintshanisa okuguquguqukayo ngegobolondo futhi ukusebenzise njengamapharamitha wegrafu. Ungashayela ukwenziwa kwamagrafu we-Ab Initio kusuka kugobolondo futhi ulawule i-Ab Initio.

Ngokungeziwe ku-Ab Initio GDE, eminye imikhiqizo eminingi ifakiwe ekulethweni. Kukhona i-Co>Operation System yayo enesimangalo esibizwa ngesistimu yokusebenza. Kukhona Isilawuli>Isikhungo lapho ungahlela futhi ugade ukugeleza kokulanda. Kunemikhiqizo yokwenza intuthuko ezingeni elidala kakhulu kunelo elivunyelwe i-Ab Initio GDE.

Incazelo yohlaka lwe-MDW nomsebenzi wokwenza ngokwezifiso iGreenPlum

Kanye nemikhiqizo yayo, umthengisi uhlinzeka ngomkhiqizo we-MDW (Metadata Driven Warehouse), okuwumhleli wegrafu oklanyelwe ukusiza ngemisebenzi evamile yokugcwaliswa kwezindawo zokugcina idatha noma ama-vaults edatha.

Iqukethe abahlaluli bemethadatha ngokwezifiso (ethize iphrojekthi) kanye namakhodi enziwe ngomumo aphuma ebhokisini.

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum
Njengokufaka, i-MDW ithola imodeli yedatha, ifayela lokumisa lokusetha uxhumano kusizindalwazi (Oracle, Teradata noma Hive) nezinye izilungiselelo. Ingxenye eqondene nephrojekthi, ngokwesibonelo, isebenzisa imodeli kusizindalwazi. Ingxenye engaphandle kwebhokisi yomkhiqizo iwakhiqizela amagrafu namafayela okumisa ngokulayisha idatha kumathebula amamodeli. Kulokhu, amagrafu (nama-psets) adalelwa izindlela ezimbalwa zokuqalisa kanye nomsebenzi okhulayo wokubuyekeza amabhizinisi.

Ezimeni ze-Hive kanye ne-RDBMS, amagrafu ahlukene akhiqizwa ukuze aqaliswe kanye nokubuyekezwa kwedatha okukhulayo.

Endabeni ye-Hive, idatha ye-delta engenayo ixhunywe nge-Ab Initio Joyina nedatha eyayikuthebula ngaphambi kokubuyekezwa. Izilayishi zedatha ku-MDW (kokubili ku-Hive ne-RDBMS) azifaki idatha entsha kuphela esuka ku-delta, kodwa futhi zivala izikhathi zokuhambisana kwedatha okhiye bayo abayinhloko bathole i-delta. Ngaphezu kwalokho, kufanele ubhale kabusha ingxenye engashintshiwe yedatha. Kodwa lokhu kufanele kwenziwe ngoba i-Hive ayinayo imisebenzi yokususa noma yokuvuselela.

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

Endabeni ye-RDBMS, amagrafu okuvuselelwa kwedatha okwandayo abukeka elunge kakhulu, ngoba i-RDBMS inamandla wangempela okubuyekeza.

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

I-delta eyamukelwe ilayishwa kuthebula elimaphakathi kusizindalwazi. Ngemva kwalokhu, i-delta ixhunywa kudatha eyayisetafuleni ngaphambi kokubuyekezwa. Futhi lokhu kwenziwa kusetshenziswa i-SQL kusetshenziswa umbuzo we-SQL okhiqiziwe. Okulandelayo, kusetshenziswa imiyalo ye-SQL ethi susa+faka, idatha entsha evela ku-delta ifakwa kuthebula eliqondiwe futhi izikhathi zokuhambisana kwedatha okhiye bayo abayinhloko abathole i-delta bayavalwa.
Asikho isidingo sokuphinda ubhale idatha engashintshiwe.

Ngakho-ke sifinyelele esiphethweni sokuthi endabeni ye-Hive, i-MDW kufanele ihambe iyobhala kabusha ithebula lonke ngoba i-Hive ayinawo umsebenzi wokuvuselela. Futhi akukho lutho olungcono kunokubhala kabusha ngokuphelele idatha lapho ukubuyekezwa sekusunguliwe. Endabeni ye-RDBMS, ngokuphambene nalokho, abadali bomkhiqizo bakuthola kudingekile ukuphathisa ukuxhumana nokuvuselelwa kwamatafula ekusetshenzisweni kwe-SQL.

Ngephrojekthi e-Sberbank, sidale ukuqaliswa okusha, okusebenziseka kabusha kwesilayishi sedathabhesi yeGreenPlum. Lokhu kwenziwa ngokusekelwe enguqulweni ekhiqizwa yi-MDW ye-Teradata. KwakuyiTeradata, hhayi i-Oracle, eyasondela kakhulu futhi engcono kakhulu kulokhu, ngoba... futhi iwuhlelo lwe-MPP. Izindlela zokusebenza, kanye ne-syntax, ye-Teradata ne-GreenPlum ivele yafana.

Izibonelo zomehluko obalulekile we-MDW phakathi kwama-RDBMS ahlukene zimi kanje. Ku-GreenPlum, ngokungafani ne-Teradata, lapho udala amathebula udinga ukubhala isigatshana

distributed by

I-Teradata iyabhala:

delete <table> all

, futhi kuGreenPlum bayabhala

delete from <table>

Ku-Oracle, ngezinjongo zokuthuthukisa babhala

delete from t where rowid in (<соСдинСниС t с Π΄Π΅Π»ΡŒΡ‚ΠΎΠΉ>)

, bese kuthi iTeradata neGreenPlum babhale

delete from t where exists (select * from delta where delta.pk=t.pk)

Siyaqaphela futhi ukuthi ukuze i-Ab Initio isebenze neGreenPlum, kwakudingeka kufakwe iklayenti le-GreenPlum kuwo wonke ama-node eqoqo le-Ab Initio. Lokhu kungenxa yokuthi sixhume ku-GreenPlum kanyekanye kusuka kuwo wonke ama-node kuqoqo lethu. Futhi ukuze ukufunda okuvela ku-GreenPlum kuhambisane futhi intambo ngayinye ehambisanayo ye-Ab Initio ifunde ingxenye yayo yedatha evela ku-GreenPlum, kudingeke ukuthi sibeke ukwakhiwa okuqondwa ngu-Ab Initio esigabeni β€œlapho” semibuzo ye-SQL.

where ABLOCAL()

futhi inqume inani lalokhu kwakhiwa ngokucacisa ukufundwa kwepharamitha kusuka kudathabheyisi yokuguqulwa

ablocal_expr=Β«string_concat("mod(t.", string_filter_out("{$TABLE_KEY}","{}"), ",", (decimal(3))(number_of_partitions()),")=", (decimal(3))(this_partition()))Β»

, ehlanganisa into efana

mod(sk,10)=3

, i.e. kufanele utshele i-GreenPlum ngesihlungi esicacile sengxenye ngayinye. Kwezinye izingosi zolwazi (i-Teradata, i-Oracle), i-Ab Initio ingenza lokhu kufana ngokuzenzakalela.

Ukuqhathaniswa kokusebenza kwe-Ab Initio phakathi kweHive neGreenPlum

I-Sberbank yenze ukuhlola ukuze iqhathanise ukusebenza kwamagrafu akhiqizwe yi-MDW ngokuphathelene neHive futhi ngokuhlobene neGreenPlum. Njengengxenye yokuhlolwa, endabeni yeHive bekunamanodi angu-5 kuqoqo elifanayo njenge-Ab Initio, futhi esimweni seGreenPlum bekunamanodi angu-4 eqenjini elihlukile. Labo. I-Hive ibe nenzuzo ethile yehadiwe ngaphezu kweGreenPlum.

Sicabangele amapheya amabili amagrafu enza umsebenzi ofanayo wokubuyekeza idatha ku-Hive ne-GreenPlum. Ngaso leso sikhathi, amagrafu akhiqizwe isihleli se-MDW aqalwa:

  • umthwalo wokuqala + umthwalo okhulayo wedatha ekhiqizwa ngokunganaki kuthebula leHive
  • umthwalo wokuqala + umthwalo okhulayo wedatha ekhiqizwa ngokungahleliwe kuthebula elifanayo le-GreenPlum

Kuzo zombili izimo (i-Hive ne-GreenPlum) basebenzise okulayishiwe emicu ehambisanayo engu-10 kuqoqo elifanayo le-Ab Initio. I-Ab Initio ilondoloze idatha emaphakathi ukuze ibalwe ku-HDFS (ngokwe-Ab Initio, isakhiwo se-MFS sisebenzisa i-HDFS sisetshenzisiwe). Umugqa owodwa wedatha ekhiqizwe ngokungahleliwe uthathe amabhayithi angu-200 kuzo zombili izimo.

Umphumela waba kanje:

Isidleke:

Ukulayisha kokuqala ku-Hive

Imigqa ifakiwe
6 000 000
60 000 000
600 000 000

Ubude besikhathi sokuqalisa
okulandwayo ngemizuzwana
41
203
1 601

Ukulayisha okukhuphukayo ku-Hive

Inombolo yemigqa etholakala kuyo
ithebula eliqondiwe ekuqaleni kokuhlolwa
6 000 000
60 000 000
600 000 000

Inombolo yemigqa ye-delta esetshenziswe kuyo
ithebula eliqondiwe phakathi nokuhlolwa
6 000 000
6 000 000
6 000 000

Ubude besikhathi sokukhula
okulandwayo ngemizuzwana
88
299
2 541

I-GreenPlum:

Ukulayisha kokuqala ku-GreenPlum

Imigqa ifakiwe
6 000 000
60 000 000
600 000 000

Ubude besikhathi sokuqalisa
okulandwayo ngemizuzwana
72
360
3 631

Ukulayisha okukhuphukayo ku-GreenPlum

Inombolo yemigqa etholakala kuyo
ithebula eliqondiwe ekuqaleni kokuhlolwa
6 000 000
60 000 000
600 000 000

Inombolo yemigqa ye-delta esetshenziswe kuyo
ithebula eliqondiwe phakathi nokuhlolwa
6 000 000
6 000 000
6 000 000

Ubude besikhathi sokukhula
okulandwayo ngemizuzwana
159
199
321

Siyabona ukuthi isivinini sokulayisha kokuqala kukho kokubili i-Hive ne-GreenPlum sincike ngokuqondile enanini ledatha futhi, ngenxa yezizathu zezingxenyekazi zekhompuyutha ezingcono, ishesha kancane ku-Hive kune-GreenPlum.

Ukulayisha okukhuphukayo ku-Hive nakho kuncike ngokomugqa kumthamo wedatha elayishwe ngaphambilini etholakala kuthebula eliqondiwe futhi kuqhubeka kancane njengoba ivolumu ikhula. Lokhu kubangelwa isidingo sokuphinda ubhale kabusha ithebula eliqondiwe ngokuphelele. Lokhu kusho ukuthi ukusebenzisa izinguquko ezincane kumatafula amakhulu akuyona into enhle yokusebenzisa i-Hive.

Ukulayisha okukhuphukayo ku-GreenPlum kuncike kancane kumthamo wedatha elayishwe ngaphambilini etholakala kuthebula eliqondiwe futhi kuqhubeka ngokushesha okukhulu. Lokhu kwenzeke ngenxa ye-SQL Joins kanye ne-GreenPlum architecture, evumela ukusebenza kokususa.

Ngakho-ke, i-GreenPlum yengeza i-delta isebenzisa indlela yokususa+yokufaka, kodwa i-Hive ayinayo imisebenzi yokususa noma yokuvuselela, ngakho lonke uhlu lwedatha lwaphoqeleka ukuthi lubhalwe kabusha ngokuphelele phakathi nesibuyekezo esikhuphukayo. Ukuqhathaniswa kwamaseli agqanyiswe ngokugqamile kuveza kakhulu, njengoba kuhambisana nenketho evame kakhulu yokusebenzisa ukulanda okusebenzisa insiza. Siyabona ukuthi iGreenPlum ishaye iHive kulokhu kuhlolwa izikhathi eziyi-8.

Ukusebenza kwe-Ab Initio nge-GreenPlum kumodi ye-Near Real Time

Kulesi sivivinyo, sizohlola ikhono lika-Ab Initio lokubuyekeza ithebula le-GreenPlum ngezingcezu zedatha ezikhiqizwa ngokungahleliwe maduze nje. Ake sicabangele ithebula le-GreenPlum dev42_1_db_usl.TESTING_SUBJ_org_finval, esizosebenza ngalo.

Sizosebenzisa amagrafu amathathu e-Ab Initio ukuze sisebenze nawo:

1) Igrafu Create_test_data.mp – idala amafayela edatha ku-HDFS anemigqa engu-10 emicushweni engu-6 efanayo. Idatha ayihleliwe, ukwakheka kwayo kuhlelwe ukuze kufakwe etafuleni lethu

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

2) Igrafu mdw_load.day_one.current.dev42_1_db_usl_testing_subj_org_finval.pset - Igrafu ekhiqizwe i-MDW ngokuqalisa ukufakwa kwedatha kuthebula lethu ngemicu efanayo engu-10 (kusetshenziswa idatha yokuhlola ekhiqizwe igrafu (1))

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

3) Igrafu mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset - igrafu ekhiqizwe i-MDW ukuze kuthuthukiswe ithebula lethu ngezintambo eziyi-10 ezifanayo kusetshenziswa ingxenye yedatha esanda kutholwa (i-delta) ekhiqizwe igrafu (1)

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum

Masiqalise iskripthi esingezansi kumodi ye-NRT:

  • khiqiza imigqa yokuhlola engu-6
  • yenza umthwalo wokuqala faka imigqa yokuhlola engu-6 etafuleni elingenalutho
  • phinda ukulanda okukhuphukayo izikhathi ezi-5
    • khiqiza imigqa yokuhlola engu-6
    • yenza ukufaka okukhulayo kwemigqa yokuhlola engu-6 kuthebula (kulesi simo, isikhathi sokuphelelwa yisikhathi_to_ts sisethelwe kudatha endala futhi idatha yakamuva kakhulu enokhiye ofanayo oyinhloko ifakiwe)

Lesi simo silingisa indlela yokusebenza kwangempela kwesistimu yebhizinisi elithile - ingxenye enkulu kakhulu yedatha entsha ivela ngesikhathi sangempela futhi ithululwe ngokushesha ku-GreenPlum.

Manje ake sibheke ilogi yeskripthi:

Qala Create_test_data.input.pset ngo-2020-06-04 11:49:11
Qedela Create_test_data.input.pset at 2020-06-04 11:49:37
Qala mdw_load.day_one.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:49:37
Qeda mdw_load.day_one.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:50:42
Qala Create_test_data.input.pset ngo-2020-06-04 11:50:42
Qedela Create_test_data.input.pset at 2020-06-04 11:51:06
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:51:06
Qedela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:53:41
Qala Create_test_data.input.pset ngo-2020-06-04 11:53:41
Qedela Create_test_data.input.pset at 2020-06-04 11:54:04
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:54:04
Qedela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:56:51
Qala Create_test_data.input.pset ngo-2020-06-04 11:56:51
Qedela Create_test_data.input.pset at 2020-06-04 11:57:14
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:57:14
Qedela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:59:55
Qala Create_test_data.input.pset ngo-2020-06-04 11:59:55
Qedela Create_test_data.input.pset at 2020-06-04 12:00:23
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 12:00:23
Qedela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 12:03:23
Qala Create_test_data.input.pset ngo-2020-06-04 12:03:23
Qedela Create_test_data.input.pset at 2020-06-04 12:03:49
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 12:03:49
Qedela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 12:06:46

Kuvela lesi sithombe:

Igrafu
Isikhathi sokuqala
Qeda isikhathi
ubude

Create_test_data.input.pset
04.06.2020 11: 49: 11
04.06.2020 11: 49: 37
00:00:26

mdw_load.day_one.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 49: 37
04.06.2020 11: 50: 42
00:01:05

Create_test_data.input.pset
04.06.2020 11: 50: 42
04.06.2020 11: 51: 06
00:00:24

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 51: 06
04.06.2020 11: 53: 41
00:02:35

Create_test_data.input.pset
04.06.2020 11: 53: 41
04.06.2020 11: 54: 04
00:00:23

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 54: 04
04.06.2020 11: 56: 51
00:02:47

Create_test_data.input.pset
04.06.2020 11: 56: 51
04.06.2020 11: 57: 14
00:00:23

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 57: 14
04.06.2020 11: 59: 55
00:02:41

Create_test_data.input.pset
04.06.2020 11: 59: 55
04.06.2020 12: 00: 23
00:00:28

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 12: 00: 23
04.06.2020 12: 03: 23
00:03:00

Create_test_data.input.pset
04.06.2020 12: 03: 23
04.06.2020 12: 03: 49
00:00:26

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 12: 03: 49
04.06.2020 12: 06: 46
00:02:57

Siyabona ukuthi imigqa yokukhuphuka engu-6 icutshungulwa emizuzwini emi-000, okushesha kakhulu.
Idatha kuthebula eliqondiwe ivele yasatshalaliswa kanje:

select valid_from_ts, valid_to_ts, count(1), min(sk), max(sk) from dev42_1_db_usl.TESTING_SUBJ_org_finval group by valid_from_ts, valid_to_ts order by 1,2;

Uma unezikali ze-Sber. Ukusebenzisa i-Ab Initio ngeHive neGreenPlum
Ungabona ukuxhumana kwedatha efakiwe ngezikhathi amagrafu aqaliswe ngazo.
Lokhu kusho ukuthi ungaqalisa ukulayishwa okukhuphukayo kwedatha ku-GreenPlum ku-Ab Initio ngefrikhwensi ephezulu kakhulu futhi ubone isivinini esikhulu sokufaka le datha ku-GreenPlum. Yiqiniso, ngeke kwenzeke ukwethula kanye ngomzuzwana, njengoba i-Ab Initio, njenganoma yiliphi ithuluzi le-ETL, idinga isikhathi "sokuqala" lapho yethulwa.

isiphetho

I-Ab Initio okwamanje isetshenziswa kwa-Sberbank ukwakha Ungqimba Lwedatha Oluhlanganisiwe (ESS). Le phrojekthi ihilela ukwakha inguqulo ehlanganisiwe yesimo sezinkampani zamabhange ahlukahlukene. Ulwazi luvela emithonjeni ehlukahlukene, okufanekisela okulungiselelwe ku-Hadoop. Ngokusekelwe ezidingweni zebhizinisi, imodeli yedatha iyalungiswa futhi ukuguqulwa kwedatha kuyachazwa. I-Ab Initio ilayisha ulwazi ku-ESN futhi idatha elandiwe ayigcini nje ngokuthakasela ibhizinisi ngokwalo, kodwa futhi isebenza njengomthombo wokwakha ama-data marts. Ngasikhathi sinye, ukusebenza komkhiqizo kukuvumela ukuthi usebenzise amasistimu ahlukahlukene njengomamukeli (i-Hive, i-Greenplum, i-Teradata, i-Oracle), okwenza kube lula ukulungiselela idatha yebhizinisi ngamafomethi ahlukahlukene ayidingayo.

Amandla ka-Ab Initio abanzi; isibonelo, uhlaka lwe-MDW olufakiwe lwenza kube nokwenzeka ukwakha idatha yomlando wezobuchwepheshe nebhizinisi ngaphandle kwebhokisi. Konjiniyela, i-Ab Initio yenza kube nokwenzeka ukuthi bangaqambi kabusha isondo, kodwa basebenzise izingxenye eziningi ezisebenzayo ezikhona, okuyimitapo yolwazi edingekayo uma usebenza nedatha.

Umbhali uchwepheshe emphakathini wochwepheshe we-Sberbank SberProfi DWH/BigData. Umphakathi wochwepheshe we-SberProfi DWH/BigData unomthwalo wemfanelo wokuthuthukisa amakhono ezindaweni ezifana ne-Hadoop ecosystem, Teradata, Oracle DB, GreenPlum, kanye namathuluzi e-BI Qlik, SAP BO, Tableau, njll.

Source: www.habr.com

Engeza amazwana