Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

Ngexesha elidlulileyo, sajongana nombuzo wokukhetha isixhobo se-ETL sokusebenza neBigData. Isisombululo se-Informatica BDM esasetyenziswa ngaphambili asizange sivumelane nathi ngenxa yokusebenza okulinganiselweyo. Ukusetyenziswa kwayo kuncitshiswe kwisakhelo sokusebenzisa imiyalelo yokuthumela intlantsi. Kwakungekho zifaniso ezininzi kwiimarike, ngokomgaqo, ezikwaziyo ukusebenza kunye nenani ledatha esijongene nayo yonke imihla. Ekugqibeleni, sakhetha u-Ab Initio. Ngexesha lemiboniso yokulinga, imveliso ibonise isantya esiphezulu kakhulu sokucubungula idatha. Kukho phantse akukho lwazi malunga no-Ab Initio ngesiRashiya, ngoko sagqiba ekubeni sithethe ngamava ethu kuHabrΓ©.

I-Ab Initio ineenguqu ezininzi zakudala nezingaqhelekanga, ikhowudi enokuthi yandiswe kusetyenziswa ulwimi lwayo lwePDL. Kwishishini elincinane, isixhobo esinjalo esinamandla sisenokungafuneki, kwaye uninzi lweempawu zalo zinokubiza kwaye zingafunwa. Kodwa ukuba isikali sakho sisondele kweso seSber, ngoko unokuba nomdla kwi-Ab Initio.

Inceda ishishini ukuba liqokelele ulwazi kwihlabathi liphela kwaye liphuhlise inkqubo yendalo, kunye nomphuhlisi ukuba baphucule izakhono zabo kwi-ETL, baphucule ulwazi kwiqokobhe, banike ithuba lokwazi kakuhle ulwimi lwePDL, banike umfanekiso obonakalayo weenkqubo zokulayisha, kunye nokwenza lula uphuhliso. ngenxa yobuninzi bamacandelo asebenzayo.

Kwisithuba ndiza kuthetha ngezakhono ze-Ab Initio kwaye ndinike iimpawu zokuthelekisa umsebenzi wayo kunye neHive kunye neGreenPlum.

  • Inkcazo yesakhelo se-MDW kunye nokusebenza kuhlengahlengiso lwayo lweGreenPlum
  • Uthelekiso lweNtsebenzo lwe-Ab Initio kunye neHive kunye neGreenPlum
  • I-Ab Initio isebenza neGreenPlum kwimowudi yeXesha lokwenyani elikufuphi


Ukusebenza kwale mveliso kubanzi kakhulu kwaye kufuna ixesha elininzi lokufunda. Nangona kunjalo, ngezakhono ezifanelekileyo zokusebenza kunye nezicwangciso zokusebenza ezifanelekileyo, iziphumo zokucwangcisa idatha ziyamangalisa kakhulu. Ukusebenzisa i-Ab Initio kumphuhlisi unokumnika amava anomdla. Kukuthatha okutsha kuphuhliso lwe-ETL, i-hybrid phakathi kwendawo ebonakalayo kunye nophuhliso lokukhuphela ngolwimi olufana nescript.

Ishishini liphuhlisa inkqubo yendalo kwaye esi sixhobo siluncedo ngakumbi kunangaphambili. Nge-Ab Initio, unokuqokelela ulwazi malunga neshishini lakho langoku kwaye usebenzise olu lwazi ukwandisa amadala kunye nokuvula amashishini amatsha. Iindlela ezizezinye kwi-Ab Initio zingabizwa ukusuka kwindawo yophuhliso olubonakalayo lwe-Informatica BDM kunye nakwiindawo ezingabonakaliyo - i-Apache Spark.

Inkcazo ye-Ab Initio

U-Ab Initio, njengezinye izixhobo ze-ETL, iseti yeemveliso.

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

U-Ab Initio GDE (Imekobume yoPhuhliso loMzobo) yindawo yomphuhlisi apho amisela khona ukuguqulwa kwedatha kwaye adibanise kunye nemijelo yedatha ngendlela yeentolo. Ngapha koko, iseti enjalo yotshintsho ibizwa ngokuba yigrafu:

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

Igalelo kunye nemveliso yodibaniso lwamacandelo asebenzayo ngamazibuko kwaye aqulethe imihlaba ebalwe ngaphakathi kweenguqu. Iigrafu ezininzi ezidityaniswe ngokuqukuqela ngendlela yeentolo ngokolandelelwano lokwenziwa kwazo zibizwa ngokuba yiplani.

Kukho amakhulu aliqela amacandelo asebenzayo, amaninzi. Uninzi lwazo lukhethekileyo. Amathuba okuguqulwa kweklasiki kwi-Ab Initio ibanzi kunezinye izixhobo ze-ETL. Umzekelo, Dibanisa ineziphumo ezininzi. Ukongeza kwisiphumo sokuxhuma iiseti zedatha, unokufumana iirekhodi zedatha yegalelo kwimveliso, izitshixo ezingenakudityaniswa. Unokufumana kwakhona ukulahlwa, iimpazamo kunye nelogi yomsebenzi wenguqu, enokufundwa kwikholamu enye njengefayile yombhalo kwaye iqhutywe ngolunye utshintsho:

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

Okanye, umzekelo, unokwenza umntu ofumana idatha ngendlela yetafile kwaye ufunde idatha kuyo kwikholamu efanayo.

Kukho iinguqu zokuqala. Umzekelo, utshintsho lweSkena lunomsebenzi ofanayo njengemisebenzi yohlalutyo. Kukho iinguqu ezinamagama axelayo: Yakha iDatha, Funda i-Excel, uQinisekiso, Hlela ngaphakathi kwamaQela, iNkqubo yeNkqubo, iSebenzisa iSQL, Joyina neDB, njl. inkqubo . Iifayile ezineseti esele zenziwe iiparameters ezigqithiselwe kwigrafu zibizwa ngokuba ziisethi zeparameter (sets).

Njengoko bekulindelekile, i-Ab Initio GDE inovimba wayo obizwa ngokuba yi-EME (Imekobume ye-Enterprise Meta). Abaphuhlisi banethuba lokusebenza kunye neenguqulelo zendawo zekhowudi kwaye bajonge ekuphuhliseni kwabo kwindawo yokugcina indawo.

Kunokwenzeka ngexesha lokuphunyezwa okanye emva kokuphunyezwa kwegrafu ukucofa kuwo nawuphi na umlambo odibanisa ukuguqulwa kwaye ujonge idatha edluliselwe phakathi kwezi nguqulelo:

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

Kwakhona kunokwenzeka ukucofa kuwo nawuphi na umjelo kwaye ubone iinkcukacha zokulandelela - zingaphi ukuhambelana kwenguqu esetyenzisiweyo, zingaphi iilayini kunye nee-bytes ezilayishwe kuyo nayiphi na imilinganiselo:

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

Kunokwenzeka ukwahlula ukuphunyezwa kwegrafu ngokwezigaba kwaye kuphawulwe ukuba ezinye iinguqu kufuneka zenziwe kuqala (kwisigaba se-zero), esilandelayo kwisigaba sokuqala, esilandelayo kwisigaba sesibini, njl.

Kwinguqu nganye, unokukhetha okubizwa ngokuba yi-layout (apho iya kwenziwa khona): ngaphandle kokuhambelana okanye kwimijelo ehambelanayo, inani elinokuthi lichazwe. Ngexesha elifanayo, iifayile zesikhashana ezenziwa ngu-Ab Initio ngexesha lokuguqulwa zingafakwa zombini kwifayile yefayile ye-server kunye ne-HDFS.

Kuguqulo ngalunye, olusekwe kwithempleyithi engagqibekanga, ungenza eyakho iPDL iskripthi, esifana neqokobhe.

Nge-PDL, unokwandisa ukusebenza kweenguqu kwaye, ngokukodwa, unokwenza ngokuguquguqukayo (ngexesha lokuqhuba) ukuvelisa iziqwengana zekhowudi ezingafanelekanga ngokuxhomekeke kwiiparamitha zexesha lokusebenza.

Kwakhona kwi-Ab Initio, ukudityaniswa ne-OS ngeqokobhe kuphuhliswe kakuhle. Ngokukodwa, iSberbank isebenzisa ilinux ksh. Ungatshintshisa izinto eziguquguqukayo kunye neqokobhe kwaye uzisebenzise njengeparamitha zegrafu. Ungafowunela ukwenziwa kweegrafu ze-Ab Initio kwiqokobhe kwaye ulawule u-Ab Initio.

Ukongeza kwi-Ab Initio GDE, ukuhanjiswa kubandakanya ezinye iimveliso ezininzi. IneCo>Operation System yayo enebango elibizwa ngokuba yinkqubo yokusebenza. Kukho Ulawulo> Iziko, apho unokucwangcisa kwaye ubeke iliso kokukhuphela. Kukho iimveliso zokwenza uphuhliso kwinqanaba langaphambili kune-Ab Initio GDE ivumelayo.

Inkcazo yesakhelo se-MDW kunye nokusebenza kuhlengahlengiso lwayo lweGreenPlum

Kunye neemveliso zayo, umthengisi unikezela ngemveliso ye-MDW (Metadata Driven Warehouse), engumlungiseleli wegrafu olungiselelwe ukunceda ngemisebenzi eqhelekileyo yokugcwala kwiindawo zokugcina idatha okanye iivaults zedatha.

Iqulethe isiko (iprojekthi ethile) abahlalutyi bemetadata kunye neejenereyitha zekhowudi eziphuma ngaphandle kwebhokisi.

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum
Kwigalelo, i-MDW ifumana imodeli yedatha, ifayile yoqwalaselo yokuseta uqhagamshelo kwisiseko sedatha (Oracle, Teradata okanye Hive) kunye nezinye izicwangciso. Inxalenye yeprojekthi ethile, umzekelo, ihambisa imodeli kwisiseko sedatha. Inxalenye yebhokisi yemveliso ivelisa iigrafu kunye neefayile zokucwangcisa kubo ngokulayisha idatha kwiitafile zemodeli. Oku kudala iigrafu (kunye neesethi) kwiindlela ezininzi zokuqalisa kunye nomsebenzi owongezelelekileyo kumaziko okuhlaziya.

Kwiimeko ze-Hive kunye ne-RDBMS, iigrafu ezahlukeneyo zenziwa ukuze kuqaliswe kunye nokuhlaziywa kwedatha.

Kwimeko yeHive, idatha ye-delta engenayo idibaniswa ngu-Ab Initio Joyina kunye nedatha eyayisetafileni ngaphambi kohlaziyo. Abalayishi bedatha kwi-MDW (zombini kwi-Hive kunye ne-RDBMS) abafaki kuphela idatha entsha evela kwi-delta, kodwa bavale amaxesha asemthethweni edatha apho izitshixo eziphambili zifumene i-delta. Ukongeza, kufuneka ubhale kwakhona inxalenye engatshintshwanga yedatha. Kodwa kuya kufuneka wenze oku, kuba iHive ayinalo ukucima okanye ukuhlaziya imisebenzi.

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

Kwimeko ye-RDBMS, iigrafu zohlaziyo lwedatha eyongeziweyo zibukeka zilunge ngakumbi, kuba i-RDBMS inamandla okuhlaziya okwenene.

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

I-delta engenayo ilayishwa kwitafile ephakathi kwisiseko sedatha. Emva koko, i-delta ijoyina kunye nedatha eyayisetafileni ngaphambi kohlaziyo. Kwaye oku kwenziwa yimikhosi yeSQL ngombuzo weSQL owenziweyo. Emva koko, usebenzisa ukucima + faka imiyalelo ye-SQL, ukufaka idatha entsha kwi-delta kwitheyibhile ekujoliswe kuyo kwaye uvale amaxesha okubaluleka kwedatha kwizitshixo eziphambili apho i-delta ifunyenwe.
Idatha engatshintshwanga ayifuni kubhalwa ngaphezulu.

Ngaloo ndlela, safikelela kwisigqibo sokuba kwimeko yeHive, i-MDW kufuneka iphinde ibhale itafile yonke, kuba iHive ayinalo umsebenzi wokuhlaziya. Kwaye akukho nto ingcono kunokubhalwa kwakhona ngokupheleleyo kwedatha ngexesha lohlaziyo lwenziwe. Kwimeko ye-RDBMS, ngokuchaseneyo, abadali bemveliso babone kufanelekile ukuthembela ukudibanisa kunye nokuhlaziywa kweetafile ekusebenziseni i-SQL.

Kwiprojekthi e-Sberbank, senze umiliselo olutsha olunokusetyenziswa kwakhona komlayishi wedatha yeGreenPlum. Oku kwenziwa ngokusekelwe kuguqulelo oluveliswa yi-MDW yeTeradata. KwakuyiTeradata, kwaye kungekhona i-Oracle, eyafika ingcono kwaye isondele kakhulu kule nto, kuba ikwayinkqubo yeMPP. Iindlela zokusebenza, kunye ne-syntax yeTeradata kunye neGreenPlum yajika yaba kufutshane.

Imizekelo yeyantlukwano ebalulekileyo ye-MDW phakathi kwe-RDBMS eyahlukeneyo yile ilandelayo. KwiGreenPlum, ngokungafaniyo neTeradata, xa udala iitafile, kufuneka ubhale igatya

distributed by

KwiTeradata babhala

delete <table> all

, kwaye kwiGreenPlum babhala

delete from <table>

Kwi-Oracle, ukuze kuphuculwe, babhala

delete from t where rowid in (<соСдинСниС t с Π΄Π΅Π»ΡŒΡ‚ΠΎΠΉ>)

, kwaye kwiTeradata kunye neGreenPlum babhala

delete from t where exists (select * from delta where delta.pk=t.pk)

Sikwaqaphela ukuba ukuze u-Ab Initio asebenze neGreenPlum, bekuyimfuneko ukuba kufakwe umxhasi weGreenPlum kuzo zonke iindawo zeqela le-Ab Initio. Oku kungenxa yokuba siqhagamshele kwiGreenPlum ngaxeshanye ukusuka kuzo zonke iindawo ezikwiqela lethu. Kwaye ukuze ufundo oluvela kwiGreenPlum luhambelane kunye nentambo nganye ye-Ab Initio yokufunda inxalenye yayo yedatha esuka kwiGreenPlum, kwakuyimfuneko ukubeka isakhiwo esiqondwa ngu-Ab Initio kwicandelo "apho" lemibuzo yeSQL.

where ABLOCAL()

kwaye uqikelele ixabiso lolwakhiwo ngokucacisa ukufundwa kweparamitha yenguqu kwisiseko sedatha

ablocal_expr=Β«string_concat("mod(t.", string_filter_out("{$TABLE_KEY}","{}"), ",", (decimal(3))(number_of_partitions()),")=", (decimal(3))(this_partition()))Β»

, eqokelela into efana

mod(sk,10)=3

, o.k. kufuneka uxelele iGreenPlum isihluzi esicacileyo sesahlulelo ngasinye. Kwezinye iindawo zogcino-lwazi (Teradata, Oracle), Ab Initio inokwenza oku kudityaniswa ngokuzenzekelayo.

Uthelekiso lweNtsebenzo lwe-Ab Initio kunye neHive kunye neGreenPlum

Uvavanyo lwenziwe kwi-Sberbank ukuthelekisa ukusebenza kweegrafu eziveliswe yi-MDW ngokumalunga neHive kwaye ngokumalunga neGreenPlum. Njengenxalenye yovavanyo, kwimeko yeHive kwakukho iindawo ezi-5 kwiqela elifanayo ne-Ab Initio, kwaye kwimeko yeGreenPlum kwakukho iindawo ezi-4 kwiqela elihlukeneyo. Ezo. IHive yayinomphetho ngaphezulu kweGreenPlum ngokwehardware.

Izibini ezimbini zeegrafu ziye zaqwalaselwa ukuba zenza umsebenzi ofanayo wokuhlaziya idatha kwiHive kunye neGreenPlum. Kwangelo xesha, iigrafu eziveliswe ngumqwalaseli we-MDW zaphehlelelwa:

  • ukuqalisa komthwalo + ukulayishwa okongeziweyo kwedatha eyenziwe ngokungenamkhethe kwitafile yeHive
  • ukuqalisa komthwalo + ukulayishwa okongeziweyo kwedatha eyenziwe ngokungenamkhethe kwitafile efanayo yeGreenPlum

Kuzo zombini ezi meko (iHive kunye neGreenPlum), ukhuphelo lwaqhutywa kwimisonto ehambelanayo eli-10 kwiqela elinye le-Ab Initio. I-Ab Initio igcine idatha ephakathi ekubaleni kwi-HDFS (ngokwe-Ab Initio, i-MFS layout esebenzisa i-HDFS isetyenzisiwe). Umgca omnye wedatha eveliswe ngokungacwangciswanga ithathe i-200 bytes kuzo zombini iimeko.

Isiphumo sesi:

Isidleke:

Ukuqalisa umthwalo kwiHive

Kufakwe imiqolo
6 000 000
60 000 000
600 000 000

Ubude bexesha lokuqalisa
ukhuphelo ngemizuzwana
41
203
1 601

Ukulayisha okunyukayo kwiHive

Inani lemigca kwi
itheyibhile ekujoliswe kuyo ekuqaleni kovavanyo
6 000 000
60 000 000
600 000 000

Inani lemiqolo yedelta efakwe kuyo
itheyibhile ekujoliswe kuyo ngexesha lovavanyo
6 000 000
6 000 000
6 000 000

Ubude bexesha
ukhuphelo ngemizuzwana
88
299
2 541

Iplamu eluhlaza:

Ukuqaliswa kwe-boot kwiGreenPlum

Kufakwe imiqolo
6 000 000
60 000 000
600 000 000

Ubude bexesha lokuqalisa
ukhuphelo ngemizuzwana
72
360
3 631

Ukhuphelo olongezelelweyo kwiGreenPlum

Inani lemigca kwi
itheyibhile ekujoliswe kuyo ekuqaleni kovavanyo
6 000 000
60 000 000
600 000 000

Inani lemiqolo yedelta efakwe kuyo
itheyibhile ekujoliswe kuyo ngexesha lovavanyo
6 000 000
6 000 000
6 000 000

Ubude bexesha
ukhuphelo ngemizuzwana
159
199
321

Siyabona ukuba isantya somthwalo wokuqalisa kuzo zombini iHive kunye neGreenPlum ixhomekeke ngokuthe ngqo kwisixa sedatha kwaye, ngenxa yezizathu zehardware engcono, ikhawuleza ngandlel’ ithile kwiHive kuneGreenPlum.

Ukulayisha okunyukayo kwiHive nako kuxhomekeke kumyinge wedatha ebilayishwe ngaphambili kwitheyibhile ekujoliswe kuyo kwaye icotha kakhulu njengoko isixa sikhula. Oku kungenxa yesidingo sokubhala ngaphezulu kwetafile ekujoliswe kuyo ngokupheleleyo. Oku kuthetha ukuba ukusebenzisa utshintsho oluncinci kwiitafile ezinkulu ayisiyonto ilungileyo yokusetyenziswa kweHive.

Ukulayisha okunyukayo kwiGreenPlum kuxhomekeke kubungakanani bedatha elayishwe ngaphambili kwitheyibhile ekujoliswe kuyo kwaye ikhawuleza kakhulu. Kuvele ukubulela kwiSQL Joins kunye neGreenPlum architecture, evumela umsebenzi wokucima.

Ke, iGreenPlum itofa i-delta isebenzisa indlela yokucima + yokufaka, kwaye iHive ayinalo ukucima okanye ukuhlaziya imisebenzi, ngoko ke lonke uluhlu lwedatha kwafuneka lubhalwe ngokutsha ngokupheleleyo ngexesha lohlaziyo olongezelelekileyo. Uthelekiso lweeseli ezigxininiswe ngokungqindilili yeyona nto ibonakalisayo, njengoko ihambelana neyona ndlela isetyenziswayo yokhuphelo olumandla. Siyabona ukuba iGreenPlum ibethe iHive kolu vavanyo ngamaxesha ayi-8.

I-Ab Initio isebenza neGreenPlum kwimowudi yeXesha lokwenyani elikufuphi

Kolu vavanyo, siya kuvavanya amandla ka-Ab Initio ukuhlaziya itafile yeGreenPlum kunye neechunks zedatha ezenziwe ngokungakhethiyo ngexesha elikufutshane. Cinga ngeGreenPlum dev42_1_db_usl.TESTING_SUBJ_org_finval table ukuze usebenze nayo.

Siza kusebenzisa iigrafu ezintathu ze-Ab Initio ukusebenza nayo:

1) Bala Create_test_data.mp - yenza iifayile ezinedatha kwi-HDFS ye-10 imigca kwi-6 parallel threads. Idatha ayikhethi, icwangciswe ukuba ifakwe kwitafile yethu

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

2) Igrafu mdw_load.day_one.current.dev42_1_db_usl_testing_subj_org_finval.pset - igrafu eyenziwe yi-MDW ngokuqalisa ukufakwa kwedatha kwitheyibhile yethu kwimicu ye-10 ehambelanayo (idatha yovavanyo eyenziwa yigrafu (1) isetyenziswa)

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

3) Igrafu mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset - igrafu eveliswa yi-MDW yokuhlaziywa ngokunyuka kwetafile yethu kwimicu ye-10 ehambelanayo usebenzisa inxalenye yedatha entsha engenayo (i-delta) eyenziwe yigrafu (1)

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum

Qhuba okushicilelweyo kulandelayo kwimo ye-NRT:

  • velisa i-6 imitya yovavanyo
  • yenza umthwalo wokuqala ufake 6 imiqolo yovavanyo kwitafile engenanto
  • phinda umthwalo owandayo amaxesha ama-5
    • velisa i-6 imitya yovavanyo
    • yenza ufakelo olongezelekayo lwe-6 yemigca yovavanyo kwitheyibhile (kulo mzekelo, idatha endala inikwa ixesha lokuphelelwa elisebenzayo kunye nedatha yamva nje ifakwe kwanesitshixo esiphambili esifanayo)

Imeko enjalo ixelisa indlela yokusebenza yokwenyani yenkqubo yeshishini elithile - inxalenye enkulu yedatha entsha ibonakala ngexesha lokwenyani kwaye ngokukhawuleza ingena kwiGreenPlum.

Ngoku makhe sibone ilogi yeskripthi:

Qala Create_test_data.input.pset ngo-2020-06-04 11:49:11
Gqiba Create_test_data.input.pset at 2020-06-04 11:49:37
Qala mdw_load.day_one.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:49:37
Gqiba mdw_load.day_one.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:50:42
Qala Create_test_data.input.pset ngo-2020-06-04 11:50:42
Gqiba Create_test_data.input.pset at 2020-06-04 11:51:06
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:51:06
Gqibezela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:53:41
Qala Create_test_data.input.pset ngo-2020-06-04 11:53:41
Gqiba Create_test_data.input.pset at 2020-06-04 11:54:04
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:54:04
Gqibezela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:56:51
Qala Create_test_data.input.pset ngo-2020-06-04 11:56:51
Gqiba Create_test_data.input.pset at 2020-06-04 11:57:14
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:57:14
Gqibezela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 11:59:55
Qala Create_test_data.input.pset ngo-2020-06-04 11:59:55
Gqiba Create_test_data.input.pset at 2020-06-04 12:00:23
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 12:00:23
Gqibezela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 12:03:23
Qala Create_test_data.input.pset ngo-2020-06-04 12:03:23
Gqiba Create_test_data.input.pset at 2020-06-04 12:03:49
Qala mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 12:03:49
Gqibezela mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset ngo-2020-06-04 12:06:46

Kuvela lo mfanekiso:

Igrafu
Ixesha lo kuqala
Ixesha lokugqiba
ubude

Create_test_data.input.pset
04.06.2020 11: 49: 11
04.06.2020 11: 49: 37
00:00:26

mdw_load.day_one.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 49: 37
04.06.2020 11: 50: 42
00:01:05

Create_test_data.input.pset
04.06.2020 11: 50: 42
04.06.2020 11: 51: 06
00:00:24

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 51: 06
04.06.2020 11: 53: 41
00:02:35

Create_test_data.input.pset
04.06.2020 11: 53: 41
04.06.2020 11: 54: 04
00:00:23

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 54: 04
04.06.2020 11: 56: 51
00:02:47

Create_test_data.input.pset
04.06.2020 11: 56: 51
04.06.2020 11: 57: 14
00:00:23

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 57: 14
04.06.2020 11: 59: 55
00:02:41

Create_test_data.input.pset
04.06.2020 11: 59: 55
04.06.2020 12: 00: 23
00:00:28

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 12: 00: 23
04.06.2020 12: 03: 23
00:03:00

Create_test_data.input.pset
04.06.2020 12: 03: 23
04.06.2020 12: 03: 49
00:00:26

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 12: 03: 49
04.06.2020 12: 06: 46
00:02:57

Siyabona ukuba i-6 imiqolo yokunyuswa icutshungulwa kwimizuzu emi-000, ekhawuleza kakhulu.
Idatha kwitheyibhile ekujoliswe kuyo isasazwe ngolu hlobo lulandelayo:

select valid_from_ts, valid_to_ts, count(1), min(sk), max(sk) from dev42_1_db_usl.TESTING_SUBJ_org_finval group by valid_from_ts, valid_to_ts order by 1,2;

Xa une Sber izikali. Ukusebenzisa i-Ab Initio kunye neHive kunye neGreenPlum
Uyakwazi ukubona imbalelwano yedatha efakiweyo kumaxesha xa iigrafu zaqaliswa.
Oku kuthetha ukuba unokubaleka kwi-Ab Initio eyongezelelekileyo yokulayisha idatha kwiGreenPlum kunye nefrikhwensi ephezulu kakhulu kwaye ujonge isantya esiphezulu sokufaka le datha kwiGreenPlum. Ngokuqinisekileyo, akuyi kusebenza ukuqhuba kanye okwesibini, ekubeni i-Ab Initio, njengaso nasiphi na isixhobo se-ETL, ithatha ixesha "lokwakha" xa iqaliswe.

isiphelo

Ngoku i-Ab Initio isetyenziswe kwi-Sberbank ukwakha i-Unified Semantic Data Layer (ESS). Le projekthi ibandakanya ukwakhiwa kwenguqulelo enye yemeko yamaqumrhu ahlukeneyo oshishino lweebhanki. Ulwazi luvela kwimithombo eyahlukeneyo, iikopi zazo zilungiswa kwiHadoop. Ngokusekelwe kwiimfuno zeshishini, imodeli yedatha ilungiselelwe kwaye ukuguqulwa kwedatha kuchazwe. U-Ab Initio ulayisha ulwazi kwi-ECC kwaye idatha elayishiweyo ayinomdla kuphela kwishishini ngokwalo, kodwa isebenza njengomthombo wokwakha i-data data. Ngelo xesha, ukusebenza kwemveliso kuvumela ukusebenzisa iinkqubo ezahlukeneyo (i-Hive, iGreenplum, iTeradata, i-Oracle) njengommkeli, okwenza kube lula ukulungiselela idatha yezoshishino kwiifomathi ezahlukeneyo ezifunwayo ngaphandle komzamo omkhulu.

Amathuba e-Ab Initio abanzi, umzekelo, isakhelo se-MDW esiqhotyoshelweyo senza ukuba kube lula ukwakha idatha yembali yobugcisa kunye neshishini ngaphandle kwebhokisi. Kubaphuhlisi, u-Ab Initio wenza ukuba "ungaphinde usungule ivili", kodwa ukusebenzisa amaninzi amacandelo asebenzayo akhoyo, ngokwenene ngamathala eencwadi afunekayo xa usebenza ngedatha.

Umbhali uyingcali yoluntu oluchwephesha lweSberbank SberProfi DWH/BigData. Uluntu lwe-SberProfi DWH / BigData luxanduva lokuphuhlisa ubuchule kwiindawo ezifana ne-Hadoop ecosystem, iTeradata, i-Oracle DB, i-GreenPlum, kunye ne-BI izixhobo ze-Qlik, i-SAP BO, i-Tableau, njl.

umthombo: www.habr.com

Yongeza izimvo