Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Imwe nguva yapfuura, takatarisana nemubvunzo wekusarudza chishandiso cheETL chekushanda neBig Data. Iyo yakamboshandiswa Informatica BDM mhinduro haina kuenderana nesu nekuda kwekuita kushoma. Kushandiswa kwayo kwakaderedzwa kuita chimiro chekutanga spark-submit mirairo. Pakanga pasina akawanda analogues pamusika ayo aive, musimboti, aikwanisa kushanda nehuwandu hwe data yatinobata nayo zuva rega rega. Pakupedzisira takasarudza Ab Initio. Panguva yekuratidzira kwemutyairi, chigadzirwa chakaratidza yakakwira zvakanyanya data processing speed. Iko kunenge kusina ruzivo nezve Ab Initio muchiRussia, saka takasarudza kutaura nezve chiitiko chedu paHabrΓ©.

Ab Initio ine akawanda echinyakare uye asina kujairika shanduko, iyo kodhi inogona kuwedzerwa uchishandisa yayo yega PDL mutauro. Kune bhizinesi diki, chishandiso chine simba chakadaro chingangove chakawandisa, uye mazhinji ezvaanogona anogona kudhura uye kusashandiswa. Asi kana chiyero chako chiri pedyo neSberov's, ipapo Ab Initio inogona kunakidza kwauri.

Zvinobatsira bhizinesi kuunganidza ruzivo pasi rose uye kugadzira ecosystem, uye mugadziri kuti avandudze hunyanzvi hwake muETL, kuvandudza ruzivo rwake mugomba, inopa mukana wekuziva mutauro wePDL, inopa mufananidzo unooneka wekurodha maitiro, uye inorerutsa budiriro. nekuda kwekuwanda kwezvinhu zvinoshanda.

Mune ino post ini ndichataura nezve kugona kweAb Initio uye nekupa kuenzanisa maitiro ebasa rayo neHive uye GreenPlum.

  • Tsanangudzo yeMDW chimiro uye shanda pakugadzirisa kwayo kweGreenPlum
  • Ab Initio kuenzanisa kwekuita pakati peHive neGreenPlum
  • Kushanda Ab Initio neGreenPlum mune Pedyo Nenguva Chaiyo modhi


Kushanda kwechigadzirwa ichi kwakakura kwazvo uye kunoda nguva yakawanda yekudzidza. Zvisinei, neunyanzvi hwebasa hwakakodzera uye zvigadziro zvekushanda zvakanaka, migumisiro yekugadzira data inoshamisa zvikuru. Kushandisa Ab Initio yekuvandudza kunogona kupa chiitiko chinonakidza. Uku ndiko kutora kutsva pakuvandudza ETL, musanganiswa pakati penzvimbo yekuona uye kurodha pasi mumutauro wakaita segwara.

Mabhizinesi ari kuvandudza ecosystems uye chishandiso ichi chinouya zvakanyanya kupfuura nakare kose. NeAb Initio, unogona kuunganidza ruzivo nezve bhizinesi rako razvino uye kushandisa ruzivo urwu kuwedzera ekare uye kuvhura mabhizinesi matsva. Dzimwe nzira dzeAb Initio dzinosanganisira nharaunda yekusimudzira yekuona Informatica BDM uye isingaonekwi yekusimudzira nharaunda Apache Spark.

Tsanangudzo yeAb Initio

Ab Initio, semamwe maturusi eETL, muunganidzwa wezvigadzirwa.

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Ab Initio GDE (Graphical Development Environment) inzvimbo yemugadziri umo anogadzirisa shanduko yedata uye anoabatanidza nekuyerera kwedata muchimiro chemiseve. Muchiitiko ichi, seti yakadaro yeshanduko inonzi graph:

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Iyo yekupinza uye yekubuda yekubatanidza yezvinoshanda zvikamu zviteshi uye zvine minda yakaverengerwa mukati meshanduko. Magirafu akati wandei akabatana nekuyerera ari muchimiro chemiseve muhurongwa hwekuurayiwa kwawo anonzi chirongwa.

Kune mazana akati wandei anoshanda zvikamu, izvo zvakawanda. Vazhinji vavo vakanyanya hunyanzvi. Iko kugona kwechinyakare shanduko muAb Initio yakakura kupfuura mune mamwe maturusi eETL. Semuenzaniso, Join ine zvakawanda zvinobuda. Pamusoro pemhedzisiro yekubatanidza dhatabheti, unogona kuwana marekodhi anobuda emadhata ekuisa ayo makiyi aisakwanisa kubatana. Iwe unogona zvakare kuwana kurambwa, zvikanganiso uye logi yekushandura mashandiro, ayo anogona kuverengerwa mukoramu imwechete seyefaira faira uye kugadziriswa nedzimwe shanduko:

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Kana, semuenzaniso, iwe unogona kupfekedza munhu anogashira data muchimiro chetafura uye woverenga data kubva mairi mune imwecheteyo column.

Kune shanduko yepakutanga. Semuenzaniso, shanduko yeScan ine basa rakafanana nemabasa ekuongorora. Kune shanduko dzine mazita anozvitsanangura: Gadzira Dhata, Verenga Excel, Normalize, Ronga mukati meMapoka, Run Chirongwa, Run SQL, Joinha neDB, nezvimwe. Magirafu anogona kushandisa maparameter enguva, kusanganisira mukana wekupfuura paramita kubva kana kuenda iyo inoshanda sisitimu. Mafaira ane akagadzirira-akagadzirwa seti yeparamita yakapfuudzwa kune girafu inonzi parameter seti (psets).

Sezvinotarisirwa, Ab Initio GDE ine repository yayo inonzi EME (Enterprise Meta Environment). Vagadziri vane mukana wekushanda neshanduro dzenzvimbo dzekodhi uye tarisa mukuvandudza kwavo mukati mepakati repository.

Zvinogoneka, panguva yekuuraya kana mushure mekuita girafu, kudzvanya pane chero kuyerera kunobatanidza shanduko uye kutarisa data rakapfuura pakati peshanduko idzi:

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Izvo zvakare zvinogoneka kudzvanya pane chero rukova uye kuona ruzivo rwekuteedzera - mangani akafananidzwa neshanduko yakashanda mukati, mangani mitsara nemabyte akaiswa mune ipi yekufanana:

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Zvinokwanisika kupatsanura kuurayiwa kwegirafu muzvikamu uye chiratidzo chekuti dzimwe shanduko dzinoda kuitwa kutanga (muchikamu che zero), dzinotevera muchikamu chekutanga, dzinotevera muchikamu chechipiri, nezvimwe.

Pakuchinja kwega kwega, unogona kusarudza iyo inonzi marongerwo (apo ichaitwa): pasina kufanana kana mune tambo dzakafanana, iyo nhamba inogona kutsanangurwa. Panguva imwecheteyo, mafaira enguva pfupi anogadzirwa naAb Initio kana shanduko ichishanda inogona kuiswa musevha faira system uye muHDFS.

Mukushandurwa kwega kwega, zvichibva pane yakasarudzika template, unogona kugadzira yako script muPDL, inova senge goko.

NePDL iwe unogona kuwedzera kushanda kweshanduko uye, kunyanya, iwe unogona dynamically (panguva yekumhanya) kugadzira zvimedu zvekodhi zvimedu zvinoenderana nenguva yekumhanya.

Ab Initio zvakare ine yakagadziridzwa kubatanidzwa neOS kuburikidza negoko. Kunyanya, Sberbank inoshandisa linux ksh. Iwe unogona kuchinjanisa zvinosiyana negoko uye wozvishandisa sema graph paramita. Iwe unogona kufonera kuurayiwa kweAb Initio magirafu kubva kugomba uye kutonga Ab Initio.

Pamusoro peAb Initio GDE, zvimwe zvakawanda zvigadzirwa zvinosanganisirwa mukuunza. Kune yayo yeCo>Operation System ine inonzi inodaidzwa kuti inoshanda system. Kune Kudzora> Center kwaunogona kuronga uye kutarisa kurodha kuyerera. Pane zvigadzirwa zvekuita budiriro padanho rekare kupfuura Ab Initio GDE inobvumira.

Tsanangudzo yeMDW chimiro uye shanda pakugadzirisa kwayo kweGreenPlum

Pamwe chete nezvigadzirwa zvayo, mutengesi anopa MDW (Metadata Driven Warehouse) chigadzirwa, inova girafu gadziriso yakagadzirirwa kubatsira neyakajairwa mabasa ekuzadza matura edata kana mavhavha edhata.

Iyo ine tsika (yakanangana nechirongwa) metadata parsers uye akagadzirira-akagadzirwa kodhi jenareta kunze kwebhokisi.

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum
Sekuisa, MDW inogamuchira modhi yedata, faira yekumisikidza yekumisikidza chinongedzo kune dhatabhesi (Oracle, Teradata kana Hive) uye mamwe marongero. Iyo purojekiti-chaiyo chikamu, semuenzaniso, inoendesa modhi kune dhatabhesi. Iyo yekunze-ye-ye-bhokisi chikamu chechigadzirwa chinoburitsa magirafu uye mafaera ekugadzirisa kwavari nekuisa data mumatafura emuenzaniso. Muchiitiko ichi, magirafu (uye psets) anogadzirwa kune akati wandei maitiro ekutanga uye ekuwedzera basa pakuvandudza masangano.

Mune zviitiko zveHive uye RDBMS, magirafu akasiyana anogadzirwa kuti atange uye awedzere data inogadziridza.

Panyaya yeHive, iyo inopinda delta data yakabatana kuburikidza neAb Initio Join neiyo data yaive patafura isati yagadziridzwa. Madhata anotakura muMDW (zvese muHive neRDBMS) kwete kungoisa data idzva kubva kudelta, asi zvakare kuvhara nguva dzekukosha kweiyo data makiyi ekutanga akagamuchira delta. Mukuwedzera, iwe unofanirwa kunyora zvakare chikamu chisina kuchinjwa che data. Asi izvi zvinofanirwa kuitwa nekuti Hive haina kudzima kana kugadzirisa mabasa.

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Panyaya yeRDBMS, magirafu ekuwedzera data ekuvandudza anotaridzika zvakanyanya, nekuti RDBMS ine chaiyo yekuvandudza masimba.

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Iyo yakagamuchirwa delta inotakurwa mutafura yepakati mune dhatabhesi. Mushure meizvi, iyo delta yakabatana kune iyo data yaive patafura isati yagadziridzwa. Uye izvi zvinoitwa uchishandisa SQL uchishandisa inogadzirwa SQL query. Tevere, uchishandisa iyo SQL mirairo bvisa + isa, data nyowani kubva kudelta inoiswa mutafura inotangwa uye nguva dzekukosha kweiyo data makiyi ekutanga akagashira delta akavharwa.
Hapana chikonzero chekunyorazve data isina kuchinjwa.

Saka takasvika pamhedziso yekuti mune yeHive, MDW inofanirwa kuenda kunonyora tafura yese nekuti Hive haina basa rekuvandudza. Uye hapana chinhu chiri nani pane kunyora zvachose iyo data kana kugadzirisa kwave kuumbwa. Panyaya yeRDBMS, zvakapesana, vagadziri vechigadzirwa vakaona zvakakodzera kuisa kubatana uye kuvandudzwa kwematafura mukushandiswa kweSQL.

Kune purojekiti paSberbank, takagadzira nyowani, inogona kushandiswa zvakare yekurodha database yeGreenPlum. Izvi zvakaitwa zvichibva pashanduro inogadzirwa neMDW yeTeradata. Yaive Teradata, uye kwete Oracle, yakauya padhuze uye yakanakira izvi, nekuti... zvakare iri MPP system. Maitiro ekushanda, pamwe neiyo syntax, yeTeradata neGreenPlum yakave yakafanana.

Mienzaniso yeMDW-yakakosha misiyano pakati peRDBMS dzakasiyana ndeiyi inotevera. MuGreenPlum, kusiyana neTeradata, paunenge uchigadzira matafura unofanirwa kunyora chirevo

distributed by

Teradata inonyora kuti:

delete <table> all

, uye muGreenPlum vanonyora

delete from <table>

MuOracle, kuitira optimization zvinangwa vanonyora

delete from t where rowid in (<соСдинСниС t с Π΄Π΅Π»ΡŒΡ‚ΠΎΠΉ>)

, uye Teradata neGreenPlum vanonyora

delete from t where exists (select * from delta where delta.pk=t.pk)

Isu tinocherekedza zvakare kuti kuti Ab Initio ishande neGreenPlum, zvaive zvakakodzera kuisa mutengi weGreenPlum pamanodhi ese eboka reAb Initio. Izvi zvinodaro nekuti isu takabatana neGreenPlum panguva imwe chete kubva kumanodhi ese musumbu redu. Uye kuitira kuti kuverenga kubva kuGreenPlum kuenderane uye imwe neimwe yakafanana Ab Initio tambo kuverenga chikamu chayo che data kubva kuGreenPlum, taifanira kuisa chivakwa chinonzwisiswa naAb Initio muchikamu che "uko" chemibvunzo yeSQL.

where ABLOCAL()

uye sarudza kukosha kwekuvaka uku nekutsanangura parameter kuverenga kubva kune yekushandura database

ablocal_expr=Β«string_concat("mod(t.", string_filter_out("{$TABLE_KEY}","{}"), ",", (decimal(3))(number_of_partitions()),")=", (decimal(3))(this_partition()))Β»

, iyo inounganidza kune chimwe chinhu chakadaro

mod(sk,10)=3

, i.e. iwe unofanirwa kusimudzira GreenPlum ine yakajeka sefa kune yega yega chikamu. Kune mamwe dhatabhesi (Teradata, Oracle), Ab Initio inogona kuita iyi parallelization otomatiki.

Ab Initio kuenzanisa kwekuita pakati peHive neGreenPlum

Sberbank yakaita chiedzo chekuenzanisa kushanda kweMDW-yakagadzirwa magirafu maererano neHive uye maererano neGreenPlum. Sechikamu chekuedza, muchiitiko cheHive pakanga pane 5 nodes pachikwata chakafanana neAb Initio, uye munyaya yeGreenPlum pakanga pane 4 nodes pane rimwe boka rakasiyana. Avo. Hive yaive neimwe mukana wehardware pamusoro peGreenPlum.

Isu takafunga maviri maviri emagirafu achiita basa rimwechete rekuvandudza data muHive neGreenPlum. Panguva imwecheteyo, magirafu akagadzirwa neMDW configurator akatangwa:

  • yekutanga mutoro + kuwedzera mutoro we data rakagadzirwa zvisina tsarukano muHive tafura
  • yekutanga mutoro + yakawedzera mutoro we data rakagadzirwa zvisina tsarukano mune imwecheteyo GreenPlum tafura

Muzviitiko zvese zviri zviviri (Hive neGreenPlum) vakamhanyisa kukwirisa kune gumi tambo dzakafanana pane imwechete Ab Initio cluster. Ab Initio yakachengetedza data yepakati yekuverenga muHDFS (maererano neAb Initio, dhizaini yeMFS uchishandisa HDFS yakashandiswa). Mutsara mumwe wedata rakagadzirwa zvisina tsarukano wakatora mazana maviri emabhayiti mune ese ari maviri.

Mhedzisiro yaive seizvi:

Mukoko:

Kutanga kurodha muHive

Mitsara yakaiswa
6 000 000
60 000 000
600 000 000

Nguva yekutanga
downloads mumasekonzi
41
203
1 601

Kuwedzera kurodha muHive

Nhamba yemitsara inowanikwa mukati
tafura yechinangwa pakutanga kwekuedza
6 000 000
60 000 000
600 000 000

Nhamba yemitsara ye delta inoshandiswa kune
tafura yechinangwa panguva yekuedza
6 000 000
6 000 000
6 000 000

Nguva yekuwedzera
downloads mumasekonzi
88
299
2 541

GreenPlum:

Kutanga kurodha muGreenPlum

Mitsara yakaiswa
6 000 000
60 000 000
600 000 000

Nguva yekutanga
downloads mumasekonzi
72
360
3 631

Kuwedzera kurodha muGreenPlum

Nhamba yemitsara inowanikwa mukati
tafura yechinangwa pakutanga kwekuedza
6 000 000
60 000 000
600 000 000

Nhamba yemitsara ye delta inoshandiswa kune
tafura yechinangwa panguva yekuedza
6 000 000
6 000 000
6 000 000

Nguva yekuwedzera
downloads mumasekonzi
159
199
321

Isu tinoona kuti kumhanya kwekutanga kurodha muHive neGreenPlum zvine mutsetse zvinoenderana nehuwandu hwedata uye, nekuda kwezvikonzero zvehardware zviri nani, inokurumidza zvishoma kune Hive pane yeGreenPlum.

Kuwedzera kurodha muHive zvakare zvine mutsetse zvinoenderana nehuwandu hweyakambotakurwa data inowanikwa mutafura inotangwa uye inoenderera zvishoma nezvishoma sezvo vhoriyamu inokura. Izvi zvinokonzerwa nekudiwa kwekunyorazve tafura yechinangwa zvachose. Izvi zvinoreva kuti kushandisa shanduko diki kumatafura akakura haisi yakanaka yekushandisa nyaya yeHive.

Kuwedzera kurodha muGreenPlum zvisina kusimba zvinoenderana nehuwandu hweyakambotakurwa data inowanikwa mutafura yakanangwa uye inoenderera nekukurumidza. Izvi zvakaitika nekuda kweSQL Joins uye GreenPlum architecture, iyo inobvumira kudzima kushanda.

Saka, GreenPlum inowedzera delta ichishandisa nzira yekudzima + insert, asi Hive haina kudzima kana kugadzirisa mashandiro, saka dhata rese rakamanikidzwa kunyorwa patsva panguva yekuwedzera. Kuenzanisa kwemaseru akaiswa mune matema kunoratidza zvakanyanya, sezvo ichienderana neyakajairika sarudzo yekushandisa kudhawunirodha kwakanyanya. Isu tinoona kuti GreenPlum yakarova Hive muyedzo iyi ka8.

Kushanda Ab Initio neGreenPlum mune Pedyo Nenguva Chaiyo modhi

Muchiyedzo ichi, tichaedza kugona kwaAb Initio kugadzirisa tafura yeGreenPlum ine machunks akagadzirwa zvisina tsarukano edata munguva iri pedyo. Ngatitarisei tafura yeGreenPlum dev42_1_db_usl.TESTING_SUBJ_org_finval, yatichashanda nayo.

Isu tichashandisa matatu Ab Initio magirafu kushanda nawo:

1) Girafu Create_test_data.mp - inogadzira mafaira edata muHDFS ine mitsara 10 mune gumi tambo dzakafanana. Iyo data haina kurongeka, chimiro chayo chakarongwa kuti chiiswe mutafura yedu

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

2) Girafu mdw_load.day_one.current.dev42_1_db_usl_testing_subj_org_finval.pset - MDW yakagadzira girafu nekutanga kuisa data mutafura yedu mune gumi tambo dzakafanana (yedzo data inogadzirwa negirafu (10) inoshandiswa)

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

3) Girafu mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset - girafu yakagadzirwa neMDW yekuwedzera kuvandudzwa kwetafura yedu mu10 tambo dzakafanana uchishandisa chikamu che data ichangobva kugamuchirwa (delta) yakagadzirwa negirafu (1)

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum

Ngatimhanyei iri pazasi script muNRT modhi:

  • gadzira 6 mitsetse yebvunzo
  • ita yekutanga mutoro isa 6 mitsara yekuyedza mutafura isina chinhu
  • dzokorora kuwedzera kurodha ka5
    • gadzira 6 mitsetse yebvunzo
    • ita kuwedzera kwe6 mitsara yebvunzo mutafura (munyaya iyi, nguva yekupera_ku_ts inoiswa kune data rekare uye data razvino rine kiyi yekutanga inoiswa)

Mamiriro ezvinhu aya anotevedzera maitiro ekushanda chaiko kweimwe bhizinesi system - chikamu chakakura kwazvo che data nyowani inoonekwa munguva chaiyo uye pakarepo inodururwa muGreenPlum.

Zvino ngatitarisei pane script's log:

Tanga Create_test_data.input.pset at 2020-06-04 11:49:11
Pedzisa Create_test_data.input.pset at 2020-06-04 11:49:37
Kutanga mdw_load.day_one.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 11:49:37
Pedzisa mdw_load.day_one.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 11:50:42
Tanga Create_test_data.input.pset at 2020-06-04 11:50:42
Pedzisa Create_test_data.input.pset at 2020-06-04 11:51:06
Kutanga mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 11:51:06
Pedzisa mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 11:53:41
Tanga Create_test_data.input.pset at 2020-06-04 11:53:41
Pedzisa Create_test_data.input.pset at 2020-06-04 11:54:04
Kutanga mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 11:54:04
Pedzisa mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 11:56:51
Tanga Create_test_data.input.pset at 2020-06-04 11:56:51
Pedzisa Create_test_data.input.pset at 2020-06-04 11:57:14
Kutanga mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 11:57:14
Pedzisa mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 11:59:55
Tanga Create_test_data.input.pset at 2020-06-04 11:59:55
Pedzisa Create_test_data.input.pset at 2020-06-04 12:00:23
Kutanga mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 12:00:23
Pedzisa mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 12:03:23
Tanga Create_test_data.input.pset at 2020-06-04 12:03:23
Pedzisa Create_test_data.input.pset at 2020-06-04 12:03:49
Kutanga mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 12:03:49
Pedzisa mdw_load.regular.current.dev42_1_db_usl_testing_subj_org_finval.pset at 2020-06-04 12:06:46

Zvinotora mufananidzo uyu:

Girafu
Tanga nguva
Nguva yekupedzisira
urefu

Create_test_data.input.pset
04.06.2020 11: 49: 11
04.06.2020 11: 49: 37
00:00:26

mdw_load.day_one.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 49: 37
04.06.2020 11: 50: 42
00:01:05

Create_test_data.input.pset
04.06.2020 11: 50: 42
04.06.2020 11: 51: 06
00:00:24

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 51: 06
04.06.2020 11: 53: 41
00:02:35

Create_test_data.input.pset
04.06.2020 11: 53: 41
04.06.2020 11: 54: 04
00:00:23

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 54: 04
04.06.2020 11: 56: 51
00:02:47

Create_test_data.input.pset
04.06.2020 11: 56: 51
04.06.2020 11: 57: 14
00:00:23

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 11: 57: 14
04.06.2020 11: 59: 55
00:02:41

Create_test_data.input.pset
04.06.2020 11: 59: 55
04.06.2020 12: 00: 23
00:00:28

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 12: 00: 23
04.06.2020 12: 03: 23
00:03:00

Create_test_data.input.pset
04.06.2020 12: 03: 23
04.06.2020 12: 03: 49
00:00:26

mdw_load.regular.current.
dev42_1_db_usl_testing_subj_org_finval.pset
04.06.2020 12: 03: 49
04.06.2020 12: 06: 46
00:02:57

Tinoona kuti 6 mitsetse yekuwedzera inogadziriswa mumaminitsi matatu, iyo inokurumidza.
Iyo data iri mutafura yechinangwa yakave yakagovaniswa sezvizvi:

select valid_from_ts, valid_to_ts, count(1), min(sk), max(sk) from dev42_1_db_usl.TESTING_SUBJ_org_finval group by valid_from_ts, valid_to_ts order by 1,2;

Paunenge uine Sber zviyero. Kushandisa Ab Initio ine Hive uye GreenPlum
Iwe unogona kuona kuwirirana kweiyo yakaiswa data kune nguva dzakatangwa magirafu.
Izvi zvinoreva kuti unogona kumhanya kuwedzera kurodha data muGreenPlum muAb Initio ine frequency yakanyanya uye tarisa kumhanya kwakanyanya kwekuisa iyi data muGreenPlum. Ehe, hazvizogone kuvhura kamwe sekondi, sezvo Ab Initio, senge chero chishandiso cheETL, inoda nguva "kutanga" kana yatangwa.

mhedziso

Ab Initio parizvino inoshandiswa kuSberbank kuvaka Unified Semantic Data Layer (ESS). Iyi purojekiti inosanganisira kuvaka vhezheni yakabatana yemamiriro eakasiyana emabhizinesi emabhanga. Ruzivo runobva kwakasiyana siyana, iyo replicas iyo inogadzirirwa paHadoop. Zvichienderana nezvinodiwa zvebhizinesi, modhi yedata inogadzirirwa uye shanduko yedata inotsanangurwa. Ab Initio inotakura ruzivo muESN uye iyo data yakatorwa haisi yekungofarira bhizinesi pachayo, asi inoshandawo sesosi yekuvaka data marts. Panguva imwecheteyo, kushanda kwechigadzirwa kunokubvumira kushandisa maitiro akasiyana-siyana seanogamuchira (Hive, Greenplum, Teradata, Oracle), izvo zvinoita kuti zvive nyore kugadzirira data kune bhizinesi mune zvakasiyana-siyana mafomu anodiwa.

Kugona kwaAb Initio kwakakura; semuenzaniso, iyo inosanganisirwa MDW chimiro inoita kuti zvikwanise kuvaka hunyanzvi uye bhizinesi nhoroondo data kunze kwebhokisi. Kune vanogadzira, Ab Initio inoita kuti zvikwanisike kusadzoreredza vhiri, asi kushandisa akawanda aripo anoshanda zvikamu, izvo zvinonyanya kuve maraibhurari anodiwa kana uchishanda nedata.

Munyori inyanzvi munharaunda yehunyanzvi yeSberbank SberProfi DWH/BigData. Iyo SberProfi DWH/BigData nyanzvi munharaunda ine basa rekuvandudza hunyanzvi munzvimbo dzakadai seHadoop ecosystem, Teradata, Oracle DB, GreenPlum, pamwe neBI zvishandiso Qlik, SAP BO, Tableau, nezvimwe.

Source: www.habr.com

Voeg