Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Nawuphi na umsebenzi omkhulu wedatha ufuna amandla amaninzi ekhompyuter. Ukuhamba okuqhelekileyo kwedatha ukusuka kwisiseko sedatha ukuya eHadoop kunokuthatha iiveki okanye kuxabisa kakhulu njengephiko lenqwelomoya. Awufuni ukulinda kwaye uchithe imali? Ukulinganisa umthwalo kumaqonga ahlukeneyo. Enye indlela kukuphucula ukutyhala.

Ndabuza umqeqeshi okhokelayo waseRashiya wophuhliso kunye nolawulo lweemveliso ze-Informatica, u-Alexey Ananyev, ukuba athethe malunga nomsebenzi wokuphucula i-pushdown kwi-Informatica Big Data Management (BDM). Ngaba ukhe wafunda ukusebenza ngeemveliso ze-Informatica? Ngokunokwenzeka, yayingu-Alexey owakuxelele iziseko zePowerCenter kwaye wachaza indlela yokwakha iimephu.

U-Alexey Ananyev, intloko yoqeqesho kwi-DIS Group

Yintoni ukutyhala?

Uninzi lwenu sele luqhelene ne-Informatica Big Data Management (BDM). Imveliso inokudibanisa idatha enkulu evela kwimithombo eyahlukeneyo, iyihambise phakathi kweenkqubo ezahlukeneyo, inika ukufikelela lula kuyo, ikuvumela ukuba uyifake iphrofayili, kunye nokunye okuninzi.
Kwizandla ezilungileyo, i-BDM inokusebenza ngokumangalisayo: imisebenzi iya kugqitywa ngokukhawuleza kunye nezixhobo ezincinci zekhompyutha.

Ngaba nawe uyayifuna loo nto? Funda ukusebenzisa i-pushdown feature kwi-BDM ukusabalalisa umthwalo wekhompyutha kuwo wonke amaqonga ahlukeneyo. Itekhnoloji yePushdown ikuvumela ukuba ujike imephu ibe siscript kwaye ukhethe imeko apho esi script siza kusebenza khona. Olu khetho lukuvumela ukuba udibanise amandla amaqonga ahlukeneyo kwaye ufezekise ukusebenza kwawo okuphezulu.

Ukuqwalasela imeko-bume yophumezo lwescript, kufuneka ukhethe uhlobo lokutyhala. Iskripthi sinokuqhutywa ngokupheleleyo kwiHadoop okanye sisasazwe ngokuyinxenye phakathi komthombo kunye ne-sink. Kukho iintlobo ezi-4 ezinokwenzeka zokutyhala. Ukwenziwa kwemephu akufuneki kuguqulwe ibe siscript (somthonyama). Imephu inokwenziwa kangangoko kumthombo (umthombo) okanye ngokupheleleyo kumthombo (ogcweleyo). Imephu inokuguqulwa ibe yiscript yeHadoop (akukho nanye).

Uphuculo lokutyhala

Iindidi ezidweliswe ezi-4 zinokudityaniswa ngeendlela ezahlukeneyo - ukutyhala kunokuphuculwa kwiimfuno ezithile zenkqubo. Ngokomzekelo, kudla ngokufanelekileyo ukukhupha idatha kwisiseko sedatha usebenzisa amandla ayo. Kwaye idatha iya kuguqulwa ngokusebenzisa i-Hadoop, ukuze ingalayishi i-database ngokwayo.

Makhe siqwalasele imeko xa zombini umthombo kunye nendawo ekuyiyo kwisiseko sedatha, kwaye iqonga lokwenziwa kwenguqu lingakhethwa: kuxhomekeke kwizicwangciso, kuya kuba yi-Informatica, iseva yedatha, okanye i-Hadoop. Umzekelo onjalo uya kukuvumela ukuba uqonde ngokuchanekileyo icala lobugcisa bokusebenza kwalo matshini. Ngokwemvelo, kubomi bokwenene, le meko ayiveli, kodwa ifaneleke ngokufanelekileyo ukubonisa ukusebenza.

Masithathe imephu ukuze sifunde iitheyibhile ezimbini kwindawo enye yeOracle. Kwaye vumela iziphumo zokufunda zirekhodwe kwitheyibhile kwisiseko sedatha efanayo. Udweliso lwemephu luya kuba ngolu hlobo:

Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Ngendlela yokwenza imephu kwi-Informatica BDM 10.2.1 ibonakala ngolu hlobo:

Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Uhlobo lokutyhalela phantsi – lwemveli

Ukuba sikhetha ukutyhala phantsi uhlobo lwemveli, ngoko ukwenza imephu kuyakwenziwa kumncedisi we-Informatica. Idatha iya kufundwa kwi-server ye-Oracle, idluliselwe kwi-Informatica iseva, iguqulwe apho kwaye idluliselwe kwi-Hadoop. Ngamanye amazwi, siya kufumana inkqubo ye-ETL eqhelekileyo.

Uhlobo lokutyhala – umthombo

Xa ukhetha uhlobo lomthombo, sifumana ithuba lokusabalalisa inkqubo yethu phakathi komncedisi wedatha (DB) kunye neHadoop. Xa inkqubo isenziwa ngolu seto, izicelo zokufumana kwakhona idatha kwiitafile ziya kuthunyelwa kuvimba weenkcukacha. Kwaye okuseleyo kuya kwenziwa ngendlela yamanyathelo kwiHadoop.
Umzobo wokwenziwa uya kujongeka ngolu hlobo:

Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Apha ngezantsi ngumzekelo wokuseta imeko-bume yexesha lokusebenza.

Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Kule meko, ukwenza imephu kuya kwenziwa ngamanyathelo amabini. Kwiisetingi zayo siya kubona ukuba ijike yaba sisikripthi esiza kuthunyelwa kumthombo. Ngaphezu koko, ukudibanisa iitheyibhile kunye nokuguqula idatha kuya kwenziwa ngendlela yombuzo ogqithisiweyo kumthombo.
Kulo mfanekiso ungezantsi, sibona imephu ephuculweyo kwi-BDM, kunye nombuzo ochazwe ngokutsha kumthombo.

Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Indima yeHadoop kolu lungelelwaniso iya kuncitshiswa ekulawuleni ukuhamba kwedatha - ukuyilungisa. Isiphumo sombuzo siya kuthunyelwa kwiHadoop. Nje ukuba ukufundwa kugqityiwe, ifayile evela kwiHadoop iya kubhalwa kwi-sink.

Uhlobo lokutyhala – lugcwele

Xa ukhetha udidi olupheleleyo, imephu iya kujika ibe ngumbuzo wedatabase. Kwaye umphumo wesicelo uya kuthunyelwa kwiHadoop. Umzobo wenkqubo enjalo uboniswe ngezantsi.

Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Umzekelo wokuseta uboniswe ngezantsi.

Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Ngenxa yoko, siya kufumana imephu ephuculweyo efana nale yangaphambili. Ukwahlukana kuphela kukuba yonke ingqiqo idluliselwa kumamkeli ngendlela yokugqithisa ukufakwa kwayo. Umzekelo wokwenziwa kwemephu ephuculweyo uboniswe ngezantsi.

Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Apha, njengakwimeko yangaphambili, iHadoop idlala indima yomqhubi. Kodwa apha umthombo ufundwa ngokupheleleyo, kwaye emva koko i-logic processing data yenziwa kwinqanaba lommkeli.

Uhlobo lokutyhala alikho

Ewe, ukhetho lokugqibela luhlobo lokutyhala, apho imephu yethu iya kujika ibe sisikripthi seHadoop.

Imaphu ephuculweyo ngoku iza kujongeka ngolu hlobo:

Ukuhambisa njani, ukulayisha kunye nokudibanisa idatha enkulu kakhulu ngexabiso eliphantsi kwaye ngokukhawuleza? Yintoni ukuphuculwa kokutyhala?

Apha idatha evela kwiifayile zomthombo iya kuqala ukufundwa kwiHadoop. Emva koko, esebenzisa iindlela zakhe, ezi fayile zimbini ziya kudityaniswa. Emva koku, idatha iya kuguqulwa kwaye ifakwe kwisiseko sedatha.

Ngokuqonda imigaqo yokuphucula ukutyhala, unokucwangcisa ngempumelelo iinkqubo ezininzi zokusebenza ngedatha enkulu. Ke, kutsha nje, enye inkampani enkulu, kwiiveki nje ezimbalwa, ikhuphe idatha enkulu kwindawo yokugcina kwiHadoop, ebikade iyiqokelele iminyaka eliqela.

umthombo: www.habr.com

Yongeza izimvo