Nawuphi na umsebenzi omkhulu wedatha ufuna amandla amaninzi ekhompyuter. Ukuhamba okuqhelekileyo kwedatha ukusuka kwisiseko sedatha ukuya eHadoop kunokuthatha iiveki okanye kuxabisa kakhulu njengephiko lenqwelomoya. Awufuni ukulinda kwaye uchithe imali? Ukulinganisa umthwalo kumaqonga ahlukeneyo. Enye indlela kukuphucula ukutyhala.
Ndabuza umqeqeshi okhokelayo waseRashiya wophuhliso kunye nolawulo lweemveliso ze-Informatica, u-Alexey Ananyev, ukuba athethe malunga nomsebenzi wokuphucula i-pushdown kwi-Informatica Big Data Management (BDM). Ngaba ukhe wafunda ukusebenza ngeemveliso ze-Informatica? Ngokunokwenzeka, yayingu-Alexey owakuxelele iziseko zePowerCenter kwaye wachaza indlela yokwakha iimephu.
U-Alexey Ananyev, intloko yoqeqesho kwi-DIS Group
Yintoni ukutyhala?
Uninzi lwenu sele luqhelene ne-Informatica Big Data Management (BDM). Imveliso inokudibanisa idatha enkulu evela kwimithombo eyahlukeneyo, iyihambise phakathi kweenkqubo ezahlukeneyo, inika ukufikelela lula kuyo, ikuvumela ukuba uyifake iphrofayili, kunye nokunye okuninzi.
Kwizandla ezilungileyo, i-BDM inokusebenza ngokumangalisayo: imisebenzi iya kugqitywa ngokukhawuleza kunye nezixhobo ezincinci zekhompyutha.
Ngaba nawe uyayifuna loo nto? Funda ukusebenzisa i-pushdown feature kwi-BDM ukusabalalisa umthwalo wekhompyutha kuwo wonke amaqonga ahlukeneyo. Itekhnoloji yePushdown ikuvumela ukuba ujike imephu ibe siscript kwaye ukhethe imeko apho esi script siza kusebenza khona. Olu khetho lukuvumela ukuba udibanise amandla amaqonga ahlukeneyo kwaye ufezekise ukusebenza kwawo okuphezulu.
Ukuqwalasela imeko-bume yophumezo lwescript, kufuneka ukhethe uhlobo lokutyhala. Iskripthi sinokuqhutywa ngokupheleleyo kwiHadoop okanye sisasazwe ngokuyinxenye phakathi komthombo kunye ne-sink. Kukho iintlobo ezi-4 ezinokwenzeka zokutyhala. Ukwenziwa kwemephu akufuneki kuguqulwe ibe siscript (somthonyama). Imephu inokwenziwa kangangoko kumthombo (umthombo) okanye ngokupheleleyo kumthombo (ogcweleyo). Imephu inokuguqulwa ibe yiscript yeHadoop (akukho nanye).
Uphuculo lokutyhala
Iindidi ezidweliswe ezi-4 zinokudityaniswa ngeendlela ezahlukeneyo - ukutyhala kunokuphuculwa kwiimfuno ezithile zenkqubo. Ngokomzekelo, kudla ngokufanelekileyo ukukhupha idatha kwisiseko sedatha usebenzisa amandla ayo. Kwaye idatha iya kuguqulwa ngokusebenzisa i-Hadoop, ukuze ingalayishi i-database ngokwayo.
Makhe siqwalasele imeko xa zombini umthombo kunye nendawo ekuyiyo kwisiseko sedatha, kwaye iqonga lokwenziwa kwenguqu lingakhethwa: kuxhomekeke kwizicwangciso, kuya kuba yi-Informatica, iseva yedatha, okanye i-Hadoop. Umzekelo onjalo uya kukuvumela ukuba uqonde ngokuchanekileyo icala lobugcisa bokusebenza kwalo matshini. Ngokwemvelo, kubomi bokwenene, le meko ayiveli, kodwa ifaneleke ngokufanelekileyo ukubonisa ukusebenza.
Masithathe imephu ukuze sifunde iitheyibhile ezimbini kwindawo enye yeOracle. Kwaye vumela iziphumo zokufunda zirekhodwe kwitheyibhile kwisiseko sedatha efanayo. Udweliso lwemephu luya kuba ngolu hlobo:
Ngendlela yokwenza imephu kwi-Informatica BDM 10.2.1 ibonakala ngolu hlobo:
Uhlobo lokutyhalela phantsi – lwemveli
Ukuba sikhetha ukutyhala phantsi uhlobo lwemveli, ngoko ukwenza imephu kuyakwenziwa kumncedisi we-Informatica. Idatha iya kufundwa kwi-server ye-Oracle, idluliselwe kwi-Informatica iseva, iguqulwe apho kwaye idluliselwe kwi-Hadoop. Ngamanye amazwi, siya kufumana inkqubo ye-ETL eqhelekileyo.
Uhlobo lokutyhala – umthombo
Xa ukhetha uhlobo lomthombo, sifumana ithuba lokusabalalisa inkqubo yethu phakathi komncedisi wedatha (DB) kunye neHadoop. Xa inkqubo isenziwa ngolu seto, izicelo zokufumana kwakhona idatha kwiitafile ziya kuthunyelwa kuvimba weenkcukacha. Kwaye okuseleyo kuya kwenziwa ngendlela yamanyathelo kwiHadoop.
Umzobo wokwenziwa uya kujongeka ngolu hlobo:
Apha ngezantsi ngumzekelo wokuseta imeko-bume yexesha lokusebenza.
Kule meko, ukwenza imephu kuya kwenziwa ngamanyathelo amabini. Kwiisetingi zayo siya kubona ukuba ijike yaba sisikripthi esiza kuthunyelwa kumthombo. Ngaphezu koko, ukudibanisa iitheyibhile kunye nokuguqula idatha kuya kwenziwa ngendlela yombuzo ogqithisiweyo kumthombo.
Kulo mfanekiso ungezantsi, sibona imephu ephuculweyo kwi-BDM, kunye nombuzo ochazwe ngokutsha kumthombo.
Indima yeHadoop kolu lungelelwaniso iya kuncitshiswa ekulawuleni ukuhamba kwedatha - ukuyilungisa. Isiphumo sombuzo siya kuthunyelwa kwiHadoop. Nje ukuba ukufundwa kugqityiwe, ifayile evela kwiHadoop iya kubhalwa kwi-sink.
Uhlobo lokutyhala – lugcwele
Xa ukhetha udidi olupheleleyo, imephu iya kujika ibe ngumbuzo wedatabase. Kwaye umphumo wesicelo uya kuthunyelwa kwiHadoop. Umzobo wenkqubo enjalo uboniswe ngezantsi.
Umzekelo wokuseta uboniswe ngezantsi.
Ngenxa yoko, siya kufumana imephu ephuculweyo efana nale yangaphambili. Ukwahlukana kuphela kukuba yonke ingqiqo idluliselwa kumamkeli ngendlela yokugqithisa ukufakwa kwayo. Umzekelo wokwenziwa kwemephu ephuculweyo uboniswe ngezantsi.
Apha, njengakwimeko yangaphambili, iHadoop idlala indima yomqhubi. Kodwa apha umthombo ufundwa ngokupheleleyo, kwaye emva koko i-logic processing data yenziwa kwinqanaba lommkeli.
Uhlobo lokutyhala alikho
Ewe, ukhetho lokugqibela luhlobo lokutyhala, apho imephu yethu iya kujika ibe sisikripthi seHadoop.
Imaphu ephuculweyo ngoku iza kujongeka ngolu hlobo:
Apha idatha evela kwiifayile zomthombo iya kuqala ukufundwa kwiHadoop. Emva koko, esebenzisa iindlela zakhe, ezi fayile zimbini ziya kudityaniswa. Emva koku, idatha iya kuguqulwa kwaye ifakwe kwisiseko sedatha.
Ngokuqonda imigaqo yokuphucula ukutyhala, unokucwangcisa ngempumelelo iinkqubo ezininzi zokusebenza ngedatha enkulu. Ke, kutsha nje, enye inkampani enkulu, kwiiveki nje ezimbalwa, ikhuphe idatha enkulu kwindawo yokugcina kwiHadoop, ebikade iyiqokelele iminyaka eliqela.
umthombo: www.habr.com