Musika wekugoverwa komputa uye data hombe, maererano
Nei komputa yakagoverwa ichidikanwa mubhizinesi renguva dzose? Zvose pano zviri nyore uye zvakaoma panguva imwe chete. Nyore - nekuti kazhinji tinoita maverengero akareruka pachikamu cheruzivo. Zvakaoma nekuti kune zvakawanda zveruzivo rwakadai. Zvakawanda. Somugumisiro, zvinodiwa
Mumwe wemienzaniso ichangoburwa: iyo pizzeria cheni Dodo Pizza
Mumwe muenzaniso:
Tool kusarudzwa
Iyo indasitiri chiyero cherudzi urwu rwekombuta ndeyeHadoop. Sei? Nekuti Hadoop yakanakisa, yakanyorwa zvakanaka chimiro (iyo yakafanana Habr inopa akawanda akadzama zvinyorwa pamusoro peiyi nyaya), iyo inoperekedzwa neseti yese yezvishandiso uye maraibhurari. Iwe unogona kuisa mahombe seti ese akaumbwa uye asina kurongeka dhata, uye iyo pachayo system inoiparadzira pakati pesimba rekombuta. Uyezve, hunyanzvi uhu humwechete hunogona kuwedzerwa kana kuremara chero nguva - iyo yakafanana yakachinjika scalability mukuita.
Muna 2017, kambani ine simba yekubvunza Gartner
Hadoop inotsamira pambiru dzakati wandei, iyo inonyanya kuzivikanwa iyo MapReduce tekinoroji (hurongwa hwekugovera data rekuverenga pakati pemaseva) uye HDFS faira system. Iyo yekupedzisira yakarongedzerwa kuchengetedza ruzivo rwakagoverwa pakati pemasumbu masumbu: bhuroka yega yega yehukuru hwakatarwa inogona kuiswa pamanodhi akati wandei, uye nekuda kwekudzokorora, sisitimu inoshingirira kukundikana kwenodhi yega. Panzvimbo petafura yefaira, sevha yakakosha inonzi NameNode inoshandiswa.
Mufananidzo uri pazasi unoratidza mashandiro anoita MapReduce. Padanho rekutanga, iyo data inokamurwa zvichienderana neimwe chirevo, padanho rechipiri inogoverwa zvinoenderana nesimba rekombuta, uye padanho rechitatu kuverenga kunoitika.
MapReduce yakatanga kugadzirwa neGoogle kune zvayaida kutsvaga. Ipapo MapReduce yakaenda yemahara kodhi, uye Apache akatora purojekiti. Zvakanaka, Google zvishoma nezvishoma yakatamira kune dzimwe mhinduro. Chinhu chinonakidza tidbit: Google parizvino ine chirongwa chinonzi Google Cloud Dataflow, chakamisikidzwa sedanho rinotevera mushure meHadoop, sechimbichimbi chekuchitsiva.
Kunyatsotarisa kunoratidza kuti Google Cloud Dataflow yakavakirwa pakusiyana kweApache Beam, nepo Apache Beam inosanganisira yakanyatso nyorwa Apache Spark chimiro, iyo inotibvumira kutaura nezve ingangoita yakafanana kukurumidza kukurumidza kwemhinduro. Zvakanaka, Apache Spark inoshanda zvakakwana paHDFS faira system, iyo inobvumira kuti ishandiswe pamaseva eHadoop.
Wedzera pano vhoriyamu yezvinyorwa uye yakagadzirira-yakagadzirwa mhinduro dzeHadoop uye Spark kupesana neGoogle Cloud Dataflow, uye sarudzo yechishandiso inova pachena. Uyezve, mainjiniya anogona kuzvisarudzira kuti ndeipi kodhi - yeHadoop kana Spark - yavanofanira kumhanya, vachitarisa pabasa, ruzivo uye zvikwaniriso.
Cloud kana sevha yemunharaunda
Maitiro ekuenda kune yakajairika shanduko kune gore akatopa kusimuka kune izwi rinonakidza seHadoop-as-a-service. Mumamiriro ezvinhu akadai, kutonga kwemaseva akabatana kwakave kwakakosha. Nekuti, maiwe, kunyangwe nekuzivikanwa kwayo, yakachena Hadoop chishandiso chakaoma kugadzirisa, sezvo zvakawanda zvichifanira kuitwa nemaoko. Semuenzaniso, gadzira maseva ega, tarisa maitiro avo, uye nyatso gadzirisa akawanda ma paramita. Kazhinji, basa ndereamateur uye pane mukana wakakura wekukanganisa pane imwe nzvimbo kana kupotsa chimwe chinhu.
Naizvozvo, akasiyana makiti ekugovera, ayo akatanga akashongedzerwa nyore kuendesa uye maturusi ekutonga, ave akakurumbira. Imwe yeanonyanya kufarirwa kugovera inotsigira Spark uye inoita kuti zvese zvive nyore ndeye Cloudera. Iyo ine zvese zvakabhadharwa uye zvemahara vhezheni - uye mune yekupedzisira ese ekutanga mashandiro anowanikwa, pasina kudzikisira nhamba yemanodhi.
Panguva yekuseta, Cloudera Maneja achabatana neSSH kumaseva ako. Chinhu chinonakidza: kana uchiisa, zviri nani kutsanangura kuti iitwe neayo anonzi mapasuru: mapakeji akakosha, imwe neimwe ine zvese zvinodiwa zvinogadziriswa kuti zvishande pamwe chete. Chaizvoizvo iyi ishanduro yakagadziridzwa yepakeji maneja.
Mushure mekuisa, tinogashira cluster management console, kwaunogona kuona cluster telemetry, akaiswa masevhisi, uyezve iwe unogona kuwedzera / kubvisa zviwanikwa uye kugadzirisa iyo cluster kumisikidza.
Nekuda kweizvozvo, kabhini yeroketi iyo inokutora iwe mune ramangwana rakajeka reBigData rinoonekwa pamberi pako. Asi tisati tati "handei," ngatifambei pasi pehodhi.
Hardware zvinodiwa
Pawebhusaiti yayo, Cloudera inotaura zvakasiyana zvinogoneka zvigadziriso. Misimboti yakawanda yavanovakwa nayo inoratidzwa mumufananidzo:
MapReduce inogona kudzima mufananidzo uyu une tariro. Kana iwe ukatarisa zvakare dhayagiramu kubva muchikamu chekare, zvinova pachena kuti munenge muzviitiko zvese, basa reMapReduce rinogona kusangana nebhodhoro pakuverenga data kubva kudhisiki kana kubva kune network. Izvi zvinoonekwa zvakare mu Cloudera blog. Nekuda kweizvozvo, kune chero nekukurumidza kuverenga, kusanganisira kuburikidza neSpark, iyo inowanzoshandiswa pakuverenga-chaiyo-nguva, I / O kumhanya kwakakosha. Nokudaro, kana uchishandisa Hadoop, zvakakosha zvikuru kuti sumbu rinosanganisira michina yakaenzana uye inokurumidza, iyo, kuiisa zvinyoro, haisi nguva dzose yakavimbiswa mukugadzirwa kwegore.
Kuenzana mukugovewa kwemutoro kunowanikwa kuburikidza nekushandiswa kweOpenstack virtualization pamaseva ane ane simba akawanda-epakati CPUs. Data node dzakagoverwa ega processor zviwanikwa uye chaiwo madhisiki. Muchisarudzo chedu Atos Codex Data Lake Injini Wide virtualization inowanikwa, ndosaka isu tichibatsirwa zvese maererano nekuita (kukanganisa kweiyo network network kunoderedzwa) uye muTCO (yakawedzera mavhavha emuviri anobviswa).
Kana tichishandisa maSeva eBullSequana S200, tinowana mutoro wakafanana, usina mamwe mabhodhoro. Iko kushomeka kwekugadzirisa kunosanganisira 3 BullSequana S200 maseva, imwe neimwe iine maJBOD maviri, pamwe nekuwedzera maS200 ane mana data node anosarudzika akabatana. Heino muenzaniso wemutoro muyedzo yeTeraGen:
Miedzo ine akasiyana data mavhoriyamu uye kudzokorora kukosha inoratidza iwo mhedzisiro yakafanana maererano nekugoverwa kwemutoro pakati pemasumbu masumbu. Pazasi pane girafu rekugoverwa kwedhisiki yekuwana nekuita bvunzo.
Maverengero akaitwa anoenderana neshongedzo shoma ye3 BullSequana S200 maseva. Inosanganisira 9 data node uye 3 master node, pamwe neakachengeterwa chaiwo machina kana kuendesa kwekudzivirira kwakavakirwa paOpenStack Virtualization. TeraSort bvunzo mhedzisiro: block saizi 512 MB replication factor yakaenzana nematatu ane encryption ndeye 23,1 maminetsi.
Iyo system inogona sei kuwedzerwa? Kune marudzi akasiyana ekuwedzera anowanikwa kuData Lake Engine:
- Data node: kune yega yega 40 TB yenzvimbo inoshandiswa
- Analytical nodes nekukwanisa kuisa GPU
- Dzimwe sarudzo zvinoenderana nezvinodiwa nebhizinesi (semuenzaniso, kana uchida Kafka nezvimwe zvakadaro)
Iyo Atos Codex Data Lake Engine inosanganisira ese maseva pachawo uye pre-yakaiswa software, kusanganisira ine rezinesi Cloudera kit; Hadoop pachayo, OpenStack ine chaiwo michina yakavakirwa paRedHat Enterprise Linux kernel, data replication uye backup masisitimu (kusanganisira kushandisa backup node uye Cloudera BDR - Backup uye Disaster Recovery). Atos Codex Data Lake Engine yakave yekutanga virtualization mhinduro kupihwa chitupa
Kana iwe uchifarira ruzivo, isu tichafara kupindura mibvunzo yedu mumhinduro.
Source: www.habr.com