"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Ina ba da shawarar ku karanta kwafin laccar "Hadoop. ZooKeeper" daga jerin "Hanyoyin rarraba manyan bayanai a Hadoop"

Menene ZooKeeper, wurin sa a cikin yanayin yanayin Hadoop. Rashin gaskiya game da rarraba kwamfuta. Zane na daidaitaccen tsarin rarrabawa. Wahala wajen daidaita tsarin rarrabawa. Matsalolin daidaitawa na yau da kullun. Ka'idodin da ke bayan ƙirar ZooKeeper. Samfurin bayanan ZooKeeper. tutocin znode. Zama. API ɗin abokin ciniki. Primitives (tsari, membobin ƙungiyar, makullai masu sauƙi, zaɓen jagora, kullewa ba tare da tasirin garke ba). ZooKeeper architecture. ZooKeeper DB. ZAB. Nemi mai kulawa.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Yau za mu yi magana game da ZooKeeper. Wannan abu yana da matukar amfani. Shi, kamar kowane samfurin Apache Hadoop, yana da tambari. Yana kwatanta mutum.

Kafin wannan, mun fi magana game da yadda za a iya sarrafa bayanai a can, yadda za a adana su, wato, yadda ake amfani da su ko ta yaya kuma a yi aiki da su ko ta yaya. Kuma a yau zan so in yi magana kadan game da gina aikace-aikacen da aka rarraba. Kuma ZooKeeper yana ɗaya daga cikin abubuwan da ke ba ku damar sauƙaƙe wannan lamarin. Wannan wani nau'i ne na sabis wanda aka yi niyya don wani nau'i na daidaitawa na hulɗar matakai a cikin tsarin da aka rarraba, a cikin aikace-aikacen da aka rarraba.

Bukatar irin waɗannan aikace-aikacen na ƙara karuwa a kowace rana, abin da kwas ɗinmu ya shafi ke nan. A gefe guda, MapReduce da wannan shirye-shiryen da aka yi yana ba ku damar fitar da wannan hadaddun kuma ku 'yantar da mai tsara shirye-shirye daga rubuce-rubuce na farko kamar mu'amala da daidaitawar matakai. Amma a gefe guda, babu wanda ya ba da tabbacin cewa ba za a yi hakan ba. MapReduce ko wasu shirye-shiryen da aka yi ba koyaushe suna maye gurbin wasu lokuta waɗanda ba za a iya aiwatar da su ta amfani da wannan ba. Ciki har da MapReduce kanta da tarin sauran ayyukan Apache; su, a zahiri, ana rarraba aikace-aikace. Kuma don sauƙaƙe rubutu, sun rubuta ZooKeeper.

Kamar duk aikace-aikacen da ke da alaƙa da Hadoop, Yahoo! Yanzu kuma aikace-aikacen Apache ne na hukuma. Ba shi da haɓaka sosai kamar HBase. Idan kun je JIRA HBase, to, a kowace rana akwai tarin rahotannin bug, bunch of shawarwari don inganta wani abu, watau rayuwa a cikin aikin yana ci gaba da gudana. Kuma ZooKeeper, a gefe guda, samfuri ne mai sauƙi, kuma a gefe guda, wannan yana tabbatar da amincinsa. Kuma yana da sauƙin amfani, wanda shine dalilin da ya sa ya zama ma'auni a aikace-aikace a cikin yanayin Hadoop. Don haka na ga zai yi kyau in sake duba shi don fahimtar yadda yake aiki da yadda ake amfani da shi.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Wannan hoto ne daga wasu lacca da muka yi. Za mu iya cewa shi ne orthogonal ga duk abin da muka yi la'akari da shi zuwa yanzu. Kuma duk abin da aka nuna a nan, zuwa mataki ɗaya ko wani, yana aiki tare da ZooKeeper, watau, sabis ne da ke amfani da duk waɗannan samfurori. HDFS ko MapReduce ba su rubuta irin nasu sabis ɗin da zai yi musu aiki musamman. Don haka, ana amfani da ZooKeeper. Kuma wannan yana sauƙaƙe ci gaba da wasu abubuwa masu alaƙa da kurakurai.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Daga ina duk wannan ya fito? Da alama mun ƙaddamar da aikace-aikacen guda biyu a layi daya akan kwamfutoci daban-daban, mun haɗa su da igiya ko a cikin raga, kuma komai yana aiki. Amma matsalar ita ce hanyar sadarwar ba ta dogara da ita ba, kuma idan ka yi la'akari da zirga-zirgar zirga-zirga ko duba abubuwan da ke faruwa a can a ƙananan matakan, yadda abokan ciniki ke hulɗa a kan hanyar sadarwa, sau da yawa za ka ga cewa wasu fakiti sun ɓace ko sake aikawa. Ba don komai ba ne aka ƙirƙira ƙa'idodin TCP, waɗanda ke ba ku damar kafa wani zama da kuma tabbatar da isar da saƙonni. Amma a kowane hali, ko da TCP ba zai iya ceton ku koyaushe ba. Komai yana da lokacin ƙarewa. Cibiyar sadarwa na iya faɗuwa kawai na ɗan lokaci. Yana iya kyaftawa kawai. Kuma wannan duk yana haifar da gaskiyar cewa ba za ku iya dogaro da hanyar sadarwar abin dogaro ba. Wannan shi ne babban bambanci daga rubuta parallel applications da ke aiki a kwamfuta ɗaya ko kuma a kan babbar kwamfuta guda ɗaya, inda babu Network, inda akwai bas ɗin musayar bayanai mafi aminci a ƙwaƙwalwar ajiya. Kuma wannan babban bambanci ne.

Daga cikin wasu abubuwa, lokacin amfani da hanyar sadarwa, koyaushe ana samun ɗan jinkiri. Disk ma yana da shi, amma Network yana da ƙari. Latency wani lokaci ne na jinkiri, wanda zai iya zama ƙarami ko kuma mai mahimmanci.

Hanyoyin sadarwa suna canzawa. Mene ne topology - wannan shi ne jeri na mu cibiyar sadarwa kayan aiki. Akwai cibiyoyin bayanai, akwai tarkace da ke tsaye, akwai kyandirori. Ana iya haɗa duk waɗannan abubuwa, motsa su, da sauransu. Wannan kuma yana buƙatar la'akari da shi. Sunayen IP suna canzawa, hanyar da zirga-zirgar ababen hawa ke canzawa. Wannan kuma yana buƙatar la'akari.

Hakanan hanyar sadarwa na iya canzawa dangane da kayan aiki. Daga aiki, zan iya cewa injiniyoyin hanyar sadarwar mu suna son sabunta wani lokaci lokaci-lokaci akan kyandir. Nan da nan wani sabon firmware ya fito kuma ba su da sha'awar wasu gungu na Hadoop. Suna da nasu aikin. A gare su, babban abu shine Network yana aiki. Saboda haka, suna son sake loda wani abu a wurin, yin walƙiya akan kayan aikinsu, kuma na'urar tana canzawa lokaci-lokaci. Duk wannan ko ta yaya yana buƙatar la'akari. Duk wannan yana shafar aikace-aikacen mu da aka rarraba.

Yawancin lokaci mutanen da suka fara aiki tare da adadi mai yawa na bayanai saboda wasu dalilai sun yi imanin cewa Intanet ba ta da iyaka. Idan akwai fayil ɗin terabytes da yawa a wurin, to zaku iya ɗauka zuwa uwar garken ku ko kwamfutar ku buɗe ta amfani da shi cat da kallo. Wani kuskure yana ciki Vim dubi gungumen azaba. Kada ku taɓa yin wannan don yana da muni. Saboda Vim yana ƙoƙarin ɓoye komai, ɗora komai cikin ƙwaƙwalwar ajiya, musamman lokacin da muka fara motsawa ta wannan log ɗin kuma muna neman wani abu. Wadannan abubuwa ne da aka manta, amma ya kamata a yi la'akari.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Yana da sauƙin rubuta program guda ɗaya wanda ke aiki akan kwamfuta ɗaya tare da processor ɗaya.

Lokacin da tsarinmu ya girma, muna so mu daidaita shi duka, kuma a daidaita shi ba kawai akan kwamfuta ba, har ma a kan gungu. Tambayar ta taso: yadda za a daidaita wannan al'amari? Aikace-aikacen mu bazai ma yin hulɗa da juna ba, amma mun gudanar da matakai da yawa a layi daya akan sabar da yawa. Kuma ta yaya za a saka idanu cewa komai yana tafiya daidai a gare su? Misali, suna aika wani abu ta hanyar Intanet. Dole ne su rubuta game da jiharsu a wani wuri, misali, a cikin wani nau'i na bayanai ko log, sa'an nan kuma a tattara wannan log ɗin sannan su yi nazari a wani wuri. Bugu da ƙari, muna buƙatar la'akari da cewa tsarin yana aiki kuma yana aiki, ba zato ba tsammani wani kuskure ya bayyana a ciki ko ya fadi, to yaya za mu gano game da shi da sauri?

A bayyane yake cewa duk wannan ana iya sa ido da sauri. Wannan kuma yana da kyau, amma saka idanu abu ne mai iyaka wanda ke ba ka damar saka idanu wasu abubuwa a matakin mafi girma.

Lokacin da muke son tsarinmu ya fara hulɗa da juna, misali, aika wa juna wasu bayanai, to, tambaya kuma ta taso - ta yaya hakan zai faru? Shin za a sami wani nau'in yanayin tsere, za su sake rubuta juna, shin bayanan sun isa daidai, wani abu zai ɓace a hanya? Muna buƙatar haɓaka wani nau'in yarjejeniya, da sauransu.

Haɗin kai duk waɗannan matakai ba ƙaramin abu bane. Kuma yana tilasta wa mai haɓakawa ya gangara zuwa ƙananan matakin, kuma ya rubuta tsarin ko dai daga karce, ko kuma ba daga karce ba, amma wannan ba mai sauƙi ba ne.

Idan kun zo da algorithm na cryptographic ko ma aiwatar da shi, to ku jefar da shi nan da nan, saboda wataƙila ba zai yi muku aiki ba. Zai fi dacewa ya ƙunshi gungun kurakurai waɗanda kuka manta don samar da su. Kada a taɓa amfani da shi don wani abu mai mahimmanci saboda zai iya zama marar kwanciyar hankali. Domin duk algorithms da suka wanzu an gwada su ta lokaci na dogon lokaci. Al'umma ne suka buge shi. Wannan batu ne daban. Kuma haka yake a nan. Idan ba zai yiwu ba don aiwatar da wani nau'i na aiki tare da kanku, to, yana da kyau kada ku yi haka, saboda yana da rikitarwa kuma yana jagorantar ku zuwa hanyar girgizar kullun neman kurakurai.

A yau muna magana ne game da ZooKeeper. A gefe guda, tsari ne, a gefe guda, sabis ne wanda ke sauƙaƙe rayuwa ga mai haɓakawa kuma yana sauƙaƙe aiwatar da dabaru da daidaita ayyukanmu gwargwadon yiwuwa.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Bari mu tuna yadda daidaitaccen tsarin rarraba zai iya kama. Wannan shi ne abin da muka yi magana game da - HDFS, HBase. Akwai tsarin Jagora wanda ke kula da ma'aikata da tsarin bawa. Yana da alhakin daidaitawa da rarraba ayyuka, sake kunna ma'aikata, ƙaddamar da sababbi, da rarraba kaya.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Abu mafi ci gaba shi ne Coordination Service, wato, matsar da aikin haɗin kai da kansa zuwa wani tsari na daban, tare da gudanar da wani nau'i na madadin ko Jagora a cikin layi daya, saboda Jagora na iya kasawa. Kuma idan Jagora ya fadi, to tsarin mu ba zai yi aiki ba. Muna gudanar da madadin. Wasu jihohi cewa Jagoran yana buƙatar a maimaita shi zuwa madadin. Hakanan ana iya ba da wannan amana ga Sabis ɗin Gudanarwa. Amma a cikin wannan zane, Jagora da kansa ke da alhakin daidaita ma'aikata; a nan sabis ɗin yana daidaita ayyukan kwafin bayanai.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Zaɓin mafi ci gaba shine lokacin da duk haɗin kai ke sarrafa ta sabis ɗinmu, kamar yadda aka saba yi. Yana ɗaukar alhakin tabbatar da cewa komai yana aiki. Kuma idan wani abu bai yi aiki ba, mun gano game da shi kuma mu yi ƙoƙari mu shawo kan wannan yanayin. A kowane hali, an bar mu da Jagora wanda ko ta yaya yake hulɗa da bayi kuma yana iya aika bayanai, bayanai, saƙonni, da dai sauransu ta hanyar wasu sabis.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Akwai wani tsari da ya fi ci gaba, a lokacin da ba mu da Jagora, duk nodes na manyan bayi ne, daban-daban a halayensu. Amma har yanzu suna buƙatar yin hulɗa da juna, don haka har yanzu akwai sauran sabis don daidaita waɗannan ayyukan. Wataƙila, Cassandra, wanda ke aiki akan wannan ka'ida, ya dace da wannan makirci.

Yana da wuya a faɗi wanne cikin waɗannan tsare-tsare ne ke aiki mafi kyau. Kowannensu yana da nasa ribobi da fursunoni.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Kuma babu buƙatar jin tsoron wasu abubuwa tare da Jagora, domin, kamar yadda aikin ya nuna, ba shi da sauƙi ga ci gaba da hidima. Babban abu a nan shi ne a zabi hanyar da ta dace don gudanar da wannan sabis ɗin a kan kulli mai ƙarfi daban, ta yadda ya sami isassun albarkatu, ta yadda idan zai yiwu, masu amfani ba su sami damar zuwa wurin ba, don kada su kashe wannan tsari ba da gangan ba. Amma a lokaci guda, a cikin irin wannan makirci yana da sauƙin sarrafa ma'aikata daga tsarin Jagora, watau wannan makirci ya fi sauƙi daga ra'ayi na aiwatarwa.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Kuma wannan makirci (a sama) tabbas ya fi rikitarwa, amma ya fi dogara.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Babban matsalar ita ce gazawar bangare. Misali, idan muka aika sako ta hanyar sadarwa ta hanyar sadarwa, wani nau'i na haɗari yana faruwa, wanda ya aiko da sakon ba zai san ko an karbi sakonsa ba da kuma abin da ya faru a bangaren mai karɓa, ba zai san ko an sarrafa saƙon daidai ba. , watau ba zai sami wani tabbaci ba.

Saboda haka, dole ne mu aiwatar da wannan yanayin. Kuma abu mafi sauki shi ne mu sake tura wannan sako mu jira har sai mun sami amsa. A wannan yanayin, ba a la'akari da ko yanayin mai karɓa ya canza. Za mu iya aika saƙo kuma mu ƙara bayanai iri ɗaya sau biyu.

ZooKeeper yana ba da hanyoyin magance irin wannan ƙi, wanda kuma ya sauƙaƙa rayuwarmu.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Kamar yadda aka ambata a baya, wannan yana kama da rubuta shirye-shirye masu yawa, amma babban bambanci shi ne cewa a cikin aikace-aikacen da aka rarraba da muke ginawa akan na'urori daban-daban, hanyar sadarwa kawai ita ce Network. Mahimmanci, wannan gine-ginen ba kowa bane. Kowane tsari ko sabis da ke aiki akan na'ura ɗaya yana da nasa ƙwaƙwalwar ajiya, diski na kansa, na'urar sarrafa kansa, wanda ba ya raba wa kowa.

Idan muka rubuta shirin mai zaren da yawa akan kwamfuta ɗaya, to za mu iya amfani da ƙwaƙwalwar ajiya don musayar bayanai. Muna da maɓallin mahallin a can, matakai na iya canzawa. Wannan yana rinjayar aiki. A gefe guda, babu irin wannan abu a cikin shirin akan cluster, amma akwai matsaloli tare da hanyar sadarwa.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Saboda haka, manyan matsalolin da ke tasowa lokacin rubuta tsarin da aka rarraba shine daidaitawa. Muna rubuta wani nau'i na aikace-aikace. Idan mai sauki ne, to muna hardcode kowane nau'in lambobi a cikin code, amma wannan bai dace ba, domin idan muka yanke shawarar cewa maimakon lokacin da ya wuce rabin daƙiƙa muna son lokaci ya ƙare na daƙiƙa ɗaya, to muna buƙatar sake haɗa aikace-aikacen kuma sake mirgine komai. Abu ɗaya ne lokacin da yake kan na'ura ɗaya, lokacin da za ku iya sake kunna shi kawai, amma idan muna da injuna da yawa, dole ne mu kwafi komai akai-akai. Dole ne mu yi ƙoƙarin yin aikace-aikacen daidaitacce.

Anan muna magana ne game da tsayayyen tsari don tsarin tsarin. Wannan ba gaba ɗaya ba ne, ƙila daga mahangar tsarin aiki, yana iya zama tsayayyen tsarin tafiyar da ayyukanmu, wato, wannan tsari ne wanda ba za a iya ɗauka da sabunta shi kawai ba.

Hakanan akwai tsari mai ƙarfi. Waɗannan su ne sigogin da muke so mu canza a kan tashi don a ɗauke su a can.

Menene matsalar anan? Mun sabunta tsarin, mun fitar da shi, to menene? Matsalar na iya zama cewa a gefe ɗaya mun fitar da tsarin, amma mun manta da sabon abu, saitin ya kasance a can. Na biyu, yayin da muke ci gaba, an sabunta tsarin a wasu wurare, amma ba a wasu ba. Kuma wasu matakai na aikace-aikacenmu da suke aiki akan na'ura ɗaya an sake kunna su tare da sabon saiti, kuma wani wuri tare da tsohuwar. Wannan na iya haifar da aikace-aikacen mu da aka rarraba ya zama sabani daga yanayin daidaitawa. Wannan matsalar ta zama ruwan dare. Don daidaitawa mai ƙarfi, ya fi dacewa saboda yana nuna cewa ana iya canza shi akan tashi.

Wata matsala kuma ita ce zama membobin kungiya. Koyaushe muna da wasu ma'aikata, koyaushe muna son sanin wanene ke raye, wanene ya mutu. Idan akwai Jagora, to dole ne ya fahimci waɗanne ma'aikata ne za a iya tura su zuwa abokan ciniki don su gudanar da lissafin ko aiki da bayanai, kuma waɗanda ba za su iya ba. Matsalar da ke tasowa akai-akai ita ce, muna buƙatar sanin wanda ke aiki a cikin ƙungiyarmu.

Wata babbar matsala ita ce zaɓen shugabanni, lokacin da muke son sanin wanda ke da iko. Misali ɗaya shine maimaitawa, lokacin da muke da wasu tsari wanda ke karɓar ayyukan rubutu sannan kuma a sake maimaita su a cikin sauran hanyoyin. Shi ne zai zama shugaba, kowa zai yi masa biyayya, zai bi shi. Wajibi ne a zabi hanyar da za ta kasance babu shakku ga kowa da kowa, don kada ya zama an zabi shugabanni biyu.

Hakanan akwai damar keɓantacce. Matsalar anan ta fi rikitarwa. Akwai irin wannan abu kamar mutex, lokacin da kake rubuta shirye-shirye masu nau'i-nau'i da yawa kuma kana son samun damar yin amfani da wasu albarkatu, misali, ƙwaƙwalwar ajiyar ƙwaƙwalwar ajiya, ta iyakance kuma za'a yi ta hanyar zare ɗaya kawai. A nan albarkatun na iya zama wani abu mafi m. Kuma daban-daban aikace-aikace daga nodes daban-daban na mu Network ya kamata kawai samun keɓaɓɓen damar zuwa ga wani albarkatun da aka ba, kuma ba domin kowa da kowa zai iya canza shi ko rubuta wani abu a can. Waɗannan su ne abin da ake kira makullai.

ZooKeeper yana ba ku damar magance duk waɗannan matsalolin zuwa mataki ɗaya ko wani. Kuma zan nuna tare da misalai yadda zai ba ku damar yin wannan.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Babu abubuwan toshewa na farko. Lokacin da muka fara amfani da wani abu, wannan na farko ba zai jira wani lamari ya faru ba. Mafi m, wannan abu zai yi aiki asynchronously, game da shi kyale tafiyar matakai ba su rataya yayin da suke jiran wani abu. Wannan abu ne mai matukar amfani.

Ana sarrafa duk buƙatun abokin ciniki bisa tsari na babban layi.

Kuma abokan ciniki suna da damar karɓar sanarwa game da canje-canje a wasu jihohi, game da canje-canjen bayanai, kafin abokin ciniki ya ga bayanan da aka canza da kansu.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

ZooKeeper na iya aiki ta hanyoyi biyu. Na farko shi kadai ne, akan kumburi daya. Wannan ya dace don gwaji. Hakanan yana iya aiki a yanayin tari akan kowane adadin sabobin. Idan muna da gungu na inji 100, to ba lallai ba ne ya yi aiki a kan injin 100. Ya isa ya zaɓi injuna da yawa inda zaku iya tafiyar da ZooKeeper. Kuma yana da'awar ka'idar babban samuwa. A kowane misali mai gudana, ZooKeeper yana adana cikakken kwafin bayanan. Daga baya zan gaya muku yadda yake yi. Ba ya karkatar da bayanai ko raba shi. A gefe guda, shi ne ragi da ba za mu iya adana da yawa, a daya bangaren, babu bukatar yin wannan. Ba abin da aka tsara shi ba ne, ba ma'ajin bayanai ba ne.

Ana iya adana bayanai a gefen abokin ciniki. Wannan ƙa'ida ce ta ƙa'ida don kada mu katse sabis ɗin kuma kar mu ɗora shi da buƙatun iri ɗaya. Abokin ciniki mai wayo yakan san game da wannan kuma yana adana shi.

Misali, wani abu ya canza a nan. Akwai wani irin aikace-aikace. An zaɓi sabon shugaba, wanda ke da alhakin, alal misali, sarrafa ayyukan rubuce-rubuce. Kuma muna so mu kwafi bayanan. Ɗayan bayani shine a saka shi a cikin madauki. Kuma kullum muna tambayar sabis ɗinmu - shin wani abu ya canza? Zaɓin na biyu ya fi dacewa. Wannan tsarin agogo ne wanda ke ba ku damar sanar da abokan ciniki cewa wani abu ya canza. Wannan hanya ce mai ƙarancin tsada dangane da albarkatun kuma mafi dacewa ga abokan ciniki.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Abokin ciniki shine mai amfani da ke amfani da ZooKeeper.

Server shine tsarin ZooKeeper kanta.

Znode shine babban abu a cikin ZooKeeper. Ana adana duk znodes a cikin ƙwaƙwalwar ajiya ta ZooKeeper kuma an tsara su a cikin sigar zane mai matsayi, a cikin hanyar bishiya.

Akwai nau'ikan ayyuka guda biyu. Na farko shine sabuntawa/rubuta, lokacin da wasu ayyuka suka canza yanayin bishiyar mu. Itacen na kowa ne.

Kuma yana yiwuwa abokin ciniki bai cika buƙatu ɗaya ba kuma an cire shi, amma zai iya kafa zaman ta hanyar da yake hulɗa da ZooKeeper.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Samfurin bayanan ZooKeeper yayi kama da tsarin fayil. Akwai tushen tushe sannan mu tafi kamar ta cikin kundin adireshi da ke fitowa daga tushen. Sannan kasida na matakin farko, matakin na biyu. Wannan duk znodes ne.

Kowane znode zai iya adana wasu bayanai, yawanci ba babba ba, misali, kilobytes 10. Kuma kowane znode na iya samun takamaiman adadin yara.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Znodes suna zuwa cikin nau'ikan iri da yawa. Ana iya ƙirƙirar su. Kuma lokacin ƙirƙirar znode, mun ƙayyade nau'in da yakamata ya kasance.

Akwai nau'i biyu. Na farko shi ne tuta na ephemeral. Znode yana rayuwa a cikin zama. Misali, abokin ciniki ya kafa zaman. Kuma muddin wannan zaman yana raye, zai wanzu. Wannan wajibi ne don kada a samar da wani abu marar amfani. Wannan kuma ya dace da lokacin da yake da mahimmanci a gare mu mu adana bayanan farko a cikin zama.

Nau'i na biyu tuta ce ta jere. Yana ƙara ma'ajiya akan hanyar zuwa znode. Misali, muna da kundin adireshi tare da aikace-aikacen 1_5. Kuma lokacin da muka ƙirƙiri kumburin farko, an karɓi p_1, na biyu - p_2. Kuma lokacin da muka kira wannan hanyar kowane lokaci, mun wuce cikakken hanya, yana nuna kawai sashi na hanyar, kuma wannan lambar tana ƙaruwa ta atomatik saboda muna nuna nau'in kumburi - jeri.

Znode na yau da kullun. Za ta rayu har abada kuma tana da sunan da muke gaya mata.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Wani abu mai amfani shine tutar agogo. Idan muka shigar da shi, to abokin ciniki na iya biyan kuɗi zuwa wasu abubuwan da suka faru don takamaiman kumburi. Zan nuna maka daga baya da misali yadda ake yin haka. ZooKeeper da kansa yana sanar da abokin ciniki cewa bayanan da ke kan kullin ya canza. Koyaya, sanarwar ba ta da garantin cewa wasu sabbin bayanai sun iso. Suna kawai cewa wani abu ya canza, don haka har yanzu kuna kwatanta bayanai daga baya tare da kira daban-daban.

Kuma kamar yadda na riga na fada, tsarin bayanan yana ƙayyade ta kilobytes. Babu buƙatar adana manyan bayanan rubutu a wurin, saboda ba ma'ajin bayanai ba ne, sabar haɗin gwiwar aiki ce.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Zan gaya muku kadan game da zaman. Idan muna da sabar sabar da yawa, to za mu iya fita a fili daga uwar garken zuwa uwar garken ta amfani da mai gano zaman. Ya dace sosai.

Kowane zama yana da wani nau'in lokacin ƙarewa. Ana bayyana zaman ta ko abokin ciniki ya aika wani abu zuwa uwar garken yayin wannan zaman. Idan bai aika da wani abu ba yayin lokacin ƙarewa, zaman ya faɗi, ko abokin ciniki na iya rufe shi da kansa.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Ba shi da wannan fasali da yawa, amma kuna iya yin abubuwa daban-daban tare da wannan API. Wannan kiran da muka gani yana ƙirƙirar znode kuma yana ɗaukar sigogi uku. Wannan ita ce hanyar zuwa znode, kuma dole ne a ƙayyade shi cikakke daga tushen. Kuma wannan wasu bayanan ne da muke son canjawa wurin. Da kuma irin tuta. Kuma bayan halitta ya mayar da hanyar zuwa znode.

Na biyu, za ku iya share shi. Dabarar a nan ita ce siga na biyu, ban da hanyar zuwa znode, na iya tantance sigar. Saboda haka, za a share wannan znode idan sigar sa da muka canjawa wuri yayi daidai da wanda yake a zahiri.

Idan ba ma son duba wannan sigar, to kawai mu wuce hujjar "-1".

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Na uku, yana bincikar wanzuwar znode. Yana dawo da gaskiya idan kumburin ya wanzu, ƙarya in ba haka ba.

Sannan agogon tuta ya bayyana, wanda ke ba ka damar saka idanu akan wannan kumburin.

Kuna iya saita wannan tuta ko da akan kumburin da babu shi kuma sami sanarwa lokacin da ta bayyana. Wannan kuma zai iya zama da amfani.

Akwai ƙarin ƙalubale guda biyu samunData. A bayyane yake cewa za mu iya karɓar bayanai ta hanyar znode. Hakanan zaka iya amfani da agogon tuta. A wannan yanayin, ba zai shigar ba idan babu kumburi. Don haka, kuna buƙatar fahimtar cewa akwai, sannan ku karɓi bayanai.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Akwai kuma SaitaData. Anan mun wuce sigar. Kuma idan muka wuce wannan, za a sabunta bayanai akan znode na wani sigar.

Hakanan zaka iya saka "-1" don ware wannan cak.

Wata hanya mai amfani ita ce samun Yara. Hakanan zamu iya samun jerin duk znodes waɗanda ke cikinsa. Za mu iya saka idanu akan hakan ta hanyar saita agogon tuta.

Kuma hanya Gama aiki yana ba da damar aika duk canje-canje a lokaci ɗaya, ta haka ne tabbatar da cewa an adana su kuma an canza duk bayanan gaba ɗaya.

Idan muka zana kwatance tare da shirye-shirye na yau da kullun, to lokacin da kuke amfani da hanyoyin kamar su rubuta, wanda ke rubuta wani abu zuwa diski, kuma bayan ya dawo muku da martani, babu tabbacin cewa kun rubuta bayanan a diski. Kuma ko da a lokacin da tsarin aiki yana da tabbacin cewa duk abin da aka rubuta, akwai hanyoyi a cikin faifan kanta inda tsarin ke tafiya ta hanyar yadudduka na buffers, kuma bayan haka an sanya bayanan akan faifai.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Galibi ana amfani da kira asynchronous. Wannan yana bawa abokin ciniki damar yin aiki a layi daya tare da buƙatun daban-daban. Kuna iya amfani da tsarin aiki tare, amma ba shi da fa'ida.

Ayyuka guda biyu da muka yi magana game da su sune sabuntawa/rubutu, waɗanda ke canza bayanai. Waɗannan sune ƙirƙira, saitaData, daidaitawa, sharewa. Kuma akwai karatu, samunData, samun Yara.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Yanzu 'yan misalan yadda za ku iya yin primitives don aiki a cikin tsarin da aka rarraba. Misali, mai alaƙa da daidaitawar wani abu. Wani sabon ma'aikaci ya bayyana. Mun kara injin kuma muka fara aiki. Kuma akwai tambayoyi guda uku masu zuwa. Ta yaya ake tambayar ZooKeeper don daidaitawa? Kuma idan muna so mu canza tsarin, ta yaya za mu canza shi? Kuma bayan mun canza shi, ta yaya waɗannan ma'aikatan da muke da su suka samu?

ZooKeeper yana sanya wannan sauƙi. Misali, akwai bishiyar mu ta znode. Akwai kumburi don aikace-aikacen mu a nan, muna ƙirƙirar ƙarin kumburi a ciki, wanda ya ƙunshi bayanai daga tsarin. Waɗannan ƙila ko ba za su zama sigogi daban ba. Tun da girman yana da ƙarami, girman sanyi yawanci ƙanana ne, don haka yana yiwuwa a adana shi a nan.

Kuna amfani da hanyar samunData don samun tsari ga ma'aikaci daga kumburi. Saita zuwa gaskiya. Idan saboda wasu dalilai wannan kumburin baya wanzu, za a sanar da mu game da shi lokacin da ya bayyana, ko lokacin da ya canza. Idan muna so mu san cewa wani abu ya canza, to mun saita shi zuwa gaskiya. Kuma idan bayanan da ke cikin wannan kumburi sun canza, za mu san game da shi.

SaitaData. Mun saita bayanan, saita "-1", watau ba mu bincika sigar ba, muna ɗauka cewa koyaushe muna da tsari ɗaya, ba mu buƙatar adana saiti da yawa. Idan kuna buƙatar adana da yawa, kuna buƙatar ƙara wani matakin. Anan mun yi imanin cewa akwai guda ɗaya kawai, don haka muna sabunta sabuwar kawai, don haka ba mu bincika sigar ba. A wannan lokacin, duk abokan cinikin da suka yi rajista a baya suna karɓar sanarwa cewa wani abu ya canza a cikin wannan kullin. Kuma bayan sun karɓa, dole ne su sake neman bayanan. Sanarwa ita ce ba sa karɓar bayanan da kanta, amma kawai sanarwar canje-canje. Bayan wannan dole ne su nemi sababbin bayanai.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Zabi na biyu don amfani da na farko shine zama membobin kungiyar. Muna da aikace-aikacen da aka rarraba, akwai tarin ma'aikata kuma muna so mu fahimci cewa duk suna nan. Don haka, dole ne su yi rajistar kansu cewa suna aiki a aikace-aikacen mu. Kuma muna so mu gano, ko dai daga tsarin Jagora ko wani wuri, game da duk ma'aikatan da muke da su a halin yanzu.

Ta yaya za mu yi wannan? Don aikace-aikacen, muna ƙirƙira kumburin ma'aikata kuma muna ƙara ƙaramin abu a wurin ta amfani da hanyar ƙirƙira. Ina da kuskure akan faifan. Anan kuna buƙata na jerin gwano tantance, sannan za a ƙirƙiri duk ma'aikata ɗaya bayan ɗaya. Kuma aikace-aikacen, neman duk bayanan game da yaran wannan kumburi, yana karɓar duk ma'aikatan da ke aiki.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Wannan mummunan aiwatarwa ne na yadda za'a iya yin hakan a lambar Java. Bari mu fara daga ƙarshe, tare da babbar hanya. Wannan shine ajin mu, bari mu kirkiro hanyarsa. A matsayin hujja ta farko muna amfani da host, inda muke haɗawa, watau mun saita shi azaman hujja. Kuma hujja ta biyu ita ce sunan kungiyar.

Ta yaya haɗin ke faruwa? Wannan misali ne mai sauƙi na API ɗin da ake amfani da shi. Komai yana da sauki a nan. Akwai madaidaicin ajin ZooKeeper. Mun wuce runduna zuwa gare shi. Kuma saita lokacin ƙarewa, misali, zuwa 5 seconds. Kuma muna da memba mai suna ConnectionSignal. Mahimmanci, muna ƙirƙirar ƙungiya tare da hanyar da aka watsa. Ba mu rubuta bayanai a can, kodayake ana iya rubuta wani abu. Kuma kumburi a nan yana da nau'i mai tsayi. Ainihin, wannan kulli ne na yau da kullun na yau da kullun wanda zai wanzu koyaushe. Anan ne aka kirkiro zaman. Wannan shine aiwatar da abokin ciniki da kansa. Abokin cinikinmu zai aika saƙon lokaci-lokaci masu nuna cewa zaman yana raye. Kuma idan muka ƙare zaman, muna kira kusa kuma shi ke nan, zaman ya fadi. Wannan idan wani abu ya fado mana, don ZooKeeper ya gano game da shi kuma ya yanke zaman.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Yadda za a kulle albarkatu? Anan komai ya dan fi rikitarwa. Muna da rukunin ma'aikata, akwai wasu albarkatun da muke son kullewa. Don yin wannan, za mu ƙirƙiri wani kumburi daban, misali, da ake kira lock1. Idan mun sami damar ƙirƙirar shi, to mun sami makulli a nan. Kuma idan ba mu iya ƙirƙirar shi ba, to ma'aikaci ya yi ƙoƙari ya samo Data daga nan, kuma tun da an riga an ƙirƙiri node, sa'an nan kuma mu sanya mai kallo a nan kuma lokacin da yanayin wannan kumburi ya canza, za mu sani game da shi. Kuma za mu iya ƙoƙarin samun lokaci don sake ƙirƙira shi. Idan muka ɗauki wannan kumburi, muka ɗauki wannan kulle, to, bayan mun daina buƙatar kulle, za mu watsar da shi, tunda kumburin yana wanzuwa ne kawai a cikin zaman. Saboda haka, zai bace. Kuma wani abokin ciniki, a cikin tsarin wani zaman, zai iya ɗaukar kulle a kan wannan kumburi, ko kuma, zai sami sanarwar cewa wani abu ya canza kuma yana iya ƙoƙarin yin shi a cikin lokaci.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Wani misali na yadda zaku iya zaɓar babban shugaba. Wannan ya ɗan fi rikitarwa, amma kuma mai sauƙi. Me ke faruwa a nan? Akwai babban kumburi wanda ya tattara duk ma'aikata. Muna ƙoƙarin samun bayanai game da jagora. Idan wannan ya faru cikin nasara, watau mun sami wasu bayanai, to ma'aikacin mu ya fara bin wannan shugaba. Ya yi imanin cewa akwai shugaba.

Idan shugaban ya mutu saboda wasu dalilai, alal misali, ya fadi, to muna ƙoƙarin ƙirƙirar sabon shugaba. Kuma idan muka yi nasara, to ma'aikacin mu ya zama jagora. Kuma idan wani a wannan lokacin ya sami damar ƙirƙirar sabon shugaba, to muna ƙoƙarin fahimtar wanene sannan mu bi shi.

Anan abin da ake kira tasirin garken ya taso, watau tasirin garken, domin idan shugaba ya mutu, wanda yake na farko a lokaci ne zai zama shugaba.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Lokacin da ake ɗaukar albarkatun, kuna iya ƙoƙarin yin amfani da wata hanya ta ɗan bambanta, wanda shine kamar haka. Misali, muna son samun makulli, amma ba tare da tasirin hert ba. Zai ƙunshi gaskiyar cewa aikace-aikacenmu yana buƙatar lissafin duk ids na kumburi don kumburin da ya riga ya kasance tare da kulle. Kuma idan kafin wannan kumburin da muka ƙirƙiri makulli don shi shine mafi ƙarancin saitin da muka karɓa, to wannan yana nufin cewa mun kama makullin. Muna duba cewa mun sami makulli. A matsayin dubawa, za a sami yanayin cewa id ɗin da muka karɓa lokacin ƙirƙirar sabon kulle ba shi da ƙaranci. Kuma idan mun karba, to muna kara yin aiki.

Idan akwai wani id ɗin da ya fi ƙanƙanta da makullin mu, to sai mu sanya mai kallo akan wannan taron kuma mu jira sanarwa har sai wani abu ya canza. Wato mun sami wannan makullin. Kuma har sai ya fadi, ba za mu zama mafi ƙarancin id ba kuma ba za mu karɓi ƙaramin kulle ba, don haka za mu sami damar shiga. Idan kuma ba a cika wannan sharadi ba, to nan take mu je nan mu sake yin kokarin samun wannan makullin, domin wata kila wani abu ya canza a wannan lokacin.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Menene ZooKeeper ya kunsa? Akwai manyan abubuwa guda 4. Wannan tsari ne na sarrafawa - nema. Haka kuma ZooKeeper Atomic Broadcast. Akwai Log ɗin Ƙaddamarwa inda aka rubuta duk ayyukan. Da kuma In-memory Replicated DB da kanta, watau database da kanta inda ake ajiye wannan bishiyar gabaɗaya.

Yana da kyau a lura cewa duk ayyukan rubuta suna tafiya ta hanyar Mai sarrafa Buƙatun. Kuma karanta ayyukan tafi kai tsaye zuwa In-memory database.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Rubutun bayanai da kansa yana da cikakken kwafi. Duk misalin ZooKeeper yana adana cikakken kwafin bayanai.

Domin maido da ma'ajin bayanai bayan fadowa, akwai rubutattun bayanai. Daidaitaccen aikin shine cewa kafin bayanai su shiga cikin ƙwaƙwalwar ajiya, an rubuta su a can ta yadda idan ya fadi, za a iya kunna wannan log ɗin kuma a iya dawo da yanayin tsarin. Kuma ana amfani da hotuna na lokaci-lokaci na bayanan bayanai.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

Watsa shirye-shiryen Atomic na ZooKeeper abu ne da ake amfani dashi don kula da kwafin bayanai.

ZAB a ciki yana zaɓar jagora daga ma'anar kullin ZooKeeper. Sauran nodes sun zama mabiyanta kuma suna tsammanin wasu ayyuka daga gare ta. Idan sun karɓi shigarwar, suna tura su duka zuwa ga jagora. Da farko ya yi aikin rubutawa sannan ya aika da sako game da abin da ya canza ga mabiyansa. Wannan, a zahiri, dole ne a yi shi ta atomatik, watau yin rikodi da watsa shirye-shiryen duk abin dole ne a yi shi ta atomatik, ta yadda zai tabbatar da daidaiton bayanai.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop" Yana aiwatar da rubuta buƙatun kawai. Babban aikinsa shi ne cewa yana canza aikin zuwa sabuntawar ma'amala. Wannan buƙatu ce ta musamman da aka samar.

Kuma a nan yana da mahimmanci a lura cewa an tabbatar da ikon sabuntawa don aiki iri ɗaya. Menene shi? Wannan abu, idan an aiwatar da shi sau biyu, zai kasance da yanayi iri ɗaya, watau buƙatar kanta ba zai canza ba. Kuma ana buƙatar yin wannan ta yadda idan wani hatsari ya faru, za ku iya sake kunna aikin, ta yadda za a mayar da sauye-sauyen da suka fadi a halin yanzu. A wannan yanayin, yanayin tsarin zai zama iri ɗaya, watau bai kamata ya zama yanayin cewa jerin nau'ikan iri ɗaya ba, alal misali, sabunta tsarin, ya haifar da wasu jihohi na ƙarshe na tsarin.

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

"Hadoop. ZooKeeper" daga jerin Technostream Rukunin Mail.Ru "Hanyoyin don rarraba manyan kundin bayanai a Hadoop"

source: www.habr.com

Add a comment