Cloud Resilient Bala'i: Yadda Yake Aiki

Hai Habr!

Bayan bukukuwan Sabuwar Shekara, mun sake buɗe gajimare mai hana bala'i dangane da shafuka biyu. A yau za mu gaya muku yadda yake aiki da kuma nuna abin da ke faruwa ga na'urori masu kama-da-wane na abokin ciniki lokacin da abubuwan da ke cikin gungu suka gaza kuma duk rukunin rukunin yanar gizon sun fashe (masu ɓarna - komai yana da kyau tare da su).

Cloud Resilient Bala'i: Yadda Yake Aiki
Tsarin ajiyar girgije mai jurewa bala'i akan rukunin OST.

Menene ciki

A ƙarƙashin hular, gungu yana da sabobin Cisco UCS tare da VMware ESXi hypervisor, tsarin ajiya INFINIDAT InfiniBox F2240 guda biyu, kayan aikin cibiyar sadarwa na Cisco Nexus, da kuma masu sauya Brocade SAN. An raba gungu zuwa shafuka biyu - OST da NORD, watau kowace cibiyar bayanai tana da nau'ikan kayan aiki iri ɗaya. A gaskiya, wannan shine abin da ya sa ya zama mai jurewa bala'i.

A cikin rukunin yanar gizon guda ɗaya, manyan abubuwan kuma ana kwafi su (runduna, SAN switches, sadarwar yanar gizo).
An haɗa rukunin yanar gizon biyu ta hanyoyin hanyoyin fiber optic da aka keɓe, kuma an tanadar su.

Kalmomi kaɗan game da tsarin ajiya. Mun gina sigar farko ta girgije mai hana bala'i akan NetApp. Anan mun zaɓi INFINIDAT, kuma ga dalilin:

  • Zabin maimaita aiki-Aiki. Yana ba da damar na'ura mai mahimmanci ta ci gaba da aiki ko da ɗaya daga cikin tsarin ajiya ya gaza gaba ɗaya. Zan yi muku ƙarin bayani game da kwafi daga baya.
  • Masu kula da faifai guda uku don ƙara haƙuri na kuskuren tsarin. Yawancin lokaci akwai biyu.
  • Shirye mafita. Mun sami riga-kafi da aka haɗa wanda kawai yana buƙatar haɗawa da hanyar sadarwa da kuma daidaita shi.
  • Tallafin fasaha mai hankali. Injiniyoyin INFINIDAT koyaushe suna nazarin rajistan ayyukan ajiya da abubuwan da suka faru, shigar da sabbin nau'ikan firmware, kuma suna taimakawa tare da daidaitawa.

Ga wasu hotuna daga kwashe kaya:

Cloud Resilient Bala'i: Yadda Yake Aiki

Cloud Resilient Bala'i: Yadda Yake Aiki

Yadda yake aiki

Girgiran ya riga ya yi haƙuri a cikin kansa. Yana kare abokin ciniki daga gazawar hardware da software guda ɗaya. Mai jure bala'i zai taimaka karewa daga manyan gazawa a cikin rukunin yanar gizo: misali, gazawar tsarin ajiya (ko gungu na SDS, wanda ke faruwa sau da yawa 🙂), manyan kurakurai a cikin hanyar sadarwar ajiya, da sauransu. To, kuma mafi mahimmanci: irin wannan gajimare yana adanawa lokacin da gabaɗayan rukunin yanar gizon ya zama ba za a iya shiga ba saboda gobara, baƙar fata, mamaye mahara, ko saukar baki.

A duk waɗannan lokuta, na'urori masu kama da abokin ciniki suna ci gaba da aiki, kuma ga dalilin da ya sa.

An ƙirƙira ƙirar tari ta yadda kowane mai masaukin ESXi tare da injuna kama-da-wane na abokin ciniki zai iya samun dama ga kowane tsarin ajiya guda biyu. Idan tsarin ajiya a kan rukunin yanar gizon OST ya kasa, injiniyoyi masu kama-da-wane za su ci gaba da aiki: rundunan da suke aiki da su za su sami damar tsarin ajiya akan NORD don bayanai.

Cloud Resilient Bala'i: Yadda Yake Aiki
Wannan shine yadda tsarin haɗin kai a cikin tari yayi kama.

Wannan yana yiwuwa saboda gaskiyar cewa an saita hanyar Inter-Switch Link tsakanin SAN masana'anta na shafukan yanar gizo guda biyu: Fabric A OST SAN switch yana haɗa zuwa Fabric A NORD SAN switches, kuma kamar haka ga Fabric B SAN switches.

Da kyau, domin duk waɗannan rikice-rikice na masana'antar SAN su sami ma'ana, Ana daidaita kwafin Active-Active tsakanin tsarin ajiya guda biyu: kusan an rubuta bayanai a lokaci guda zuwa tsarin ajiya na gida da na nesa, RPO = 0. Ya zama cewa ana adana bayanan asali a kan tsarin ajiya ɗaya, kuma ana adana kwafin su akan ɗayan. Ana maimaita bayanai a matakin juzu'in ajiya, kuma ana adana bayanan VM (faifan sa, fayil ɗin sanyi, fayil ɗin musanyawa, da sauransu) akan su.

Mai watsa shiri na ESXi yana ganin ƙarar farko da kwafinta azaman na'urar diski ɗaya (Na'urar Adana). Akwai hanyoyi 24 daga mai masaukin ESXi zuwa kowace na'urar diski:

Hanyoyi 12 suna haɗa shi zuwa tsarin ajiya na gida (hanyoyi mafi kyau), da sauran 12 zuwa tsarin ajiya mai nisa (hanyoyi marasa kyau). A cikin yanayi na al'ada, ESXi yana samun damar bayanai akan tsarin ajiya na gida ta amfani da hanyoyi "mafi kyau". Lokacin da wannan tsarin ajiya ya kasa, ESXi ya rasa ingantattun hanyoyi kuma ya canza zuwa "marasa mafi kyau". Wannan shine yadda yake kama da zane.

Cloud Resilient Bala'i: Yadda Yake Aiki
Tsarin gungu mai hana bala'i.

Duk cibiyoyin sadarwar abokin ciniki suna haɗe zuwa rukunin yanar gizon biyu ta hanyar masana'anta na gama gari. Kowane rukunin yanar gizon yana gudanar da Edge Provider (PE), wanda aka ƙare hanyoyin sadarwar abokin ciniki. An haɗa PEs zuwa gungu ɗaya. Idan PE ya gaza a rukunin yanar gizo ɗaya, ana tura duk zirga-zirga zuwa rukunin yanar gizo na biyu. Godiya ga wannan, injunan kama-da-wane daga rukunin yanar gizon da aka bari ba tare da PE ba sun kasance masu isa ga hanyar sadarwar ga abokin ciniki.

Bari yanzu mu ga abin da zai faru da injunan kama-da-wane na abokin ciniki yayin gazawa daban-daban. Bari mu fara da mafi sauƙi zažužžukan kuma mu ƙare da mafi tsanani - gazawar dukan site. A cikin misalan, babban dandamali zai zama OST, kuma dandamalin madadin, tare da kwafin bayanai, zai zama NORD.

Me zai faru da na'ura mai kama da kwamfuta idan ...

Rukunin Maimaitawa ya kasa. Maimaitawa tsakanin tsarin ajiya na rukunin yanar gizon biyu yana tsayawa.
ESXi kawai zai yi aiki tare da na'urorin faifai na gida (ta hanya mafi kyau).
Na'urori masu ban mamaki suna ci gaba da aiki.

Cloud Resilient Bala'i: Yadda Yake Aiki

ISL (Inter-Switch Link) ya karye. Lamarin da ba zai yuwu ba. Sai dai idan wasu mahaukatan tono ya tono hanyoyin gani da yawa lokaci guda, waɗanda ke gudana akan hanyoyi masu zaman kansu kuma ana kawo su wuraren ta hanyar bayanai daban-daban. Amma duk da haka. A wannan yanayin, rundunonin ESXi sun rasa rabin hanyoyin kuma suna iya samun damar tsarin ajiya na gida kawai. Ana tattara kwafi, amma runduna ba za su sami damar shiga su ba.

Na'urori masu kyan gani suna aiki akai-akai.

Cloud Resilient Bala'i: Yadda Yake Aiki

Canjin SAN ya gaza akan ɗayan rukunin yanar gizon. Rundunan ESXi sun rasa wasu hanyoyin zuwa tsarin ajiya. A wannan yanayin, runduna a wurin da canjin ya kasa aiki kawai ta ɗayan HBAs ɗin su.

Na'urorin kama-da-wane suna ci gaba da aiki akai-akai.

Cloud Resilient Bala'i: Yadda Yake Aiki

Duk masu kunna SAN akan ɗayan rukunin yanar gizon sun kasa. Bari mu ce irin wannan bala'i ya faru a wurin OST. A wannan yanayin, rundunonin ESXi a wannan rukunin yanar gizon za su rasa duk hanyoyin zuwa na'urorin diski. Daidaitaccen tsarin VMware vSphere HA ya zo cikin wasa: zai sake kunna duk injina na rukunin yanar gizon OST a cikin NORD a cikin matsakaicin daƙiƙa 140.

Na'urori masu ƙayatarwa da ke gudana akan rundunonin rukunin yanar gizon NORD suna aiki akai-akai.

Cloud Resilient Bala'i: Yadda Yake Aiki

Mai watsa shiri na ESXi ya gaza akan rukunin yanar gizo ɗaya. Anan tsarin vSphere HA yana sake aiki: injunan kama-da-wane daga mai masaukin da suka gaza an sake kunna su akan sauran runduna - akan wannan rukunin yanar gizon ko nesa. Lokacin sake kunna injin kama-da-wane ya kai minti 1.

Idan duk rundunonin ESXi na rukunin yanar gizon OST sun kasa, babu zaɓuɓɓuka: an sake kunna VM akan wani. Lokacin sake farawa iri ɗaya ne.

Cloud Resilient Bala'i: Yadda Yake Aiki

Tsarin ajiya ya gaza a wuri ɗaya. Bari mu ce tsarin ajiya ya kasa a wurin OST. Sannan rundunonin ESXi na rukunin rukunin yanar gizon OST sun canza zuwa aiki tare da kwafin ajiya a cikin NORD. Bayan tsarin ma'ajiyar da ya gaza ya dawo sabis, kwafin tilastawa zai faru kuma rundunonin ESXi OST za su sake fara shiga tsarin ma'ajiyar gida.

Na'urori na zamani suna aiki akai-akai duk tsawon wannan lokacin.

Cloud Resilient Bala'i: Yadda Yake Aiki

Ɗayan rukunin yanar gizon ya gaza. A wannan yanayin, duk injunan kama-da-wane za a sake kunna su akan rukunin yanar gizon ta hanyar tsarin vSphere HA. Lokacin sake kunna VM shine 140 seconds. A wannan yanayin, za a adana duk saitunan cibiyar sadarwa na injin kama-da-wane, kuma ya kasance mai isa ga abokin ciniki akan hanyar sadarwar.

Don tabbatar da cewa sake kunna injina a wurin ajiyar yana tafiya lafiya, kowane rukunin yanar gizon ya cika rabin rabin kawai. Rabin na biyu ajiyar ne idan duk injunan kama-da-wane sun motsa daga na biyu, wurin da ya lalace.

Cloud Resilient Bala'i: Yadda Yake Aiki

Girgizar kasa mai jure wa bala'i dangane da cibiyoyin bayanai guda biyu yana ba da kariya daga irin wannan gazawar.

Wannan jin daɗin ba shi da arha, tun da, ban da manyan albarkatun, ana buƙatar ajiyar wuri a kan shafin na biyu. Sabili da haka, ana sanya ayyuka masu mahimmanci na kasuwanci a cikin irin wannan gajimare, na dogon lokaci wanda ke haifar da asarar kuɗi da yawa da kuma ladabi, ko kuma idan tsarin bayanan yana ƙarƙashin buƙatun juriya na bala'i daga masu sarrafawa ko ka'idojin kamfani na ciki.

Sources:

  1. www.infinidat.com/sites/default/files/resource-pdfs/DS-INFBOX-190331-US_0.pdf
  2. support.infinidat.com/hc/en-us/articles/207057109-InfiniBox-mafi kyawun-practices-guides

source: www.habr.com

Add a comment