Kwarewar mu wajen haɓaka direban CSI a Kubernetes don Yandex.Cloud

Kwarewar mu wajen haɓaka direban CSI a Kubernetes don Yandex.Cloud

Muna farin cikin sanar da cewa Flant yana faɗaɗa gudummawar sa ga kayan aikin Buɗewa na Kubernetes ta hanyar fitarwa. sigar alpha na direban CSI (Tsarin Ma'ajiyar Kwantena) don Yandex.Cloud.

Amma kafin ci gaba zuwa cikakkun bayanan aiwatarwa, bari mu amsa tambayar dalilin da yasa ake buƙatar wannan kwata-kwata yayin da Yandex ya riga ya sami sabis Gudanar da Sabis don Kubernetes.

Gabatarwar

Me yasa wannan?

A cikin kamfaninmu, tun daga farkon amfani da Kubernetes a cikin samarwa (watau shekaru da yawa yanzu), muna haɓaka kayan aikin mu (deckhouse), wanda, ta hanyar, muna kuma shirin ba da da ewa ba a matsayin aikin Buɗewa. . Tare da taimakonsa, muna daidaitawa tare da daidaita dukkan gungu namu, kuma a halin yanzu akwai sama da 100 daga cikinsu, akan nau'ikan saitin kayan masarufi da duk sabis na girgije.

Rukunin da ke amfani da gidan bene suna da duk abubuwan da ake buƙata don aiki: masu daidaitawa, saka idanu tare da sigogi masu dacewa, awo da faɗakarwa, amincin mai amfani ta hanyar masu samar da waje don samun dama ga duk dashboards, da sauransu. Babu wata ma'ana a shigar da irin wannan gungu na "tushe" a cikin maganin da aka sarrafa, tun da yake wannan sau da yawa ba zai yiwu ba ko zai haifar da buƙatar kashe rabin abubuwan da aka gyara.

NB: Wannan shine kwarewarmu, kuma yana da takamaiman takamaiman. Ba mu da wata hanya ta ba da shawarar cewa kowa ya kamata ya tura gungu na Kubernetes da kansu maimakon amfani da shirye-shiryen da aka yi. Af, ba mu da ainihin kwarewa a cikin aiki Kubernetes daga Yandex kuma ba za mu ba da wani kima na wannan sabis a cikin wannan labarin.

Menene kuma ga wa?

Don haka, mun riga mun yi magana game da tsarin zamani don ajiya a Kubernetes: ta yaya CSI ke aiki? и yadda al'umma suka zo ga wannan hanya.

A halin yanzu, yawancin manyan masu ba da sabis na gajimare sun haɓaka direbobi don amfani da fayafan gajimarensu azaman Ƙarfin Ƙarfafawa a Kubernetes. Idan mai ba da kaya ba shi da irin wannan direba, amma duk ayyukan da ake bukata ana ba da su ta hanyar API, to babu abin da zai hana ku aiwatar da direba da kanku. Wannan shine abin da ya faru da Yandex.Cloud.

Mun dauki a matsayin tushen ci gaba CSI direba don DigitalOcean girgije da kuma wasu ra'ayoyi daga direbobi don GCP, tunda hulɗa tare da API na waɗannan gizagizai (Google da Yandex) yana da kamanceceniya da yawa. Musamman, API da GCP, kuma a Yandex mayar da abu Operation don bin diddigin matsayin ayyuka masu tsayi (misali, ƙirƙirar sabon faifai). Don yin hulɗa tare da Yandex.Cloud API, yi amfani da Yandex.Cloud Go SDK.

Sakamakon aikin da aka yi An buga a GitHub kuma yana iya zama da amfani ga waɗanda, saboda wasu dalilai, suna amfani da nasu shigarwar Kubernetes akan na'urorin kama-da-wane na Yandex.Cloud (amma ba gungu wanda aka shirya ba) kuma suna son yin amfani da diski (oda) ta hanyar CSI.

Aiwatarwa

Abubuwan fasali

A halin yanzu direba yana goyan bayan ayyuka masu zuwa:

  • Yin odar faifai a duk yankuna na tari bisa ga topology na nodes a cikin gungu;
  • Cire fayafai da aka yi oda a baya;
  • Girman kan layi don faifai (Yandex.Cloud kada ku goyi baya yana haɓaka faifan diski waɗanda aka ɗora zuwa injin kama-da-wane). Don bayani kan yadda dole a gyara direban don yin girman da ba shi da zafi sosai, duba ƙasa.

A nan gaba, muna shirin aiwatar da tallafi don ƙirƙira da share hotunan diski.

Babban wahala da yadda za a shawo kan shi

Rashin ikon ƙara faifai a cikin ainihin lokaci a cikin Yandex.Cloud API shine iyakancewa wanda ke dagula aikin sake girman PV (Ƙarar Ƙarfafawa): a wannan yanayin, dole ne a dakatar da kwas ɗin aikace-aikacen da ke amfani da faifai. kuma wannan na iya haifar da aikace-aikacen downtime.

A cewar Bayanan CSI, idan mai kula da CSI ya ba da rahoton cewa zai iya canza girman diski kawai "offline" (VolumeExpansion.OFFLINE), to, tsarin haɓaka faifan ya kamata ya tafi kamar haka:

Idan plugin ɗin yana da kawai VolumeExpansion.OFFLINE A halin yanzu ana buga iyawar faɗaɗa da ƙarar ko ana samun su akan kumburi sannan ControllerExpandVolume DOLE a kira shi kawai bayan daya:

  • plugin ɗin yana da mai sarrafawa PUBLISH_UNPUBLISH_VOLUME iyawa da ControllerUnpublishVolume an yi kira cikin nasara.

KO WANI

  • Plugin ba shi da mai sarrafawa PUBLISH_UNPUBLISH_VOLUME iyawa, plugin ɗin yana da kumburi STAGE_UNSTAGE_VOLUME iyawa, da NodeUnstageVolume an kammala shi cikin nasara.

KO WANI

  • Plugin ba shi da mai sarrafawa PUBLISH_UNPUBLISH_VOLUME iyawa, ko kumburi STAGE_UNSTAGE_VOLUME iyawa, da NodeUnpublishVolume ya kammala cikin nasara.

Wannan da gaske yana nufin kana buƙatar cire faifan daga injin kama-da-wane kafin faɗaɗa shi.

Duk da haka, da rashin alheri aiwatarwa Ƙididdigar CSI ta hanyar motocin gefe ba ta cika waɗannan buƙatun ba:

  • A cikin akwati na gefe csi-attacher, wanda ya kamata ya zama alhakin kasancewar tazarar da ake buƙata tsakanin tudu, wannan aikin ba a aiwatar da shi kawai a cikin girman layi ba. An fara tattaunawa game da wannan a nan.
  • Menene ainihin kwandon motar gefe a cikin wannan mahallin? CSI plugin ɗin kanta baya hulɗa da Kubernetes API, amma kawai yana amsa kiran gRPC da aka aika masa ta kwantena na gefe. Bugawa ana bunkasa al'ummar Kubernetes.

A cikin yanayinmu (CSI plugin), aikin haɓaka faifai yayi kama da haka:

  1. Muna karɓar kiran gRPC ControllerExpandVolume;
  2. Muna ƙoƙarin ƙara faifai a cikin API, amma mun sami kuskure game da rashin yiwuwar yin aikin saboda an ɗora diski;
  3. Muna adana mai gano diski a taswira, wanda ke ƙunshe da faifai waɗanda ake buƙatar yin aikin haɓakawa. A ƙasa, don taƙaitawa, za mu kira wannan taswira kamar volumeResizeRequired;
  4. Cire kwas ɗin da ke amfani da faifai da hannu. Kubernetes zai sake kunna shi. Don kada faifan ya sami lokacin hawa (ControllerPublishVolume) kafin mu kammala aikin haɓaka yayin ƙoƙarin hawa, muna bincika cewa faifan da aka bayar yana nan a ciki volumeResizeRequired kuma mayar da kuskure;
  5. Direban CSI yayi ƙoƙarin sake aiwatar da aikin sake girman. Idan aikin ya yi nasara, to cire diski daga volumeResizeRequired;
  6. Domin ID ɗin diski ya ɓace daga volumeResizeRequired, ControllerPublishVolume wucewa cikin nasara, faifan yana hawa, kwaf ɗin ya fara.

Komai yayi kama da sauki sosai, amma kamar koyaushe akwai pitfalls. Yana haɓaka faifai waje-maimaita, wanda idan akwai kuskure yayin aiki yana amfani da layi tare da haɓaka mai ma'ana a cikin lokacin ƙarewa har zuwa daƙiƙa 1000:

func DefaultControllerRateLimiter() RateLimiter {
  return NewMaxOfRateLimiter(
  NewItemExponentialFailureRateLimiter(5*time.Millisecond, 1000*time.Second),
  // 10 qps, 100 bucket size.  This is only for retry speed and its only the overall factor (not per item)
  &BucketRateLimiter{Limiter: rate.NewLimiter(rate.Limit(10), 100)},
  )
}

Wannan na iya haifar da lokaci-lokaci a tsawaita aikin faɗaɗa faifai na tsawon mintuna 15+ kuma, don haka, kwaf ɗin da ya dace ba ya samuwa.

Zaɓin kawai wanda a sauƙaƙe kuma ba tare da raɗaɗi ya ba mu damar rage yuwuwar lokacin raguwa ba shine amfani da sigar mu ta resizer na waje tare da matsakaicin iyakar lokacin ƙarewa. cikin dakika 5:

workqueue.NewItemExponentialFailureRateLimiter(5*time.Millisecond, 5*time.Second)

Ba mu yi la'akari da cewa ya zama dole mu fara tattaunawa cikin gaggawa ba tare da faci na'urar resizer na waje, saboda girman faifai na kan layi wani juyi ne wanda zai ɓace nan ba da jimawa ba daga duk masu samar da girgije.

Yadda za a fara amfani?

Ana goyan bayan direba akan sigar Kubernetes 1.15 da sama. Domin direba ya yi aiki, dole ne a cika waɗannan buƙatu:

  • Flag --allow-privileged saita zuwa daraja true don uwar garken API da kubelet;
  • Kunshe --feature-gates=VolumeSnapshotDataSource=true,KubeletPluginsWatcher=true,CSINodeInfo=true,CSIDriverRegistry=true don uwar garken API da kubelet;
  • Dutsen yaduwa (hawa yaduwa) dole ne a kunna akan gungu. Lokacin amfani da Docker, dole ne a saita daemon don ba da damar hawa da aka raba.

Duk matakan da ake buƙata don shigarwa kanta aka bayyana a cikin README. Shigarwa ya ƙunshi ƙirƙirar abubuwa a cikin Kubernetes daga bayyanannun.

Domin direba ya yi aiki kuna buƙatar waɗannan abubuwa:

  • Ƙayyade mai gano kundin adireshi a cikin bayyanuwa (folder-idYandex.Cloud (duba takardun);
  • Don yin hulɗa tare da Yandex.Cloud API, direban CSI yana amfani da asusun sabis. A bayyane, dole ne a ba da sirri maɓallan izini daga asusun sabis. A cikin takardun aka bayyana, yadda ake ƙirƙirar asusun sabis da samun maɓalli.

Gaba ɗaya - gwada shi, kuma za mu yi farin cikin samun feedback da kuma sababbin batutuwaidan kun fuskanci wata matsala!

Ƙarin tallafi

A sakamakon haka, muna so mu lura cewa mun aiwatar da wannan direban CSI ba don babban sha'awar jin daɗin rubuta aikace-aikacen a cikin Go ba, amma saboda buƙatar gaggawa a cikin kamfanin. Ba ze zama mai amfani a gare mu don ci gaba da aiwatar da namu ba, don haka idan Yandex ya nuna sha'awa kuma ya yanke shawarar ci gaba da tallafawa direba, za mu yi farin cikin canja wurin wurin ajiyar zuwa gare su.

Bugu da kari, Yandex mai yiwuwa yana da nasa aiwatar da direban CSI a cikin gungu na Kubernetes da ake gudanarwa, wanda za'a iya saki a cikin Buɗe tushen. Muna kuma ganin wannan zaɓi na ci gaba yana da kyau - al'umma za su iya amfani da ingantaccen direba daga mai ba da sabis, ba daga kamfani na ɓangare na uku ba.

PS

Karanta kuma a kan shafinmu:

source: www.habr.com

Add a comment