Hijira na Cassandra zuwa Kubernetes: fasali da mafita

Hijira na Cassandra zuwa Kubernetes: fasali da mafita

Muna haɗu da bayanan Apache Cassandra akai-akai da buƙatar sarrafa shi a cikin tushen tushen Kubernetes. A cikin wannan kayan, za mu raba hangen nesanmu na matakan da suka wajaba, sharuɗɗa da hanyoyin magance su (ciki har da bayyani na masu aiki) don ƙaura Cassandra zuwa K8s.

"Duk wanda ya iya mulkin mace kuma zai iya mulkin jihar"

Wanene Cassandra? Yana da tsarin ajiya mai rarraba wanda aka tsara don sarrafa manyan bayanai yayin da yake tabbatar da samuwa mai yawa ba tare da gazawar guda ɗaya ba. Aikin da wuya yana buƙatar dogon gabatarwa, don haka zan ba da mahimman abubuwan Cassandra waɗanda zasu dace a cikin mahallin takamaiman labarin:

  • An rubuta Cassandra da Java.
  • Tsarin topology na Cassandra ya ƙunshi matakai da yawa:
    • Node - misalin Cassandra wanda aka tura;
    • Rack rukuni ne na lokuta na Cassandra, wanda aka haɗa ta wasu halaye, wanda ke cikin cibiyar bayanai ɗaya;
    • Datacenter - tarin duk ƙungiyoyin lokuta na Cassandra waɗanda ke cikin cibiyar bayanai ɗaya;
    • Tari tarin duk cibiyoyin bayanai ne.
  • Cassandra yana amfani da adireshin IP don gano kumburi.
  • Don haɓaka ayyukan rubutu da karantawa, Cassandra yana adana wasu bayanai a cikin RAM.

Yanzu - zuwa ainihin yuwuwar motsawa zuwa Kubernetes.

Duba lissafin don canja wuri

Da yake magana game da ƙaura na Cassandra zuwa Kubernetes, muna fatan cewa tare da motsi zai zama mafi dacewa don gudanarwa. Menene za a buƙata don wannan, menene zai taimaka da wannan?

1. Adana bayanai

Kamar yadda aka riga aka fayyace, Cassanda yana adana wani ɓangare na bayanan a cikin RAM - in Memtable. Amma akwai wani bangare na bayanan da aka ajiye zuwa faifai - a cikin tsari SSTable. Ana ƙara wani mahaluƙi zuwa wannan bayanan Ƙaddamar da Log - bayanan duk ma'amaloli, waɗanda kuma an adana su zuwa faifai.

Hijira na Cassandra zuwa Kubernetes: fasali da mafita
Rubuta jadawalin ma'amala a Cassandra

A cikin Kubernetes, za mu iya amfani da PersistentVolume don adana bayanai. Godiya ga ingantattun hanyoyin, aiki tare da bayanai a cikin Kubernetes yana samun sauƙi kowace shekara.

Hijira na Cassandra zuwa Kubernetes: fasali da mafita
Za mu keɓance namu Dindindin Volume ga kowane Cassandra kwafsa

Yana da mahimmanci a lura cewa Cassandra da kanta tana nuna kwafin bayanai, yana ba da ingantattun hanyoyin don wannan. Don haka, idan kuna gina gungu na Cassandra daga babban adadin nodes, to babu buƙatar amfani da tsarin rarraba kamar Ceph ko GlusterFS don adana bayanai. A wannan yanayin, zai zama ma'ana don adana bayanai akan faifan mai watsa shiri ta amfani da na gida faifai ko hawa hostPath.

Wata tambaya ita ce idan kuna son ƙirƙirar yanayi daban don masu haɓakawa don kowane reshe na fasali. A wannan yanayin, hanyar da ta dace ita ce ta ɗaga kumburin Cassandra guda ɗaya da adana bayanan a cikin ma'ajiyar da aka rarraba, watau. Ceph da aka ambata da GlusterFS za su zama zaɓinku. Sannan mai haɓakawa zai tabbata cewa ba zai rasa bayanan gwaji ba ko da an rasa ɗaya daga cikin kuɗaɗɗen kuɗaɗen Kuberntes.

2. Saka idanu

Zaɓin kusan wanda ba a yi takara ba don aiwatar da sa ido a cikin Kubernetes shine Prometheus (mun yi magana game da wannan daki-daki a ciki rahoto mai alaka). Yaya Cassandra ke yi tare da masu fitar da awo na Prometheus? Kuma, menene ma mafi mahimmanci, tare da dashboards masu dacewa don Grafana?

Hijira na Cassandra zuwa Kubernetes: fasali da mafita
Misali na bayyanar jadawalai a Grafana don Cassandra

Masu fitarwa guda biyu ne kawai: jmx_exporter и cassandra_exporter.

Mun zabi na farko don kanmu saboda:

  1. JMX Exporter yana girma kuma yana haɓakawa, yayin da Cassandra Exporter ya kasa samun isasshen tallafin al'umma. Cassandra Exporter har yanzu baya goyan bayan mafi yawan nau'ikan Cassandra.
  2. Kuna iya sarrafa shi azaman javaagent ta ƙara tuta -javaagent:<plugin-dir-name>/cassandra-exporter.jar=--listen=:9180.
  3. Akwai daya gare shi isasshen dashboard, wanda bai dace da Cassandra Exporter ba.

3. Zabar Kubernetes primitives

Bisa ga tsarin da ke sama na gungu na Cassandra, bari mu yi ƙoƙari mu fassara duk abin da aka kwatanta a can cikin kalmomin Kubernetes:

  • Cassandra Node → Pod
  • Cassandra Rack → StatefulSet
  • Cassandra Datacenter → tafkin daga StatefulSets
  • Cluster Cassandra → ???

Ya bayyana cewa an ɓace wasu ƙarin mahaɗan don gudanar da dukkan gungun Cassandra a lokaci ɗaya. Amma idan babu wani abu, za mu iya ƙirƙirar shi! Kubernetes yana da tsari don ayyana albarkatunsa don wannan dalili - Ma'anar Ma'anar Albarkatun Al'ada.

Hijira na Cassandra zuwa Kubernetes: fasali da mafita
Bayyana ƙarin albarkatun don rajistan ayyukan da faɗakarwa

Amma Custom Resource kanta baya nufin komai: bayan haka, yana buƙatar mai sarrafawa. Kuna iya buƙatar neman taimako Kamfanin Kubernetes...

4. Gano kwasfa

A cikin sakin layi na sama, mun yarda cewa kumburin Cassandra guda ɗaya zai daidaita kwafsa ɗaya a Kubernetes. Amma adiresoshin IP na kwas ɗin za su bambanta kowane lokaci. Kuma gano kumburi a Cassandra ya dogara ne akan adireshin IP ... Ya zama cewa bayan kowane cire kwasfa, gunkin Cassandra zai ƙara sabon kumburi.

Akwai mafita, kuma ba guda ɗaya ba:

  1. Za mu iya adana bayanai ta masu gano masu masaukin baki (UUIDs waɗanda ke keɓance yanayin Cassandra) ko ta adiresoshin IP kuma mu adana su duka a cikin wasu sifofi/tebura. Hanyar tana da manyan hasashe guda biyu:
    • Hadarin yanayin tseren da ke faruwa idan nodes biyu sun faɗi lokaci ɗaya. Bayan hawan, Cassandra nodes za su nemi adireshin IP lokaci guda daga tebur kuma su yi gasa don albarkatun iri ɗaya.
    • Idan kumburin Cassandra ya rasa bayanan sa, ba za mu iya gane su ba.
  2. Magani na biyu yana kama da ƙaramin hack, amma duk da haka: zamu iya ƙirƙirar Sabis tare da ClusterIP ga kowane kumburin Cassandra. Matsalolin wannan aiwatarwa:
    • Idan akwai nodes da yawa a cikin tarin Cassandra, dole ne mu ƙirƙiri Sabis da yawa.
    • Ana aiwatar da fasalin ClusterIP ta hanyar iptables. Wannan na iya zama matsala idan gungun Cassandra yana da nodes da yawa (1000... ko ma 100?) nodes. Ko da yake daidaitawa bisa IPVS zai iya magance wannan matsala.
  3. Magani na uku shine a yi amfani da hanyar sadarwa na nodes don nodes na Cassandra maimakon keɓaɓɓen hanyar sadarwa na kwas ɗin ta hanyar kunna saitin. hostNetwork: true. Wannan hanyar tana ɗaukar wasu hani:
    • Don maye gurbin raka'a. Wajibi ne cewa sabon kumburi dole ne ya sami adireshin IP iri ɗaya kamar na baya (a cikin gajimare kamar AWS, GCP wannan kusan ba shi yiwuwa a yi);
    • Yin amfani da hanyar sadarwa na nodes, mun fara gasa don albarkatun cibiyar sadarwa. Don haka, sanya kwafsa fiye da ɗaya tare da Cassandra akan kullin tari ɗaya zai zama matsala.

5. Ajiyayyen

Muna son adana cikakken sigar bayanan kumburin Cassandra guda ɗaya akan jadawali. Kubernetes yana ba da fasalin dacewa ta amfani da shi CronJob, amma a nan Cassandra da kanta ta sanya magana a cikin ƙafafunmu.

Bari in tunatar da ku cewa Cassandra yana adana wasu bayanan a cikin ƙwaƙwalwar ajiya. Don yin cikakken madadin, kuna buƙatar bayanai daga ƙwaƙwalwar ajiya (Matsalolimatsawa zuwa faifai (SSTables). A wannan lokaci, kumburin Cassandra ya daina karɓar haɗin gwiwa, yana rufe gaba ɗaya daga gungu.

Bayan wannan, ana cire ajiyar waje (hotunakuma tsarin ya tsira (mabuɗin mabuɗin). Kuma a sa'an nan ya bayyana cewa kawai madadin ba ya ba mu wani abu: muna bukatar mu cece data gano abin da kullin Cassandra ke da alhakin - wadannan su ne na musamman alamu.

Hijira na Cassandra zuwa Kubernetes: fasali da mafita
Rarraba alamun don gano menene bayanan Cassandra nodes ke da alhakin

Misali rubutun don ɗaukar madadin Cassandra daga Google a Kubernetes ana iya samun shi a wannan haɗin. Iyakar abin da rubutun baya la'akari da shi shine sake saita bayanai zuwa kumburi kafin ɗaukar hoto. Wato madadin ana yin su ba don halin da ake ciki ba, amma don wata jiha kadan da wuri. Amma wannan yana taimakawa kar a cire kumburi daga aiki, wanda yake da ma'ana sosai.

set -eu

if [[ -z "$1" ]]; then
  info "Please provide a keyspace"
  exit 1
fi

KEYSPACE="$1"

result=$(nodetool snapshot "${KEYSPACE}")

if [[ $? -ne 0 ]]; then
  echo "Error while making snapshot"
  exit 1
fi

timestamp=$(echo "$result" | awk '/Snapshot directory: / { print $3 }')

mkdir -p /tmp/backup

for path in $(find "/var/lib/cassandra/data/${KEYSPACE}" -name $timestamp); do
  table=$(echo "${path}" | awk -F "[/-]" '{print $7}')
  mkdir /tmp/backup/$table
  mv $path /tmp/backup/$table
done


tar -zcf /tmp/backup.tar.gz -C /tmp/backup .

nodetool clearsnapshot "${KEYSPACE}"

Misalin rubutun bash don ɗaukar madadin daga kumburin Cassandra ɗaya

Shirye-shiryen mafita don Cassandra a cikin Kubernetes

Menene a halin yanzu ake amfani dashi don tura Cassandra a Kubernetes kuma wanne daga cikin waɗannan ya fi dacewa da buƙatun da aka bayar?

1. Magani akan taswirar StatefulSet ko Helm

Amfani da ainihin ayyukan StatefulSets don gudanar da tarin Cassandra zaɓi ne mai kyau. Yin amfani da ginshiƙi na Helm da samfuran Go, zaku iya samar wa mai amfani tare da sassauƙan keɓancewa don tura Cassandra.

Wannan yawanci yana aiki lafiya... har sai wani abu da ba a zata ya faru, kamar gazawar kumburi. Daidaitaccen kayan aikin Kubernetes kawai ba zai iya la'akari da duk fasalulluka da aka kwatanta a sama ba. Bugu da ƙari, wannan hanyar tana da iyakancewa a cikin nawa za a iya tsawaita shi don ƙarin hadaddun amfani: maye gurbin node, madadin, farfadowa, saka idanu, da dai sauransu.

Wakilai:

Duk sigogin biyu suna da kyau daidai, amma suna ƙarƙashin matsalolin da aka bayyana a sama.

2. Magani dangane da Kubernetes Operator

Irin waɗannan zaɓuɓɓuka sun fi ban sha'awa saboda suna ba da damammaki masu yawa don sarrafa tarin. Don zayyana ma'aikacin Cassandra, kamar kowane tsarin bayanai, kyakkyawan tsari yayi kama da Sidecar <-> Mai sarrafawa <-> CRD:

Hijira na Cassandra zuwa Kubernetes: fasali da mafita
Tsarin sarrafa node a cikin ingantaccen ma'aikacin Cassandra

Bari mu dubi masu aiki da yanzu.

1. Cassandra-operator daga instaclustr

  • GitHub
  • Shirye-shirye: Alpha
  • Lasisi: Apache 2.0
  • An aiwatar a cikin: Java

Wannan hakika aiki ne mai ban sha'awa da haɓakawa daga kamfani wanda ke ba da jigilar Cassandra sarrafawa. Shi, kamar yadda aka bayyana a sama, yana amfani da akwati na gefe wanda ke karɓar umarni ta hanyar HTTP. An rubuta shi cikin Java, wani lokaci yana rasa ingantaccen aikin laburare-tafi na abokin ciniki. Hakanan, afaretan baya goyan bayan Racks daban-daban na Datacenter ɗaya.

Amma ma'aikaci yana da fa'idodi kamar goyan baya don saka idanu, babban matakin sarrafa gungu ta amfani da CRD, har ma da takaddun bayanai don yin madadin.

2. Navigator daga Jetstack

  • GitHub
  • Shirye-shirye: Alpha
  • Lasisi: Apache 2.0
  • An aiwatar a: Golang

Bayanin da aka tsara don tura DB-as-a-Service. A halin yanzu yana goyan bayan bayanan bayanai guda biyu: Elasticsearch da Cassandra. Yana da irin waɗannan mafita masu ban sha'awa kamar sarrafa damar bayanai ta hanyar RBAC (don wannan yana da nasa navigator-apiserver na daban). Wani aiki mai ban sha'awa wanda zai dace da yin la'akari da kyau, amma ƙaddamarwa na ƙarshe an yi shi ne shekara guda da rabi da suka wuce, wanda a fili ya rage yiwuwarsa.

3. Cassandra-mai aiki ta vgkowski

  • GitHub
  • Shirye-shirye: Alpha
  • Lasisi: Apache 2.0
  • An aiwatar a: Golang

Ba su yi la'akari da shi "da gaske", tun da ƙaddamarwar ƙarshe ga ma'ajiyar ta kasance fiye da shekara guda da ta wuce. An yi watsi da ci gaban mai aiki: sabon sigar Kubernetes da aka ruwaito kamar yadda aka goyan baya shine 1.9.

4. Cassandra-mai gudanar da aikin Rook

  • GitHub
  • Shirye-shirye: Alpha
  • Lasisi: Apache 2.0
  • An aiwatar a: Golang

Ma'aikaci wanda ci gabansa baya ci gaba da sauri kamar yadda muke so. Yana da tsarin CRD da aka yi tunani sosai don sarrafa gungu, yana magance matsalar gano nodes ta amfani da Sabis tare da ClusterIP (daidai "hack") ... amma wannan shine yanzu. A halin yanzu babu saka idanu ko madadin daga cikin akwatin (a hanya, mu ne don saka idanu ya dauki kanmu). Wani batu mai ban sha'awa shine cewa zaku iya tura ScyllaDB ta amfani da wannan afaretan.

NB: Mun yi amfani da wannan ma'aikaci tare da ƙananan gyare-gyare a ɗayan ayyukanmu. Ba a lura da matsala ba a cikin aikin mai aiki a duk tsawon lokacin aiki (~ 4 watanni na aiki).

5. CassKop daga Orange

  • GitHub
  • Shirye-shirye: Alpha
  • Lasisi: Apache 2.0
  • An aiwatar a: Golang

Mafi ƙarancin ma'aikaci a cikin jerin: ƙaddamarwar farko an yi shi ne a ranar 23 ga Mayu, 2019. Tuni yanzu yana da a cikin arsenal babban adadin fasali daga jerinmu, ƙarin cikakkun bayanai waɗanda za a iya samu a cikin ma'ajin aikin. An gina ma'aikacin akan tushen sanannen afareta-sdk. Yana goyan bayan sa ido daga cikin akwatin. Babban bambanci daga sauran masu aiki shine amfani CassKop plugin, wanda aka aiwatar a cikin Python kuma ana amfani dashi don sadarwa tsakanin nodes na Cassandra.

binciken

Adadin hanyoyin da zaɓuɓɓuka masu yiwuwa don jigilar Cassandra zuwa Kubernetes yayi magana da kansa: batun yana cikin buƙata.

A wannan matakin, zaku iya gwada kowane ɗayan abubuwan da ke sama a cikin haɗarin ku da haɗarin ku: babu ɗayan masu haɓakawa da ke ba da garantin aikin 100% na maganin su a cikin yanayin samarwa. Amma yanzu samfuran da yawa suna kallon alƙawarin ƙoƙarin yin amfani da su a benci na ci gaba.

Ina tsammanin nan gaba wannan matar da ke cikin jirgin za ta zo da amfani!

PS

Karanta kuma a kan shafinmu:

source: www.habr.com

Add a comment