Nasihu & dabaru don aiki tare da Ceph a cikin ayyuka masu yawan gaske

Nasihu & dabaru don aiki tare da Ceph a cikin ayyuka masu yawan gaske

Yin amfani da Ceph azaman ajiyar hanyar sadarwa a cikin ayyuka tare da lodi daban-daban, ƙila mu haɗu da ayyuka daban-daban waɗanda a kallon farko ba su da sauƙi ko maras muhimmanci. Misali:

  • ƙaura na bayanai daga tsohon Ceph zuwa sabo tare da yin amfani da ɓangarori na sabobin da suka gabata a cikin sabon tari;
  • magance matsalar rarraba sararin samaniya a Ceph.

Yin magance irin waɗannan matsalolin, muna fuskantar buƙatar cire OSD daidai ba tare da rasa bayanai ba, wanda yake da mahimmanci musamman lokacin da ake hulɗa da manyan bayanai. Za a tattauna wannan a cikin labarin.

Hanyoyin da aka bayyana a ƙasa sun dace da kowane sigar Ceph. Bugu da ƙari, za a yi la'akari da gaskiyar cewa Ceph na iya adana babban adadin bayanai: don hana asarar bayanai da sauran matsalolin, wasu ayyuka za a "raba" zuwa wasu da dama.

Gabatarwa game da OSD

Tun da biyu daga cikin girke-girke uku da aka tattauna an sadaukar da su ga OSD (Kayan Ajiye Daemon), kafin nutsewa cikin sashin aiki - a taƙaice game da abin da yake cikin Ceph da kuma dalilin da ya sa yake da mahimmanci.

Da farko, ya kamata a ce gabaɗayan ƙungiyar Ceph ta ƙunshi yawancin OSDs. Da yawan akwai, mafi girman ƙarar bayanan kyauta a Ceph. Yana da sauƙin fahimta daga nan babban aikin OSD: Yana adana bayanan abubuwan Ceph akan tsarin fayil na duk nodes ɗin gungu kuma yana ba da damar hanyar sadarwa zuwa gare ta (don karatu, rubutu, da sauran buƙatun).

A matakin ɗaya, ana saita sigogin kwafi ta hanyar kwafin abubuwa tsakanin OSD daban-daban. Kuma a nan za ku iya fuskantar matsaloli daban-daban, hanyoyin da za a tattauna a kasa.

Harka Na 1. Cire OSD lafiya daga rukunin Ceph ba tare da rasa bayanai ba

Ana iya haifar da buƙatar cire OSD ta hanyar cire uwar garken daga gungu - alal misali, maye gurbin shi da wani uwar garken - abin da ya faru da mu, wanda ya haifar da rubuta wannan labarin. Don haka, babban makasudin magudi shine cire duk OSDs da mons akan uwar garken da aka bayar don a iya dakatar da shi.

Don saukakawa kuma don guje wa yanayin da, yayin aiwatar da umarni, mun yi kuskure wajen nuna OSD ɗin da ake buƙata, za mu saita mabambanta daban, wanda darajarsa zata zama adadin OSD ɗin da za a share. Mu kira ta ${ID} - nan da ƙasa, irin wannan madaidaicin ya maye gurbin adadin OSD wanda muke aiki dashi.

Bari mu kalli yanayin kafin fara aiki:

root@hv-1 ~ # ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF
-1       0.46857 root default
-3       0.15619      host hv-1
-5       0.15619      host hv-2
 1   ssd 0.15619      osd.1     up     1.00000  1.00000
-7       0.15619      host hv-3
 2   ssd 0.15619      osd.2     up     1.00000  1.00000

Don fara cire OSD, kuna buƙatar yin aiki lafiya reweight akan shi zuwa sifili. Ta wannan hanyar za mu rage adadin bayanai a cikin OSD ta hanyar daidaita shi zuwa wasu OSDs. Don yin wannan, gudanar da umarni masu zuwa:

ceph osd reweight osd.${ID} 0.98
ceph osd reweight osd.${ID} 0.88
ceph osd reweight osd.${ID} 0.78

... da sauransu har zuwa sifili.

Ana buƙatar daidaita daidaito mai laushidon kada a rasa bayanai. Wannan gaskiya ne musamman idan OSD ya ƙunshi babban adadin bayanai. Don tabbatar da cewa bayan aiwatar da umarni reweight komai yayi kyau, zaku iya kammala shi ceph -s ko a cikin wata taga tasha ta daban ceph -w don ganin canje-canje a ainihin lokacin.

Lokacin da OSD ya “ɓata”, zaku iya ci gaba da daidaitaccen aiki don cire shi. Don yin wannan, canja wurin OSD da ake so zuwa jihar down:

ceph osd down osd.${ID}

Bari mu “cire” OSD daga gungu:

ceph osd out osd.${ID}

Bari mu dakatar da sabis na OSD kuma mu kwance sashin sa a cikin FS:

systemctl stop ceph-osd@${ID}
umount /var/lib/ceph/osd/ceph-${ID}

Cire OSD daga CRUSH taswira:

ceph osd crush remove osd.${ID}

Mu share OSD mai amfani:

ceph auth del osd.${ID}

Kuma a ƙarshe, bari mu cire OSD kanta:

ceph osd rm osd.${ID}

Примечание: Idan kuna amfani da sigar Ceph Luminous ko mafi girma, to ana iya rage matakan cire OSD na sama zuwa umarni biyu:

ceph osd out osd.${ID}
ceph osd purge osd.${ID}

Idan, bayan kammala matakan da aka kwatanta a sama, kuna gudanar da umarnin ceph osd tree, to ya kamata a bayyana a fili cewa a kan uwar garke inda aka yi aikin babu sauran OSDs waɗanda aka yi ayyukan da ke sama:

root@hv-1 ~ # ceph osd tree
ID CLASS WEIGHT  TYPE NAME     STATUS REWEIGHT PRI-AFF
-1       0.46857      root default
-3       0.15619      host hv-1
-5       0.15619      host hv-2
-7       0.15619      host hv-3
 2   ssd 0.15619      osd.2    up     1.00000  1.00000

A kan hanya, lura cewa yanayin gungun Ceph zai tafi HEALTH_WARN, kuma za mu ga raguwar adadin OSDs da adadin sararin faifai.

Masu biyowa zasu bayyana matakan da ake buƙata idan kuna son dakatar da uwar garken gaba ɗaya kuma, don haka, cire shi daga Ceph. A wannan yanayin, yana da mahimmanci a tuna cewa Kafin ka rufe uwar garken, dole ne ka cire duk OSDs akan wannan uwar garken.

Idan babu sauran OSDs da suka rage akan wannan uwar garken, to bayan cire su kuna buƙatar ware uwar garken daga taswirar OSD. hv-2ta hanyar gudanar da umarni mai zuwa:

ceph osd crush rm hv-2

Share mon daga uwar garken hv-2ta hanyar gudanar da umarnin da ke ƙasa akan wani uwar garken (watau a wannan yanayin, kunna hv-1):

ceph-deploy mon destroy hv-2

Bayan wannan, zaku iya dakatar da uwar garken kuma ku fara ayyuka masu zuwa (sake tura shi, da sauransu).

Harka Na 2. Rarraba sararin faifai a cikin gungu na Ceph da aka riga aka ƙirƙira

Zan fara labari na biyu da gabatarwa game da PG (Ƙungiyoyin Sanyawa). Babban aikin PG a cikin Ceph shine da farko don tara abubuwan Ceph da ƙara yin kwafin su a cikin OSD. Tsarin da za ku iya lissafin adadin da ake buƙata na PG yana ciki sashen da ya dace Takardun Ceph. Ana kuma tattauna wannan batu a can tare da takamaiman misalai.

Don haka: ɗayan matsalolin gama gari yayin amfani da Ceph shine rashin daidaituwa adadin OSD da PG tsakanin wuraren tafki a Ceph.

Da fari dai, saboda wannan, yanayi na iya tasowa lokacin da aka ƙayyade PGs da yawa a cikin ƙaramin tafki, wanda shine ainihin amfani da sarari diski mara kyau a cikin tari. Abu na biyu, a aikace akwai matsala mai tsanani: bayanai sun mamaye daya daga cikin OSDs. Wannan ya haɗa da gungu na farko na sauyawa zuwa jiha HEALTH_WARN, sai me HEALTH_ERR. Dalilin wannan shine Ceph, lokacin da ake ƙididdige yawan adadin bayanai (zaku iya gano ta ta MAX AVAIL a cikin fitarwar umarni ceph df ga kowane tafkin daban) ya dogara ne akan adadin da ake samu a cikin OSD. Idan babu isasshen sarari a cikin aƙalla OSD ɗaya, ba za a iya rubuta ƙarin bayanai ba har sai an rarraba bayanan da kyau tsakanin duk OSDs.

Yana da daraja bayyana cewa wadannan matsaloli An yanke shawarar da yawa a matakin daidaitawar gungun Ceph. Ɗaya daga cikin kayan aikin da za ku iya amfani da shi shine Ceph PGCalc. Tare da taimakonsa, ana ƙididdige adadin da ake buƙata na PG a fili. Koyaya, zaku iya amfani da shi a cikin halin da ake ciki inda gungu na Ceph riga an saita ba daidai ba. Yana da kyau a fayyace a nan cewa a matsayin wani ɓangare na aikin gyara za ku iya buƙatar rage adadin PGs, kuma wannan fasalin ba ya samuwa a cikin tsofaffin nau'ikan Ceph (kawai ya bayyana a cikin sigar). Nautilus).

Don haka, bari mu yi tunanin hoto mai zuwa: gungu yana da matsayi HEALTH_WARN saboda daya daga cikin OSD yana kurewa sarari. Za a nuna wannan ta kuskure HEALTH_WARN: 1 near full osd. Da ke ƙasa akwai algorithm don fita daga wannan yanayin.

Da farko, kuna buƙatar rarraba bayanan da ke akwai tsakanin sauran OSDs. Mun riga mun yi irin wannan aiki a cikin shari'ar farko, lokacin da muka "zuba" kumburi - tare da kawai bambanci wanda yanzu za mu buƙaci rage dan kadan. reweight. Misali, har zuwa 0.95:

ceph osd reweight osd.${ID} 0.95

Wannan yana 'yantar da sarari diski a cikin OSD kuma yana gyara kuskuren lafiyar ceph. Koyaya, kamar yadda aka ambata, wannan matsalar galibi tana faruwa ne saboda kuskuren tsarin Ceph a farkon matakan: yana da matukar mahimmanci a sake fasalin don kada ya bayyana a nan gaba.

A cikin yanayinmu na musamman, duk ya zo ga:

  • darajar yayi girma sosai replication_count a daya daga cikin tafkunan,
  • PG da yawa a cikin tafkin daya kuma kadan a cikin wani.

Bari mu yi amfani da kalkuleta da aka ambata. Ya nuna a fili abin da ake buƙatar shigar da shi kuma, a ka'ida, babu wani abu mai rikitarwa. Bayan saita ma'auni masu mahimmanci, muna samun shawarwari masu zuwa:

Примечание: Idan kuna saita gungu na Ceph daga karce, wani aiki mai amfani na kalkuleta shine ƙirƙirar umarni waɗanda zasu haifar da wuraren waha daga karce tare da sigogi da aka ƙayyade a cikin tebur.

Rukunin ƙarshe yana taimaka muku kewayawa - Ƙididdigar PG da aka Shawarta. A cikin yanayinmu, na biyu kuma yana da amfani, inda aka nuna ma'anar maimaitawa, tun da mun yanke shawarar canza yawan maimaitawa.

Don haka, da farko kuna buƙatar canza sigogin kwafi - wannan ya cancanci yin farko, tunda ta hanyar rage yawan haɓaka, za mu 'yantar da sararin diski. Yayin da umurnin ke aiki, za ku lura cewa sararin faifan diski zai ƙaru:

ceph osd pool $pool_name set $replication_size

Kuma bayan kammala ta, muna canza ma'auni pg_num и pgp_num kamar haka:

ceph osd pool set $pool_name pg_num $pg_number
ceph osd pool set $pool_name pgp_num $pg_number

Muhimmanci: Dole ne mu canza adadin PGs a jere a cikin kowane tafkin kuma kada mu canza dabi'u a cikin sauran wuraren tafki har sai gargadin ya ɓace. "Rashin sake fasalin bayanai" и "n-lambar pgs ya ƙasƙanta".

Hakanan zaka iya bincika cewa komai ya tafi da kyau ta amfani da fitattun umarni ceph health detail и ceph -s.

Harka Na 3. Hijira inji mai kama-da-wane daga LVM zuwa Ceph RBD

A halin da ake ciki inda aikin ke amfani da injunan kama-da-wane da aka sanya a kan sabar sabar da ba ta da ƙarfe ba a hayar, batun ajiyar rashin haƙuri yakan taso. Har ila yau, yana da kyawawa cewa akwai isasshen sarari a cikin wannan ajiyar ... Wani yanayi na yau da kullum: akwai na'ura mai mahimmanci tare da ajiyar gida a kan uwar garke kuma kuna buƙatar fadada faifai, amma babu inda za ku je, saboda babu sarari diski kyauta da aka bari akan sabar.

Ana iya magance matsalar ta hanyoyi daban-daban - alal misali, ta hanyar ƙaura zuwa wata uwar garken (idan akwai ɗaya) ko ƙara sabbin diski zuwa uwar garken. Amma ba koyaushe yana yiwuwa a yi wannan ba, don haka ƙaura daga LVM zuwa Ceph na iya zama kyakkyawan mafita ga wannan matsalar. Ta zaɓin wannan zaɓi, muna kuma sauƙaƙe tsarin ci gaba na ƙaura tsakanin sabobin, tunda ba za a sami buƙatar matsar da ajiyar gida daga wannan hypervisor zuwa wani ba. Abinda kawai ake kamawa shine zaku dakatar da VM yayin da ake aiwatar da aikin.

Ana ɗaukar girke-girke mai zuwa daga labarin daga wannan blog, umarnin da aka gwada a aikace. AF, Hakanan an kwatanta hanyar ƙaura ba tare da wahala ba a can, duk da haka, a cikin yanayinmu ba a buƙata kawai, don haka ba mu bincika ba. Idan wannan yana da mahimmanci ga aikin ku, za mu yi farin cikin jin labarin sakamakon a cikin sharhi.

Mu ci gaba zuwa bangaren aiki. A cikin misalin muna amfani da virsh kuma, daidai da haka, libvirt. Da farko, tabbatar da cewa tafkin Ceph wanda za a yi hijirar bayanan yana da alaƙa da libvirt:

virsh pool-dumpxml $ceph_pool

Bayanin tafkin dole ne ya ƙunshi bayanan haɗin kai zuwa Ceph tare da bayanan izini.

Mataki na gaba shine cewa an canza hoton LVM zuwa Ceph RBD. Lokacin aiwatarwa ya dogara da farko akan girman hoton:

qemu-img convert -p -O rbd /dev/main/$vm_image_name rbd:$ceph_pool/$vm_image_name

Bayan juyawa, hoton LVM zai kasance, wanda zai yi amfani idan ƙaura VM zuwa RBD ya gaza kuma dole ne ku sake jujjuya canje-canje. Hakanan, don samun damar jujjuya sauye-sauye da sauri, bari mu yi ajiyar fayil ɗin daidaitawar injin kama-da-wane:

virsh dumpxml $vm_name > $vm_name.xml
cp $vm_name.xml $vm_name_backup.xml

... kuma gyara asalin (vm_name.xml). Bari mu nemo toshe tare da bayanin diski (farawa da layin <disk type='file' device='disk'> kuma ya ƙare da </disk>) kuma a rage shi zuwa tsari mai zuwa:

<disk type='network' device='disk'>
<driver name='qemu'/>
<auth username='libvirt'>
  <secret type='ceph' uuid='sec-ret-uu-id'/>
 </auth>
<source protocol='rbd' name='$ceph_pool/$vm_image_name>
  <host name='10.0.0.1' port='6789'/>
  <host name='10.0.0.2' port='6789'/>
</source>
<target dev='vda' bus='virtio'/> 
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>

Bari mu dubi wasu bayanai:

  1. Zuwa ga yarjejeniya source an nuna adireshin wurin ajiya a Ceph RBD (wannan shine adireshin da ke nuna sunan tafkin Ceph da hoton RBD, wanda aka ƙaddara a matakin farko).
  2. A cikin toshe secret an nuna nau'in ceph, da kuma UUID na sirrin haɗawa da shi. Ana iya samun uuid ta ta amfani da umarnin virsh secret-list.
  3. A cikin toshe host ana nuna adireshi zuwa masu lura da Ceph.

Bayan gyara fayil ɗin sanyi da kammala canjin LVM zuwa RBD, zaku iya amfani da fayil ɗin sanyi da aka gyara kuma fara injin kama-da-wane:

virsh define $vm_name.xml
virsh start $vm_name

Lokaci ya yi da za a bincika cewa injin kama-da-wane ya fara daidai: zaku iya gano, alal misali, ta hanyar haɗa shi ta hanyar SSH ko ta hanyar. virsh.

Idan na'urar kama-da-wane tana aiki daidai kuma ba ku sami wasu matsaloli ba, to kuna iya share hoton LVM ɗin da ba a yi amfani da shi ba:

lvremove main/$vm_image_name

ƙarshe

Mun ci karo da duk maganganun da aka bayyana a aikace - muna fatan cewa umarnin zai taimaka wa sauran masu gudanarwa su magance matsalolin irin wannan. Idan kuna da sharhi ko wasu labarun makamancin haka daga ƙwarewar ku ta amfani da Ceph, za mu yi farin cikin ganin su a cikin sharhin!

PS

Karanta kuma a kan shafinmu:

source: www.habr.com

Add a comment