Chiitiko neCEPH

Kana paine data rakawanda kupfuura rinogona kukwana pane rimwe dhisiki, inguva yekufunga nezve RAID. Ndichiri mwana, ndaiwanzonzwa kubva kuvakuru vangu: β€œrimwe zuva RAID ichava chinhu chekare, kuchengetwa kwezvinhu kuchazadza nyika, uye hautomboziva kuti CEPH chii,” saka chinhu chekutanga muhupenyu hwangu hwekuzvimiririra. kwaiva kugadzira sumbu rangu. Chinangwa chekuedza kwaive kujairana nechimiro chemukati checeph uye kunzwisisa chiyero chekushandiswa kwayo. Kuitwa kweceph kwakakodzera sei mumabhizinesi epakati uye mune madiki? Mushure memakore akati wandei ekushanda uye akati wandei kurasikirwa kusingadzoreki data, kunzwisisa kwezvakaoma kwakamuka kuti hazvisi zvese zviri nyore. Izvo zvinozivikanwa zveECPH zvinoisa zvipingamupinyi pakutorwa kwayo kwakapararira, uye nekuda kwavo, kuedza kwakasvika kumagumo. Pazasi pane tsananguro yematanho ese akatorwa, mhedzisiro yakawanikwa uye mhedziso dzakatorwa. Kana vanhu vane ruzivo vakagovana ruzivo rwavo uye vakatsanangura mamwe mapoinzi, ndichatenda.

Ongorora: Vanotaura vaona zvikanganiso zvakakomba mune dzimwe fungidziro dzinoda kudzokororwa kwechinyorwa chose.

CEPH Strategy

Cluster yeCEPH inosanganisa nhamba yekupokana K yemadhisiki ehukuru husingaenzaniswi uye inochengeta data pairi, ichidzokorora chidimbu chimwe nechimwe (4 MB nekusarudzika) nhamba yakapihwa N nguva.

Ngatitarisei nyaya yakapusa ine madhisiki maviri akafanana. Kubva kwavari unogona kuunganidza RAID 1 kana sumbu neN = 2 - mhedzisiro ichave yakafanana. Kana paine madhisiki matatu uye ari ehukuru hwakasiyana, saka zviri nyore kuunganidza sumbu neN = 2: imwe data ichange iri padhisiki 1 uye 2, mamwe achange ari padhisiki 1 uye 3, uye mamwe achange ari. pa2 ne3, nepo RAID isingadi (unogona kuunganidza RAID yakadaro, asi kunenge kuri kukanganisa). Kana paine mamwe madhisiki, saka zvinokwanisika kugadzira RAID 5; CEPH ine analogue - erasure_code, iyo inopesana nemafungiro ekutanga evagadziri, uye saka haina kutariswa. RAID 5 inofungidzira kuti kune nhamba diki yemadhiraivha, ese ari muchimiro chakanaka. Kana imwe ikakundikana, vamwe vanofanira kubatirira kunze kusvikira diski yatsiviwa uye data yadzorerwa kwairi. CEPH, ine N> = 3, inokurudzira kushandiswa kwema disks ekare, kunyanya, kana iwe uchichengeta madhisiki akawanda akanaka kuti uchengetedze imwe kopi yedata, uye uchengetedze makopi maviri kana matatu akasara pane nhamba huru yedhisiki yekare, ipapo ruzivo. ichave yakachengeteka, sezvo ikozvino madhisiki matsva ari mupenyu - hapana matambudziko, uye kana rimwe rawo rikaputsika, ipapo kukundikana panguva imwe chete yemadhisiki matatu nehupenyu hwesevhisi inopfuura makore mashanu, zviri nani kubva kune akasiyana maseva, hazvigoneke zvakanyanya. chiitiko.

Pane hunyengeri hwekugoverwa kwemakopi. Nekutadza, zvinofungidzirwa kuti data rakakamurwa kuva akawanda (~ 100 per diski) PG kugovera mapoka, rimwe nerimwe rinodzokororwa pane mamwe madhisiki. Ngatitii K = 6, N = 2, zvino kana chero madhisiki maviri akakundikana, data inovimbiswa kurasika, sezvo maererano nefungidziro yemukana, pachava nePG imwe chete ichave iri paaya madhisiki maviri. Uye kurasikirwa kweboka rimwe chete kunoita kuti data yose iri mudziva iwanikwe. Kana ma disks akakamurwa kuva maviri maviri uye data inobvumirwa kuchengetwa chete pa disks mukati meimwe peya, saka kugovera kwakadaro kunopesana nekukundikana kweimwe disk, asi kana maviri disks akakundikana, mukana wekurasikirwa kwedata hausi. 100%, asi 3/15 chete, uye kunyange kana kukundikana matatu dhisiki - chete 12/20. Nekudaro, entropy mukugovera data haibatsire mukushivirira kukanganisa. Ziva zvakare kuti kune sevha yefaira, yemahara RAM inowedzera zvakanyanya kumhanya kwekupindura. Iyo yakawanda ndangariro mune imwe neimwe node, uye iyo yakawanda ndangariro mumanodhi ese, inokurumidza kukurumidza. Izvi pasina mubvunzo mukana wechikwata pamusoro pesevha imwechete uye, kunyanya, hardware NAS, uko kudiki ndangariro kwakavakirwa mukati.

Izvi zvinotevera kuti CEPH inzira yakanaka yekugadzira yakavimbika data yekuchengetedza hurongwa hwemakumi eTB nekukwanisa kukwira nemari shoma kubva kumidziyo yechinyakare (pano, hongu, mari ichadikanwa, asi idiki kana ichienzaniswa neyekutengesa masisitimu ekuchengetedza).

Kuitwa kweCluster

Pakuedza, ngatitorei komputa yakabviswa Intel DQ57TM + Intel core i3 540 + 16 GB ye RAM. Ticharonga mana 2 TB disks mune chimwe chinhu chakafanana neRAID10, mushure mekuedza kwakabudirira tichawedzera node yechipiri uye nhamba imwechete ye disks.

Kuisa Linux. Kugovera kunoda kukwanisa kugadzirisa uye kugadzikana. Debian naSuse vanosangana nezvinodiwa. Suse ine yakawedzera kuchinjika simisi iyo inobvumidza iwe kudzima chero package; Sezvineiwo, handina kukwanisa kuona kuti ndedzipi dzaigona kuraswa pasina kukuvadza system. Isa Debian uchishandisa debootstrap buster. Iyo min-base sarudzo inoisa yakaputsika sisitimu inoshaya vatyairi. Misiyano yehukuru inofananidzwa neshanduro yakazara haina kukura zvakanyanya zvekunetsa. Sezvo basa richiitwa pamushini wenyama, ini ndoda kutora mafoto, senge pamashini chaiwo. Iyi sarudzo inopihwa neLVM kana btrfs (kana xfs, kana zfs - mutsauko hausi mukuru). LVM snapshots haisi poindi yakasimba. Isa btrfs. Uye iyo bootloader iri muMBR. Iko hakuna chikonzero mukuunganidza 50 MB dhisiki ine FAT partition kana iwe uchikwanisa kuisundira munzvimbo ye1 MB yekugovera tafura uye kugovera nzvimbo yese yehurongwa. Akatora 700 MB pane dhisiki. Ini handiyeuke kuti yakawanda sei iyo yekutanga SUSE yekumisikidza ine, ndinofunga ingangoita 1.1 kana 1.4 GB.

Isa CEPH. Isu tinoregeredza vhezheni 12 mune debian repository uye tinobatanidza zvakananga kubva kune 15.2.3 saiti. Isu tinotevedzera mirairo kubva muchikamu "Isa CEPH nemaoko" nemapako anotevera:

  • Usati wabatanidza repository, unofanirwa kuisa gnupg wget ca-certificates
  • Mushure mekubatanidza repository, asi usati waisa cluster, kuisa mapakeji kunosiiwa: apt -y --no-install-inokurudzira kuisa ceph-common ceph-mon ceph-osd ceph-mds ceph-mgr.
  • Pakuisa CEPH, nekuda kwezvikonzero zvisingazivikanwe, ichaedza kuisa lvm2. Muchidimbu, haisi tsitsi, asi kuisirwa kunotadza, saka CEPH haizoise kana.

    Ichi chikamu chakabatsira:

    cat << EOF >> /var/lib/dpkg/status
    Package: lvm2
    Status: install ok installed
    Priority: important
    Section: admin
    Installed-Size: 0
    Maintainer: Debian Adduser Developers <[email protected]>
    Architecture: all
    Multi-Arch: foreign
    Version: 113.118
    Description: No-install
    EOF
    

Cluster overview

ceph-osd - ine basa rekuchengetedza data pane dhisiki. Kune dhisiki rega rega, sevhisi yetiweki inotangwa iyo inogamuchira uye inoita zvikumbiro zvekuverenga kana kunyora kune zvinhu. Zvikamu zviviri zvinogadzirwa pa diski. Imwe yacho ine ruzivo nezve cluster, disk nhamba, uye makiyi kusumbu. Iyi 1KB ruzivo inogadzirwa kamwe chete paunenge uchiwedzera dhisiki uye haina kumbobvira yaonekwa kuti ichinje. Chikamu chechipiri hachina faira system uye inochengetedza CEPH binary data. Kuisa otomatiki mumavhezheni apfuura kwakagadzira 100MB xfs chikamu cheruzivo rwesevhisi. Ndakashandura dhisiki kuva MBR uye ndakagovera 16MB chete - sevhisi hainyunyuti. Ini ndinofunga xfs inogona kutsiviwa ne ext pasina matambudziko. Ichi chikamu chakaiswa mukati / var/lib/…, apo sevhisi inoverenga ruzivo nezve OSD uye zvakare inowana referensi kune block mudziyo unochengeterwa data rebhinari. Nechepfungwa, iwe unogona pakarepo kuisa mafaera ebetsero mu/var/lib/…, uye wogovera dhisiki rese re data. Paunenge uchigadzira OSD kuburikidza ne ceph-deploy, mutemo unogadzirwa otomatiki kuti uise chikamu mukati /var/lib/…, uye ceph mushandisi anopihwawo kodzero dzekuverenga inodiwa block mudziyo. Paunenge uchiisa nemaoko, unofanirwa kuzviita iwe pachako; zvinyorwa hazvitauri izvi. Izvo zvakare zvinokurudzirwa kutsanangura iyo osd memory target parameter kuitira kuti pave neakakwana ndangariro yemuviri.

ceph-mds. Padanho rakaderera, CEPH chinhu chekuchengetedza. Kugona kweBlock kuchengetedza kunowira pasi kuchengetedza yega yega 4MB block sechinhu. Kuchengetera faira kunoshanda pane imwechete musimboti. Madziva maviri anogadzirwa: imwe yemetadata, imwe yedata. Izvo zvinosanganiswa kuita faira system. Panguva ino, imwe mhando yerekodhi inogadzirwa, saka kana iwe ukadzima iyo faira system, asi chengeta ese madziva, haugone kuidzosera. Pane maitiro ekubvisa mafaera nemabhuraki, ini handina kuayedza. Iyo ceph-mds sevhisi ine basa rekuwana iyo faira system. Imwe neimwe faira system inoda imwe yakaparadzana muenzaniso wesevhisi. Pane "index" sarudzo, iyo inokutendera iwe kuti ugadzire kufanana kwemafaira akati wandei mune imwe - zvakare isina kuedzwa.

Ceph-mon - Iyi sevhisi inochengeta mepu yesumbu. Inosanganisira ruzivo nezve ese maOSDs, algorithm yekugovera maPG muOSDs uye, zvakanyanya kukosha, ruzivo nezve zvese zvinhu (izvo zvemashini iyi hazvina kujeka kwandiri: pane dhairekitori /var/lib/ceph/mon/…/ store.db, ine hombe iyo faira iri 26MB, uye muboka rezvinhu 105K, inoshanduka kuita zvishoma pamusoro pe256 bytes pachinhu chimwe - ndinofunga kuti iyo monitor inochengeta runyoro rwezvinhu zvese uye maPG umo. varipo). Kukuvadzwa kweiyi dhairekitori kunoguma nekurasikirwa kwese data musumbu. Saka mhedziso yakatorwa yekuti CRUSH inoratidza kuti maPG anowanikwa sei paOSD, uye kuti zvinhu zviri paPGs sei - zvakachengetwa nechepakati mukati medhatabhesi, zvisinei kuti vagadziri vanodzivirira sei izwi iri. Nekuda kweizvozvo, chekutanga, isu hatigone kuisa iyo sisitimu pane flash drive muRO modhi, sezvo dhatabhesi iri kugara ichirekodhwa, imwe dhisiki inodiwa kune idzi (zvisina kupfuura 1 GB), chechipiri, zvinodikanwa kuve ne kopi munguva chaiyo iyi base. Kana paine mamonitor akati wandei, saka kushivirira kwemhosva kunovimbiswa otomatiki, asi kwatiri kune imwe chete yekutarisa, yepamusoro maviri. Pane maitiro edzidziso ekudzoreredza cheki zvichibva paOSD data, ndakatendeukira kwairi katatu nekuda kwezvikonzero zvakasiyana, uye katatu pakanga pasina mhosho, uye pasina data. Zvinosuruvarisa, iyi michina haishande. Pamwe isu tinoshandisa chikamu chidiki paOSD uye kuunganidza RAID kuchengetedza dhatabhesi, izvo zvichave nemhedzisiro yakaipa pakuita, kana isu tinogovera angangoita maviri akavimbika emidhiya yemuviri, zvirinani USB, kuti usatore madoko.

rados-gw - inotumira kunze chinhu kuchengetedza kuburikidza neS3 protocol uye zvakafanana. Inogadzira madziva mazhinji, hazvizivikanwe kuti sei. Handina kunyanya kuedza.

ceph-mgr - Pakuisa iyi sevhisi, akati wandei ma module anotangwa. Imwe yacho ndeye autoscale isingagone kudzimwa. Inovavarira kuchengetedza huwandu hwakaringana hwePG/OSD. Kana iwe uchida kudzora reshiyo nemaoko, unogona kudzima chiyero chedziva rega rega, asi mune ino moduru inoputsika nekupatsanurwa ne0, uye chimiro chesumbu chinova ERROR. Iyo module yakanyorwa muPython, uye kana iwe ukataura mutsara unodiwa mairi, izvi zvinotungamira mukudzima. Kunyanya usimbe kurangarira mashoko.

Rondedzero yezvinyorwa zvakashandiswa:

Kuiswa kweCEPH
Kudzoreredza kubva mukutadza kwakakwana kwekutarisa

Zvinyorwa zvinyorwa:

Kuisa iyo system kuburikidza ne debootstrap

blkdev=sdb1
mkfs.btrfs -f /dev/$blkdev
mount /dev/$blkdev /mnt
cd /mnt
for i in {@,@var,@home}; do btrfs subvolume create $i; done
mkdir snapshot @/{var,home}
for i in {var,home}; do mount -o bind @${i} @/$i; done
debootstrap buster @ http://deb.debian.org/debian; echo $?
for i in {dev,proc,sys}; do mount -o bind /$i @/$i; done
cp /etc/bash.bashrc @/etc/

chroot /mnt/@ /bin/bash
echo rbd1 > /etc/hostname
passwd
uuid=`blkid | grep $blkdev | cut -d """ -f 2`
cat << EOF > /etc/fstab
UUID=$uuid / btrfs noatime,nodiratime,subvol=@ 0 1
UUID=$uuid /var btrfs noatime,nodiratime,subvol=@var 0 2
UUID=$uuid /home btrfs noatime,nodiratime,subvol=@home 0 2
EOF
cat << EOF >> /var/lib/dpkg/status
Package: lvm2
Status: install ok installed
Priority: important
Section: admin
Installed-Size: 0
Maintainer: Debian Adduser Developers <[email protected]>
Architecture: all
Multi-Arch: foreign
Version: 113.118
Description: No-install

Package: sudo
Status: install ok installed
Priority: important
Section: admin
Installed-Size: 0
Maintainer: Debian Adduser Developers <[email protected]>
Architecture: all
Multi-Arch: foreign
Version: 113.118
Description: No-install
EOF

exit
grub-install --boot-directory=@/boot/ /dev/$blkdev
init 6

apt -yq install --no-install-recommends linux-image-amd64 bash-completion ed btrfs-progs grub-pc iproute2 ssh  smartmontools ntfs-3g net-tools man
exit
grub-install --boot-directory=@/boot/ /dev/$blkdev
init 6

Gadzira sumbu

apt -yq install --no-install-recommends gnupg wget ca-certificates
echo 'deb https://download.ceph.com/debian-octopus/ buster main' >> /etc/apt/sources.list
wget -q -O- 'https://download.ceph.com/keys/release.asc' | apt-key add -
apt update
apt -yq install --no-install-recommends ceph-common ceph-mon

echo 192.168.11.11 rbd1 >> /etc/hosts
uuid=`cat /proc/sys/kernel/random/uuid`
cat << EOF > /etc/ceph/ceph.conf
[global]
fsid = $uuid
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
mon allow pool delete = true
mon host = 192.168.11.11
mon initial members = rbd1
mon max pg per osd = 385
osd crush update on start = false
#osd memory target = 2147483648
osd memory target = 1610612736
osd scrub chunk min = 1
osd scrub chunk max = 2
osd scrub sleep = .2
osd pool default pg autoscale mode = off
osd pool default size = 1
osd pool default min size = 1
osd pool default pg num = 1
osd pool default pgp num = 1
[mon]
mgr initial modules = dashboard
EOF

ceph-authtool --create-keyring ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
ceph-authtool --create-keyring ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
cp ceph.client.admin.keyring /etc/ceph/
ceph-authtool --create-keyring bootstrap-osd.ceph.keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'
cp bootstrap-osd.ceph.keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
ceph-authtool ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
ceph-authtool ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
monmaptool --create --add rbd1 192.168.11.11 --fsid $uuid monmap
rm -R /var/lib/ceph/mon/ceph-rbd1/*
ceph-mon --mkfs -i rbd1 --monmap monmap --keyring ceph.mon.keyring
chown ceph:ceph -R /var/lib/ceph
systemctl enable ceph-mon@rbd1
systemctl start ceph-mon@rbd1
ceph mon enable-msgr2
ceph status

# dashboard

apt -yq install --no-install-recommends ceph-mgr ceph-mgr-dashboard python3-distutils python3-yaml
mkdir /var/lib/ceph/mgr/ceph-rbd1
ceph auth get-or-create mgr.rbd1 mon 'allow profile mgr' osd 'allow *' mds 'allow *' > /var/lib/ceph/mgr/ceph-rbd1/keyring
systemctl enable ceph-mgr@rbd1
systemctl start ceph-mgr@rbd1
ceph config set mgr mgr/dashboard/ssl false
ceph config set mgr mgr/dashboard/server_port 7000
ceph dashboard ac-user-create root 1111115 administrator
systemctl stop ceph-mgr@rbd1
systemctl start ceph-mgr@rbd1

Kuwedzera OSD (chikamu)

apt install ceph-osd

osdnum=`ceph osd create`
mkdir -p /var/lib/ceph/osd/ceph-$osdnum
mkfs -t xfs /dev/sda1
mount -t xfs /dev/sda1 /var/lib/ceph/osd/ceph-$osdnum
cd /var/lib/ceph/osd/ceph-$osdnum
ceph auth get-or-create osd.0 mon 'profile osd' mgr 'profile osd' osd 'allow *' > /var/lib/ceph/osd/ceph-$osdnum/keyring
ln -s /dev/disk/by-partuuid/d8cc3da6-02  block
ceph-osd -i $osdnum --mkfs
#chown ceph:ceph /dev/sd?2
chown ceph:ceph -R /var/lib/ceph
systemctl enable ceph-osd@$osdnum
systemctl start ceph-osd@$osdnum

Summary

Iyo huru yekushambadzira mukana weCEPH ndeye CRUSH - algorithm yekuverenga nzvimbo yedata. Mamonitor anogovera iyi algorithm kune vatengi, mushure meizvozvo vatengi vanokumbira zvakananga node inodiwa uye inodiwa OSD. CRUSH inovimbisa kuti hapana centralization. Ifaira diki raunogona kudhinda nekurembera pamadziro. Kudzidzira kwakaratidza kuti CRUSH haisi mepu inopedza. Kana iwe ukaparadza nekugadzirisa zvakare ma monitors, uchichengeta ese OSD uye CRUSH, saka izvi hazvina kukwana kudzoreredza cluster. Kubva pane izvi zvinogumiswa kuti imwe neimwe yekutarisa inochengeta imwe metadata nezve cluster yese. Iyo diki yemetadata iyi haiisi zvirambidzo pakukura kwesumbu, asi inoda kuve nechokwadi chekuchengetedza kwavo, iyo inobvisa dhisiki yekuchengetedza nekuisa iyo system pane flash drive uye isingabatanidzi masumbu ane asingasviki matatu node. Mugadziri ane hukasha mutemo maererano nesarudzo. Kure ne minimalism. Zvinyorwa zviri pamwero we "maita basa nezvatinazvo, asi zvakanyanya, zvishoma." Iko kugona kuyanana nemasevhisi padanho rakaderera kunopihwa, asi zvinyorwa zvinobata pamusoro penyaya iyi zvakanyanya, saka zvingangoita aiwa pane hongu. Iko hakuna mukana wekudzoreredza data kubva kune emergency.

Sarudzo dzekuwedzera chiito: siya CEPH uye shandisa banal multi-disk btrfs (kana xfs, zfs), tsvaga ruzivo rutsva nezve CEPH, iyo inokutendera iwe kuti uishandise pasi pemamiriro akatarwa, edza kunyora yako chengetedzo seyepamusoro. kudzidzisa.

Source: www.habr.com

Voeg