Phihlelo ka CEPH

Ha ho na le data e ngata ho feta e lekanang disk e le 'ngoe, ke nako ea ho nahana ka RAID. Ha ke sa le ngoana, ke ne ke atisa ho utloa ho tsoa ho baholo ba ka: “Ka letsatsi le leng RAID e tla fetoha ntho ea nako e fetileng, polokelo ea lintho e tla tlala lefatše, ’me ha u tsebe le hore na CEPH ke eng,” kahoo ntho ea pele bophelong bo ikemetseng. e ne e iketsetsa sehlopha sa hau. Sepheo sa teko e ne e le ho tloaelana le sebopeho sa ka hare sa ceph le ho utloisisa boholo ba ts'ebeliso ea eona. Ke ho loketseng hakae ho kenyelletsoa ha ceph khoebong e mahareng, empa ho e nyane? Kamora lilemo tse 'maloa tsa ts'ebetso le tahlehelo e' maloa e ke keng ea qojoa, ho ile ha hlaha kutloisiso ea maqiti a hore ha se ntho e 'ngoe le e' ngoe e bonolo hakana. Likarolo tsa CEPH li baka litšitiso ho ajoa ha eona ka bophara, 'me ka lebaka la bona, liteko li eme. Ka tlase ke tlhaloso ea mehato eohle e nkiloeng, sephetho se fumanoeng le liqeto tse entsoeng. Haeba batho ba nang le tsebo ba ka arolelana phihlelo ea bona le ho hlalosa lintlha tse ling, ke tla leboha.

Tlhokomeliso: Bahlahlobisisi ba bontšitse liphoso tse tebileng ho tse ling tsa menahano, e hlokang ho ntlafatsoa ha sengoloa kaofela.

CEPH leano

Sehlopha sa CEPH se kopanya nomoro ea k'homphieutha ea K ea li-disk tsa boholo bo sa tloaelehang 'me li boloka lintlha ho tsona, li kopitsa sengoathoana se seng le se seng (4 MB ka ho sa feleng) palo e fanoeng ka makhetlo a N.

Nahana ka taba e bonolo ka ho fetisisa e nang le li-disk tse peli tse tšoanang. U ka kopanya RAID 1 kapa sehlopha se nang le N = 2 ho bona - sephetho se tla tšoana. Haeba ho na le li-disk tse tharo, 'me li na le boholo bo fapaneng, joale ho bonolo ho bokella sehlopha se nang le N = 2: lintlha tse ling li tla ba ho disks 1 le 2, tse ling ho 1 le 3,' me tse ling ho 2 le 3. , ha RAID e se (o ka bokella RAID e joalo, empa e ka ba tšitiso). Haeba ho na le li-disk tse ngata, joale ho ka khoneha ho theha RAID 5, CEPH e na le analogue - erasure_code, e hanyetsanang le likhopolo tsa pele tsa bahlahisi, kahoo ha e nahane. RAID 5 e nka hore ho na le palo e nyenyane ea li-disk, 'me kaofela ha tsona li boemong bo botle. Haeba e 'ngoe e hloleha, ba bang kaofela ba tlameha ho tšoarella ho fihlela disk e nkeloa sebaka mme data e khutlisetsoa ho eona. CEPH, e nang le N> = 3, e khothalletsa tšebeliso ea li-disk tsa khale, haholo-holo, haeba u boloka li-disk tse 'maloa tse ntle ho boloka kopi e le' ngoe ea data, 'me u boloke likopi tse peli kapa tse tharo tse setseng ho palo e kholo ea li-disks tsa khale, joale boitsebiso e tla bolokeha, hobane hajoale li-disk tse ncha li ntse li phela - ha ho na mathata, 'me haeba e' ngoe ea tsona e robeha, ho hlōleha ka nako e le 'ngoe ha li-disk tse tharo tse nang le bophelo ba tšebeletso ea lilemo tse fetang tse hlano, haholo-holo ho tsoa ho li-server tse fapaneng, ke ntho e ke keng ea etsahala. ketsahalo.

Ho na le boqhetseke kabong ea likopi. Ka ho sa feleng, ho nahanoa hore data e arotsoe ka lihlopha tse ling tsa PG (~ 100 ka disk), e 'ngoe le e' ngoe ea tsona e kopitsoa ho li-disk tse ling. Ha re re K = 6, N = 2, joale haeba li-disk tse peli li hlōleha, data e tiisitsoe hore e tla lahleha, kaha ho ea ka khopolo ea monyetla, bonyane ho tla ba le PG e le 'ngoe e tla be e le li-disks tsena tse peli. 'Me tahlehelo ea sehlopha se le seng e etsa hore lintlha tsohle tse ka letamong li se ke tsa fumaneha. Haeba li-disk li arotsoe ka lipara tse tharo 'me li lumelloa ho boloka data feela ho li-disk ka har'a para e le' ngoe, joale kabo e joalo e boetse e hanyetsana le ho hlōleha ha disk leha e le efe, empa haeba tse peli li hlōleha, monyetla oa ho lahleheloa ke data ha o 100%. empa ke 3/15 feela, esita le tabeng ea ho hlōleha li-disk tse tharo - ke 12/20 feela. Kahoo, entropy kabong ea data ha e kenye letsoho ho mamellaneng ha liphoso. Hape hlokomela hore bakeng sa seva sa faele, RAM ea mahala e eketsa karabelo haholo. Ha memori e ntse e eketseha sebakeng se seng le se seng, 'me ha memori e ntse e eketseha ho li-node tsohle, e tla ba kapele. Ha ho pelaelo hore sena ke molemo oa sehlopha ka holim'a seva se le seng, 'me ho feta moo, NAS ea hardware, moo mohopolo o monyenyane haholo o hahiloeng teng.

Ho latela hore CEPH ke mokhoa o motle oa ho theha mokhoa o ka tšeptjoang oa ho boloka mafu a lefuba a mashome a nang le monyetla oa ho fokotsa lichelete tse fokolang ho tloha thepa ea khale (mona, ha e le hantle, litšenyehelo li tla hlokahala, empa tse nyenyane ha li bapisoa le mekhoa ea polokelo ea khoebo).

Phethahatso ea Sehlopha

Bakeng sa teko, a re nkeng komporo e khaotsoeng ea Intel DQ57TM + Intel core i3 540 + 16 GB ea RAM. Re hlophisa li-disk tse 'nè tsa 2 TB ho ntho e kang RAID10, ka mor'a teko e atlehileng re tla eketsa node ea bobeli le palo e tšoanang ea li-disk.

Kenya Linux. Ho ajoa hoa hlokahala hore e be mokhoa o ikhethileng le o tsitsitseng. Debian le Suse li lumellana le litlhoko. Suse e na le sesebelisoa se feto-fetohang se u lumellang ho tima sephutheloana leha e le sefe; ka bomalimabe, ke ne ke sa utloisise hore na ke life tse ka lahleloang ntle le tšenyo ea tsamaiso. Kenya Debian ka debootstrap buster. The min-base option e kenya tsamaiso e sa sebetseng e hlokang bakhanni. Phapang ka boholo ha e bapisoa le phetolelo e feletseng ha e kholo hoo e ka khathatsang. Kaha mosebetsi o ntse o etsoa ka mochini oa 'mele, ke batla ho nka linepe, joalo ka mechini ea sebele. Ebang LVM kapa btrfs (kapa xfs, kapa zfs - phapang ha e kholo) e fana ka monyetla o joalo. Li-snapshots ha se matla a LVM. Kenya btrfs. Mme bootloader e ho MBR. Ha ho utloahale ho koala disk ea 50 MB ka karohano ea FAT ha u ka e sutumelletsa sebakeng sa tafole ea 1 MB ebe u fana ka sebaka sohle bakeng sa sistimi. E nkile 700 MB ho disk. Ho kenya SUSE ho na le bokae - ha ke hopole, ho bonahala eka ke 1.1 kapa 1.4 GB.

Kenya CEPH. Re iphapanyetsa mofuta oa 12 sebakeng sa polokelo ea li-debian mme re hokela ka kotloloho ho tsoa sebakeng sa 15.2.3. Re latela litaelo tse tsoang karolong ea "Ho kenya CEPH ka letsoho" ka lintlha tse latelang:

  • Pele o hokela polokelo, o tlameha ho kenya li-certificate tsa gnupg wget
  • Kamora ho hokela polokelo, empa pele o kenya sehlopha, ho kengoa sephutheloana ho tlohetsoe: apt -y --no-install-recommends install ceph-common ceph-mon ceph-osd ceph-mds ceph-mgr
  • Nakong ea ho kenya CEPH, ka mabaka a sa tsejoeng, e tla leka ho kenya lvm2. Ha e le hantle, ha se masoabi, empa ho kenya ho hloleha, kahoo CEPH le eona e ke ke ea kenya.

    Papali ena e thusitse:

    cat << EOF >> /var/lib/dpkg/status
    Package: lvm2
    Status: install ok installed
    Priority: important
    Section: admin
    Installed-Size: 0
    Maintainer: Debian Adduser Developers <[email protected]>
    Architecture: all
    Multi-Arch: foreign
    Version: 113.118
    Description: No-install
    EOF
    

Cluster Overview

ceph-osd - e ikarabellang bakeng sa ho boloka data ho disk. Bakeng sa disk e 'ngoe le e' ngoe, ho qalisoa tšebeletso ea marang-rang e amohelang le ho phethahatsa likōpo tsa ho bala kapa ho ngolla lintho. Likarolo tse peli li entsoe ka disk. E 'ngoe ea tsona e na le tlhahisoleseling mabapi le sehlopha, nomoro ea disk le linotlolo tsa cluster. Tlhahisoleseding ena ea 1KB e bōpiloe hanngoe ha u eketsa disk mme ha ho mohla u kileng ua hlokomela ho fetoha hape. Karolo ea bobeli ha e na sistimi ea faele mme e boloka data ea binary ea CEPH. Ho kenya ka othomathike liphetolelong tse fetileng ho thehile karolo ea 100MB xfs bakeng sa tlhaiso-leseling ea litšebeletso. Ke fetotse disk ho MBR mme ke fane ka 16MB feela - tšebeletso ha e tletlebe. Ke nahana, ntle le mathata, xfs e ka nkeloa sebaka ke ext. Karohano ena e kenngoa ho /var/lib/… moo tšebeletso e balang tlhahisoleseding e mabapi le OSD hape e fumana sehokelo ho sesebelisoa sa block moo data ea binary e bolokiloeng teng. Ka khopolo, o ka beha hang-hang tse thusang ho / var / lib / ..., 'me u fane ka disk eohle bakeng sa data. Ha o theha OSD ka ceph-deploy, molao o etsoa ka bohona ho kenya karohano ho /var/lib/…, mme litokelo tsa mosebelisi oa ceph li abeloa ho bala sesebelisoa se lakatsehang sa block. Ka ho kenya letsoho ka letsoho, u lokela ho iketsetsa sena, litokomane ha li bue ka eona. Ho boetse ho eletsoa ho hlakisa paramethara ea sepheo sa osd e le hore ho be le mohopolo o lekaneng oa 'mele.

ceph-mds. Boemong bo tlase, CEPH ke polokelo ea ntho. Bokhoni ba ho boloka boloko bo theohela ho boloka boloko bo bong le bo bong ba 4MB joalo ka ntho. Polokelo ea faele e sebetsa ka mokhoa o ts'oanang. Ho entsoe matamo a mabeli: e 'ngoe ke ea metadata, e' ngoe bakeng sa data. Li kopantsoe ho sistimi ea faele. Ka nako ena, mofuta o mong oa rekoto o bōptjoa, kahoo haeba u hlakola tsamaiso ea faele, empa u boloke matamo ka bobeli, joale u ke ke ua khona ho e tsosolosa. Ho na le mokhoa oa ho ntša lifaele ka li-block, ha ke so o leke. Ts'ebeletso ea ceph-mds e ikarabella ho fihlella sistimi ea faele. Sistimi e 'ngoe le e' ngoe ea faele e hloka mohlala o ikhethileng oa ts'ebeletso. Ho na le khetho ea "index" e u lumellang hore u thehe sebopeho sa litsamaiso tse 'maloa tsa faele ho e le' ngoe - hape ha e lekoe.

ceph-mon - Ts'ebeletso ena e boloka 'mapa oa sehlopha. E kenyelletsa tlhahisoleseling mabapi le li-OSD tsohle, algorithm ea kabo ea PG ho OSD, mme, haholo-holo, tlhahisoleseling mabapi le lintho tsohle (lintlha tsa mochini ona ha ke hlake ho nna: ho na le /var/lib/ceph/mon/…/ store.db directory, e na le file e kholo ke 26MB, 'me ka sehlopha sa lintho tse 105K, ho hlaha li-byte tse fetang 256 ka ntho e' ngoe le e 'ngoe - ke nahana hore mohlokomeli o boloka lethathamo la lintho tsohle le PG eo ho eona. ba bua leshano). Ho senyeha ha bukana ena ho fella ka tahlehelo ea data eohle sehlopheng. Ho tloha mona ho ile ha fihleloa qeto ea hore CRUSH e bonts'a hore na li-PG li fumaneha joang ho latela OSD, le hore na lintho li fumaneha joang ho latela PG - li bolokiloe bohareng ba database, ho sa tsotelehe hore na bahlahisi ba qoba lentsoe lena joang. Ka lebaka leo, pele, re ke ke ra kenya tsamaiso ho flash drive ka RO mode, kaha database e lula e ngolloa, ho hlokahala disk e eketsehileng bakeng sa tsena (e seng ho feta 1 GB), 'me ea bobeli, hoa hlokahala ho ba le kopi ea nako ea 'nete setsing sena. Haeba ho na le li-monitor tse 'maloa, joale mamello ea phoso e fanoa ka boiketsetso, empa molemong oa rona ho na le leihlo le le leng feela, boholo ba tse peli. Ho na le ts'ebetso ea theory ea ho khutlisa sebali se ipapisitseng le data ea OSD, ke ile ka e sebelisa ka makhetlo a mararo ka mabaka a fapaneng, mme makhetlo a mararo ha ho na melaetsa ea liphoso, hammoho le data. Ka bomalimabe, mochine ona ha o sebetse. Ekaba re sebelisa karolo e nyane ea OSD ebe re bokella RAID ho boloka database, e kanna ea ba le phello e mpe haholo ts'ebetsong, kapa re abela bonyane mecha ea litaba e 'meli e tšepahalang, haholo-holo USB, e le hore likou li se ke tsa nka.

rados-gw - e romella thepa kantle ho naha e sebelisa protocol ea S3 le tse ling tse joalo. E theha matamo a mangata, ha ho hlake hore na ke hobane'ng. Ha kea etsa liteko.

ceph-mgr - Ho kenya ts'ebeletso ena ho qala li-module tse 'maloa. E 'ngoe ea tsona ke autoscale e sa holofetseng. E leka ka matla ho boloka palo e nepahetseng ea li-PG/OSD. Haeba u batla ho laola karo-karolelano ka letsoho, u ka thibela sekhahla bakeng sa letamo le leng le le leng, empa tabeng ena mojule o oela ka karohano ka 0, 'me boemo ba sehlopha bo fetoha ERROR. Mojule o ngotsoe ka python, 'me haeba u fana ka maikutlo ka mohala o hlokahalang ho eona, sena se lebisa ho koala ha eona. Ho botsoa haholo ho hopola lintlha.

Lethathamo la mehloli e sebelisitsoeng:

Ho kenya CEPH
Ho hlaphoheloa ho tloha ho hloleha hoa ho hlahloba ka ho feletseng

Lethathamo la mangolo:

Ho kenya sistimi ka debootstrap

blkdev=sdb1
mkfs.btrfs -f /dev/$blkdev
mount /dev/$blkdev /mnt
cd /mnt
for i in {@,@var,@home}; do btrfs subvolume create $i; done
mkdir snapshot @/{var,home}
for i in {var,home}; do mount -o bind @${i} @/$i; done
debootstrap buster @ http://deb.debian.org/debian; echo $?
for i in {dev,proc,sys}; do mount -o bind /$i @/$i; done
cp /etc/bash.bashrc @/etc/

chroot /mnt/@ /bin/bash
echo rbd1 > /etc/hostname
passwd
uuid=`blkid | grep $blkdev | cut -d """ -f 2`
cat << EOF > /etc/fstab
UUID=$uuid / btrfs noatime,nodiratime,subvol=@ 0 1
UUID=$uuid /var btrfs noatime,nodiratime,subvol=@var 0 2
UUID=$uuid /home btrfs noatime,nodiratime,subvol=@home 0 2
EOF
cat << EOF >> /var/lib/dpkg/status
Package: lvm2
Status: install ok installed
Priority: important
Section: admin
Installed-Size: 0
Maintainer: Debian Adduser Developers <[email protected]>
Architecture: all
Multi-Arch: foreign
Version: 113.118
Description: No-install

Package: sudo
Status: install ok installed
Priority: important
Section: admin
Installed-Size: 0
Maintainer: Debian Adduser Developers <[email protected]>
Architecture: all
Multi-Arch: foreign
Version: 113.118
Description: No-install
EOF

exit
grub-install --boot-directory=@/boot/ /dev/$blkdev
init 6

apt -yq install --no-install-recommends linux-image-amd64 bash-completion ed btrfs-progs grub-pc iproute2 ssh  smartmontools ntfs-3g net-tools man
exit
grub-install --boot-directory=@/boot/ /dev/$blkdev
init 6

Etsa sehlopha

apt -yq install --no-install-recommends gnupg wget ca-certificates
echo 'deb https://download.ceph.com/debian-octopus/ buster main' >> /etc/apt/sources.list
wget -q -O- 'https://download.ceph.com/keys/release.asc' | apt-key add -
apt update
apt -yq install --no-install-recommends ceph-common ceph-mon

echo 192.168.11.11 rbd1 >> /etc/hosts
uuid=`cat /proc/sys/kernel/random/uuid`
cat << EOF > /etc/ceph/ceph.conf
[global]
fsid = $uuid
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
mon allow pool delete = true
mon host = 192.168.11.11
mon initial members = rbd1
mon max pg per osd = 385
osd crush update on start = false
#osd memory target = 2147483648
osd memory target = 1610612736
osd scrub chunk min = 1
osd scrub chunk max = 2
osd scrub sleep = .2
osd pool default pg autoscale mode = off
osd pool default size = 1
osd pool default min size = 1
osd pool default pg num = 1
osd pool default pgp num = 1
[mon]
mgr initial modules = dashboard
EOF

ceph-authtool --create-keyring ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
ceph-authtool --create-keyring ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
cp ceph.client.admin.keyring /etc/ceph/
ceph-authtool --create-keyring bootstrap-osd.ceph.keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'
cp bootstrap-osd.ceph.keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
ceph-authtool ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
ceph-authtool ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
monmaptool --create --add rbd1 192.168.11.11 --fsid $uuid monmap
rm -R /var/lib/ceph/mon/ceph-rbd1/*
ceph-mon --mkfs -i rbd1 --monmap monmap --keyring ceph.mon.keyring
chown ceph:ceph -R /var/lib/ceph
systemctl enable ceph-mon@rbd1
systemctl start ceph-mon@rbd1
ceph mon enable-msgr2
ceph status

# dashboard

apt -yq install --no-install-recommends ceph-mgr ceph-mgr-dashboard python3-distutils python3-yaml
mkdir /var/lib/ceph/mgr/ceph-rbd1
ceph auth get-or-create mgr.rbd1 mon 'allow profile mgr' osd 'allow *' mds 'allow *' > /var/lib/ceph/mgr/ceph-rbd1/keyring
systemctl enable ceph-mgr@rbd1
systemctl start ceph-mgr@rbd1
ceph config set mgr mgr/dashboard/ssl false
ceph config set mgr mgr/dashboard/server_port 7000
ceph dashboard ac-user-create root 1111115 administrator
systemctl stop ceph-mgr@rbd1
systemctl start ceph-mgr@rbd1

Ho eketsa OSD (karolo)

apt install ceph-osd

osdnum=`ceph osd create`
mkdir -p /var/lib/ceph/osd/ceph-$osdnum
mkfs -t xfs /dev/sda1
mount -t xfs /dev/sda1 /var/lib/ceph/osd/ceph-$osdnum
cd /var/lib/ceph/osd/ceph-$osdnum
ceph auth get-or-create osd.0 mon 'profile osd' mgr 'profile osd' osd 'allow *' > /var/lib/ceph/osd/ceph-$osdnum/keyring
ln -s /dev/disk/by-partuuid/d8cc3da6-02  block
ceph-osd -i $osdnum --mkfs
#chown ceph:ceph /dev/sd?2
chown ceph:ceph -R /var/lib/ceph
systemctl enable ceph-osd@$osdnum
systemctl start ceph-osd@$osdnum

Kakaretso

Molemo o ka sehloohong oa ho bapatsa oa CEPH ke CRUSH, algorithm ea ho bala sebaka sa data. Bahlokomeli ba phatlalatsa algorithm ena ho bareki, ka mor'a moo bareki ba kopa ka ho toba node e lakatsehang le OSD e lakatsehang. CRUSH ha e fane ka setsi. Ke faele e nyane eo u ka e hatisang le ho e fanyeha leboteng. Boikoetliso bo bontšitse hore CRUSH ha se 'mapa o felletseng. Ho senya le ho bopa li-monitor ha u ntse u boloka li-OSD tsohle le CRUSH ha hoa lekana ho tsosolosa sehlopha. Ho tsoa ho sena ho fihleloa hore leihlo le leng le le leng le boloka metadata e mabapi le sehlopha sohle. Chelete e sa reng letho ea metadata ena ha e behe lithibelo ho boholo ba sehlopha, empa e hloka polokeho ea bona, e felisang ho boloka disk ka lebaka la ho kenya tsamaiso ho flash drive le ho kenyelletsa lihlopha tse nang le li-node tse ka tlaase ho tse tharo. Leano le matla la bahlahisi mabapi le likarolo tsa boikhethelo. Hole le minimalism. Litokomane boemong: "kea leboha ka seo e leng sona, empa haholo, hanyenyane haholo." Bokhoni ba ho sebelisana le lits'ebeletso ka boemo bo tlase bo fanoa, empa litokomane li ka holimo haholo tabeng ena, kahoo ho ka etsahala hore ebe che ho feta e. Hoo e ka bang ha ho na monyetla oa ho hlaphoheloa data ho tswa boemong ba tshohanyetso.

Likhetho bakeng sa ts'ebetso e eketsehileng: tlohela CEPH 'me u sebelise banal multi-disk btrfs (kapa xfs, zfs), ithute lintlha tse ncha ka CEPH, e tla u lumella ho e sebelisa maemong a boletsoeng, leka ho ngola polokelo ea hau e le koetliso e tsoetseng pele. .

Source: www.habr.com

Eketsa ka tlhaloso