Ha ho na le data e ngata ho feta e lekanang disk e le 'ngoe, ke nako ea ho nahana ka RAID. Ha ke sa le ngoana, ke ne ke atisa ho utloa ho tsoa ho baholo ba ka: "ka letsatsi le leng RAID e tla fetoha ntho ea nako e fetileng, polokelo ea lintho e tla tlala lefats'e, 'me ha u tsebe le hore na CEPH ke eng," kahoo ntho ea pele eo ke e entseng bophelong ba ka bo ikemetseng e ne e le ho iketsetsa sehlopha sa ka. Sepheo sa teko e ne e le ho tloaelana le sebopeho sa ka hare sa ceph le ho utloisisa boholo ba ts'ebeliso ea eona. Ke ho utloahalang hakae ho kenya tšebetsong ceph likhoebong tse mahareng, le ho tse nyane? Kamora lilemo tse 'maloa tsa ts'ebetso le tahlehelo e' maloa e ke keng ea qojoa, ho ile ha hlaha kutloisiso ea maqiti a hore ha se ntho e 'ngoe le e' ngoe e bonolo hakana. Likarolo tsa CEPH li baka litšitiso ts'ebelisong ea eona e pharalletseng, 'me ka lebaka la tsona, liteko li fihlile pheletsong e sa feleng. Ka tlase ke tlhaloso ea mehato eohle e nkiloeng, sephetho se fumanoeng le liqeto tse entsoeng. Haeba batho ba nang le tsebo ba arolelana phihlelo ea bona le ho hlakisa lintlha tse ling, ke tla leboha.
Tlhokomeliso: Bahlahlobisisi ba bontšitse liphoso tse tebileng ho tse ling tsa menahano e hlokang ho ntlafatsoa ha sengoloa kaofela.
CEPH Leano
Sehlopha sa CEPH se kopanya nomoro ea k'homphieutha ea K ea li-disk tsa boholo bo sa tloaelehang 'me li boloka lintlha ho tsona, li kopitsa sengoathoana se seng le se seng (4 MB ka ho sa feleng) palo e boletsoeng N ea linako.
Ha re nahaneng ka taba e bonolo ka ho fetisisa e nang le li-disk tse peli tse tšoanang. Li ka sebelisoa ho theha RAID 1 kapa sehlopha se nang le N = 2 - sephetho se tla tšoana. Haeba ho na le li-disk tse tharo tsa boholo bo fapaneng, joale ho bonolo ho theha sehlopha se nang le N = 2: lintlha tse ling li tla ba li-disks 1 le 2, tse ling ho 1 le 3, le tse ling ho 2 le 3, athe RAID ha e khonehe (ho ka khoneha ho theha RAID e joalo, empa e ka ba khopamiso). Haeba ho na le li-disk tse ngata, joale ho ka khoneha ho theha RAID 5; CEPH e na le analoge - erasure_code, e hananang le mehopolo ea pele ea bahlahisi 'me ka hona ha e nahaneloe. RAID 5 e nka hore ho na le palo e nyenyane ea li-disk, 'me kaofela ha tsona li boemong bo botle. Haeba e 'ngoe e hlōleha, ba bang ba lokela ho ema ho fihlela disk e nkeloa sebaka' me data e khutlisetsoa ho eona. CEPH, leha ho le joalo, ka N> = 3, e khothalletsa tšebeliso ea li-disk tsa khale, haholo-holo, haeba u boloka li-disk tse 'maloa tse ntle ho boloka kopi e le' ngoe ea data, 'me u boloke likopi tse peli kapa tse tharo tse setseng ho palo e kholo ea li-disk tsa khale, joale boitsebiso bo tla bolokeha, kaha ha li-disks tse ncha li ntse li phela, ha ho na mathata,' me haeba e 'ngoe ea tsona e robeha, joale ho hloleha ka nako e le 'ngoe ka nako e le 'ngoe ho feta lilemo tse hlano ho tloha ho hloleha ho feta lilemo tse hlano ho feta ho hloleha ho feta lilemo tse hlano ho tloha ho li-server ho feta lilemo tse tharo. ketsahalo e sa lebelloang haholo.
Ho na le boqhetseke kabong ea likopi. Ka ho sa feleng, ho nahanoa hore data e arotsoe ka palo e kholoanyane (~ 100 ka disk) ea lihlopha tsa kabo ea PG, e 'ngoe le e' ngoe e kopitsoa ho li-disk tse ling. Ha re re K = 6, N = 2, joale haeba li-disk tse peli li hlōleha, data e tiisitsoe hore e tla lahleha, kaha ho ea ka khopolo ea monyetla, ho tla ba le bonyane PG e le 'ngoe e tla be e le li-disks tsena tse peli. 'Me tahlehelo ea sehlopha se le seng e etsa hore data eohle e ka letamong e se ke ea fumaneha. Haeba li-disk li arotsoe ka lipara tse tharo 'me data e lumelloa ho bolokoa feela ka li-disk ka har'a para e le' ngoe, joale kabo e joalo e boetse e hanyetsa ho hlōleha ha disk leha e le efe, empa haeba tse peli li hlōleha, monyetla oa ho lahleheloa ke data ha o 100%, empa ke 3/15 feela, esita le haeba li-disk tse tharo li hlōleha, ke 12/20 feela. Kahoo, entropy kabong ea data ha e kenye letsoho ho mamellaneng ha liphoso. Hape hlokomela hore bakeng sa seva sa faele, RAM ea mahala e eketsa lebelo la karabo haholo. Ha memori e ntse e eketseha sebakeng se seng le se seng, 'me ha memori e ntse e eketseha ho li-node tsohle, e tla ba kapele. Ha ho pelaelo hore sena ke monyetla oa sehlopha ho feta seva e le 'ngoe, haholo-holo, NAS ea hardware, moo mohopolo o monyenyane haholo o hahiloeng ho oona.
Ho latela hore CEPH ke mokhoa o motle oa ho theha mokhoa o ka tšeptjoang oa polokelo ea data bakeng sa mashome a lefuba a nang le bokhoni ba ho fokotsa thepa e sa sebetseng ka lichelete tse fokolang (sena se tla hloka litšenyehelo, empa se senyenyane ha se bapisoa le mekhoa ea polokelo ea khoebo).
Ts'ebetsong ea lihlopha
Bakeng sa teko, re tla nka komporo e khaotsoeng ea Intel DQ57TM + Intel core i3 540 + 16 GB RAM. Re tla hlophisa li-disk tse 'nè tsa 2 TB ka mofuta oa RAID10, ka mor'a tlhahlobo e atlehileng re tla eketsa node ea bobeli le palo e tšoanang ea li-disk.
Kenya Linux. От дистрибутива требуется возможность кастомизации и стабильность. Под требования подходят Debian и Suse. У Suse более гибкий установщик, позволяющий отключить любой пакет; к сожалению, я не смог понять, какие можно выкинуть без ущерба для системы. Ставим Debian через debootstrap buster. Опция min-base устанавливает нерабочую систему, в которой не хватает драйверов. Разница в размере по сравнению с полной версией не так велика, чтобы заморачиваться. Поскольку работа ведётся на физической машине, хочется делать снапшоты, как на виртуалках. Такую возможность предоставляет либо LVM, либо btrfs (или xfs, или zfs — разница не велика). У LVM снапшоты — не сильная сторона. Ставим btrfs. И загрузчик — в MBR. Нет смысла засорять диск 50 МБ разделом с FAT, когда можно затолкать его в 1 МБ область таблицы разделов и всё пространство выделить под систему. Заняло на диске 700 МБ. Сколько у базовой установки SUSE — не запомнил, кажется, около 1.1 или 1.4 ГБ.
Устанавливаем CEPH. Игнорируем версию 12 в репозитории debian и подключаем прямо с сайта 15.2.3. Следуем инструкции из раздела «устанавливаем CEPH вручную» со следующими оговорками:
- Pele o hokela polokelo, o hloka ho kenya li-certificate tsa gnupg wget
- Kamora ho hokahanya polokelo, empa pele o kenya sehlopha, ho kengoa ha liphutheloana ho tlohetsoe: apt -y --no-install-recommends install ceph-common ceph-mon ceph-osd ceph-mds ceph-mgr
- Nakong ea ho kenya, CEPH e tla leka ho kenya lvm2 ka mabaka a sa tsejoeng. Ha e le hantle, ha se masoabi, empa ho kenya ho qetella ka ho hlōleha, kahoo CEPH le eona e ke ke ea kenngoa.
Papali ena e thusitse:
cat << EOF >> /var/lib/dpkg/status Package: lvm2 Status: install ok installed Priority: important Section: admin Installed-Size: 0 Maintainer: Debian Adduser Developers <adduser@packages.debian.org> Architecture: all Multi-Arch: foreign Version: 113.118 Description: No-install EOF
Kakaretso ea sehlopha
ceph-osd - e ikarabella ho boloka data ho disk. Bakeng sa disk e 'ngoe le e' ngoe, ho hlahisoa tšebeletso ea marang-rang e amohelang le ho phethahatsa likōpo tsa ho bala kapa ho ngolla lintho. Likarolo tse peli li entsoe ka disk. E 'ngoe ea tsona e na le tlhahisoleseling mabapi le sehlopha, nomoro ea disk le linotlolo tsa cluster. Tlhahisoleseding ena, 1 KB ka boholo, e bōpiloe hanngoe ha u eketsa disk mme ha ke e-s'o bone hore e fetoha. Karolo ea bobeli ha e na sistimi ea faele mme e boloka data ea binary ea CEPH. Ho instola ka boits'oaro liphetolelong tse fetileng ho thehile karohano ea 100 MB xfs bakeng sa tlhaiso-leseling ea lits'ebeletso. Ke fetotse disk ho MBR mme ke fane ka 16 MB feela - tšebeletso ha e tletlebe. Ke nahana hore xfs e ka nkeloa sebaka ke ext ntle le mathata. Karohano ena e kenngoa ho /var/lib/…, moo tšebeletso e balang tlhahisoleseding e mabapi le OSD, hape e fumana sehokelo ho sesebelisoa sa block moo data ea binary e bolokiloeng teng. Ka khopolo, u ka beha tse thusang hang-hang ho /var/lib/…, 'me u fane ka disk eohle bakeng sa data. Ha o theha OSD ka ceph-deploy, molao o etsoa ka bohona ho kenya karohano ho /var/lib/…, mme mosebelisi oa ceph le eena o fuoa litokelo tsa ho bala sesebelisoa se hlokahalang sa block. Ha u kenya ka letsoho, u tlameha ho iketsetsa sena, litokomane ha li bue letho ka sena. Ho boetse ho eletsoa ho hlakisa paramethara ea sepheo sa osd e le hore ho be le mohopolo o lekaneng oa 'mele.
ceph-mds. Boemong bo tlase, CEPH ke polokelo ea ntho. Bokhoni ba ho thibela polokelo bo theohela ho boloka boloko bo bong le bo bong ba 4MB joalo ka ntho. Polokelo ea faele e sebetsa ka mokhoa o ts'oanang. Ho entsoe matamo a mabeli: e 'ngoe ke ea metadata, e' ngoe bakeng sa data. Li kopantsoe ho sistimi ea faele. Ka nako ena, rekoto e 'ngoe e entsoe, kahoo haeba u hlakola tsamaiso ea faele empa u boloka matamo ka bobeli, u ke ke ua khona ho e tsosolosa. Ho na le mokhoa oa ho ntša lifaele ka li-block, ha ke so o leke. Ts'ebeletso ea ceph-mds e ikarabella ho fihlella sistimi ea faele. Mohlala o fapaneng oa ts'ebeletso oa hlokahala bakeng sa sistimi e ngoe le e ngoe ea faele. Ho na le khetho ea "index" e u lumellang hore u thehe sebopeho sa litsamaiso tse 'maloa tsa faele ho e le' ngoe - hape ha e lekoe.
ceph-mon - tšebeletso ena e boloka 'mapa oa lihlopha. E kenyelletsa tlhahisoleseling mabapi le li-OSD tsohle, algorithm ea ho aba li-PG ho li-OSD, 'me, haholo-holo, tlhahisoleseling mabapi le lintho tsohle (lintlha tsa mochini ona ha li hlake ho 'na: ho na le directory /var/lib/ceph/mon/…/store.db, e na le faele e kholo - 26MB, mme ho na le lintho tse 105K tse ka har'a sehlopha sa 256 se nahanang ho feta sehlopha se seng le se seng, boloka lenane la dintho tsohle le di PG tseo di leng ho tsona). Ho senyeha ha bukana ena ho fella ka tahlehelo ea data eohle sehlopheng. Kahoo qeto ea hore CRUSH e bonts'a hore na li-PG li fumaneha joang ke OSD, le hore na lintho li fumaneha joang ke li-PG - li bolokiloe bohareng ba database, ho sa tsotelehe hore na bahlahisi ba qoba lentsoe lena joang. Ka lebaka leo, pele, re ke ke ra kenya tsamaiso ho flash drive ka RO mode, kaha database e lula e rekotoa, ho hlokahala disk e eketsehileng bakeng sa sena (ha e na ho feta 1 GB), ea bobeli, ho hlokahala hore u be le kopi ea database ena ka nako ea sebele. Haeba ho na le li-monitor tse 'maloa, joale mamello ea phoso e fanoa ka boiketsetso, empa molemong oa rona ho na le leihlo le le leng, boholo - tse peli. Ho na le ts'ebetso ea theory ea ho khutlisetsa leihlo ho latela data ea OSD, ke ile ka e sebelisa ka makhetlo a mararo ka mabaka a fapaneng, 'me ka makhetlo a mararo ho ne ho se na melaetsa ea liphoso, hammoho le data. Ka bomalimabe, mochine ona ha o sebetse. Ekaba re sebelisa karohano e nyane ho OSD ebe re bokella RAID ho boloka database, e tla ba le phello e mpe haholo ts'ebetsong, kapa re abela bonyane mecha ea litaba e 'meli e tšepahalang, haholo-holo USB, e le hore re se ke ra lula likoung.
rados-gw - e romella thepa kantle ho naha ka protocol ea S3 le tse tšoanang. E theha matamo a mangata, ha ho hlake hore na ke hobane'ng. Ha kea etsa liteko tse ngata.
ceph-mgr - ha ts'ebeletso ena e kentsoe, li-module tse 'maloa li hlahisoa. E 'ngoe ea tsona ke autoscale, e ke keng ea thijoa. E leka ho boloka palo e nepahetseng ea PG/OSD. Haeba u batla ho laola karo-karolelano ka letsoho, u ka tima "scaling" bakeng sa letamo le leng le le leng, empa tabeng ena module e senyeha ka karohano ka 0, 'me boemo ba sehlopha bo fetoha ERROR. Mojule o ngotsoe ka Python, 'me haeba u fana ka maikutlo ka mohala o hlokahalang ho oona, o tla holofala. Ke botsoa haholo ho hopola lintlha.
Lethathamo la mehloli e sebelisitsoeng:
Lethathamo la mangolo:
Ho kenya sistimi ka debootstrap
blkdev=sdb1
mkfs.btrfs -f /dev/$blkdev
mount /dev/$blkdev /mnt
cd /mnt
for i in {@,@var,@home}; do btrfs subvolume create $i; done
mkdir snapshot @/{var,home}
for i in {var,home}; do mount -o bind @${i} @/$i; done
debootstrap buster @ http://deb.debian.org/debian; echo $?
for i in {dev,proc,sys}; do mount -o bind /$i @/$i; done
cp /etc/bash.bashrc @/etc/
chroot /mnt/@ /bin/bash
echo rbd1 > /etc/hostname
passwd
uuid=`blkid | grep $blkdev | cut -d """ -f 2`
cat << EOF > /etc/fstab
UUID=$uuid / btrfs noatime,nodiratime,subvol=@ 0 1
UUID=$uuid /var btrfs noatime,nodiratime,subvol=@var 0 2
UUID=$uuid /home btrfs noatime,nodiratime,subvol=@home 0 2
EOF
cat << EOF >> /var/lib/dpkg/status
Package: lvm2
Status: install ok installed
Priority: important
Section: admin
Installed-Size: 0
Maintainer: Debian Adduser Developers <adduser@packages.debian.org>
Architecture: all
Multi-Arch: foreign
Version: 113.118
Description: No-install
Package: sudo
Status: install ok installed
Priority: important
Section: admin
Installed-Size: 0
Maintainer: Debian Adduser Developers <adduser@packages.debian.org>
Architecture: all
Multi-Arch: foreign
Version: 113.118
Description: No-install
EOF
exit
grub-install --boot-directory=@/boot/ /dev/$blkdev
init 6
apt -yq install --no-install-recommends linux-image-amd64 bash-completion ed btrfs-progs grub-pc iproute2 ssh smartmontools ntfs-3g net-tools man
exit
grub-install --boot-directory=@/boot/ /dev/$blkdev
init 6Etsa sehlopha
apt -yq install --no-install-recommends gnupg wget ca-certificates
echo 'deb https://download.ceph.com/debian-octopus/ buster main' >> /etc/apt/sources.list
wget -q -O- 'https://download.ceph.com/keys/release.asc' | apt-key add -
apt update
apt -yq install --no-install-recommends ceph-common ceph-mon
echo 192.168.11.11 rbd1 >> /etc/hosts
uuid=`cat /proc/sys/kernel/random/uuid`
cat << EOF > /etc/ceph/ceph.conf
[global]
fsid = $uuid
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
mon allow pool delete = true
mon host = 192.168.11.11
mon initial members = rbd1
mon max pg per osd = 385
osd crush update on start = false
#osd memory target = 2147483648
osd memory target = 1610612736
osd scrub chunk min = 1
osd scrub chunk max = 2
osd scrub sleep = .2
osd pool default pg autoscale mode = off
osd pool default size = 1
osd pool default min size = 1
osd pool default pg num = 1
osd pool default pgp num = 1
[mon]
mgr initial modules = dashboard
EOF
ceph-authtool --create-keyring ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
ceph-authtool --create-keyring ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
cp ceph.client.admin.keyring /etc/ceph/
ceph-authtool --create-keyring bootstrap-osd.ceph.keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'
cp bootstrap-osd.ceph.keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
ceph-authtool ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
ceph-authtool ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
monmaptool --create --add rbd1 192.168.11.11 --fsid $uuid monmap
rm -R /var/lib/ceph/mon/ceph-rbd1/*
ceph-mon --mkfs -i rbd1 --monmap monmap --keyring ceph.mon.keyring
chown ceph:ceph -R /var/lib/ceph
systemctl enable ceph-mon@rbd1
systemctl start ceph-mon@rbd1
ceph mon enable-msgr2
ceph status
# dashboard
apt -yq install --no-install-recommends ceph-mgr ceph-mgr-dashboard python3-distutils python3-yaml
mkdir /var/lib/ceph/mgr/ceph-rbd1
ceph auth get-or-create mgr.rbd1 mon 'allow profile mgr' osd 'allow *' mds 'allow *' > /var/lib/ceph/mgr/ceph-rbd1/keyring
systemctl enable ceph-mgr@rbd1
systemctl start ceph-mgr@rbd1
ceph config set mgr mgr/dashboard/ssl false
ceph config set mgr mgr/dashboard/server_port 7000
ceph dashboard ac-user-create root 1111115 administrator
systemctl stop ceph-mgr@rbd1
systemctl start ceph-mgr@rbd1Ho eketsa OSD (karolo)
apt install ceph-osd
osdnum=`ceph osd create`
mkdir -p /var/lib/ceph/osd/ceph-$osdnum
mkfs -t xfs /dev/sda1
mount -t xfs /dev/sda1 /var/lib/ceph/osd/ceph-$osdnum
cd /var/lib/ceph/osd/ceph-$osdnum
ceph auth get-or-create osd.0 mon 'profile osd' mgr 'profile osd' osd 'allow *' > /var/lib/ceph/osd/ceph-$osdnum/keyring
ln -s /dev/disk/by-partuuid/d8cc3da6-02 block
ceph-osd -i $osdnum --mkfs
#chown ceph:ceph /dev/sd?2
chown ceph:ceph -R /var/lib/ceph
systemctl enable ceph-osd@$osdnum
systemctl start ceph-osd@$osdnumKakaretso
Molemo o ka sehloohong oa ho bapatsa oa CEPH ke CRUSH - algorithm ea ho bala sebaka sa data. Bahlokomeli ba aba algorithm ena ho bareki, ka mor'a moo bareki ba kopa ka ho toba node e hlokahalang le OSD e hlokahalang. CRUSH e netefatsa ho ba sieo ha centralization. Ke faele e nyenyane e ka hatisoang le ho fanyehoa leboteng. Boikoetliso bo bontšitse hore CRUSH ha se 'mapa o felletseng. Haeba u senya le ho bopa li-monitor, u boloka li-OSD tsohle le CRUSH, sena ha sea lekana ho tsosolosa sehlopha. Kahoo, ho etsoa qeto ea hore leihlo le leng le le leng le boloka metadata e mabapi le sehlopha sohle. Palo e sa reng letho ea metadata ena ha e behe lithibelo ho boholo ba sehlopha, empa e hloka ho netefatsa polokeho ea bona, e sa kenyelletseng polokelo ea disk ka ho kenya sistimi ho drive ea flash mme e sa kenye lihlopha tse nang le li-node tse ka tlase ho tse tharo. Leano le matla la bahlahisi mabapi le likarolo tsa boikhethelo. Hole le minimalism. Litokomane boemong ba: "leboha ka seo re nang le sona, empa haholo, se futsanehile haholo." Ho na le monyetla oa ho sebelisana le lits'ebeletso ka boemo bo tlase, empa litokomane li ka holimo haholo mabapi le sehlooho sena, ho e-na le hoo, ho e-na le hoo. Hoo e ka bang ha ho monyetla oa ho hlaphoheloa ha data ho tloha boemong ba tšohanyetso.
Likhetho bakeng sa ketso e eketsehileng: tlohela CEPH 'me u sebelise banal multi-disk btrfs (kapa xfs, zfs), ithute tlhahisoleseding e ncha ka CEPH e tla e lumella hore e sebelisoe maemong a boletsoeng, leka ho ngola polokelo ea hau e le koetliso e tsoetseng pele.
Source: www.habr.com
