Apa sing padha karo LVM lan Matryoshka?

Wektu kuwi.
Aku kaya kanggo nuduhake karo masyarakat pengalaman praktis mbangun sistem panyimpenan data kanggo KVM nggunakake md RAID + LVM.

Program kasebut bakal kalebu:

  • Bangunan md RAID 1 saka NVMe SSD.
  • Assembling md RAID 6 saka SATA SSD lan drive biasa.
  • Fitur operasi TRIM/DISCARD ing SSD RAID 1/6.
  • Nggawe bootable md RAID 1/6 array ing pesawat umum saka disk.
  • Nginstal sistem ing NVMe RAID 1 nalika ora ana support NVMe ing BIOS.
  • Nggunakake cache LVM lan LVM tipis.
  • Nggunakake jepretan BTRFS lan ngirim / nampa kanggo serep.
  • Nggunakake gambar lancip lan thin_delta LVM kanggo serep gaya BTRFS.

Yen sampeyan kasengsem, mangga deleng kucing.

Pernyataan

Penulis ora tanggung jawab kanggo akibat saka nggunakake utawa ora nggunakake materi / conto / kode / tips / data saka artikel iki. Kanthi maca utawa nggunakake materi iki kanthi cara apa wae, sampeyan tanggung jawab kanggo kabeh akibat saka tumindak kasebut. Konsekuensi sing bisa uga kalebu:

  • SSD NVMe sing digoreng garing.
  • Rampung digunakake sumber ngrekam lan Gagal drive SSD.
  • Mundhut kabeh data ing kabeh drive, kalebu salinan serep.
  • hardware komputer rusak.
  • Mbuang wektu, syaraf lan dhuwit.
  • Konsekuensi liyane sing ora kasebut ing ndhuwur.

Wesi

kasedhiya yaiku:

Motherboard saka watara 2013 karo chipset Z87, lengkap karo Intel Core i7 / Haswell.

  • Prosesor 4 intine, 8 benang
  • 32 GB DDR3 RAM
  • 1 x 16 utawa 2 x 8 PCIe 3.0
  • 1 x 4 + 1 x 1 PCIe 2.0
  • 6 x 6 GBps SATA 3 konektor

SAS adaptor LSI SAS9211-8I flashed kanggo IT / mode HBA. Firmware sing aktif RAID wis sengaja diganti karo perangkat kukuh HBA kanggo:

  1. Sampeyan bisa mbuwang adaptor iki kapan wae lan ngganti karo adaptor liyane sing sampeyan temoni.
  2. TRIM/Discard bisa digunakake ing disk, amarga ... ing perangkat kukuh RAID printah iki ora didhukung ing kabeh, lan HBA, ing umum, ora Care apa printah ditularakΓ© liwat bis.

Hard drive - 8 lembar HGST Travelstar 7K1000 kanthi kapasitas 1 TB ing faktor wangun 2.5, kaya kanggo laptop. Drive iki sadurunge ana ing RAID 6 array. Padha uga bakal duwe nggunakake ing sistem anyar. Kanggo nyimpen serep lokal.

Tambahan ditambahake:

6 bΓͺsik SATA SSD model Samsung 860 QVO 2TB. SSDs iki mbutuhake volume gedhe, ana cache SLC, linuwih, lan rega murah dikarepake. Dhukungan kanggo discard / nul dibutuhake, sing dicenthang dening baris ing dmesg:

kernel: ata1.00: Enabling discard_zeroes_data

2 bΓͺsik NVMe SSD model Samsung SSD 970 EVO 500GB.

Kanggo SSD iki, kacepetan maca / nulis kanthi acak lan kapasitas sumber daya kanggo kabutuhan sampeyan penting. Radiator kanggo wong-wong mau. kudune. Pancen. Yen ora, goreng nganti crispy sajrone sinkronisasi RAID pisanan.

Adaptor StarTech PEX8M2E2 kanggo 2 x NVMe SSD diinstal ing slot PCIe 3.0 8x. Iki, maneh, mung HBA, nanging kanggo NVMe. Beda karo adaptor murah amarga ora mbutuhake dhukungan bifurkasi PCIe saka motherboard amarga ana saklar PCIe sing dibangun. Iku bakal bisa malah ing sistem paling kuna karo PCIe, malah yen iku x1 PCIe 1.0 slot . Alami, kanthi kacepetan sing cocog. Ora ana RAID ing kana. Ora ana BIOS sing dibangun ing papan. Dadi, sistem sampeyan ora bakal sinau kanthi ajaib kanggo boot karo NVMe, luwih-luwih NVMe RAID amarga piranti iki.

Komponen iki mung amarga mung ana siji 8x PCIe 3.0 gratis ing sistem kasebut, lan, yen ana 2 slot gratis, bisa gampang diganti karo rong PEX4M2E1 utawa analog, sing bisa dituku ing endi wae kanthi rega 600 rubel.

Penolakan kabeh jinis hardware utawa chipset sing dibangun / RAID BIOS digawe kanthi sengaja, supaya bisa ngganti kabeh sistem, kajaba SSD / HDD dhewe, nalika njaga kabeh data. Saenipun, supaya sampeyan bisa nyimpen malah sistem operasi diinstal nalika pindhah menyang hardware rampung anyar / beda. Sing utama yaiku ana port SATA lan PCIe. Iku kaya CD urip utawa bootable flash drive, mung cepet banget lan sethitik gedhe banget.

HumorYen ora, sampeyan ngerti apa sing kedadeyan - kadhangkala sampeyan kudu njupuk kabeh array kanggo njupuk. Nanging aku ora pengin kelangan data. Kanggo nindakake iki, kabeh media kasebut kanthi gampang dumunung ing slide ing 5.25 teluk ing kasus standar.

Inggih, lan, mesthi, kanggo nyobi karo macem-macem cara SSD caching ing Linux.

Serangan hardware mboseni. Nguripake. Iku salah siji bisa utawa ora. Lan karo mdadm tansah ana pilihan.

Piranti Lunak

Sadurunge, Debian 8 Jessie wis diinstal ing hardware, sing cedhak karo EOL. RAID 6 dirakit saka HDD kasebut ing ndhuwur dipasangake karo LVM. Iki mbukak mesin virtual ing kvm/libvirt.

Amarga Penulis duwe pengalaman sing cocog kanggo nggawe flash drive SATA / NVMe bootable portabel, lan uga, supaya ora ngrusak template sing cocog, Ubuntu 18.04 dipilih minangka sistem target, sing wis cukup stabil, nanging isih duwe 3 taun. dhukungan ing mangsa ngarep.

Sistem kasebut ngemot kabeh driver hardware sing kita butuhake metu saka kothak. Kita ora butuh piranti lunak utawa driver pihak katelu.

Nyiyapake kanggo instalasi

Kanggo nginstal sistem kita butuh Gambar Desktop Ubuntu. Sistem server duwe sawetara jinis installer sing kuat, sing nuduhake kamardikan sing gedhe banget sing ora bisa dipateni, kudu nyurung partisi sistem UEFI menyang salah sawijining disk, ngrusak kabeh kaendahan. Mulane, mung diinstal ing mode UEFI. Ora menehi pilihan.

Kita ora seneng karo iki.

Kenapa?Sayange, boot UEFI arang banget kompatibel karo piranti lunak boot RAID, amarga ... Ora ana sing menehi reservasi kanggo partisi UEFI ESP. Ana resep-resep online sing menehi saran kanggo nyelehake partisi ESP ing flash drive ing port USB, nanging iki minangka titik kegagalan. Ana resep-resep nggunakake piranti lunak mdadm RAID 1 karo versi metadata 0.9 sing ora nyegah UEFI BIOS saka ndeleng pemisahan iki, nanging iki urip nganti wayahe seneng nalika BIOS utawa OS hardware liyane nulis soko kanggo ESP lan lali kanggo nyinkronake menyang liyane. pangilon.

Kajaba iku, boot UEFI gumantung ing NVRAM, sing ora bakal pindhah bebarengan karo disk menyang sistem anyar, amarga iku bagΓ©an saka motherboard.

Dadi, kita ora bakal reinvent anyar wheel . Kita wis duwe mancal mbah kakung sing wis dites, saiki diarani Warisan / BIOS boot, kanthi jeneng bangga CSM ing sistem sing kompatibel karo UEFI. Kita mung bakal njupuk saka rak, lubricate, pompa ban lan ngilangke nganggo kain lembab.

Versi desktop Ubuntu uga ora bisa diinstal kanthi bener karo bootloader Warisan, nanging ing kene, kaya sing dikandhakake, paling ora ana pilihan.

Dadi, kita ngumpulake hardware lan mbukak sistem saka flash drive bootable Ubuntu Live. Kita kudu ndownload paket, mula kita bakal nyiyapake jaringan sing cocog kanggo sampeyan. Yen ora bisa, sampeyan bisa mbukak paket sing dibutuhake menyang flash drive luwih dhisik.

Kita pindhah menyang lingkungan Desktop, bukak emulator terminal, lan mateni:

#sudo bash

Piye…?Baris ing ndhuwur minangka pemicu kanonik kanggo holiwar babagan sudo. C bΠΎkesempatan luwih teka lanΠΎtanggung jawab sing luwih gedhe. Pitakonan yaiku apa sampeyan bisa njupuk dhewe. Akeh wong mikir yen nggunakake sudo kanthi cara iki paling ora ati-ati. Nanging:

#apt-get install mdadm lvm2 thin-provisioning-tools btrfs-tools util-linux lsscsi nvme-cli mc

Apa ora ZFS ...?Nalika kita nginstal piranti lunak ing komputer kita, kita ateges ngutangi hardware kanggo pangembang piranti lunak iki kanggo drive.
Nalika kita dipercaya piranti lunak iki kanthi safety data kita, kita njupuk silihan padha karo biaya mulihake data iki, kang kita bakal kudu mbayar mati ing sawijining dina.

Saka sudut pandang iki, ZFS minangka Ferrari, lan mdadm + lvm luwih kaya sepeda.

Secara subyektif, penulis luwih milih nyilih sepeda motor kanggo wong sing ora dingerteni tinimbang Ferrari. Ing kana, rega masalah kasebut ora dhuwur. Ora perlu hak. Luwih prasaja tinimbang aturan lalu lintas. Parkir gratis. Kemampuan lintas negara luwih apik. Sampeyan bisa tansah masang sikil menyang mancal, lan sampeyan bisa ndandani pit karo tangan dhewe.

Kenapa BTRFS...?Kanggo boot sistem operasi, kita kudu sistem file sing didhukung ing Warisan / BIOS GRUB metu saka kothak, lan ing wektu sing padha ndhukung jepretan urip. Kita bakal nggunakake kanggo partisi / boot. Kajaba iku, penulis luwih seneng nggunakake FS iki kanggo / (root), ora dilalekake manawa kanggo piranti lunak liyane sampeyan bisa nggawe partisi sing kapisah ing LVM lan dipasang ing direktori sing dibutuhake.

Kita ora bakal nyimpen gambar saka mesin virtual utawa database ing FS iki.
FS iki mung bakal digunakake kanggo nggawe jepretan saka sistem tanpa mateni banjur transfer gambar iki menyang disk serep nggunakake send/receive.

Kajaba iku, penulis umume luwih seneng njaga minimal piranti lunak langsung ing hardware lan mbukak kabeh piranti lunak liyane ing mesin virtual kanthi nggunakake kaya GPU terusake lan pengontrol Host PCI-USB menyang KVM liwat IOMMU.

Siji-sijine sing isih ana ing hardware yaiku panyimpenan data, virtualisasi lan serep.

Yen sampeyan luwih dipercaya ZFS, mula, ing asas, kanggo aplikasi kasebut bisa diijolake.

Nanging, penulis sengaja nglirwakake fitur mirroring / RAID lan redundansi sing dibangun ing ZFS, BRTFS lan LVM.

Minangka argumentasi tambahan, BTRFS nduweni kemampuan kanggo ngowahi tulisan acak dadi urutan, sing duwe efek positif banget ing kacepetan sinkronisasi gambar / serep ing HDD.

Ayo mindai maneh kabeh piranti:

#udevadm control --reload-rules && udevadm trigger

Ayo ndeleng watara:

#lsscsi && nvme list
[0:0:0:0] disk ATA Samsung SSD 860 2B6Q /dev/sda
[1:0:0:0] disk ATA Samsung SSD 860 2B6Q /dev/sdb
[2:0:0:0] disk ATA Samsung SSD 860 2B6Q /dev/sdc
[3:0:0:0] disk ATA Samsung SSD 860 2B6Q /dev/sdd
[4:0:0:0] disk ATA Samsung SSD 860 2B6Q /dev/sde
[5:0:0:0] disk ATA Samsung SSD 860 2B6Q /dev/sdf
[6:0:0:0] disk ATA HGST HTS721010A9 A3J0 /dev/sdg
[6:0:1:0] disk ATA HGST HTS721010A9 A3J0 /dev/sdh
[6:0:2:0] disk ATA HGST HTS721010A9 A3J0 /dev/sdi
[6:0:3:0] disk ATA HGST HTS721010A9 A3B0 /dev/sdj
[6:0:4:0] disk ATA HGST HTS721010A9 A3B0 /dev/sdk
[6:0:5:0] disk ATA HGST HTS721010A9 A3B0 /dev/sdl
[6:0:6:0] disk ATA HGST HTS721010A9 A3J0 /dev/sdm
[6:0:7:0] disk ATA HGST HTS721010A9 A3J0 /dev/sdn
Node SN Model Namespace Usage Format FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1 S466NXXXXXXX15L Samsung SSD 970 EVO 500GB 1 0,00 GB / 500,11 GB 512 B + 0 B 2B2QEXE7
/dev/nvme1n1 S5H7NXXXXXXX48N Samsung SSD 970 EVO 500GB 1 0,00 GB / 500,11 GB 512 B + 0 B 2B2QEXE7

Tata letak disk

SSD NVMe

Nanging kita ora bakal menehi tandha kanthi cara apa wae. Kabeh padha, BIOS kita ora ndeleng drive kasebut. Dadi, dheweke bakal pindhah menyang piranti lunak RAID. Kita malah ora bakal nggawe bagean ana. Yen sampeyan pengin ngetutake "kanon" utawa "utamane", nggawe partisi gedhe, kaya HDD.

HDD SATA Kab

Ora perlu nggawe apa-apa khusus ing kene. Kita bakal nggawe siji bagean kanggo kabeh. Kita bakal nggawe partisi amarga BIOS ndeleng disk kasebut lan bisa uga nyoba boot saka dheweke. Kita malah bakal nginstal GRUB ing disk kasebut mengko supaya sistem bisa dumadakan nindakake iki.

#cat >hdd.part << EOF
label: dos
label-id: 0x00000000
device: /dev/sdg
unit: sectors

/dev/sdg1 : start= 2048, size= 1953523120, type=fd, bootable
EOF
#sfdisk /dev/sdg < hdd.part
#sfdisk /dev/sdh < hdd.part
#sfdisk /dev/sdi < hdd.part
#sfdisk /dev/sdj < hdd.part
#sfdisk /dev/sdk < hdd.part
#sfdisk /dev/sdl < hdd.part
#sfdisk /dev/sdm < hdd.part
#sfdisk /dev/sdn < hdd.part

SATA SSD

Iki ngendi iku dadi menarik kanggo kita.

Kaping pisanan, drive kita ukurane 2 TB. Iki ana ing kisaran sing bisa ditampa kanggo MBR, sing bakal digunakake. Yen perlu, bisa diganti karo GPT. Disk GPT duwe lapisan kompatibilitas sing ngidini sistem kompatibel MBR bisa ndeleng 4 partisi pisanan yen ana ing 2 terabyte pisanan. Sing utama yaiku partisi boot lan partisi bios_grub ing disk kasebut kudu ana ing wiwitan. Iki malah ngidini sampeyan boot saka drive GPT Legacy/BIOS.

Nanging iki dudu kasus kita.

Ing kene kita bakal nggawe rong bagean. Sing pisanan bakal ukurane 1 GB lan digunakake kanggo RAID 1 /boot.

Sing nomer loro bakal digunakake kanggo RAID 6 lan bakal njupuk kabeh ruang kosong sing isih ana kajaba area cilik sing ora ana ing mburi drive.

Apa wilayah sing ora ditandhani iki?Miturut sumber ing jaringan, SSD SATA kita duwe cache SLC sing bisa ditambah kanthi dinamis kanthi ukuran 6 nganti 78 gigabyte. Kita entuk 6 gigabyte "gratis" amarga beda antarane "gigabyte" lan "gibibyte" ing lembar data drive. Sisa 72 gigabyte dialokasikan saka papan sing ora digunakake.

Kene iku kudu nyatet sing kita duwe cache SLC, lan papan dikuwasani ing mode MLC 4 dicokot. Sing kanggo kita kanthi efektif tegese kanggo saben 4 gigabyte ruang kosong mung bakal entuk 1 gigabyte cache SLC.

Multiply 72 gigabyte karo 4 lan entuk 288 gigabyte. Iki minangka ruang kosong sing ora bakal ditandhani supaya drive bisa nggunakake cache SLC kanthi lengkap.

Mangkono, kita bakal entuk nganti 312 gigabyte cache SLC saka total enem drive. Saka kabeh drive, 2 bakal digunakake ing RAID kanggo redundansi.

Jumlah cache iki bakal ngidini kita arang banget ing urip nyata nemoni kahanan sing nulis ora mlebu ing cache. Iki menehi kompensasi kanthi apik kanggo kekurangan memori QLC sing paling sedih - kacepetan nulis sing sithik banget nalika data ditulis ngliwati cache. Yen beban sampeyan ora cocog karo iki, mula aku menehi saran supaya sampeyan mikir babagan suwene SSD sampeyan bakal tahan ing beban kasebut, kanthi nimbang TBW saka lembar data.

#cat >ssd.part << EOF
label: dos
label-id: 0x00000000
device: /dev/sda
unit: sectors

/dev/sda1 : start= 2048, size= 2097152, type=fd, bootable
/dev/sda2 : start= 2099200, size= 3300950016, type=fd
EOF
#sfdisk /dev/sda < ssd.part
#sfdisk /dev/sdb < ssd.part
#sfdisk /dev/sdc < ssd.part
#sfdisk /dev/sdd < ssd.part
#sfdisk /dev/sde < ssd.part
#sfdisk /dev/sdf < ssd.part

Nggawe Arrays

Kaping pisanan, kita kudu ngganti jeneng mesin kasebut. Iki perlu amarga jeneng host iku bagΓ©an saka jeneng Uploaded nang endi wae nang mdadm lan mengaruhi soko nang endi wae. Mesthine, array bisa diganti jeneng mengko, nanging iki minangka langkah sing ora perlu.

#mcedit /etc/hostname
#mcedit /etc/hosts
#hostname
vdesk0

SSD NVMe

#mdadm --create --verbose --assume-clean /dev/md0 --level=1 --raid-devices=2 /dev/nvme[0-1]n1

Kenapa -anggap-resik ...?Kanggo ngindhari wiwitan array. Kanggo loro tingkat RAID 1 lan 6 iki bener. Kabeh bisa tanpa initialization yen iku array anyar. Kajaba iku, miwiti array SSD nalika digawe minangka sampah sumber daya TBW. Kita nggunakake TRIM / DISCARD yen bisa ing susunan SSD sing dipasang kanggo "miwiti".

Kanggo susunan SSD, RAID 1 DISCARD didhukung metu saka kothak.

Kanggo SSD RAID 6 DISCARD arrays, sampeyan kudu ngaktifake ing paramèter modul kernel.

Iki mung kudu ditindakake yen kabeh SSD sing digunakake ing tingkat 4/5/6 susunan ing sistem iki duwe dhukungan kanggo discard_zeroes_data. Kadhangkala sampeyan teka tengen drive aneh sing ngandhani kernel sing fungsi iki didhukung, nanging nyatane ora ana, utawa fungsi ora tansah bisa. Saiki, dhukungan kasedhiya meh ing endi wae, nanging ana drive lawas lan perangkat kukuh kanthi kasalahan. Mulane, dhukungan DISCARD dipateni kanthi standar kanggo RAID 6.

Manungsa waΓ©, prentah ing ngisor iki bakal ngrusak kabeh data ing drive NVMe kanthi "miwiti" array kanthi "nol".

#blkdiscard /dev/md0

Yen ana masalah, coba nemtokake langkah.

#blkdiscard --step 65536 /dev/md0

SATA SSD

#mdadm --create --verbose --assume-clean /dev/md1 --level=1 --raid-devices=6 /dev/sd[a-f]1
#blkdiscard /dev/md1
#mdadm --create --verbose --assume-clean /dev/md2 --chunk-size=512 --level=6 --raid-devices=6 /dev/sd[a-f]2

Kok gedhe banget...?Nambah ukuran cuwilan nduweni efek positif ing kacepetan maca acak ing blok nganti ukuran cuwilan kalebu. Iki kedadeyan amarga siji operasi kanthi ukuran sing cocog utawa luwih cilik bisa rampung ing piranti siji. Mulane, IOPS saka kabeh piranti diringkes. Miturut statistik, 99% saka IO ora ngluwihi 512K.

RAID 6 IOPS saben nulis tansah kurang saka utawa padha karo IOPS siji drive. Nalika, minangka maca acak, IOPS bisa kaping pirang-pirang luwih gedhe tinimbang siji drive, lan ing kene ukuran blok penting banget.
Penulis ora weruh titik ing nyoba kanggo ngoptimalake parameter sing ala ing RAID 6 dening-desain lan tinimbang ngoptimalake apa RAID 6 apik ing.
Kita bakal menehi ganti rugi kanggo nulis acak miskin RAID 6 karo cache NVMe lan trik-provisioning tipis.

Kita durung ngaktifake DISCARD kanggo RAID 6. Supaya kita ora bakal "miwiti" array iki saiki. Kita bakal nindakake iki mengko, sawise nginstal OS.

HDD SATA Kab

#mdadm --create --verbose --assume-clean /dev/md3 --chunk-size=512 --level=6 --raid-devices=8 /dev/sd[g-n]1

LVM ing NVMe RAID

Kanggo kacepetan, kita pengin nyelehake sistem file root ing NVMe RAID 1 yaiku /dev/md0.
Nanging, kita isih mbutuhake Uploaded cepet iki kanggo kabutuhan liyane, kayata swap, metadata lan LVM-cache lan LVM-tipis metadata, supaya kita bakal nggawe LVM VG ing Uploaded iki.

#pvcreate /dev/md0
#vgcreate root /dev/md0

Ayo nggawe partisi kanggo sistem file root.

#lvcreate -L 128G --name root root

Ayo nggawe partisi kanggo ngganti miturut ukuran RAM.

#lvcreate -L 32G --name swap root

instalasi OS

Secara total, kita duwe kabeh sing dibutuhake kanggo nginstal sistem kasebut.

Bukak tuntunan instalasi sistem saka lingkungan Ubuntu Live. Instalasi normal. Mung ing tataran milih disk kanggo instalasi, sampeyan kudu nemtokake ing ngisor iki:

  • / dev / md1, - titik gunung / boot, FS - BTRFS
  • /dev/root/root (alias /dev/mapper/root-root), - mount point / (root), FS - BTRFS
  • /dev/root/swap (alias /dev/mapper/root-swap), - gunakake minangka partisi swap
  • Instal bootloader ing /dev/sda

Nalika sampeyan milih BTRFS minangka sistem file root, installer bakal kanthi otomatis nggawe rong volume BTRFS kanthi jeneng "@" kanggo / (root), lan "@home" kanggo /home.

Ayo miwiti instalasi ...

Instalasi bakal rampung karo kothak dialog modal sing nuduhake kesalahan nalika nginstal bootloader. Sayange, sampeyan ora bakal bisa metu saka dialog iki nggunakake cara standar lan nerusake instalasi. Kita metu saka sistem lan mlebu maneh, rampung ing desktop Ubuntu Live sing resik. Bukak terminal, lan maneh:

#sudo bash

Nggawe lingkungan chroot kanggo nerusake instalasi:

#mkdir /mnt/chroot
#mount -o defaults,space_cache,noatime,nodiratime,discard,subvol=@ /dev/mapper/root-root /mnt/chroot
#mount -o defaults,space_cache,noatime,nodiratime,discard,subvol=@home /dev/mapper/root-root /mnt/chroot/home
#mount -o defaults,space_cache,noatime,nodiratime,discard /dev/md1 /mnt/chroot/boot
#mount --bind /proc /mnt/chroot/proc
#mount --bind /sys /mnt/chroot/sys
#mount --bind /dev /mnt/chroot/dev

Ayo konfigurasi jaringan lan jeneng host ing chroot:

#cat /etc/hostname >/mnt/chroot/etc/hostname
#cat /etc/hosts >/mnt/chroot/etc/hosts
#cat /etc/resolv.conf >/mnt/chroot/etc/resolv.conf

Ayo menyang lingkungan chroot:

#chroot /mnt/chroot

Kaping pisanan, kita bakal ngirim paket:

apt-get install --reinstall mdadm lvm2 thin-provisioning-tools btrfs-tools util-linux lsscsi nvme-cli mc debsums hdparm

Ayo mriksa lan ndandani kabeh paket sing diinstal kanthi bengkok amarga instalasi sistem sing ora lengkap:

#CORRUPTED_PACKAGES=$(debsums -s 2>&1 | awk '{print $6}' | uniq)
#apt-get install --reinstall $CORRUPTED_PACKAGES

Yen ana sing ora bisa, sampeyan bisa uga kudu ngowahi /etc/apt/sources.list dhisik

Ayo nyetel paramèter kanggo modul RAID 6 kanggo ngaktifake TRIM/DISCARD:

#cat >/etc/modprobe.d/raid456.conf << EOF
options raid456 devices_handle_discard_safely=1
EOF

Ayo ngapiki array kita sethithik:

#cat >/etc/udev/rules.d/60-md.rules << EOF
SUBSYSTEM=="block", KERNEL=="md*", ACTION=="change", TEST=="md/stripe_cache_size", ATTR{md/stripe_cache_size}="32768"
SUBSYSTEM=="block", KERNEL=="md*", ACTION=="change", TEST=="md/sync_speed_min", ATTR{md/sync_speed_min}="48000"
SUBSYSTEM=="block", KERNEL=="md*", ACTION=="change", TEST=="md/sync_speed_max", ATTR{md/sync_speed_max}="300000"
EOF
#cat >/etc/udev/rules.d/62-hdparm.rules << EOF
SUBSYSTEM=="block", ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", RUN+="/sbin/hdparm -B 254 /dev/%k"
EOF
#cat >/etc/udev/rules.d/63-blockdev.rules << EOF
SUBSYSTEM=="block", ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", RUN+="/sbin/blockdev --setra 1024 /dev/%k"
SUBSYSTEM=="block", ACTION=="add|change", KERNEL=="md*", RUN+="/sbin/blockdev --setra 0 /dev/%k"
EOF

Apa iku..?Kita wis nggawe set aturan udev sing bakal nindakake ing ngisor iki:

  • Setel ukuran cache pemblokiran kanggo RAID 2020 cukup kanggo 6. Nilai standar, misale jek, wis ora diganti wiwit nggawe Linux, lan wis ora cukup kanggo dangu.
  • Simpen minimal IO kanggo durasi mriksa / sinkronisasi array. Iki kanggo nyegah larik saka macet ing negara sinkronisasi langgeng ing mbukak.
  • Matesi IO maksimum sajrone mriksa / sinkronisasi array. Iki perlu supaya sinkronisasi / mriksa SSD RAIDs ora Fry drive kanggo asri. Iki utamanΓ© bener kanggo NVMe. (Elinga babagan radiator? Aku ora guyon.)
  • Larangan disk kanggo mungkasi rotasi spindle (HDD) liwat APM lan nyetel wektu entek turu kanggo pengontrol disk nganti 7 jam. Sampeyan bisa mateni APM kanthi lengkap yen drive sampeyan bisa nindakake (-B 255). Kanthi nilai standar, drive bakal mandheg sawise limang detik. Banjur OS pengin ngreset cache disk, disk bakal muter maneh, lan kabeh bakal diwiwiti maneh. Cakram duwe jumlah maksimum rotasi spindle sing winates. Siklus standar sing prasaja iki bisa gampang mateni disk sajrone sawetara taun. Ora kabeh disk nandhang sangsara marga saka iki, nanging kita "laptop", karo setelan gawan cocok, kang ndadekake RAID katon kaya mini-MAID.
  • Instal readahead ing disk (puteran) 1 megabyte - rong blok berturut-turut / potongan RAID 6
  • Pateni readahead ing array dhewe.

Ayo nyunting /etc/fstab:

#cat >/etc/fstab << EOF
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
# file-system mount-point type options dump pass
/dev/mapper/root-root / btrfs defaults,space_cache,noatime,nodiratime,discard,subvol=@ 0 1
UUID=$(blkid -o value -s UUID /dev/md1) /boot btrfs defaults,space_cache,noatime,nodiratime,discard 0 2
/dev/mapper/root-root /home btrfs defaults,space_cache,noatime,nodiratime,discard,subvol=@home 0 2
/dev/mapper/root-swap none swap sw 0 0
EOF

Kok ngono..?Kita bakal nggoleki partisi / boot dening UUID. Jeneng array bisa diganti kanthi teoritis.

Kita bakal nggoleki bagean sing isih ana kanthi jeneng LVM ing notasi /dev/mapper/vg-lv, amarga padha ngenali partisi cukup unik.

Kita ora nggunakake UUID kanggo LVM amarga UUID volume LVM lan gambar bisa padha.Gunung /dev/mapper/root-root.. kaping pindho?ya wis. Persis. Fitur saka BTRFS. Sistem file iki bisa dipasang kaping pirang-pirang kanthi subvols sing beda.

Amarga fitur sing padha, aku nyaranake ora nggawe gambar LVM saka volume BTRFS aktif. Sampeyan bisa uga kaget nalika urip maneh.

Ayo gawe maneh konfigurasi mdadm:

#/usr/share/mdadm/mkconf | sed 's/#DEVICE/DEVICE/g' >/etc/mdadm/mdadm.conf

Ayo nyetel setelan LVM:

#cat >>/etc/lvm/lvmlocal.conf << EOF

activation {
thin_pool_autoextend_threshold=90
thin_pool_autoextend_percent=5
}
allocation {
cache_pool_max_chunks=2097152
}
devices {
global_filter=["r|^/dev/.*_corig$|","r|^/dev/.*_cdata$|","r|^/dev/.*_cmeta$|","r|^/dev/.*gpv$|","r|^/dev/images/.*$|","r|^/dev/mapper/images.*$|","r|^/dev/backup/.*$|","r|^/dev/mapper/backup.*$|"] issue_discards=1
}
EOF

Apa iku..?Kita wis ngaktifake expansion otomatis saka LVM lancip pools tekan 90% saka papan dikuwasani dening 5% saka volume.

Kita wis nambah jumlah maksimum pamblokiran cache kanggo cache LVM.

Kita wis nyegah LVM nggoleki volume (PV) LVM ing:

  • piranti sing ngemot cache LVM (cdata)
  • piranti cache nggunakake cache LVM, bypassing cache ( _corig). Ing kasus iki, piranti sing di-cache dhewe isih bakal dipindai liwat cache (mung ).
  • piranti sing ngemot metadata cache LVM (cmeta)
  • kabeh piranti ing VG karo gambar jeneng. Kene kita bakal duwe gambar disk saka mesin virtual, lan kita ora pengin LVM ing inang kanggo volume Aktifake gadhahanipun OS tamu.
  • kabeh piranti ing VG karo jeneng serep. Ing kene kita bakal duwe salinan serep gambar mesin virtual.
  • kabeh piranti sing jenenge dipungkasi nganggo "gpv" (volume fisik tamu)

Kita wis ngaktifake dhukungan DISCARD nalika mbebasake ruang kosong ing LVM VG. Ati-ati. Iki bakal nggawe mbusak LVs ing SSD cukup wektu-akeh. Iki utamanΓ© ditrapake kanggo SSD RAID 6. Nanging, miturut rencana, kita bakal nggunakake provisioning lancip, supaya iki ora bakal ngalangi kita ing kabeh.

Ayo nganyari gambar initramfs:

#update-initramfs -u -k all

Instal lan konfigurasi grub:

#apt-get install grub-pc
#apt-get purge os-prober
#dpkg-reconfigure grub-pc

Disk apa sing kudu dipilih?Kabeh sing sd*. Sistem kasebut kudu bisa boot saka drive SATA utawa SSD sing digunakake.

Kok tambah os-prober..?Kanggo kamardikan gedhe banget lan tangan Playful.

Ora bisa digunakake kanthi bener yen salah siji saka RAID ana ing negara sing rusak. Iku nyoba kanggo nelusuri OS ing partisi sing digunakake ing mesin virtual mlaku ing hardware iki.

Yen sampeyan butuh, sampeyan bisa ninggalake, nanging elinga kabeh sing kasebut ing ndhuwur. Aku nyaranake nggoleki resep-resep kanggo nyingkirake tangan nakal online.

Kanthi iki kita wis rampung instalasi dhisikan. Wektu kanggo urip maneh menyang OS sing mentas diinstal. Aja lali kanggo mbusak bootable Live CD / USB.

#exit
#reboot

Pilih salah sawijining SSD SATA minangka piranti boot.

LVM ing SATA SSD

Ing titik iki, kita wis boot menyang OS anyar, ngatur jaringan, apt, mbukak terminal emulator, lan mlayu:

#sudo bash

Ayo diterusake.

"Initialize" array saka SATA SSD:

#blkdiscard /dev/md2

Yen ora bisa, coba:

#blkdiscard --step 65536 /dev/md2
Nggawe LVM VG ing SATA SSD:

#pvcreate /dev/md2
#vgcreate data /dev/md2

Kok liyane VG..?Nyatane, kita wis duwe VG jenenge root. Apa ora nambah kabeh menyang siji VG?

Yen ana sawetara PV ing VG, supaya VG bisa diaktifake kanthi bener, kabeh PV kudu ana (online). Pangecualian yaiku LVM RAID, sing ora sengaja kita gunakake.

Kita pancene pengin yen ana Gagal (maca mundhut data) ing samubarang RAID 6 susunan, sistem operasi bakal boot normal lan menehi kita kesempatan kanggo ngatasi masalah.

Kanggo nindakake iki, ing tingkat abstraksi pisanan kita bakal ngisolasi saben jinis "media" fisik menyang VG sing kapisah.

Sacara ilmiah, array RAID sing beda kalebu "domain linuwih" sing beda. Sampeyan ora kudu nggawe titik umum tambahan saka Gagal kanggo wong-wong mau dening cramming menyang siji VG.

Ngarsane LVM ing tingkat "hardware" bakal ngidini kita kanthi sewenang-wenang ngethok potongan RAID sing beda-beda kanthi nggabungake kanthi cara sing beda-beda. Contone - mbukak ing wektu sing padha bcache + LVM lancip, bcache + BTRFS, cache LVM + LVM lancip, konfigurasi ZFS Komplek karo caches, utawa campuran neraka liyane kanggo nyoba lan mbandhingakΓ© kabeh.

Ing tingkat "hardware", kita ora bakal nggunakake apa-apa liyane saka volume LVM "kandel" lawas apik. Pangecualian kanggo aturan iki bisa dadi partisi serep.

Aku mikir ing wektu iki, akeh sing maca wis wiwit curiga bab boneka nesting.

LVM ing SATA HDD

#pvcreate /dev/md3
#vgcreate backup /dev/md3

VG anyar maneh..?Kita pancene pengin yen array disk sing bakal digunakake kanggo serep data gagal, sistem operasi kita bakal terus bisa normal, nalika njaga akses menyang data non-serep kaya biasane. Mulane, kanggo ngindhari masalah aktivasi VG, kita nggawe VG sing kapisah.

Nyetel cache LVM

Ayo nggawe LV ing NVMe RAID 1 kanggo nggunakake minangka piranti caching.

#lvcreate -L 70871154688B --name cache root

Kok kurang sithik...?Kasunyatane yaiku SSD NVMe kita uga duwe cache SLC. 4 gigabyte "gratis" lan 18 gigabyte dinamis amarga ruang bebas sing dikuwasani ing MLC 3-bit. Sawise cache iki kesel, SSD NVMe ora bakal luwih cepet tinimbang SSD SATA sing nganggo cache. Bener, kanthi alesan iki, ora ana gunane kanggo nggawe partisi cache LVM luwih gedhe tinimbang kaping pindho ukuran cache SLC drive NVMe. Kanggo drive NVMe sing digunakake, penulis nganggep cukup kanggo nggawe cache 32-64 gigabyte.

Ukuran partisi sing diwenehake dibutuhake kanggo ngatur 64 gigabyte cache, metadata cache, lan cadangan metadata.

Kajaba iku, aku Wigati sing sawise mati sistem reged, LVM bakal menehi tandha kabeh cache reged lan bakal nyinkronake maneh. Menapa malih, iki bakal bola saben wektu lvchange digunakake ing piranti iki nganti sistem rebooted maneh. Mulane, aku nyaranake langsung nggawe maneh cache nggunakake skrip sing cocog.

Ayo nggawe LV ing SATA RAID 6 kanggo nggunakake minangka piranti cached.

#lvcreate -L 3298543271936B --name cache data

Kok mung telung terabyte..?Dadi, yen perlu, sampeyan bisa nggunakake SATA SSD RAID 6 kanggo sawetara kabutuhan liyane. Ukuran ruang cache bisa ditambah kanthi dinamis, kanthi cepet, tanpa mandheg sistem. Kanggo nindakake iki, sampeyan kudu sementara mandeg lan ngaktifake maneh cache, nanging kauntungan khusu saka LVM-cache liwat, contone, bcache iku bisa rampung ing fly ing.

Ayo nggawe VG anyar kanggo cache.

#pvcreate /dev/root/cache
#pvcreate /dev/data/cache
#vgcreate cache /dev/root/cache /dev/data/cache

Ayo nggawe LV ing piranti sing di-cache.

#lvcreate -L 3298539077632B --name cachedata cache /dev/data/cache

Ing kene kita langsung njupuk kabeh ruang kosong ing / dev / data / cache supaya kabeh partisi liyane sing dibutuhake digawe langsung ing / dev / root / cache. Yen sampeyan nggawe soko ing panggonan sing salah, sampeyan bisa mindhah nggunakake pvmove.

Ayo nggawe lan ngaktifake cache:

#lvcreate -y -L 64G -n cache cache /dev/root/cache
#lvcreate -y -L 1G -n cachemeta cache /dev/root/cache
#lvconvert -y --type cache-pool --cachemode writeback --chunksize 64k --poolmetadata cache/cachemeta cache/cache
#lvconvert -y --type cache --cachepool cache/cache cache/cachedata

kok ukurane kaya ngono..Liwat eksperimen praktis, penulis bisa ngerteni manawa asil paling apik diraih yen ukuran blok cache LVM pas karo ukuran blok tipis LVM. Kajaba iku, ukuran sing luwih cilik, konfigurasi sing luwih apik ditindakake ing rekaman acak.

64k ukuran blok minimal sing diijini kanggo tipis LVM.

Ati-ati nulis maneh..!ya wis. Jinis cache iki nundha sinkronisasi nulis menyang piranti sing di-cache. Iki tegese yen cache ilang, sampeyan bisa uga bakal kelangan data ing piranti sing di-cache. Mengko, penulis bakal pitutur marang kowe apa ngukur, saliyane NVMe RAID 1, bisa dijupuk kanggo ijol kanggo resiko iki.

Jinis cache iki dipilih kanthi sengaja kanggo ngimbangi kinerja nulis acak RAID 6 sing ora apik.

Ayo priksa apa sing entuk:

#lvs -a -o lv_name,lv_size,devices --units B cache
LV LSize Devices
[cache] 68719476736B cache_cdata(0)
[cache_cdata] 68719476736B /dev/root/cache(0)
[cache_cmeta] 1073741824B /dev/root/cache(16384)
cachedata 3298539077632B cachedata_corig(0)
[cachedata_corig] 3298539077632B /dev/data/cache(0)
[lvol0_pmspare] 1073741824B /dev/root/cache(16640)

Mung [cachedata_corig] sing kudu ana ing /dev/data/cache. Yen ana sing salah, banjur gunakake pvmove.

Sampeyan bisa mateni cache yen perlu nganggo siji printah:

#lvconvert -y --uncache cache/cachedata

Iki ditindakake kanthi online. LVM mung bakal nyelarasake cache menyang disk, mbusak, lan ngganti jeneng cachedata_corig bali menyang cachedata.

Nyetel LVM tipis

Ayo kira-kira pira papan sing dibutuhake kanggo metadata tipis LVM:

#thin_metadata_size --block-size=64k --pool-size=6terabytes --max-thins=100000 -u bytes
thin_metadata_size - 3385794560 bytes estimated metadata area size for "--block-size=64kibibytes --pool-size=6terabytes --max-thins=100000"

Babak nganti 4 gigabyte: 4294967296B

Tikel loro lan tambahake 4194304B kanggo metadata PV LVM: 8594128896B
Ayo nggawe partisi sing kapisah ing NVMe RAID 1 kanggo nyelehake metadata tipis LVM lan salinan serep kasebut:

#lvcreate -L 8594128896B --name images root

Kanggo opo..?Ing kene bisa uga ana pitakonan: kenapa nyeleh metadata tipis LVM kanthi kapisah yen isih bakal di-cache ing NVMe lan bakal bisa digunakake kanthi cepet.

Senajan kacepetan penting kene, iku adoh saka alesan utama. Bab iku cache minangka titik kegagalan. Ana sing bisa kedadeyan, lan yen metadata tipis LVM di-cache, bakal nyebabake kabeh ilang. Tanpa metadata lengkap, meh ora bisa ngumpulake volume tipis.

Kanthi mindhah metadata menyang volume non-cache sing kapisah, nanging cepet, kita njamin safety metadata yen ana cache ilang utawa korupsi. Ing kasus iki, kabeh karusakan sing disebabake mundhut cache bakal dilokalisasi ing volume tipis, sing bakal nyederhanakake prosedur pemulihan kanthi urutan gedhene. Kanthi kemungkinan dhuwur, karusakan iki bakal dibalekake nggunakake log FS.

Menapa malih, yen gambar asli seko volume tipis sadurunge dijupuk, lan sawise cache wis disinkronake kanthi paling sethithik sapisan, banjur, amarga desain internal saka LVM tipis, integritas gambar asli bakal dijamin ing acara mundhut cache. .

Ayo nggawe VG anyar sing bakal tanggung jawab kanggo penyediaan tipis:

#pvcreate /dev/root/images
#pvcreate /dev/cache/cachedata
#vgcreate images /dev/root/images /dev/cache/cachedata

Ayo nggawe blumbang:

#lvcreate -L 274877906944B --poolmetadataspare y --poolmetadatasize 4294967296B --chunksize 64k -Z y -T images/thin-pool
Kenapa -Z ySaliyane apa mode iki bener dimaksudakΓ© kanggo - kanggo nyegah data saka siji mesin virtual saka bocor kanggo mesin virtual liyane nalika redistributing papan - zeroing tambahan digunakake kanggo nambah kacepetan nulis acak ing pamblokiran cilik saka 64k. Sembarang nulis kurang saka 64k menyang area volume tipis sing sadurunge ora dialokasikan bakal dadi 64K pinggiran-selaras ing cache. Iki bakal ngidini operasi rampung liwat cache, ngliwati piranti sing di-cache.

Ayo pindhah LV menyang PV sing cocog:

#pvmove -n images/thin-pool_tdata /dev/root/images /dev/cache/cachedata
#pvmove -n images/lvol0_pmspare /dev/cache/cachedata /dev/root/images
#pvmove -n images/thin-pool_tmeta /dev/cache/cachedata /dev/root/images

Ayo priksa:

#lvs -a -o lv_name,lv_size,devices --units B images
LV LSize Devices
[lvol0_pmspare] 4294967296B /dev/root/images(0)
thin-pool 274877906944B thin-pool_tdata(0)
[thin-pool_tdata] 274877906944B /dev/cache/cachedata(0)
[thin-pool_tmeta] 4294967296B /dev/root/images(1024)

Ayo nggawe volume tipis kanggo tes:

#lvcreate -V 64G --thin-pool thin-pool --name test images

Kita bakal nginstal paket kanggo tes lan ngawasi:

#apt-get install sysstat fio

Iki carane sampeyan bisa mirsani prilaku konfigurasi panyimpenan ing wektu nyata:

#watch 'lvs --rows --reportformat basic --quiet -ocache_dirty_blocks,cache_settings cache/cachedata && (lvdisplay cache/cachedata | grep Cache) && (sar -p -d 2 1 | grep -E "sd|nvme|DEV|md1|md2|md3|md0" | grep -v Average | sort)'

Iki carane kita bisa nyoba konfigurasi kita:

#fio --loops=1 --size=64G --runtime=4 --filename=/dev/images/test --stonewall --ioengine=libaio --direct=1
--name=4kQD32read --bs=4k --iodepth=32 --rw=randread
--name=8kQD32read --bs=8k --iodepth=32 --rw=randread
--name=16kQD32read --bs=16k --iodepth=32 --rw=randread
--name=32KQD32read --bs=32k --iodepth=32 --rw=randread
--name=64KQD32read --bs=64k --iodepth=32 --rw=randread
--name=128KQD32read --bs=128k --iodepth=32 --rw=randread
--name=256KQD32read --bs=256k --iodepth=32 --rw=randread
--name=512KQD32read --bs=512k --iodepth=32 --rw=randread
--name=4Kread --bs=4k --rw=read
--name=8Kread --bs=8k --rw=read
--name=16Kread --bs=16k --rw=read
--name=32Kread --bs=32k --rw=read
--name=64Kread --bs=64k --rw=read
--name=128Kread --bs=128k --rw=read
--name=256Kread --bs=256k --rw=read
--name=512Kread --bs=512k --rw=read
--name=Seqread --bs=1m --rw=read
--name=Longread --bs=8m --rw=read
--name=Longwrite --bs=8m --rw=write
--name=Seqwrite --bs=1m --rw=write
--name=512Kwrite --bs=512k --rw=write
--name=256write --bs=256k --rw=write
--name=128write --bs=128k --rw=write
--name=64write --bs=64k --rw=write
--name=32write --bs=32k --rw=write
--name=16write --bs=16k --rw=write
--name=8write --bs=8k --rw=write
--name=4write --bs=4k --rw=write
--name=512KQD32write --bs=512k --iodepth=32 --rw=randwrite
--name=256KQD32write --bs=256k --iodepth=32 --rw=randwrite
--name=128KQD32write --bs=128k --iodepth=32 --rw=randwrite
--name=64KQD32write --bs=64k --iodepth=32 --rw=randwrite
--name=32KQD32write --bs=32k --iodepth=32 --rw=randwrite
--name=16KQD32write --bs=16k --iodepth=32 --rw=randwrite
--name=8KQD32write --bs=8k --iodepth=32 --rw=randwrite
--name=4kQD32write --bs=4k --iodepth=32 --rw=randwrite
| grep -E 'read|write|test' | grep -v ioengine

Ati-ati! sumber daya!Kode iki bakal mbukak 36 tes beda, saben mlaku 4 detik. Setengah saka tes kanggo ngrekam. Sampeyan bisa ngrekam akeh ing NVMe ing 4 detik. Nganti 3 gigabyte per detik. Dadi, saben tes nulis bisa mangan nganti 216 gigabyte sumber SSD saka sampeyan.

Maca lan nulis dicampur?ya wis. Iku ndadekake pangertèn kanggo mbukak tes maca lan nulis kanthi kapisah. Menapa malih, iku ndadekake pangertèn kanggo mesthekake yen kabeh caches disinkronake supaya nulis sadurunge digawe ora mengaruhi maca.

Asil bakal beda-beda nemen sak Bukak pisanan lan sakteruse minangka cache lan volume tipis ngisi, lan uga gumantung apa sistem ngatur kanggo nyinkronake caches kapenuhan sak Bukak pungkasan.

Antarane liyane, aku nyaranake ngukur kacepetan ing volume tipis sing wis kebak sing dijupuk gambar asline. Penulis duwe kesempatan kanggo mirsani carane nulis acak nyepetake kanthi cepet sawise nggawe snapshot pisanan, utamane nalika cache durung kebak. Iki kedadeyan amarga semantik nulis copy-on-write, alignment cache lan blok volume tipis, lan kasunyatan sing nulis acak menyang RAID 6 dadi maca acak saka RAID 6 diikuti karo nulis menyang cache. Ing konfigurasi kita, maca acak saka RAID 6 nganti 6 kaping (jumlah SATA SSDs ing Uploaded) luwih cepet saka nulis. Amarga pamblokiran kanggo CoW diparengake sequentially saka blumbang lancip, banjur rekaman, kanggo sisih paling, uga dadi urutan.

Loro-lorone fitur kasebut bisa digunakake kanggo keuntungan sampeyan.

Cache "koheren" snapshots

Kanggo ngurangi risiko mundhut data ing cilik saka karusakan cache / mundhut, penulis ngusulake kanggo introduce laku puteran snapshots kanggo njamin integritas ing kasus iki.

Kaping pisanan, amarga metadata volume tipis dumunung ing piranti sing ora disimpen, metadata bakal konsisten lan kemungkinan kerugian bakal diisolasi ing blok data.

Siklus rotasi snapshot ing ngisor iki njamin integritas data ing jero potret yen ana cache ilang:

  1. Kanggo saben volume tipis kanthi jeneng <jeneng>, gawe potret nganggo jeneng <jeneng>.cached
  2. Ayo nyetel ambang migrasi menyang nilai dhuwur sing cukup: #lvchange --quiet --cachesettings "migration_threshold=16384" cache/cachedata
  3. Ing daur ulang, kita mriksa jumlah blok kotor ing cache: #lvs --rows --reportformat basic --quiet -ocache_dirty_blocks cache/cachedata | awk '{print $2}' nganti kita entuk nol. Yen nul ilang dawa banget, bisa digawe kanthi ngoper sementara cache menyang mode writethrough. Nanging, kanthi nggatekake karakteristik kacepetan susunan SATA lan NVMe SSD kita, uga sumber daya TBW, sampeyan bakal bisa kanthi cepet njupuk wektu kasebut tanpa ngganti mode cache, utawa piranti keras sampeyan bakal mangan kabeh sumber daya. sawetara dina. Amarga keterbatasan sumber daya, sistem kasebut, ing prinsip, ora bisa nganti 100% beban nulis kabeh wektu. SSDs NVMe kita ing ngisor 100% beban nulis bakal ngilangi sumber daya 3-4 dina. SSD SATA mung bakal suwene kaping pindho. Mulane, kita bakal nganggep sing paling akeh mbukak kanggo maca, lan kita duwe bledosan relatif short-term kegiatan dhuwur banget digabungake karo mbukak kurang ing rata-rata kanggo nulis.
  4. Sanalika kita kejiret (utawa nggawe) nol, kita ngganti jeneng <name>.cached kanggo <name>.committed. <name> lawas.committed wis dibusak.
  5. Opsional, yen cache 100% kebak, bisa digawe maneh kanthi skrip, saΓ©ngga ngresiki. Kanthi cache setengah kosong, sistem bisa luwih cepet nalika nulis.
  6. Setel ambang migrasi menyang nol: #lvchange --quiet --cachesettings "migration_threshold=0" cache/cachedata Iki bakal nyegah cache supaya ora nyinkronake menyang media utama.
  7. Kita ngenteni nganti akeh owah-owahan nglumpukake ing cache #lvs --rows --reportformat basic --quiet -ocache_dirty_blocks cache/cachedata | awk '{print $2}' utawa timer bakal mati.
  8. We mbaleni maneh.

Napa angel karo ambang migrasi ...?Ing bab iku ing laku nyata, rekaman "acak" bener ora rampung acak. Yen kita nulis soko kanggo sektor 4 ukuran kilobyte, ana kemungkinan dhuwur sing ing saperangan menit sabanjurΓ© rekaman bakal digawe kanggo padha utawa siji saka tetanggan (+- 32K) sektor.

Kanthi nyetel batesan migrasi menyang nul, kita nundha sinkronisasi nulis ing SSD SATA lan nglumpukake sawetara owah-owahan menyang siji blok 64K ing cache. Iki kanthi signifikan ngirit sumber daya SSD SATA.

Kode endi..?Sayange, penulis nganggep awake dhewe ora cukup kompeten ing pangembangan skrip bash amarga dheweke 100% otodidak lan nindakake pangembangan sing didorong "google", mula dheweke percaya yen kode elek sing metu saka tangane ora kudu digunakake dening sapa wae. liyane.

Aku sing profesional ing lapangan iki bakal bisa kanggo independen nggambarake kabeh logika ing ndhuwur, yen perlu, lan, mbok menawa, malah apik desain minangka layanan systemd, minangka penulis nyoba kanggo nindakake.

Skema rotasi snapshot sing prasaja kasebut bakal ngidini kita ora mung duwe siji gambar asli sing disinkronake kanthi lengkap ing SSD SATA, nanging uga ngidini kita, nggunakake utilitas thin_delta, kanggo ngerteni pamblokiran sing diowahi sawise digawe, lan kanthi mangkono bisa ngrusak karusakan. volume utama, nemen simplifying Recovery .

TRIM/DISCARD ing libvirt/KVM

Amarga panyimpenan data bakal digunakake kanggo KVM mlaku libvirt, banjur iku bakal dadi apike kanggo mulang VM kita ora mung kanggo njupuk munggah papan free, nanging uga kanggo mbebasake apa maneh needed.

Iki ditindakake kanthi niru dhukungan TRIM/DISCARD ing disk virtual. Kanggo nindakake iki, sampeyan kudu ngganti jinis controller kanggo virtio-scsi lan ngowahi xml.

#virsh edit vmname
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='writethrough' io='threads' discard='unmap'/>
<source dev='/dev/images/vmname'/>
<backingStore/>
<target dev='sda' bus='scsi'/>
<alias name='scsi0-0-0-0'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>

<controller type='scsi' index='0' model='virtio-scsi'>
<alias name='scsi0'/>
<address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
</controller>

DISCARDs kasebut saka OS tamu diproses kanthi bener dening LVM, lan pamblokiran dibebasake kanthi bener ing cache lan ing blumbang tipis. Ing kasus kita, iki kedadeyan utamane kanthi cara sing ditundha, nalika mbusak gambar sabanjure.

BTRFS Gawe serep

Gunakake script siap-digawe karo nemen ati-ati lan kanthi resiko dhewe. Penulis nulis kode iki dhewe lan khusus kanggo awake dhewe. Aku yakin manawa akeh pangguna Linux sing duwe pengalaman duwe alat sing padha, lan ora perlu nyalin piranti liya.

Ayo nggawe volume ing piranti serep:

#lvcreate -L 256G --name backup backup

Ayo format ing BTRFS:

#mkfs.btrfs /dev/backup/backup

Ayo nggawe titik gunung lan pasang subbagian root sistem file:

#mkdir /backup
#mkdir /backup/btrfs
#mkdir /backup/btrfs/root
#mkdir /backup/btrfs/back
#ln -s /boot /backup/btrfs
# cat >>/etc/fstab << EOF

/dev/mapper/root-root /backup/btrfs/root btrfs defaults,space_cache,noatime,nodiratime 0 2
/dev/mapper/backup-backup /backup/btrfs/back btrfs defaults,space_cache,noatime,nodiratime 0 2
EOF
#mount -a
#update-initramfs -u
#update-grub

Ayo nggawe direktori kanggo serep:

#mkdir /backup/btrfs/back/remote
#mkdir /backup/btrfs/back/remote/root
#mkdir /backup/btrfs/back/remote/boot

Ayo nggawe direktori kanggo skrip serep:

#mkdir /root/btrfs-backup

Ayo copy script:

Akeh kode bash sing medeni. Gunakake kanthi resiko dhewe. Aja nulis layang duka marang pengarang...#cat >/root/btrfs-backup/btrfs-backup.sh << EOF
#!/bin/bash
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

SCRIPT_FILE="$(realpath $0)"
SCRIPT_DIR="$(dirname $SCRIPT_FILE)"
SCRIPT_NAME="$(basename -s .sh $SCRIPT_FILE)"

LOCK_FILE="/dev/shm/$SCRIPT_NAME.lock"
DATE_PREFIX='%Y-%m-%d'
DATE_FORMAT=$DATE_PREFIX'-%H-%M-%S'
DATE_REGEX='[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]'
BASE_SUFFIX=".@base"
PEND_SUFFIX=".@pend"
SNAP_SUFFIX=".@snap"
MOUNTS="/backup/btrfs/"
BACKUPS="/backup/btrfs/back/remote/"

function terminate ()
{
echo "$1" >&2
exit 1
}

function wait_lock()
{
flock 98
}

function wait_lock_or_terminate()
{
echo "Wating for lock..."
wait_lock || terminate "Failed to get lock. Exiting..."
echo "Got lock..."
}

function suffix()
{
FORMATTED_DATE=$(date +"$DATE_FORMAT")
echo "$SNAP_SUFFIX.$FORMATTED_DATE"
}

function filter()
{
FORMATTED_DATE=$(date --date="$1" +"$DATE_PREFIX")
echo "$SNAP_SUFFIX.$FORMATTED_DATE"
}

function backup()
{
SOURCE_PATH="$MOUNTS$1"
TARGET_PATH="$BACKUPS$1"
SOURCE_BASE_PATH="$MOUNTS$1$BASE_SUFFIX"
TARGET_BASE_PATH="$BACKUPS$1$BASE_SUFFIX"
TARGET_BASE_DIR="$(dirname $TARGET_BASE_PATH)"
SOURCE_PEND_PATH="$MOUNTS$1$PEND_SUFFIX"
TARGET_PEND_PATH="$BACKUPS$1$PEND_SUFFIX"
if [ -d "$SOURCE_BASE_PATH" ] then
echo "$SOURCE_BASE_PATH found"
else
echo "$SOURCE_BASE_PATH File not found creating snapshot of $SOURCE_PATH to $SOURCE_BASE_PATH"
btrfs subvolume snapshot -r $SOURCE_PATH $SOURCE_BASE_PATH
sync
if [ -d "$TARGET_BASE_PATH" ] then
echo "$TARGET_BASE_PATH found out of sync with source... removing..."
btrfs subvolume delete -c $TARGET_BASE_PATH
sync
fi
fi
if [ -d "$TARGET_BASE_PATH" ] then
echo "$TARGET_BASE_PATH found"
else
echo "$TARGET_BASE_PATH not found. Synching to $TARGET_BASE_DIR"
btrfs send $SOURCE_BASE_PATH | btrfs receive $TARGET_BASE_DIR
sync
fi
if [ -d "$SOURCE_PEND_PATH" ] then
echo "$SOURCE_PEND_PATH found removing..."
btrfs subvolume delete -c $SOURCE_PEND_PATH
sync
fi
btrfs subvolume snapshot -r $SOURCE_PATH $SOURCE_PEND_PATH
sync
if [ -d "$TARGET_PEND_PATH" ] then
echo "$TARGET_PEND_PATH found removing..."
btrfs subvolume delete -c $TARGET_PEND_PATH
sync
fi
echo "Sending $SOURCE_PEND_PATH to $TARGET_PEND_PATH"
btrfs send -p $SOURCE_BASE_PATH $SOURCE_PEND_PATH | btrfs receive $TARGET_BASE_DIR
sync
TARGET_DATE_SUFFIX=$(suffix)
btrfs subvolume snapshot -r $TARGET_PEND_PATH "$TARGET_PATH$TARGET_DATE_SUFFIX"
sync
btrfs subvolume delete -c $SOURCE_BASE_PATH
sync
btrfs subvolume delete -c $TARGET_BASE_PATH
sync
mv $SOURCE_PEND_PATH $SOURCE_BASE_PATH
mv $TARGET_PEND_PATH $TARGET_BASE_PATH
sync
}

function list()
{
LIST_TARGET_BASE_PATH="$BACKUPS$1$BASE_SUFFIX"
LIST_TARGET_BASE_DIR="$(dirname $LIST_TARGET_BASE_PATH)"
LIST_TARGET_BASE_NAME="$(basename -s .$BASE_SUFFIX $LIST_TARGET_BASE_PATH)"
find "$LIST_TARGET_BASE_DIR" -maxdepth 1 -mindepth 1 -type d -printf "%fn" | grep "${LIST_TARGET_BASE_NAME/$BASE_SUFFIX/$SNAP_SUFFIX}.$DATE_REGEX"
}

function remove()
{
REMOVE_TARGET_BASE_PATH="$BACKUPS$1$BASE_SUFFIX"
REMOVE_TARGET_BASE_DIR="$(dirname $REMOVE_TARGET_BASE_PATH)"
btrfs subvolume delete -c $REMOVE_TARGET_BASE_DIR/$2
sync
}

function removeall()
{
DATE_OFFSET="$2"
FILTER="$(filter "$DATE_OFFSET")"
while read -r SNAPSHOT ; do
remove "$1" "$SNAPSHOT"
done < <(list "$1" | grep "$FILTER")

}

(
COMMAND="$1"
shift

case "$COMMAND" in
"--help")
echo "Help"
;;
"suffix")
suffix
;;
"filter")
filter "$1"
;;
"backup")
wait_lock_or_terminate
backup "$1"
;;
"list")
list "$1"
;;
"remove")
wait_lock_or_terminate
remove "$1" "$2"
;;
"removeall")
wait_lock_or_terminate
removeall "$1" "$2"
;;
*)
echo "None.."
;;
esac
) 98>$LOCK_FILE

EOF

Malah gawe opo..?Ngandhut pesawat saka printah prasaja kanggo nggawe BTRFS snapshots lan nyalin menyang FS liyane nggunakake BTRFS ngirim / nampa.

Peluncuran pisanan bisa uga dawa, amarga ... Ing wiwitan, kabeh data bakal disalin. Peluncuran liyane bakal cepet banget, amarga ... Mung owah-owahan sing bakal disalin.

Skrip liyane sing bakal kita lebokake ing cron:

Sawetara kode bash liyane#cat >/root/btrfs-backup/cron-daily.sh << EOF
#!/bin/bash
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

SCRIPT_FILE="$(realpath $0)"
SCRIPT_DIR="$(dirname $SCRIPT_FILE)"
SCRIPT_NAME="$(basename -s .sh $SCRIPT_FILE)"

BACKUP_SCRIPT="$SCRIPT_DIR/btrfs-backup.sh"
RETENTION="-60 day"
$BACKUP_SCRIPT backup root/@
$BACKUP_SCRIPT removeall root/@ "$RETENTION"
$BACKUP_SCRIPT backup root/@home
$BACKUP_SCRIPT removeall root/@home "$RETENTION"
$BACKUP_SCRIPT backup boot/
$BACKUP_SCRIPT removeall boot/ "$RETENTION"
EOF

Apa sing ditindakake..?Nggawe lan nyinkronake gambar tambahan saka volume BTRFS kadhaptar ing FS serep. Sawise iki, mbusak kabeh gambar sing digawe 60 dina kepungkur. Sawise diluncurake, jepretan tanggal saka volume sing kadhaptar bakal katon ing /backup/btrfs/back/remote/ subdirektori.

Ayo menehi hak eksekusi kode:

#chmod +x /root/btrfs-backup/cron-daily.sh
#chmod +x /root/btrfs-backup/btrfs-backup.sh

Ayo dipriksa lan dilebokake ing cron:

#/usr/bin/nice -n 19 /usr/bin/ionice -c 3 /root/btrfs-backup/cron-daily.sh 2>&1 | /usr/bin/logger -t btrfs-backup
#cat /var/log/syslog | grep btrfs-backup
#crontab -e
0 2 * * * /usr/bin/nice -n 19 /usr/bin/ionice -c 3 /root/btrfs-backup/cron-daily.sh 2>&1 | /usr/bin/logger -t btrfs-backup

LVM serep tipis

Ayo nggawe blumbang tipis ing piranti serep:

#lvcreate -L 274877906944B --poolmetadataspare y --poolmetadatasize 4294967296B --chunksize 64k -Z y -T backup/thin-pool

Ayo nginstal ddrescue, amarga ... skrip bakal nggunakake alat iki:

#apt-get install gddrescue

Ayo nggawe direktori kanggo skrip:

#mkdir /root/lvm-thin-backup

Ayo nyalin skrip:

Akeh bash ing jero ...#cat >/root/lvm-thin-backup/lvm-thin-backup.sh << EOF
#!/bin/bash
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

SCRIPT_FILE="$(realpath $0)"
SCRIPT_DIR="$(dirname $SCRIPT_FILE)"
SCRIPT_NAME="$(basename -s .sh $SCRIPT_FILE)"

LOCK_FILE="/dev/shm/$SCRIPT_NAME.lock"
DATE_PREFIX='%Y-%m-%d'
DATE_FORMAT=$DATE_PREFIX'-%H-%M-%S'
DATE_REGEX='[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]'
BASE_SUFFIX=".base"
PEND_SUFFIX=".pend"
SNAP_SUFFIX=".snap"
BACKUPS="backup"
BACKUPS_POOL="thin-pool"

export LVM_SUPPRESS_FD_WARNINGS=1

function terminate ()
{
echo "$1" >&2
exit 1
}

function wait_lock()
{
flock 98
}

function wait_lock_or_terminate()
{
echo "Wating for lock..."
wait_lock || terminate "Failed to get lock. Exiting..."
echo "Got lock..."
}

function suffix()
{
FORMATTED_DATE=$(date +"$DATE_FORMAT")
echo "$SNAP_SUFFIX.$FORMATTED_DATE"
}

function filter()
{
FORMATTED_DATE=$(date --date="$1" +"$DATE_PREFIX")
echo "$SNAP_SUFFIX.$FORMATTED_DATE"
}

function read_thin_id {
lvs --rows --reportformat basic --quiet -othin_id "$1/$2" | awk '{print $2}'
}

function read_pool_lv {
lvs --rows --reportformat basic --quiet -opool_lv "$1/$2" | awk '{print $2}'
}

function read_lv_dm_path {
lvs --rows --reportformat basic --quiet -olv_dm_path "$1/$2" | awk '{print $2}'
}

function read_lv_active {
lvs --rows --reportformat basic --quiet -olv_active "$1/$2" | awk '{print $2}'
}

function read_lv_chunk_size {
lvs --rows --reportformat basic --quiet --units b --nosuffix -ochunk_size "$1/$2" | awk '{print $2}'
}

function read_lv_size {
lvs --rows --reportformat basic --quiet --units b --nosuffix -olv_size "$1/$2" | awk '{print $2}'
}

function activate_volume {
lvchange -ay -Ky "$1/$2"
}

function deactivate_volume {
lvchange -an "$1/$2"
}

function read_thin_metadata_snap {
dmsetup status "$1" | awk '{print $7}'
}

function thindiff()
{
DIFF_VG="$1"
DIFF_SOURCE="$2"
DIFF_TARGET="$3"
DIFF_SOURCE_POOL=$(read_pool_lv $DIFF_VG $DIFF_SOURCE)
DIFF_TARGET_POOL=$(read_pool_lv $DIFF_VG $DIFF_TARGET)

if [ "$DIFF_SOURCE_POOL" == "" ] then
(>&2 echo "Source LV is not thin.")
exit 1
fi

if [ "$DIFF_TARGET_POOL" == "" ] then
(>&2 echo "Target LV is not thin.")
exit 1
fi

if [ "$DIFF_SOURCE_POOL" != "$DIFF_TARGET_POOL" ] then
(>&2 echo "Source and target LVs belong to different thin pools.")
exit 1
fi

DIFF_POOL_PATH=$(read_lv_dm_path $DIFF_VG $DIFF_SOURCE_POOL)
DIFF_SOURCE_ID=$(read_thin_id $DIFF_VG $DIFF_SOURCE)
DIFF_TARGET_ID=$(read_thin_id $DIFF_VG $DIFF_TARGET)
DIFF_POOL_PATH_TPOOL="$DIFF_POOL_PATH-tpool"
DIFF_POOL_PATH_TMETA="$DIFF_POOL_PATH"_tmeta
DIFF_POOL_METADATA_SNAP=$(read_thin_metadata_snap $DIFF_POOL_PATH_TPOOL)

if [ "$DIFF_POOL_METADATA_SNAP" != "-" ] then
(>&2 echo "Thin pool metadata snapshot already exist. Assuming stale one. Will release metadata snapshot in 5 seconds.")
sleep 5
dmsetup message $DIFF_POOL_PATH_TPOOL 0 release_metadata_snap
fi

dmsetup message $DIFF_POOL_PATH_TPOOL 0 reserve_metadata_snap
DIFF_POOL_METADATA_SNAP=$(read_thin_metadata_snap $DIFF_POOL_PATH_TPOOL)

if [ "$DIFF_POOL_METADATA_SNAP" == "-" ] then
(>&2 echo "Failed to create thin pool metadata snapshot.")
exit 1
fi

#We keep output in variable because metadata snapshot need to be released early.
DIFF_DATA=$(thin_delta -m$DIFF_POOL_METADATA_SNAP --snap1 $DIFF_SOURCE_ID --snap2 $DIFF_TARGET_ID $DIFF_POOL_PATH_TMETA)

dmsetup message $DIFF_POOL_PATH_TPOOL 0 release_metadata_snap

echo $"$DIFF_DATA" | grep -E 'different|left_only|right_only' | sed 's/</"/g' | sed 's/ /"/g' | awk -F'"' '{print $6 "t" $8 "t" $11}' | sed 's/different/copy/g' | sed 's/left_only/copy/g' | sed 's/right_only/discard/g'

}

function thinsync()
{
SYNC_VG="$1"
SYNC_PEND="$2"
SYNC_BASE="$3"
SYNC_TARGET="$4"
SYNC_PEND_POOL=$(read_pool_lv $SYNC_VG $SYNC_PEND)
SYNC_BLOCK_SIZE=$(read_lv_chunk_size $SYNC_VG $SYNC_PEND_POOL)
SYNC_PEND_PATH=$(read_lv_dm_path $SYNC_VG $SYNC_PEND)

activate_volume $SYNC_VG $SYNC_PEND

while read -r SYNC_ACTION SYNC_OFFSET SYNC_LENGTH ; do
SYNC_OFFSET_BYTES=$((SYNC_OFFSET * SYNC_BLOCK_SIZE))
SYNC_LENGTH_BYTES=$((SYNC_LENGTH * SYNC_BLOCK_SIZE))
if [ "$SYNC_ACTION" == "copy" ] then
ddrescue --quiet --force --input-position=$SYNC_OFFSET_BYTES --output-position=$SYNC_OFFSET_BYTES --size=$SYNC_LENGTH_BYTES "$SYNC_PEND_PATH" "$SYNC_TARGET"
fi

if [ "$SYNC_ACTION" == "discard" ] then
blkdiscard -o $SYNC_OFFSET_BYTES -l $SYNC_LENGTH_BYTES "$SYNC_TARGET"
fi
done < <(thindiff "$SYNC_VG" "$SYNC_PEND" "$SYNC_BASE")
}

function discard_volume()
{
DISCARD_VG="$1"
DISCARD_LV="$2"
DISCARD_LV_PATH=$(read_lv_dm_path "$DISCARD_VG" "$DISCARD_LV")
if [ "$DISCARD_LV_PATH" != "" ] then
echo "$DISCARD_LV_PATH found"
else
echo "$DISCARD_LV not found in $DISCARD_VG"
exit 1
fi
DISCARD_LV_POOL=$(read_pool_lv $DISCARD_VG $DISCARD_LV)
DISCARD_LV_SIZE=$(read_lv_size "$DISCARD_VG" "$DISCARD_LV")
lvremove -y --quiet "$DISCARD_LV_PATH" || exit 1
lvcreate --thin-pool "$DISCARD_LV_POOL" -V "$DISCARD_LV_SIZE"B --name "$DISCARD_LV" "$DISCARD_VG" || exit 1
}

function backup()
{
SOURCE_VG="$1"
SOURCE_LV="$2"
TARGET_VG="$BACKUPS"
TARGET_LV="$SOURCE_VG-$SOURCE_LV"
SOURCE_BASE_LV="$SOURCE_LV$BASE_SUFFIX"
TARGET_BASE_LV="$TARGET_LV$BASE_SUFFIX"
SOURCE_PEND_LV="$SOURCE_LV$PEND_SUFFIX"
TARGET_PEND_LV="$TARGET_LV$PEND_SUFFIX"
SOURCE_BASE_LV_PATH=$(read_lv_dm_path "$SOURCE_VG" "$SOURCE_BASE_LV")
SOURCE_PEND_LV_PATH=$(read_lv_dm_path "$SOURCE_VG" "$SOURCE_PEND_LV")
TARGET_BASE_LV_PATH=$(read_lv_dm_path "$TARGET_VG" "$TARGET_BASE_LV")
TARGET_PEND_LV_PATH=$(read_lv_dm_path "$TARGET_VG" "$TARGET_PEND_LV")

if [ "$SOURCE_BASE_LV_PATH" != "" ] then
echo "$SOURCE_BASE_LV_PATH found"
else
echo "Source base not found creating snapshot of $SOURCE_VG/$SOURCE_LV to $SOURCE_VG/$SOURCE_BASE_LV"
lvcreate --quiet --snapshot --name "$SOURCE_BASE_LV" "$SOURCE_VG/$SOURCE_LV" || exit 1
SOURCE_BASE_LV_PATH=$(read_lv_dm_path "$SOURCE_VG" "$SOURCE_BASE_LV")
activate_volume "$SOURCE_VG" "$SOURCE_BASE_LV"
echo "Discarding $SOURCE_BASE_LV_PATH as we need to bootstrap."
SOURCE_BASE_POOL=$(read_pool_lv $SOURCE_VG $SOURCE_BASE_LV)
SOURCE_BASE_CHUNK_SIZE=$(read_lv_chunk_size $SOURCE_VG $SOURCE_BASE_POOL)
discard_volume "$SOURCE_VG" "$SOURCE_BASE_LV"
sync
if [ "$TARGET_BASE_LV_PATH" != "" ] then
echo "$TARGET_BASE_LV_PATH found out of sync with source... removing..."
lvremove -y --quiet $TARGET_BASE_LV_PATH || exit 1
TARGET_BASE_LV_PATH=$(read_lv_dm_path "$TARGET_VG" "$TARGET_BASE_LV")
sync
fi
fi
SOURCE_BASE_SIZE=$(read_lv_size "$SOURCE_VG" "$SOURCE_BASE_LV")
if [ "$TARGET_BASE_LV_PATH" != "" ] then
echo "$TARGET_BASE_LV_PATH found"
else
echo "$TARGET_VG/$TARGET_LV not found. Creating empty volume."
lvcreate --thin-pool "$BACKUPS_POOL" -V "$SOURCE_BASE_SIZE"B --name "$TARGET_BASE_LV" "$TARGET_VG" || exit 1
echo "Have to rebootstrap. Discarding source at $SOURCE_BASE_LV_PATH"
activate_volume "$SOURCE_VG" "$SOURCE_BASE_LV"
SOURCE_BASE_POOL=$(read_pool_lv $SOURCE_VG $SOURCE_BASE_LV)
SOURCE_BASE_CHUNK_SIZE=$(read_lv_chunk_size $SOURCE_VG $SOURCE_BASE_POOL)
discard_volume "$SOURCE_VG" "$SOURCE_BASE_LV"
TARGET_BASE_POOL=$(read_pool_lv $TARGET_VG $TARGET_BASE_LV)
TARGET_BASE_CHUNK_SIZE=$(read_lv_chunk_size $TARGET_VG $TARGET_BASE_POOL)
TARGET_BASE_LV_PATH=$(read_lv_dm_path "$TARGET_VG" "$TARGET_BASE_LV")
echo "Discarding target at $TARGET_BASE_LV_PATH"
discard_volume "$TARGET_VG" "$TARGET_BASE_LV"
sync
fi
if [ "$SOURCE_PEND_LV_PATH" != "" ] then
echo "$SOURCE_PEND_LV_PATH found removing..."
lvremove -y --quiet "$SOURCE_PEND_LV_PATH" || exit 1
sync
fi
lvcreate --quiet --snapshot --name "$SOURCE_PEND_LV" "$SOURCE_VG/$SOURCE_LV" || exit 1
SOURCE_PEND_LV_PATH=$(read_lv_dm_path "$SOURCE_VG" "$SOURCE_PEND_LV")
sync
if [ "$TARGET_PEND_LV_PATH" != "" ] then
echo "$TARGET_PEND_LV_PATH found removing..."
lvremove -y --quiet $TARGET_PEND_LV_PATH
sync
fi
lvcreate --quiet --snapshot --name "$TARGET_PEND_LV" "$TARGET_VG/$TARGET_BASE_LV" || exit 1
TARGET_PEND_LV_PATH=$(read_lv_dm_path "$TARGET_VG" "$TARGET_PEND_LV")
SOURCE_PEND_LV_SIZE=$(read_lv_size "$SOURCE_VG" "$SOURCE_PEND_LV")
lvresize -L "$SOURCE_PEND_LV_SIZE"B "$TARGET_PEND_LV_PATH"
activate_volume "$TARGET_VG" "$TARGET_PEND_LV"
echo "Synching $SOURCE_PEND_LV_PATH to $TARGET_PEND_LV_PATH"
thinsync "$SOURCE_VG" "$SOURCE_PEND_LV" "$SOURCE_BASE_LV" "$TARGET_PEND_LV_PATH" || exit 1
sync

TARGET_DATE_SUFFIX=$(suffix)
lvcreate --quiet --snapshot --name "$TARGET_LV$TARGET_DATE_SUFFIX" "$TARGET_VG/$TARGET_PEND_LV" || exit 1
sync
lvremove --quiet -y "$SOURCE_BASE_LV_PATH" || exit 1
sync
lvremove --quiet -y "$TARGET_BASE_LV_PATH" || exit 1
sync
lvrename -y "$SOURCE_VG/$SOURCE_PEND_LV" "$SOURCE_BASE_LV" || exit 1
lvrename -y "$TARGET_VG/$TARGET_PEND_LV" "$TARGET_BASE_LV" || exit 1
sync
deactivate_volume "$TARGET_VG" "$TARGET_BASE_LV"
deactivate_volume "$SOURCE_VG" "$SOURCE_BASE_LV"
}

function verify()
{
SOURCE_VG="$1"
SOURCE_LV="$2"
TARGET_VG="$BACKUPS"
TARGET_LV="$SOURCE_VG-$SOURCE_LV"
SOURCE_BASE_LV="$SOURCE_LV$BASE_SUFFIX"
TARGET_BASE_LV="$TARGET_LV$BASE_SUFFIX"
TARGET_BASE_LV_PATH=$(read_lv_dm_path "$TARGET_VG" "$TARGET_BASE_LV")
SOURCE_BASE_LV_PATH=$(read_lv_dm_path "$SOURCE_VG" "$SOURCE_BASE_LV")

if [ "$SOURCE_BASE_LV_PATH" != "" ] then
echo "$SOURCE_BASE_LV_PATH found"
else
echo "$SOURCE_BASE_LV_PATH not found"
exit 1
fi
if [ "$TARGET_BASE_LV_PATH" != "" ] then
echo "$TARGET_BASE_LV_PATH found"
else
echo "$TARGET_BASE_LV_PATH not found"
exit 1
fi
activate_volume "$TARGET_VG" "$TARGET_BASE_LV"
activate_volume "$SOURCE_VG" "$SOURCE_BASE_LV"
echo Comparing "$SOURCE_BASE_LV_PATH" with "$TARGET_BASE_LV_PATH"
cmp "$SOURCE_BASE_LV_PATH" "$TARGET_BASE_LV_PATH"
echo Done...
deactivate_volume "$TARGET_VG" "$TARGET_BASE_LV"
deactivate_volume "$SOURCE_VG" "$SOURCE_BASE_LV"
}

function resync()
{
SOURCE_VG="$1"
SOURCE_LV="$2"
TARGET_VG="$BACKUPS"
TARGET_LV="$SOURCE_VG-$SOURCE_LV"
SOURCE_BASE_LV="$SOURCE_LV$BASE_SUFFIX"
TARGET_BASE_LV="$TARGET_LV$BASE_SUFFIX"
TARGET_BASE_LV_PATH=$(read_lv_dm_path "$TARGET_VG" "$TARGET_BASE_LV")
SOURCE_BASE_LV_PATH=$(read_lv_dm_path "$SOURCE_VG" "$SOURCE_BASE_LV")

if [ "$SOURCE_BASE_LV_PATH" != "" ] then
echo "$SOURCE_BASE_LV_PATH found"
else
echo "$SOURCE_BASE_LV_PATH not found"
exit 1
fi
if [ "$TARGET_BASE_LV_PATH" != "" ] then
echo "$TARGET_BASE_LV_PATH found"
else
echo "$TARGET_BASE_LV_PATH not found"
exit 1
fi
activate_volume "$TARGET_VG" "$TARGET_BASE_LV"
activate_volume "$SOURCE_VG" "$SOURCE_BASE_LV"
SOURCE_BASE_POOL=$(read_pool_lv $SOURCE_VG $SOURCE_BASE_LV)
SYNC_BLOCK_SIZE=$(read_lv_chunk_size $SOURCE_VG $SOURCE_BASE_POOL)

echo Syncronizing "$SOURCE_BASE_LV_PATH" to "$TARGET_BASE_LV_PATH"

CMP_OFFSET=0
while [[ "$CMP_OFFSET" != "" ]] ; do
CMP_MISMATCH=$(cmp -i "$CMP_OFFSET" "$SOURCE_BASE_LV_PATH" "$TARGET_BASE_LV_PATH" | grep differ | awk '{print $5}' | sed 's/,//g' )
if [[ "$CMP_MISMATCH" != "" ]] ; then
CMP_OFFSET=$(( CMP_MISMATCH + CMP_OFFSET ))
SYNC_OFFSET_BYTES=$(( ( CMP_OFFSET / SYNC_BLOCK_SIZE ) * SYNC_BLOCK_SIZE ))
SYNC_LENGTH_BYTES=$(( SYNC_BLOCK_SIZE ))
echo "Synching $SYNC_LENGTH_BYTES bytes at $SYNC_OFFSET_BYTES from $SOURCE_BASE_LV_PATH to $TARGET_BASE_LV_PATH"
ddrescue --quiet --force --input-position=$SYNC_OFFSET_BYTES --output-position=$SYNC_OFFSET_BYTES --size=$SYNC_LENGTH_BYTES "$SOURCE_BASE_LV_PATH" "$TARGET_BASE_LV_PATH"
else
CMP_OFFSET=""
fi
done
echo Done...
deactivate_volume "$TARGET_VG" "$TARGET_BASE_LV"
deactivate_volume "$SOURCE_VG" "$SOURCE_BASE_LV"
}

function list()
{
LIST_SOURCE_VG="$1"
LIST_SOURCE_LV="$2"
LIST_TARGET_VG="$BACKUPS"
LIST_TARGET_LV="$LIST_SOURCE_VG-$LIST_SOURCE_LV"
LIST_TARGET_BASE_LV="$LIST_TARGET_LV$SNAP_SUFFIX"
lvs -olv_name | grep "$LIST_TARGET_BASE_LV.$DATE_REGEX"
}

function remove()
{
REMOVE_TARGET_VG="$BACKUPS"
REMOVE_TARGET_LV="$1"
lvremove -y "$REMOVE_TARGET_VG/$REMOVE_TARGET_LV"
sync
}

function removeall()
{
DATE_OFFSET="$3"
FILTER="$(filter "$DATE_OFFSET")"
while read -r SNAPSHOT ; do
remove "$SNAPSHOT"
done < <(list "$1" "$2" | grep "$FILTER")

}

(
COMMAND="$1"
shift

case "$COMMAND" in
"--help")
echo "Help"
;;
"suffix")
suffix
;;
"filter")
filter "$1"
;;
"backup")
wait_lock_or_terminate
backup "$1" "$2"
;;
"list")
list "$1" "$2"
;;
"thindiff")
thindiff "$1" "$2" "$3"
;;
"thinsync")
thinsync "$1" "$2" "$3" "$4"
;;
"verify")
wait_lock_or_terminate
verify "$1" "$2"
;;
"resync")
wait_lock_or_terminate
resync "$1" "$2"
;;
"remove")
wait_lock_or_terminate
remove "$1"
;;
"removeall")
wait_lock_or_terminate
removeall "$1" "$2" "$3"
;;
*)
echo "None.."
;;
esac
) 98>$LOCK_FILE

EOF

Apa sing ditindakake...?Ngandhut pesawat saka printah kanggo manipulasi jepretan lancip lan nyinkronake prabΓ©dan antarane loro jepretan tipis ditampa liwat thin_delta kanggo piranti pemblokiran liyane nggunakake ddrescue lan blkdiscard.

Skrip liyane sing bakal kita lebokake ing cron:

Luwih bash#cat >/root/lvm-thin-backup/cron-daily.sh << EOF
#!/bin/bash
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

SCRIPT_FILE="$(realpath $0)"
SCRIPT_DIR="$(dirname $SCRIPT_FILE)"
SCRIPT_NAME="$(basename -s .sh $SCRIPT_FILE)"

BACKUP_SCRIPT="$SCRIPT_DIR/lvm-thin-backup.sh"
RETENTION="-60 days"

$BACKUP_SCRIPT backup images linux-dev
$BACKUP_SCRIPT backup images win8
$BACKUP_SCRIPT backup images win8-data
#etc

$BACKUP_SCRIPT removeall images linux-dev "$RETENTION"
$BACKUP_SCRIPT removeall images win8 "$RETENTION"
$BACKUP_SCRIPT removeall images win8-data "$RETENTION"
#etc

EOF

Apa sing ditindakake...?Gunakake skrip sadurunge kanggo nggawe lan nyinkronake serep saka volume tipis sing kadhaptar. Skrip bakal ninggalake gambar sing ora aktif saka volume sing kadhaptar, sing dibutuhake kanggo nglacak owah-owahan wiwit sinkronisasi pungkasan.

Skrip iki kudu diowahi, nemtokake dhaptar volume tipis sing kudu digawe salinan serep. Jeneng sing diwenehake mung kanggo tujuan ilustrasi. Yen pengin, sampeyan bisa nulis skrip sing bakal nyinkronake kabeh volume.

Ayo menehi hak:

#chmod +x /root/lvm-thin-backup/cron-daily.sh
#chmod +x /root/lvm-thin-backup/lvm-thin-backup.sh

Ayo dipriksa lan dilebokake ing cron:

#/usr/bin/nice -n 19 /usr/bin/ionice -c 3 /root/lvm-thin-backup/cron-daily.sh 2>&1 | /usr/bin/logger -t lvm-thin-backup
#cat /var/log/syslog | grep lvm-thin-backup
#crontab -e
0 3 * * * /usr/bin/nice -n 19 /usr/bin/ionice -c 3 /root/lvm-thin-backup/cron-daily.sh 2>&1 | /usr/bin/logger -t lvm-thin-backup

Peluncuran pisanan bakal dawa, amarga ... volume tipis bakal disinkronake kanthi nyalin kabeh papan sing digunakake. Thanks kanggo metadata tipis LVM, kita ngerti blok endi sing digunakake, mula mung blok volume tipis sing digunakake bakal disalin.

Mlaku sakteruse bakal nyalin data incrementally thanks kanggo ngganti nelusuri liwat LVM metadata tipis.

Ayo ndeleng apa sing kedadeyan:

#time /root/btrfs-backup/cron-daily.sh
real 0m2,967s
user 0m0,225s
sys 0m0,353s

#time /root/lvm-thin-backup/cron-daily.sh
real 1m2,710s
user 0m12,721s
sys 0m6,671s

#ls -al /backup/btrfs/back/remote/*
/backup/btrfs/back/remote/boot:
total 0
drwxr-xr-x 1 root root 1260 ΠΌΠ°Ρ€ 26 09:11 .
drwxr-xr-x 1 root root 16 ΠΌΠ°Ρ€ 6 09:30 ..
drwxr-xr-x 1 root root 322 ΠΌΠ°Ρ€ 26 02:00 .@base
drwxr-xr-x 1 root root 516 ΠΌΠ°Ρ€ 6 09:39 [email protected]
drwxr-xr-x 1 root root 516 ΠΌΠ°Ρ€ 6 09:39 [email protected]
...
/backup/btrfs/back/remote/root:
total 0
drwxr-xr-x 1 root root 2820 ΠΌΠ°Ρ€ 26 09:11 .
drwxr-xr-x 1 root root 16 ΠΌΠ°Ρ€ 6 09:30 ..
drwxr-xr-x 1 root root 240 ΠΌΠ°Ρ€ 26 09:11 @.@base
drwxr-xr-x 1 root root 22 ΠΌΠ°Ρ€ 26 09:11 @home.@base
drwxr-xr-x 1 root root 22 ΠΌΠ°Ρ€ 6 09:39 @[email protected]
drwxr-xr-x 1 root root 22 ΠΌΠ°Ρ€ 6 09:39 @[email protected]
...
drwxr-xr-x 1 root root 240 ΠΌΠ°Ρ€ 6 09:39 @[email protected]
drwxr-xr-x 1 root root 240 ΠΌΠ°Ρ€ 6 09:39 @[email protected]
...

#lvs -olv_name,lv_size images && lvs -olv_name,lv_size backup
LV LSize
linux-dev 128,00g
linux-dev.base 128,00g
thin-pool 1,38t
win8 128,00g
win8-data 2,00t
win8-data.base 2,00t
win8.base 128,00g
LV LSize
backup 256,00g
images-linux-dev.base 128,00g
images-linux-dev.snap.2020-03-08-10-09-11 128,00g
images-linux-dev.snap.2020-03-08-10-09-25 128,00g
...
images-win8-data.base 2,00t
images-win8-data.snap.2020-03-16-14-11-55 2,00t
images-win8-data.snap.2020-03-16-14-19-50 2,00t
...
images-win8.base 128,00g
images-win8.snap.2020-03-17-04-51-46 128,00g
images-win8.snap.2020-03-18-03-02-49 128,00g
...
thin-pool <2,09t

Apa hubungane karo boneka nesting?

Paling kamungkinan, diwenehi sing LVM LV volume logis bisa LVM PV volume fisik kanggo VG liyane. LVM bisa rekursif, kaya boneka nesting. Iki menehi LVM keluwesan nemen.

PS

Ing artikel sabanjure, kita bakal nyoba kanggo nggunakake sawetara sistem panyimpenan seluler padha / KVM minangka basis kanggo nggawe panyimpenan geo-mbagekke / cluster vm karo redundansi ing sawetara bawana nggunakake desktop ngarep, Internet ngarep lan jaringan P2P.

Source: www.habr.com

Add a comment