I-RAM eningi yamahhala, i-NVMe Intel P4500 futhi yonke into ihamba kancane kakhulu - indaba yokwengezwa okungaphumelelanga kokuhlukaniswa kwe-swap.

Kulesi sihloko, ngizokhuluma ngesimo esisanda kwenzeka ngesinye seziphakeli efwini lethu le-VPS, elingishiye ngigxilile amahora ambalwa. Sekuyiminyaka engaba ngu-15 ngilungisa futhi ngixazulula izinkinga amaseva e-Linux, kodwa leli cala alingeni nhlobo ekusebenzeni kwami ​​- ngenza imibono eminingana engamanga futhi ngaba nokuphelelwa yithemba ngaphambi kokuba ngikwazi ukuthola imbangela yenkinga futhi ngiyixazulule. .

Isingeniso

Sisebenzisa ifu elinosayizi omaphakathi, esilakha kumaseva ajwayelekile ngokucushwa okulandelayo - ama-cores angu-32, 256 GB RAM kanye nedrayivu engu-4500TB PCI-E Intel P4 NVMe. Sikuthanda ngempela lokhu kulungiselelwa ngoba kuqeda isidingo sokukhathazeka mayelana ne-IO ngokunikeza umkhawulo olungile kuleveli yohlobo lwesibonelo se-VM. Ngoba i-NVMe Intel P4500 inokusebenza okumangazayo, ngesikhathi esisodwa singanikeza kokubili ukunikezwa kwe-IOPS okugcwele emishinini kanye nesitoreji esiyisipele kuseva eyisipele engena-IOWAIT eyiziro.

Singelinye lalawo makholwa amadala angasebenzisi i-hyperconverged SDN nezinye izinto ezisesitayeleni, ezisemfashinini, zentsha ukugcina imiqulu ye-VM, sikholelwa ukuthi uma uhlelo lulula, kuba lula ukuluxazulula ngaphansi kwezimo "ze-guru enkulu ihambile. ezintabeni.” Njengomphumela, sigcina amavolumu e-VM ngefomethi ye-QCOW2 ku-XFS noma i-EXT4, esetshenziswa phezu kwe-LVM2.

Siphinde siphoqeleke ukuthi sisebenzise i-QCOW2 ngomkhiqizo esiwusebenzisela i-orchestration - i-Apache CloudStack.

Ukwenza isipele, sithatha isithombe esigcwele sevolumu njengesifinyezo se-LVM2 (yebo, siyazi ukuthi izifinyezo ze-LVM2 zihamba kancane, kodwa i-Intel P4500 iyasisiza nalapha). Senza njalo lvmcreate -s .. nangosizo dd sithumela ikhophi yokusekelayo kuseva ekude enesitoreji se-ZFS. Lapha sisathuthuka kancane - phela, i-ZFS ingagcina idatha ngefomu elicindezelwe, futhi singayibuyisela ngokushesha sisebenzisa DD noma uthole imiqulu ye-VM ngayinye usebenzisa mount -o loop ....

Yebo, awukwazi ukususa isithombe esigcwele sevolumu ye-LVM2, kodwa faka uhlelo lwefayela kufayela RO futhi sikopishe izithombe ze-QCOW2 ngokwabo, nokho, sasibhekene neqiniso lokuthi i-XFS yaba yimbi kulokhu, futhi hhayi ngokushesha, kodwa ngendlela engalindelekile. Asithandi ngempela lapho abasingathi be-hypervisor "benamathela" ngokuzumayo ngezimpelasonto, ebusuku noma ngamaholide ngenxa yamaphutha angacaci ukuthi azokwenzeka nini. Ngakho-ke, ku-XFS asisebenzisi ukufaka isifinyezo ngaphakathi RO ukuze sikhiphe amavolumu, simane sikopishe yonke ivolumu ye-LVM2.

Isivinini sokwenza ikhophi yasenqolobaneni kuseva eyisipele sinqunywa kithina ngokusebenza kweseva eyisipele, okungaba ngu-600-800 MB/s ngedatha engenakucindezelwa; esinye isikhawulisi isiteshi esingu-10Gbit/s okuxhunywe ngaso iseva eyisipele. kuqoqo.

Kulokhu, amakhophi ayisipele amaseva angu-8 we-hypervisor alayishwa kanyekanye kuseva eyodwa eyisipele. Ngakho-ke, i-disk ne-network subsystems yeseva yokusekelayo, njengoba ihamba kancane, ayivumeli ama-disk subsystems we-hypervisor hosts ukuthi alayishe ngokweqile, ngoba awakwazi ukucubungula, athi, 8 GB / sec, okuyinto i-hypervisor ibamba kalula. khiqiza.

Inqubo yokukopisha engenhla ibaluleke kakhulu endabeni eyengeziwe, okuhlanganisa nemininingwane - usebenzisa idrayivu ye-Intel P4500 esheshayo, usebenzisa i-NFS futhi, mhlawumbe, nokusebenzisa i-ZFS.

Indaba eyisipele

Kunodi ngayinye ye-hypervisor sinengxenye encane ye-SWAP engu-8 GB ngosayizi, futhi "sikhipha" i-hypervisor node ngokwayo sisebenzisa. DD kusukela esithombeni esiyisithenjwa. Ngevolumu yesistimu kumaseva, sisebenzisa i-2xSATA SSD RAID1 noma i-2xSAS HDD RAID1 ku-LSI noma isilawuli sehadiwe ye-HP. Ngokuvamile, asinandaba nhlobo nokuthi yini engaphakathi, njengoba ivolumu yesistimu yethu isebenza ngemodi "cishe yokufunda kuphela", ngaphandle kwe-SWAP. Futhi njengoba sine-RAM eningi kuseva futhi imahhala engu-30-40%, asicabangi nge-SWAP.

Inqubo yokwenza isipele. Lo msebenzi ubukeka kanjena:

#!/bin/bash

mkdir -p /mnt/backups/volumes

DIR=/mnt/images-snap
VOL=images/volume
DATE=$(date "+%d")
HOSTNAME=$(hostname)

lvcreate -s -n $VOL-snap -l100%FREE $VOL
ionice -c3 dd iflag=direct if=/dev/$VOL-snap bs=1M of=/mnt/backups/volumes/$HOSTNAME-$DATE.raw
lvremove -f $VOL-snap

naka ionice -c3, empeleni, le nto ayinamsebenzi ngokuphelele kumadivayisi we-NVMe, ngoba umhleli we-IO wabo usethwe njenge:

cat /sys/block/nvme0n1/queue/scheduler
[none] 

Kodwa-ke, sinenombolo yamanodi efa anama-SSD RAID avamile, kubo lokhu kuyafaneleka, ngakho ayanyakaza. NJENGOBA KUNJALO. Sekukonke, lokhu kuwucezu lwekhodi oluthokozisayo oluchaza ubuze ionice uma kwenzeka ukucushwa okunjalo.

Naka ifulege iflag=direct ngoba DD. Sisebenzisa i-IO eqondile ukweqa inqolobane yebhafa ukuze sigweme ukushintshwa okungadingekile kwamabhafa e-IO lapho sifunda. Nokho, oflag=direct asikwenzi ngoba sihlangabezane nenkinga yokusebenza kweZFS uma siyisebenzisa.

Sekuyiminyaka eminingi silusebenzisa ngempumelelo lolu hlelo ngaphandle kwezinkinga.

Kwabe sekuqala... Sithole ukuthi enye yama-node yayingasasekelwa, futhi eyangaphambili yayisebenza nge-IOWAIT esabekayo ka-50%. Lapho sizama ukuqonda ukuthi kungani ukukopisha kungenzeki, sihlangabezane nale nto elandelayo:

Volume group "images" not found

Saqala ukucabanga ngokuthi "ukuphela sekufikile ku-Intel P4500," nokho, ngaphambi kokuvala iseva ukuze kungene idrayivu, bekusadingeka ukwenza isipele. Silungise i-LVM2 ngokubuyisela imethadatha kusuka kusipele se-LVM2:

vgcfgrestore images

Sethule isipele futhi sabona lo mdwebo kawoyela:
I-RAM eningi yamahhala, i-NVMe Intel P4500 futhi yonke into ihamba kancane kakhulu - indaba yokwengezwa okungaphumelelanga kokuhlukaniswa kwe-swap.

Futhi sasidabukile kakhulu - kwacaca ukuthi asikwazi ukuphila kanje, ngoba wonke ama-VPS azohlupheka, okusho ukuthi nathi sizohlupheka. Okwenzekile akucaci kahle - iostat ibonise i-IOPS edabukisayo kanye ne-IOWAIT ephakeme kakhulu. Yayingekho imibono ngaphandle kokuthi β€œasishintshe i-NVMe,” kodwa ukuqonda kwenzeka ngesikhathi.

Ukuhlaziywa kwesimo isinyathelo ngesinyathelo

Umagazini womlando. Ezinsukwini ezimbalwa ngaphambili, kule seva kwakudingeka ukudala i-VPS enkulu nge-RAM engu-128 GB. Kwabonakala kunenkumbulo eyanele, kodwa ukuze sibe sohlangothini oluphephile, sabele enye i-32 GB yokwahlukanisa okushintshiwe. I-VPS idalwe, yaqeda ngempumelelo umsebenzi wayo futhi isigameko sakhohlwa, kodwa ukuhlukaniswa kwe-SWAP kwahlala.

Izici zokucushwa. Kuwo wonke amaseva amafu ipharamitha vm.swappiness isethwe kokuzenzakalelayo 60. Futhi i-SWAP yadalwa ku-SAS HDD RAID1.

Kwenzekeni (ngokusho kwabahleli). Uma wenza ikhophi yasenqolobaneni DD ikhiqize idatha eminingi yokubhala, eyafakwa ku-RAM buffers ngaphambi kokubhalela i-NFS. Umongo wesistimu, oqondiswa inqubomgomo swappiness, yayihambisa amakhasi amaningi enkumbulo ye-VPS endaweni yokushintshana, eyayitholakala kuvolumu ehamba kancane ye-HDD RAID1. Lokhu kuholele ekutheni i-IOWAIT ikhule kakhulu, kodwa hhayi ngenxa ye-IO NVMe, kodwa ngenxa ye-IO HDD RAID1.

Indlela inkinga yaxazululeka ngayo. I-partition ye-swap engu-32GB ivaliwe. Lokhu kuthathe amahora angu-16; ungafunda ngokuhlukene mayelana nokuthi kungani i-SWAP icishwa kancane kangaka. Izilungiselelo zishintshiwe swappiness enanini elilingana ne 5 phezu kwefu.

Kwakungenzeka kanjani lokhu?. Okokuqala, ukube i-SWAP ibiku-SSD RAID noma idivayisi ye-NVMe, futhi okwesibili, uma bekungekho idivayisi ye-NVMe, kodwa idivayisi ehamba kancane engeke ikhiqize umthamo onjalo wedatha - okuxakayo, inkinga yenzeka ngoba leyo NVMe ishesha kakhulu.

Ngemva kwalokho, konke kwaqala ukusebenza njengakuqala - nge zero IOWAIT.

Source: www.habr.com

Engeza amazwana