Tushen ZFS: Adana da Ayyuka

Tushen ZFS: Adana da Ayyuka

A wannan bazara mun riga mun tattauna wasu batutuwan gabatarwa, misali. yadda za a duba gudun your drives и menene RAID. A na biyun daga cikinsu, har ma mun yi alkawarin ci gaba da yin nazari kan yadda ake gudanar da ayyukan manyan faifan diski daban-daban a cikin ZFS. Wannan shine tsarin fayil na ƙarni na gaba wanda yanzu ana aiwatar dashi a ko'ina: daga apple to Ubuntu.

To, yau ita ce mafi kyawun rana don sanin ZFS, masu karatu masu bincike. Kawai ku sani cewa a cikin ra'ayi mai tawali'u na mai haɓaka OpenZFS Matt Ahrens, "da gaske yana da wahala."

Amma kafin mu isa lambobin - kuma za su yi, na yi alkawari - don duk zaɓuɓɓuka don daidaitawar ZFS na diski guda takwas, muna buƙatar magana game da su. yadda Gabaɗaya, ZFS tana adana bayanai akan faifai.

Zpool, vdev da na'ura

Tushen ZFS: Adana da Ayyuka
Wannan cikakken zanen tafkin ya ƙunshi vdevs na taimako guda uku, ɗayan kowane aji, da huɗu don RAIDz2

Tushen ZFS: Adana da Ayyuka
Yawancin lokaci babu wani dalili don ƙirƙirar tafkin nau'ikan vdev da girma dabam - amma babu abin da zai hana ku yin hakan idan kuna so.

Don fahimtar tsarin fayil ɗin ZFS da gaske, kuna buƙatar bincika ainihin tsarin sa. Na farko, ZFS yana haɗa matakan gargajiya na ƙarar da sarrafa tsarin fayil. Na biyu, yana amfani da tsarin kwafi-kan-rubutu na ma'amala. Waɗannan fasalulluka suna nufin cewa tsarin ya bambanta da tsarin fayil na al'ada da tsararrun RAID. Saitin farko na ainihin tubalan ginin da za a fahimta shine wurin ajiyar ajiya (zpool), na'urar kama-da-wane (vdev), da na'urar gaske (na'urar).

zpool

Wurin ajiya na zpool shine mafi girman tsarin ZFS. Kowane tafkin ya ƙunshi na'urori masu kama da juna ɗaya ko fiye. Bi da bi, kowannensu ya ƙunshi na'urori ɗaya ko fiye (na'urar). Wuraren tafki na zahiri tubalan ne masu kamun kai. Kwamfuta ta zahiri guda ɗaya na iya ƙunsar wuraren tafkuna guda biyu ko fiye, amma kowannensu yana da 'yancin kansa daga sauran. Tafkunan ba za su iya raba na'urori masu kama da juna ba.

Sake sakewa na ZFS yana a matakin na'urar kama-da-wane, ba a matakin tafkin ba. Babu shakka babu sakewa a matakin tafkin - idan kowane vdev drive ko vdev na musamman ya ɓace, to duk tafkin ya ɓace tare da shi.

Tafkunan ajiya na zamani na iya tsira daga asarar rumbun adana bayanai ko na'ura mai kama-da-wane - ko da yake za su iya rasa ƴan ƙazantattun bayanai idan sun rasa log ɗin vdev yayin katsewar wutar lantarki ko haɗarin tsarin.

Akwai kuskuren gama-gari cewa ZFS "raƙuman bayanai" an rubuta su a duk faɗin tafkin. Wannan ba gaskiya bane. Zpool ba mai ban dariya ba ne RAID0 kwata-kwata, abin ban dariya ne JBOD tare da hadadden tsarin rarraba rarraba.

A mafi yawancin lokuta, ana rarraba bayanan a tsakanin na'urori masu kama-da-wane da ake da su bisa ga sararin samaniya kyauta, don haka a ka'idar za a cika su a lokaci guda. A cikin nau'ikan ZFS na baya, ana la'akari da amfanin vdev na yanzu (amfani) - idan na'urar kama-da-wane ta fi wani aiki sosai (alal misali, saboda ɗaukar nauyi), za a tsallake ta na ɗan lokaci don rubutu, duk da samun mafi girman kyauta. rabon sarari.

Tsarin gano amfani da aka gina a cikin hanyoyin rarraba ZFS na zamani na iya rage jinkiri da haɓaka kayan aiki yayin lokutan babban nauyi da ba a saba gani ba - amma ba carte blanche akan hadawa na rashin son rai na HDDs da sauri da SSDs masu sauri a cikin tafki daya. Irin wannan tafki marar daidaito zai ci gaba da aiki a cikin saurin na'urar mafi hankali, wato, kamar dai ya ƙunshi irin waɗannan na'urori.

vdev

Kowane wurin ajiyar ajiya ya ƙunshi na'urori masu kama-da-wane ɗaya ko fiye (na'urar kama-da-wane, vdev). Bi da bi, kowane vdev ya ƙunshi guda ɗaya ko fiye na ainihin na'urori. Yawancin na'urori masu kama-da-wane ana amfani da su don sauƙin adana bayanai, amma akwai darussan vdev masu taimako da yawa, gami da CACHE, LOG, da SPECIAL. Kowane ɗayan waɗannan nau'ikan vdev na iya samun ɗayan topologies biyar: na'ura ɗaya (na'ura ɗaya), RAIDz1, RAIDz2, RAIDz3, ko madubi ( madubi).

RAIDz1, RAIDz2 da RAIDz3 nau'i ne na musamman na abin da tsofaffin lokaci za su kira ninki biyu (diagonal) RAID. 1, 2 da 3 suna nuni zuwa adadin tubalan da aka ware don kowane tsiri na bayanai. Maimakon faifai daban-daban don daidaitawa, na'urorin kama-da-wane na RAIDz suna rarraba wannan tsaka-tsakin tsaka-tsaki a cikin faifai. Tsare-tsaren RAIDz na iya rasa faifai masu yawa kamar yadda yake da tubalan daidaitawa; idan ya rasa wani, zai yi karo ya dauki wurin ajiyar kaya da shi.

A cikin na'urorin kama-da-wane (mirror vdev), kowane toshe ana adana shi akan kowace na'ura a cikin vdev. Ko da yake madubai mai fadi biyu sun fi yawa, kowane adadin na'urori na sabani na iya kasancewa a cikin madubi - sau uku ana amfani da su a cikin manyan kayan aiki don ingantacciyar aikin karantawa da rashin haƙuri. Madubin vdev na iya tsira daga kowace gazawa muddin aƙalla na'ura ɗaya a cikin vdev ta ci gaba da aiki.

Single vdevs suna da haɗari a zahiri. Irin wannan na'ura mai mahimmanci ba zai tsira da gazawar guda ɗaya ba - kuma idan aka yi amfani da shi azaman ajiya ko vdev na musamman, to rashin nasararsa zai haifar da lalata dukan tafkin. Yi hankali sosai a nan.

Ana iya ƙirƙirar cache, LOG, da SPECIAL VAs ta amfani da kowane ɗayan abubuwan da ke sama - amma ku tuna cewa asarar SPECIAL VA yana nufin asarar tafkin, don haka ana ba da shawarar yin amfani da topology sosai.

na'urar

Wataƙila wannan shine mafi sauƙin kalmar da za a fahimta a cikin ZFS - a zahiri toshe na'urar shiga bazuwar ce. Ka tuna cewa na'urori masu kama-da-wane sun ƙunshi na'urori guda ɗaya, yayin da tafkin ya ƙunshi na'urori masu kama-da-wane.

Disks - ko dai maganadisu ko ƙwaƙƙwaran yanayi - sune na'urorin toshe na yau da kullun waɗanda ake amfani da su azaman tubalan ginin vdev. Koyaya, duk na'urar da ke da siffa a cikin /dev zata yi, don haka ana iya amfani da jigon RAID gaba ɗaya azaman na'urori daban.

Fayil mai sauƙi mai sauƙi yana ɗaya daga cikin mafi mahimmancin madadin na'urorin toshe wanda za'a iya gina vdev daga. Gwada wuraren waha daga ƙananan fayiloli hanya ce mai matukar amfani don duba umarnin tafkin da ganin yawan sarari a cikin tafkin ko na'urar kama-da-wane na topology da aka bayar.

Tushen ZFS: Adana da Ayyuka
Kuna iya ƙirƙirar wurin gwajin gwaji daga fayiloli marasa ƙarfi a cikin ƴan daƙiƙa kaɗan - amma kar a manta da share dukkan tafkin da abubuwan da ke ciki daga baya.

Bari mu ce kuna son sanya uwar garken akan diski guda takwas kuma kuyi shirin amfani da fayafai na TB 10 (~ 9300 GiB) - amma ba ku da tabbacin wanne topology ya dace da bukatunku. A cikin misalin da ke sama, muna gina wurin gwajin gwaji daga fayilolin da ba su da yawa a cikin daƙiƙa - kuma yanzu mun san cewa RAIDz2 vdev na diski guda takwas 10 TB yana ba da 50 TiB na iya aiki.

Wani nau'in na'urori na musamman shine SPARE (spare). Na'urorin musanyawa masu zafi, ba kamar na'urori na yau da kullun ba, suna cikin duka tafkin, kuma ba na na'urar kama-da-wane ɗaya ba. Idan vdev a cikin tafkin ya kasa kuma an haɗa na'urar da aka haɗa zuwa tafkin kuma akwai, to za ta shiga kai tsaye ta hanyar vdev da abin ya shafa.

Bayan haɗawa zuwa vdev da abin ya shafa, na'urar tana fara karɓar kwafi ko sake gina bayanan da ya kamata su kasance akan na'urar da ta ɓace. A cikin RAID na gargajiya ana kiran wannan sake ginawa, yayin da a cikin ZFS ana kiransa resilvering.

Yana da mahimmanci a lura cewa na'urorin da aka keɓe ba sa maye gurbin na'urorin da suka gaza dindindin. Wannan shine kawai maye gurbin wucin gadi don rage adadin lokacin vdev ya lalace. Bayan mai gudanarwa ya maye gurbin vdev ɗin da ya gaza, za a dawo da sakewa zuwa waccan na'urar ta dindindin, kuma an cire haɗin SPARE daga vdev kuma a koma aiki a matsayin abin keɓe ga dukan tafkin.

Saitunan bayanai, tubalan da sassa

Saitin tubalan gini na gaba don fahimta akan tafiyarmu ta ZFS ba ta da yawa game da kayan aiki da ƙari game da yadda ake tsara bayanan da kanta da kuma adana su. Muna tsallake matakai kaɗan a nan - kamar metaslab - don kar mu rikitar da cikakkun bayanai yayin da muke ci gaba da fahimtar tsarin gaba ɗaya.

Saitin bayanai (dataset)

Tushen ZFS: Adana da Ayyuka
Lokacin da muka fara ƙirƙirar saitin bayanai, yana nuna duk sararin tafkin da ake da shi. Sa'an nan kuma mu saita adadin - kuma mu canza wurin dutsen. Sihiri!

Tushen ZFS: Adana da Ayyuka
Zvol ga mafi yawan ɓangaren bayanai ne kawai da aka cire daga Layer ɗin tsarin fayil ɗin sa, wanda muke maye gurbinsa anan tare da ingantaccen tsarin fayil na ext4 na yau da kullun.

Saitin bayanai na ZFS kusan iri ɗaya ne da daidaitaccen tsarin fayil ɗin da aka ɗora. Kamar tsarin fayil na yau da kullun, kallon farko yana kama da "wani babban fayil kawai". Amma kamar tsarin fayiloli na yau da kullun, kowane saitin bayanai na ZFS yana da nasa saitin abubuwan asali.

Da farko, saitin bayanai na iya samun adadin da aka keɓe. Idan saita zfs set quota=100G poolname/datasetname, to ba za ku iya rubutawa zuwa babban fayil ɗin da aka ɗora ba /poolname/datasetname fiye da 100 GiB.

Yi la'akari da kasancewar - da rashi - na raguwa a farkon kowane layi? Kowane saitin bayanai yana da nasa wurin a cikin duka matakan ZFS da tsarin hawan tsarin. Babu wani babban slash a cikin matsayi na ZFS - kuna farawa da sunan tafkin sannan hanya daga saitin bayanai zuwa na gaba. Misali, pool/parent/child ga kundin bayanai mai suna child ƙarƙashin bayanan iyaye parent a cikin wani tafkin da sunan kirkira pool.

Ta hanyar tsoho, wurin dutsen dataset zai kasance daidai da sunansa a cikin matsayi na ZFS, tare da babban slash - tafkin mai suna. pool dora kamar /pool, saitin bayanai parent saka a /pool/parent, da kuma bayanan yara child saka a /pool/parent/child. Koyaya, za'a iya canza madaidaicin tsarin tsarin dataset.

Idan muka ayyana zfs set mountpoint=/lol pool/parent/child, sannan saitin bayanai pool/parent/child saka a kan tsarin kamar yadda /lol.

Baya ga bayanan bayanai, yakamata mu ambaci kundin (zvols). Ƙararren ƙira kusan iri ɗaya ne da saitin bayanai, sai dai cewa ba shi da tsarin fayil-sai dai na'urar toshewa. Kuna iya, misali, ƙirƙira zvol Tare da suna mypool/myzvol, sannan ka tsara shi da tsarin fayil na ext4, sannan ka hau wancan tsarin fayil - yanzu kana da tsarin fayil na ext4, amma tare da duk abubuwan tsaro na ZFS! Wannan na iya zama kamar wauta akan na'ura ɗaya, amma yana da ma'ana sosai azaman abin baya lokacin fitar da na'urar iSCSI.

Tubalan

Tushen ZFS: Adana da Ayyuka
Fayil ɗin yana wakiltar ɗaya ko fiye da tubalan. Ana adana kowane toshe akan na'urar kama-da-wane. Girman toshe yawanci yana daidai da siga girman rikodin, amma ana iya ragewa zuwa 2^ canzaidan ya ƙunshi metadata ko ƙaramin fayil.

Tushen ZFS: Adana da Ayyuka
Mu da gaske da gaske ba wasa ba game da babban hukuncin wasan kwaikwayo idan kun saita ƙaramin aiki

A cikin tafkin ZFS, duk bayanai, gami da metadata, ana adana su a cikin tubalan. Matsakaicin girman toshe don kowane saitin bayanai an bayyana shi a cikin kadarorin recordsize (girman rikodin). Ana iya canza girman rikodin, amma wannan ba zai canza girman ko wurin kowane tubalan da aka riga aka rubuta zuwa ma'aunin bayanai ba - yana shafar sabbin tubalan ne kawai yayin da aka rubuta su.

Sai dai in ba haka ba, girman rikodin tsoho na yanzu shine 128 KiB. Yana da wani nau'i mai ban sha'awa mai ban sha'awa inda aikin ba shi da kyau, amma ba shi da muni a mafi yawan lokuta ko dai. Recordsize za a iya saita zuwa kowace ƙima daga 4K zuwa 1M (tare da saitunan ci gaba recordsize za ka iya shigar har ma da ƙari, amma wannan da wuya kyakkyawan ra'ayi ne).

Duk wani toshe yana nufin bayanan fayil ɗaya kawai - ba za ku iya cusa fayiloli daban-daban guda biyu cikin toshe ɗaya ba. Kowane fayil ya ƙunshi tubalan ɗaya ko fiye, dangane da girman. Idan girman fayil ɗin ya yi ƙasa da girman rikodin, za a adana shi a cikin ƙaramin yanki - alal misali, toshe tare da fayil ɗin KiB 2 zai mamaye sashin KiB 4 kawai akan faifai.

Idan fayil ɗin yana da girma kuma yana buƙatar tubalan da yawa, to duk bayanan da wannan fayil ɗin zasu kasance masu girma recordsize - ciki har da shigarwar ƙarshe, babban ɓangaren wanda zai iya zama sarari mara amfani.

zvols ba su da dukiya recordsize - a maimakon haka suna da makamanciyar dukiya volblocksize.

Sassan

Na ƙarshe, mafi mahimmancin tubalin gini shine fannin. Ita ce mafi ƙanƙanta naúrar jiki wanda za'a iya rubutawa ko karantawa daga na'urar da ke ƙasa. Shekaru da yawa, yawancin faifai suna amfani da sassan 512-byte. Kwanan nan, yawancin faifai ana saita su don sassan KiB 4, wasu kuma - musamman SSDs - suna da sassan KiB 8 ko ma fiye da haka.

Tsarin ZFS yana da dukiya wanda ke ba ku damar saita girman sashin da hannu. Wannan dukiya ashift. A ɗan rikice, ashift iko ne na biyu. Misali, ashift=9 yana nufin girman yanki na 2^9, ko 512 bytes.

ZFS yana tambayar tsarin aiki don cikakken bayani game da kowace na'ura mai toshewa lokacin da aka ƙara ta zuwa sabuwar vdev, kuma a ƙa'ida ta atomatik tana shigar da shift daidai bisa wannan bayanin. Abin takaici, yawancin faifai suna karya game da girman sassan su don kiyaye dacewa da Windows XP (wanda ya kasa fahimtar tafiyarwa tare da sauran masu girma dabam).

Wannan yana nufin cewa ana ba mai kula da ZFS shawara mai ƙarfi don sanin ainihin girman ɓangaren na'urorin su kuma saita da hannu ashift. Idan ashift ya yi ƙasa da ƙasa, to adadin ayyukan karantawa / rubuta yana ƙaruwa ta hanyar astronomy. Don haka, rubuta “bangarorin” 512-byte zuwa sashin KiB na gaske yana nufin rubuta “bangaren” na farko, sannan karanta sashin KiB 4, gyara shi da “bangare” 4-byte na biyu, rubuta shi zuwa sabon. 512 KiB sashen, da sauransu. ga kowane shigarwa.

A cikin duniyar gaske, irin wannan hukunci ya shafi Samsung EVO SSDs, wanda ashift=13, amma waɗannan SSDs sun yi ƙarya game da girman sassan su, don haka an saita tsoho zuwa ashift=9. Idan gogaggen mai kula da tsarin bai canza wannan saitin ba, to wannan SSD yana aiki a hankali na al'ada Magnetic HDD.

Don kwatanta, don girman girma da yawa ashift kusan babu hukunci. Babu ainihin hukuncin aiki, kuma haɓakar sararin da ba a amfani da shi ba shi da iyaka (ko sifili tare da kunna matsi). Don haka, muna ba da shawarar sosai cewa hatta waɗancan faifan da ke amfani da sassan 512-byte su shigar ashift=12 ko ma ashift=13don fuskantar gaba tare da amincewa.

Dukiya ashift an saita don kowace na'ura mai kama da vdev, kuma ba don tafkin ba, kamar yadda mutane da yawa suna tunanin kuskure - kuma baya canzawa bayan shigarwa. Idan ka buga da gangan ashift lokacin da kuka ƙara sabon vdev zuwa tafkin, kun ƙazantar da wannan tafkin tare da ƙarancin na'urar aiki kuma yawanci babu wani zaɓi face lalata tafkin kuma fara sakewa. Ko cire vdev ba zai cece ku daga karyewar tsari ba ashift!

Tsarin kwafi-kan-rubutu

Tushen ZFS: Adana da Ayyuka
Idan tsarin fayil na yau da kullun yana buƙatar sake rubuta bayanai, yana canza kowane toshe inda yake

Tushen ZFS: Adana da Ayyuka
Tsarin fayil na kwafi-kan-rubuta yana rubuta sabon sigar toshe sannan ya buɗe tsohuwar sigar

Tushen ZFS: Adana da Ayyuka
A cikin taƙaitaccen bayani, idan muka yi watsi da ainihin ainihin wurin tubalan, to "data comet" an sauƙaƙe shi zuwa "data tsutsa" wanda ke motsawa daga hagu zuwa dama a kan taswirar sararin samaniya.

Tushen ZFS: Adana da Ayyuka
Yanzu za mu iya samun kyakkyawan ra'ayi na yadda kwafin-kan-rubutu hotuna ke aiki - kowane toshe na iya kasancewa cikin hotuna da yawa, kuma za su ci gaba har sai an lalata duk hotunan da ke da alaƙa.

Tsarin Kwafi akan Rubuta (CoW) shine tushen tushen abin da ke sa ZFS irin wannan tsarin ban mamaki. Mahimmin ra'ayi mai sauƙi ne - idan kun tambayi tsarin fayil na gargajiya don canza fayil, zai yi daidai abin da kuka tambaya. Idan ka nemi tsarin fayil na kwafi-kan-rubutu don yin haka, zai ce "ok" amma ya yi maka karya.

Madadin haka, tsarin fayil na kwafi-kan-rubutu yana rubuta sabon sigar tubalan da aka gyara sannan kuma ya sabunta metadata na fayil ɗin don cire haɗin tsohuwar toshe da haɗa sabon toshe ɗin da kuka rubuta masa.

Ana cire tsohon block da haɗa sabo a cikin aiki ɗaya, don haka ba za a iya katse shi ba - idan kun kunna wuta bayan wannan ya faru, kuna da sabon sigar fayil ɗin, kuma idan kun kunna wuta da wuri, kuna da tsohon sigar. . A kowane hali, ba za a sami rikice-rikice a cikin tsarin fayil ba.

Kwafi-kan-rubuta a cikin ZFS yana faruwa ba kawai a matakin tsarin fayil ba, har ma a matakin sarrafa diski. Wannan yana nufin cewa farin sarari bai shafe ZFS ba (rami a cikin RAID) - wani al'amari lokacin da tsiri ya sami lokacin yin rikodi a wani yanki kawai kafin tsarin ya fado, tare da lalata tsararru bayan sake kunnawa. Anan an rubuta ratsin atomically, vdev koyaushe yana bi da bi, kuma Bob kawunku ne.

ZIL: ZFS log na niyya

Tushen ZFS: Adana da Ayyuka
Tsarin ZFS yana kula da rubutun synchronous ta hanya ta musamman - yana ɗan lokaci amma nan da nan yana adana su a cikin ZIL kafin rubuta su na dindindin daga baya tare da rubutattun asynchronous.

Tushen ZFS: Adana da Ayyuka
Yawanci, bayanan da aka rubuta zuwa ZIL ba a sake karantawa ba. Amma yana yiwuwa bayan faduwar tsarin

Tushen ZFS: Adana da Ayyuka
SLOG, ko na'urar LOG ta biyu, na musamman ne kawai - kuma zai fi dacewa da sauri sosai - vdev, inda za'a iya adana ZIL daban daga babban ma'ajiyar.

Tushen ZFS: Adana da Ayyuka
Bayan wani karo, duk dattin bayanai a cikin ZIL ana sake kunna shi - a wannan yanayin, ZIL yana kan SLOG, don haka ana sake kunna shi daga can.

Akwai manyan nau'ikan ayyukan rubutu guda biyu - synchronous (synchronous) da asynchronous (async). Ga mafi yawan nauyin aiki, yawancin rubuce-rubuce ba su da alaƙa - tsarin fayil yana ba su damar haɗawa da ba da su cikin batches, rage rarrabuwa da haɓaka kayan aiki sosai.

Rikodi masu aiki tare wani lamari ne daban. Lokacin da aikace-aikacen ya buƙaci rubuta ta aiki tare, yana gaya wa tsarin fayil: "Kuna buƙatar yin wannan zuwa ƙwaƙwalwar mara mara ƙarfi. yanzunnanhar zuwa lokacin, babu wani abin da zan iya yi." Sabili da haka, ya kamata a ƙaddamar da rubutun synchronous zuwa faifai nan da nan-kuma idan hakan yana ƙara rarrabuwa ko rage abubuwan samarwa, to haka ya kasance.

ZFS tana sarrafa rubuce-rubucen aiki tare da daban-daban fiye da tsarin fayil na yau da kullun-maimakon sanya su nan da nan zuwa ajiya na yau da kullun, ZFS ya sa su zuwa wurin ajiya na musamman da ake kira ZFS Intent Log, ko ZIL. Dabarar ita ce waɗannan bayanan Har ila yau kasance cikin ƙwaƙwalwar ajiya, ana haɗawa tare da buƙatun rubutu na asynchronous na yau da kullun, don a jujjuya su zuwa ajiya azaman daidaitattun TXGs (Kungiyoyin Kasuwanci).

A cikin aiki na yau da kullun, ana rubuta ZIL zuwa kuma ba za a sake karantawa ba. Lokacin da, bayan ƴan mintuna kaɗan, bayanan ZIL sun ƙaddamar da babban ajiya a cikin TXG na yau da kullun daga RAM, an cire su daga ZIL. Lokacin da ake karanta wani abu daga ZIL shine lokacin da aka shigo da tafkin.

Idan ZFS ta gaza - tsarin aiki ya fadi ko katsewar wuta - yayin da akwai bayanai a cikin ZIL, za a karanta bayanan yayin shigo da tafkin na gaba (misali, lokacin da aka sake kunna tsarin gaggawa). Za a karanta duk wani abu a cikin ZIL, a haɗa shi cikin TXGs, a ƙaddamar da shi ga babban ma'ajiyar, sannan a cire shi daga ZIL yayin aikin shigo da kaya.

Ɗaya daga cikin azuzuwan taimakon vdev ana kiranta LOG ko SLOG, na'urar sakandare ta LOG. Yana da manufa ɗaya - don samar da tafkin tare da keɓaɓɓen, kuma zai fi dacewa da sauri, vdev mai jurewa sosai don adana ZIL, maimakon adana ZIL akan babban kantin vdev. Ita kanta ZIL tana aiki iri ɗaya duk inda aka adana shi, amma idan LOG vdev yana da babban aikin rubutu, rubutun synchronous zai yi sauri.

Ƙara vdev tare da LOG zuwa tafkin baya aiki ba zai iya ba inganta aikin rubuta asynchronous - ko da kun tilasta duk rubuta zuwa ZIL da zfs set sync=always, har yanzu za a haɗa su zuwa babban ajiya a cikin TXG a cikin hanya ɗaya kuma a cikin taki ɗaya kamar ba tare da log ɗin ba. Iyakar ingantaccen aikin kai tsaye shine rashin jinkirin rubuce-rubucen aiki tare (saboda saurin log yana haɓaka ayyuka). sync).

Koyaya, a cikin mahallin da ya riga ya buƙaci rubutattun aiki tare, vdev LOG na iya hanzarta rubuta asynchronous a kaikaice da waɗanda ba a adana su ba. Zazzage shigarwar ZIL zuwa wani vdev LOG na daban yana nufin ƙarancin jayayya ga IOPS akan ma'ajiyar farko, wanda ke haɓaka aikin duk karantawa da rubutu zuwa ɗan lokaci.

Hoton hoto

Tsarin kwafi-kan-rubutu kuma tushen tushe ne mai mahimmanci don hotunan atomic na ZFS da ƙarin kwafin asynchronous. Tsarin fayil mai aiki yana da bishiyar nuna alama wacce ke yiwa duk bayanan da aka yi alama - lokacin da kuka ɗauki hoto, kawai kuna yin kwafin wannan bishiyar mai nuni.

Lokacin da aka sake rubuta rikodin akan tsarin fayil mai aiki, ZFS ta fara rubuta sabon sigar toshe zuwa sarari mara amfani. Sannan yana cire tsohon sigar toshe daga tsarin fayil na yanzu. Amma idan wani hoton hoto yana nufin tsohon toshe, har yanzu ya kasance baya canzawa. Ba za a sake dawo da tsohon toshe a matsayin sarari kyauta ba har sai an lalata duk hotunan da ke nuni da wannan shingen!

Maimaituwa

Tushen ZFS: Adana da Ayyuka
Laburaren Steam na a cikin 2015 shine 158 GiB kuma ya haɗa da fayiloli 126. Wannan yana kusa da mafi kyawun yanayin rsync - maimaita ZFS akan hanyar sadarwar "kawai" 927% cikin sauri.

Tushen ZFS: Adana da Ayyuka
A kan hanyar sadarwa guda ɗaya, yin kwafin fayil ɗin hoto na injin kama-da-wane 40GB guda 7 ya bambanta. Kwafin ZFS shine sau 289 cikin sauri fiye da rsync - ko "kawai" sau 161 cikin sauri idan kuna da wayo don kiran rsync tare da --inplace.

Tushen ZFS: Adana da Ayyuka
Lokacin da aka ƙaddamar da hoton VM, rsync yana haifar da ma'auni tare da shi. 1,9 TiB bai kai girman girman hoton VM na zamani ba - amma yana da girma sosai cewa kwafin ZFS shine sau 1148 da sauri fiye da rsync, har ma da gardamar rsync ta --inplace.

Da zarar kun fahimci yadda hotunan hoto ke aiki, yakamata ya zama da sauƙi a fahimci ainihin kwafi. Tun da hoton hoto bishiyar nuni ce kawai ga rikodin, yana biyo bayan hakan idan muka yi zfs send Hoton hoto, sannan mu aika da wannan bishiyar da duk bayanan da ke tattare da ita. Lokacin da muka aika wannan zfs send в zfs receive akan maƙasudin, yana rubuta duka ainihin abubuwan da ke cikin toshe da kuma bishiyar nuni waɗanda ke nufin tubalan zuwa bayanan da aka yi niyya.

Abubuwa sun fi ban sha'awa akan na biyu zfs send. Yanzu muna da tsarin guda biyu, kowanne ya ƙunshi poolname/datasetname@1, kuma kun ɗauki sabon hoto poolname/datasetname@2. Saboda haka, a cikin ainihin tafkin kuna da datasetname@1 и datasetname@2, kuma a cikin tafkin da aka yi niyya zuwa yanzu kawai hoton farko datasetname@1.

Tunda muna da hoton gama gari tsakanin tushen da manufa datasetname@1, za mu iya yi karuwa zfs send akan shi. Lokacin da muka ce ga tsarin zfs send -i poolname/datasetname@1 poolname/datasetname@2, yana kwatanta bishiyoyi biyu masu nuni. Duk wani nuni da ke wanzuwa a ciki @2, a bayyane yake komawa zuwa sababbin tubalan - don haka muna buƙatar abubuwan da ke cikin waɗannan tubalan.

A kan tsarin nesa, sarrafa ƙarin send kamar sauki. Da farko muna rubuta duk sabbin shigarwar da aka haɗa a cikin rafi send, sa'an nan kuma ƙara masu nuni ga waɗannan tubalan. Voila, muna da @2 a cikin sabon tsarin!

ZFS asynchronous ƙara kwafi babban ci gaba ne akan hanyoyin da ba a ɗauka ba a baya kamar rsync. A cikin lokuta biyu, kawai canja bayanai da aka canjawa wuri - amma rsync dole ne farko karanta daga faifan duk bayanan da ke bangarorin biyu don duba jimlar da kwatanta shi. Sabanin haka, kwafin ZFS ba ya karanta komai sai bishiyoyi masu nuni - da duk wani shingen da ba ya cikin hoton da aka raba.

Matsi da aka gina a ciki

Tsarin kwafi-kan-rubutu kuma yana sauƙaƙa tsarin matsi na layi. A cikin tsarin fayil na gargajiya, matsawa yana da matsala - duka tsohon sigar da sabon sigar bayanan da aka gyara suna zaune a sarari ɗaya.

Idan muka yi la'akari da wani yanki na bayanai a tsakiyar fayil wanda ya fara rayuwa a matsayin megabyte na sifili daga 0x00000000 da sauransu, yana da sauƙi a matsa shi zuwa bangare ɗaya akan faifai. Amma menene zai faru idan muka maye gurbin wannan megabyte na sifili tare da megabyte na bayanai marasa fahimta kamar JPEG ko amo-bazuwar hayaniya? Ba zato ba tsammani, wannan megabyte na bayanai ba zai buƙaci ɗaya ba, amma sassan 256 4 KiB, kuma a cikin wannan wuri a kan faifai guda ɗaya kawai aka tanada.

ZFS ba ta da wannan matsalar, saboda ana rubuta gyare-gyaren bayanan koyaushe zuwa sararin da ba a yi amfani da su ba - katangar asali kawai ta mamaye sashin KiB 4 ne kawai, kuma sabon rikodin zai mamaye 256, amma wannan ba matsala ba ce - guntun da aka gyara kwanan nan daga " tsakiyar" fayil ɗin za a rubuta zuwa sararin da ba a yi amfani da shi ba ko da kuwa girmansa ya canza ko a'a, don haka ga ZFS wannan lamari ne na yau da kullun.

An kashe matsewar ZFS ta asali ta tsohuwa, kuma tsarin yana ba da algorithms masu toshe-a halin yanzu LZ4, gzip (1-9), LZJB, da ZLE.

  • LZ4 Algorithm ne mai gudana wanda ke ba da matsananciyar matsananciyar matsawa da ragewa da fa'idodin aiki don yawancin lokuta masu amfani - har ma akan CPUs masu jinkirin gaske.
  • GZIP algorithm ne mai daraja wanda duk masu amfani da Unix suka sani kuma suke so. Ana iya aiwatar da shi tare da matakan matsawa na 1-9, tare da matsawa rabo da kuma amfani da CPU yana ƙaruwa yayin da yake gabatowa matakin 9. Algorithm ya dace sosai ga duk maganganun rubutu (ko wasu matsi sosai) amfani da lokuta, amma in ba haka ba sau da yawa yana haifar da al'amurran CPU - amfani da shi. tare da kulawa, musamman a matakai masu girma.
  • LZJB shine ainihin algorithm a cikin ZFS. An yanke shi kuma bai kamata a sake amfani da shi ba, LZ4 ya zarce ta ta kowace hanya.
  • MUMMUCI - Sifili matakin rufewa, Sifili Level Encoding. Ba ya taɓa bayanan al'ada kwata-kwata, amma yana matsar da manyan jerin sifilai. Yana da amfani ga bayanan da ba a iya haɗawa gaba ɗaya (kamar JPEG, MP4, ko wasu tsarin da aka riga aka matsa) kamar yadda yake watsi da bayanan da ba za a iya haɗawa ba amma yana matsa sararin da ba a amfani da shi a cikin sakamakon bayanan.

Muna ba da shawarar matsawa LZ4 don kusan duk lokuta masu amfani; hukuncin yin aiki lokacin da aka haɗu da bayanan da ba a iya fahimta ba kaɗan ne, kuma girma aiki don bayanan al'ada yana da mahimmanci. Kwafi hoton injin kama-da-wane don sabon shigarwa na tsarin aiki na Windows (wanda aka shigar da sabon OS, babu bayanai a ciki tukuna) tare da compression=lz4 ya wuce 27% cikin sauri fiye da tare da compression=none, in wannan gwajin a 2015.

ARC - cache mai daidaitawa

ZFS shine kawai tsarin fayil na zamani da muka sani wanda ke amfani da na'urar adana bayanan karatunsa, maimakon dogaro da cache na tsarin aiki don adana kwafin tubalan da aka karanta kwanan nan a cikin RAM.

Kodayake cache na asali ba tare da matsalolinsa ba - ZFS ba zai iya amsa buƙatun rarraba ƙwaƙwalwar ajiya da sauri kamar kernel ba, don haka sabon ƙalubale. malloc() akan rarraba ƙwaƙwalwar ajiya na iya gazawa idan yana buƙatar RAM ɗin ARC a halin yanzu. Amma akwai kyawawan dalilai don amfani da cache naku, aƙalla a yanzu.

Duk sanannun tsarukan aiki na zamani, gami da MacOS, Windows, Linux da BSD, suna amfani da LRU (Ƙarancin Amfani da Kwanan nan) algorithm don aiwatar da cache shafin. Wannan wani babban algorithm ne wanda ke tura shingen cache " sama da layi " bayan kowane karantawa, kuma yana tura tubalan "saukar da layin" kamar yadda ake buƙata don ƙara sabon cache da aka rasa (tubalan da yakamata a karanta daga faifai, ba daga cache ba) sama.

Algorithm yawanci yana aiki da kyau, amma akan tsarin da ke da manyan bayanan aiki, LRU cikin sauƙi yana haifar da ɓarna - korar tubalan da ake buƙata akai-akai don yin ɗaki don tubalan waɗanda ba za a sake karantawa daga cache ba.

ARC Algorithm ne mafi ƙarancin butulci wanda za'a iya tunaninsa azaman ma'ajin ''nau'i''. Duk lokacin da aka karanta bulo ɗin da aka adana, yana ɗan ƙara “nauyi” kuma yana da wahala a fitar da shi - har ma bayan fitar da shinge. sa ido a cikin wani ƙayyadadden lokaci. Katangar da aka fitar amma sai a sake karantawa a cikin cache shima zai zama “mafi nauyi”.

Sakamakon ƙarshe na duk wannan shine cache tare da rabo mafi girma mafi girma, rabo tsakanin cache hits (karanta da aka yi daga cache) da cache ya ɓace (karanta daga faifai). Wannan ƙididdiga ce mai mahimmanci - ba wai kawai cache ɗin ke ba da umarnin girma cikin sauri ba, ana iya yin asarar cache da sauri, tunda yawancin cache ɗin da ake samu, ƙarancin buƙatun faifai na lokaci ɗaya da ƙarancin latency ga waɗanda suka rage ke ɓacewa. wanda dole ne a yi amfani da faifai.

ƙarshe

Bayan koyon ainihin ma'anar ZFS - yadda kwafi-kan-rubutu ke aiki, da kuma alaƙar da ke tsakanin wuraren ajiya, na'urori masu kama-da-wane, tubalan, sassa, da fayiloli - mun shirya don tattauna aikin ainihin duniya tare da lambobi na gaske.

A kashi na gaba, za mu kalli ainihin aikin wuraren tafki tare da vdevs da RAIDz masu kamanceceniya, tare da juna, da kuma kan al'adun gargajiya na Linux RAID topologies da muka bincika. a baya.

Da farko, muna so mu rufe kawai abubuwan yau da kullun - ZFS topologies kansu - amma bayan irin wannan bari mu shirya don yin magana game da ƙarin saiti da daidaitawa na ZFS, gami da amfani da nau'ikan vdev na taimako kamar L2ARC, SLOG da Rarraba Musamman.

source: www.habr.com

Add a comment