Nā Kumu ZFS: Hoʻopaʻa a me ka hana

Nā Kumu ZFS: Hoʻopaʻa a me ka hana

I kēia pūnāwai ua kūkākūkā mua mākou i kekahi mau kumuhana hoʻolauna, no ka laʻana, pehea e nānā ai i ka māmā holo o kāu mau drive и he aha ka RAID. Ma ka lua o lākou, ua hoʻohiki mākou e hoʻomau i ke aʻo ʻana i ka hana o nā topologies multi-disk ma ZFS. ʻO kēia ka ʻōnaehana faila hanauna hou e hoʻokō ʻia nei ma nā wahi āpau: mai Apple i luna Ubuntu.

ʻAe, ʻo kēia ka lā maikaʻi loa e kamaʻilio me ZFS, nā mea heluhelu nīnau. E ʻike wale i ka manaʻo haʻahaʻa o ka mea hoʻomohala OpenZFS ʻo Matt Ahrens, "he paʻakikī loa."

Akā ma mua o ko mākou hiki ʻana i nā helu - a e hana lākou, hoʻohiki wau - no nā koho āpau no kahi hoʻonohonoho ZFS ʻewalu-disk, pono mākou e kamaʻilio e pili ana. pehea Ma ka laulā, mālama ʻo ZFS i ka ʻikepili ma ka disk.

Zpool, vdev a me ka mea hana

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Aia i loko o kēia kiʻina puna piha ʻekolu vdevs kōkua, hoʻokahi o kēlā me kēia papa, a ʻehā no RAIDz2

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
ʻAʻohe kumu maʻamau e hana i ka loko o nā ʻano vdev like ʻole a me ka nui - akā ʻaʻohe mea e kāohi iā ʻoe mai ka hana ʻana inā makemake ʻoe.

No ka hoʻomaopopo pono i ka ʻōnaehana faila ZFS, pono ʻoe e nānā pono i kona ʻano maoli. ʻO ka mea mua, hoʻohui ʻo ZFS i nā pae kuʻuna o ka leo a me ka hoʻokele ʻōnaehana faila. ʻO ka lua, hoʻohana ia i kahi hana kope-ma-kākau. ʻO kēia mau hiʻohiʻona he ʻokoʻa loa ka ʻōnaehana mai nā ʻōnaehana faila maʻamau a me nā array RAID. ʻO ka pūʻulu mua o nā poloka kūkulu hale e hoʻomaopopo ai ʻo ia ka waihona waihona (zpool), ka polokalamu virtual (vdev), a me ka mea pono (mea hana).

zpool

ʻO ka wai waihona Zpool ka hale ZFS kiʻekiʻe loa. Loaʻa i kēlā me kēia pūnāwai hoʻokahi a ʻoi aku paha nā polokalamu virtual. Ma ka huli ʻana, loaʻa i kēlā me kēia o lākou hoʻokahi a ʻoi aku paha nā mea hana maoli (mea hana). ʻO nā loko waiʻauʻau he mau poloka paʻa ponoʻī. Hiki i hoʻokahi kamepiula kino ke loaʻa i ʻelua a ʻoi aʻe nā loko wai kaʻawale, akā kūʻokoʻa loa kēlā me kēia mai nā poʻe ʻē aʻe. ʻAʻole hiki i nā loko wai ke kaʻana like i nā polokalamu virtual.

ʻO ka redundancy o ZFS aia ma ka pae o ka polokalamu virtual, ʻaʻole ma ka pae wai. ʻAʻole loa he redundancy ma ka pool level - inā nalowale kekahi drive vdev a i ʻole vdev kūikawā, a laila nalowale ka loko a pau me ia.

Hiki ke ola nā loko waihona waihona o kēia wā i ka nalowale o kahi huna huna a i ʻole ka moʻolelo polokalamu virtual - ʻoiai hiki iā lākou ke nalowale i kahi helu liʻiliʻi o ka ʻikepili lepo inā nalowale lākou i ka log vdev i ka wā o ka pau ʻana o ka mana a i ʻole ka pōʻino o ka ʻōnaehana.

Aia ka manaʻo kuhihewa maʻamau i kākau ʻia nā "paʻi ʻikepili" ZFS ma ka loko holoʻokoʻa. ʻAʻole ʻoiaʻiʻo kēia. ʻAʻole ʻakaʻaka ʻo Zpool iā RAID0, he ʻakaʻaka JBOD me kahi mīkini hoʻoili paʻakikī paʻakikī.

ʻO ka hapa nui, ua māhele ʻia nā moʻolelo ma waena o nā polokalamu virtual i loaʻa e like me ka wahi manuahi i loaʻa, no laila ma ke kumumanaʻo e hoʻopiha ʻia lākou a pau i ka manawa like. Ma nā mana hope o ZFS, e noʻonoʻo ʻia ka hoʻohana ʻana o ka vdev (hoʻohana) i kēia manawa - inā ʻoi aku ka ʻoi aku o ka ʻoihana virtual ma mua o kekahi (no ka laʻana, ma muli o ka heluhelu heluhelu), e hoʻokuʻu ʻia ia no ka kākau ʻana, ʻoiai ke loaʻa ka manuahi kiʻekiʻe. laki lākiō.

ʻO ka mīkini ʻike hoʻohana i kūkulu ʻia i loko o nā ʻano hana hoʻokaʻawale kākau ZFS hou hiki ke hōʻemi i ka latency a hoʻonui i ka throughput i nā manawa o ka ukana kiʻekiʻe - akā ʻaʻole ia. carte blanche ma ka hui ʻole ʻana o nā HDD lohi a me nā SSD wikiwiki i loko o ka wai hoʻokahi. E holo mau ana kēlā ʻano wai like ʻole i ka wikiwiki o ka hāmeʻa lohi, ʻo ia hoʻi, me he mea lā ua haku ʻia ia mau mea hana.

vdev

Loaʻa i kēlā me kēia waihona waihona hoʻokahi a ʻoi aʻe paha nā polokalamu virtual (mea uila, vdev). Ma ka huli ʻana, loaʻa i kēlā me kēia vdev hoʻokahi a ʻoi aku paha nā mea hana maoli. Hoʻohana ʻia ka hapa nui o nā polokalamu virtual no ka mālama ʻana i ka ʻikepili maʻalahi, akā aia kekahi mau papa kōkua vdev, me CACHE, LOG, a me SPECIAL. Hiki i kēlā me kēia ʻano vdev ke loaʻa i hoʻokahi o nā topologies ʻelima: mea hoʻokahi (mea hoʻokahi), RAIDz1, RAIDz2, RAIDz3, a i ʻole aniani (mirror).

ʻO RAIDz1, RAIDz2 a me RAIDz3 nā ʻano ʻano kūikawā o ka mea a ka poʻe kahiko i kapa ʻia he pālua (diagonal) parity RAID. Nānā 1, 2 a me 3 i ka nui o nā poloka parity i hoʻokaʻawale ʻia no kēlā me kēia kaula ʻikepili. Ma kahi o nā disks kaʻawale no ka parity, hāʻawi nā polokalamu virtual RAIDz i kēia parity semi-evenly ma waena o nā disks. Hiki i kahi pūʻulu RAIDz ke nalowale i nā disks e like me kona mau poloka parity; inā e nalowale kekahi, e hāʻule ʻo ia a lawe pū i ka waihona waihona.

I loko o nā hāmeʻa virtualed mirrored (mirror vdev), mālama ʻia kēlā me kēia poloka ma kēlā me kēia hāmeʻa i ka vdev. ʻOiai ʻo nā aniani ākea ʻelua ka mea maʻamau, hiki ke hoʻohana ʻia nā ʻāpana like ʻole i loko o ke aniani - hoʻohana pinepine ʻia nā ʻekolu i nā hoʻonohonoho nui no ka hoʻomaikaʻi ʻana i ka heluhelu a me ka hoʻomanawanui hewa. Hiki i ke aniani vdev ke ola i ka hāʻule ʻole inā lōʻihi ka hana ʻana o hoʻokahi mea hana ma ka vdev.

Pilikia maoli nā vdev hoʻokahi. ʻAʻole e ola kēlā ʻano mea virtual i kahi hemahema hoʻokahi - a inā hoʻohana ʻia ma ke ʻano he mālama a i ʻole kahi vdev kūikawā, a laila ʻo kona hemahema e alakaʻi ai i ka luku ʻana i ka wai holoʻokoʻa. E akahele loa maanei.

Hiki ke hana ʻia nā CACHE, LOG, a me SPECIAL VA me ka hoʻohana ʻana i kekahi o nā topologies ma luna aʻe - akā e hoʻomanaʻo ʻo ka nalowale ʻana o kahi VA SPECIAL ʻo ia ka nalowale o ka loko, no laila ua manaʻo nui ʻia kahi topology redundant.

mea

ʻO kēia paha ka huaʻōlelo maʻalahi loa e hoʻomaopopo ma ZFS - ʻo ia ka mea i hoʻopaʻa ʻia. E hoʻomanaʻo i ka hana ʻana o nā hāmeʻa virtual i nā ʻaoʻao pākahi, ʻoiai ʻo ka loko wai i hana ʻia i nā polokalamu virtual.

ʻO nā disks - ʻo ke kūlana magnetic a paʻa paha - ʻo ia nā mea poloka maʻamau i hoʻohana ʻia e like me nā poloka kūkulu o vdev. Eia nō naʻe, e hana kekahi mea me ka wehewehe ʻana i / dev, no laila hiki ke hoʻohana ʻia nā ʻāpana RAID lako holoʻokoʻa e like me nā mea ʻokoʻa.

ʻO kahi faila maka maʻalahi kekahi o nā mea hana poloka ʻokoʻa koʻikoʻi i hiki ke kūkulu ʻia kahi vdev. Nā loko hoʻāʻo mai nā faila liʻiliʻi He ala maʻalahi loa ia e nānā i nā kauoha wai a ʻike i ka nui o ka nui o ka manawa i loaʻa i loko o kahi loko wai a i ʻole mea uila o kahi topology i hāʻawi ʻia.

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Hiki iā ʻoe ke hana i kahi wai hoʻāʻo mai nā faila liʻiliʻi i loko o nā kekona wale nō - akā mai poina e holoi i ka loko holoʻokoʻa a me kāna mau ʻāpana ma hope.

E ʻōlelo mākou makemake ʻoe e kau i kahi kikowaena ma nā diski ʻewalu a hoʻolālā e hoʻohana i nā disks 10 TB (~ 9300 GiB) - akā ʻaʻole ʻoe maopopo i ka topology i kūpono i kāu mau pono. Ma ka laʻana ma luna, kūkulu mākou i kahi wai hoʻāʻo mai nā faila liʻiliʻi i kekona - a i kēia manawa ua ʻike mākou he RAIDz2 vdev o ʻewalu 10 TB disks e hāʻawi iā 50 TiB o ka hiki ke hoʻohana.

ʻO kekahi papa hana kūikawā ʻē aʻe ʻo SPARE (spare). ʻO nā mea hoʻololi wela, ʻaʻole like me nā mea hana maʻamau, no ka loko holoʻokoʻa, ʻaʻole i kahi hāmeʻa virtual hoʻokahi. Inā hāʻule ka vdev i loko o ka pūnāwai a ua hoʻopili ʻia kahi mea ʻokoʻa i ka wai a loaʻa, a laila e hui pū me ka vdev i hoʻopilikia ʻia.

Ma hope o ka hoʻopili ʻana i ka vdev i hoʻopilikia ʻia, hoʻomaka ka mea ʻokoʻa e loaʻa i nā kope a i ʻole ke kūkulu hou ʻana o ka ʻikepili e pono ai ma ka mea nalo. I ka RAID kuʻuna, kapa ʻia kēia i ke kūkulu hou ʻana, ʻoiai ma ZFS ua kapa ʻia ʻo resilvering.

He mea nui e hoʻomaopopo ʻaʻole e hoʻololi mau nā mea ʻokoʻa i nā mea hana hemahema. He hoʻololi manawaleʻa wale kēia e hōʻemi ai i ka manawa i hoʻohaʻahaʻa ʻia ai ka vdev. Ma hope o ka hoʻololi ʻana o ka luna hoʻoponopono i ka vdev i hāʻule ʻole, hoʻihoʻi ʻia ka redundancy i kēlā mea paʻa mau, a hemo ʻia ʻo SPARE mai ka vdev a hoʻi i ka hana ma ke ʻano he ʻokoʻa no ka wai holoʻokoʻa.

Nā pūʻulu ʻikepili, nā poloka a me nā ʻāpana

ʻO ka papa hana hou aʻe e hoʻomaopopo ai i kā mākou huakaʻi ZFS he mea liʻiliʻi e pili ana i ka ʻenehana a me nā mea hou aʻe e pili ana i ka hoʻonohonoho ʻana a mālama ʻia o ka ʻikepili ponoʻī. Ke lele nei mākou i kekahi mau pae ma aneʻi - e like me ka metaslab - i ʻole e hoʻopili i nā kikoʻī me ka mālama ʻana i ka ʻike o ke ʻano holoʻokoʻa.

pūʻulu ʻikepili (setset)

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Ke hana mua mākou i kahi ʻikepili, hōʻike ia i nā wahi wai wai āpau i loaʻa. A laila hoʻonoho mākou i ka quota - a hoʻololi i ke kiko mauna. Ke kilokilo!

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
ʻO Zvol no ka hapa nui o kahi ʻikepili i wehe ʻia i kāna papa waihona, a mākou e pani nei ma aneʻi me kahi ʻōnaehana file ext4 maʻamau.

Ua like like ka waihona ZFS me kahi ʻōnaehana faila i kau ʻia. E like me ka ʻōnaehana faila maʻamau, i ka nānā mua ʻana e like me "kahi waihona ʻē aʻe". Akā e like me nā ʻōnaehana faila hiki ke kau ʻia, loaʻa i kēlā me kēia waihona ZFS kona mau pono ponoʻī.

ʻO ka mea mua, hiki i kahi ʻikepili ke loaʻa kahi quota i hāʻawi ʻia. Inā hoʻonoho zfs set quota=100G poolname/datasetname, a laila ʻaʻole hiki iā ʻoe ke kākau i ka waihona i kau ʻia /poolname/datasetname ʻoi aku ma mua o 100 GiB.

E ʻike i ka hele ʻana - a me ka ʻole - o nā ʻoki ma ka hoʻomaka o kēlā me kēia laina? Loaʻa i kēlā me kēia waihona kona wahi ponoʻī ma ka hierarchy ZFS a me ka hierarchy mauna ʻōnaehana. ʻAʻohe alakaʻi slash i ka ZFS hierarchy - hoʻomaka ʻoe me ka inoa pool a laila ke ala mai kahi papa helu a hiki i kekahi. ʻo kahi laʻana, pool/parent/child no kahi waihona i kapa ʻia child ma lalo o ka papa helu makua parent i loko o kahi wai me ka inoa hana pool.

Ma ka maʻamau, e like ka helu mauna o ka dataset me kona inoa ma ka hierarchy ZFS, me kahi slash alakaʻi - ka wai i kapa ʻia. pool kau ʻia e like me /pool, pūʻulu ʻikepili parent kau i loko /pool/parent, a me ka waihona keiki child kau i loko /pool/parent/child. Eia nō naʻe, hiki ke hoʻololi ʻia ka pae mauna o ka ʻōnaehana.

Inā mākou e kuhikuhi zfs set mountpoint=/lol pool/parent/child, a laila ka hoʻonohonoho ʻikepili pool/parent/child kau ʻia ma ka ʻōnaehana e like me /lol.

Ma waho aʻe o nā waihona, pono mākou e haʻi i nā volumes (zvols). Ua like like ka leo me ka waihona, koe wale nō ʻaʻohe ona waihona waihona—he mea poloka wale nō. Hiki iā ʻoe, no ka laʻana, hana zvol Me ka inoa mypool/myzvol, a laila e hoʻopili iā ia me kahi ʻōnaehana faila ext4, a laila e kau i kēlā ʻōnaehana faila - loaʻa iā ʻoe kahi ʻōnaehana file ext4, akā me nā hiʻohiʻona palekana āpau o ZFS! He mea lapuwale paha kēia ma ka mīkini hoʻokahi, akā ʻoi aku ka manaʻo ma ke ʻano he backend i ka wā e lawe aku ai i kahi mea iSCSI.

Nā ʻakaʻaka

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Hōʻike ʻia ka faila e hoʻokahi poloka a ʻoi aku paha. Hoʻopaʻa ʻia kēlā me kēia poloka ma kahi hāmeʻa virtual. Ua like ka nui o ka poloka me ka palena nui mooolelo, akā hiki ke hoʻemi ʻia i 2^ hoololiinā loaʻa ka metadata a i ʻole kahi faila liʻiliʻi.

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Mākou maoli ʻoiaʻiʻo ʻaʻole ʻakaʻaka e pili ana i ka hoʻopaʻi hoʻokō nui inā hoʻonoho ʻoe i kahi ashift liʻiliʻi loa

Ma kahi wai ZFS, mālama ʻia nā ʻikepili āpau, me ka metadata, i loko o nā poloka. Ua wehewehe ʻia ka nui o ka poloka no kēlā me kēia pūʻulu ʻikepili i ka waiwai recordsize (ka nui palapala). Hiki ke hoʻololi ʻia ka nui o ka moʻolelo, akā ʻaʻole e hoʻololi kēia i ka nui a i ʻole kahi o nā poloka i kākau mua ʻia i ka waihona - pili wale ia i nā poloka hou e like me ke kākau ʻana.

Inā ʻaʻole i kuhikuhi ʻia, ʻo 128 KiB ka nui o ka moʻolelo paʻamau. He ʻano kālepa paʻakikī kahi i kūpono ʻole ai ka hana, akā ʻaʻole weliweli i ka nui o nā hihia. Recordsize hiki ke hoʻonohonoho i kekahi waiwai mai 4K a i 1M (me nā hoʻonohonoho kiʻekiʻe recordsize hiki iā ʻoe ke hoʻouka hou aku, akā ʻaʻole kēia he manaʻo maikaʻi).

ʻO kēlā me kēia poloka e pili ana i ka ʻikepili o hoʻokahi faila wale nō - ʻaʻole hiki iā ʻoe ke hoʻopaʻa i ʻelua faila ʻokoʻa i hoʻokahi poloka. Aia i kēlā me kēia faila i hoʻokahi a ʻoi aʻe paha nā poloka, ma muli o ka nui. Inā ʻoi aku ka liʻiliʻi o ka faila ma mua o ka nui o ka mooolelo, e mālama ʻia ia ma kahi poloka liʻiliʻi - no ka laʻana, ʻo kahi poloka me kahi faila 2 KiB e noho i hoʻokahi māhele 4 KiB ma ka disk.

Inā nui ka faila a koi ʻia kekahi mau poloka, a laila nui nā moʻolelo āpau me kēia faila recordsize - me ka helu hope loa, ʻo ka hapa nui paha wahi i hoʻohana ʻole ʻia.

ʻaʻohe waiwai o zvols recordsize — aka, he waiwai like ko lakou volblocksize.

Nā ʻāpana

ʻO ka mea hope loa, ʻo ia ka ʻāpana kūkulu kumu. ʻO ia ka ʻāpana kino liʻiliʻi loa i hiki ke kākau ʻia a heluhelu ʻia paha mai ka ʻaoʻao lalo. No kekahi mau makahiki, ua hoʻohana ka hapa nui o nā disks i nā ʻāpana 512-byte. I kēia mau lā, ua hoʻonohonoho ʻia ka hapa nui o nā disks i nā ʻāpana 4 KiB, a ʻo kekahi - ʻoi aku ka SSD - loaʻa nā ʻāpana 8 KiB a ʻoi aku paha.

Loaʻa i ka ʻōnaehana ZFS kahi waiwai e hiki ai iā ʻoe ke hoʻonohonoho lima i ka nui o ka māhele. ʻO kēia waiwai ashift. ʻO kahi mea huikau, ʻo ka ashift ka mana o ʻelua. ʻo kahi laʻana, ashift=9 'o ia ka nui o ka māhele o 2^9, a i 'ole 512 bytes.

Nīnau ʻo ZFS i ka ʻōnaehana hana no ka ʻike kikoʻī e pili ana i kēlā me kēia mea poloka ke hoʻohui ʻia i kahi vdev hou, a hoʻokomo pono ʻia ʻo ashift ma muli o ia ʻike. ʻO ka mea pōʻino, nui nā drive e hoʻopunipuni e pili ana i ka nui o kā lākou ʻāpana i mea e mālama ai i ka hoʻohālikelike ʻana me Windows XP (ʻaʻole hiki ke hoʻomaopopo i nā drive me nā ʻāpana ʻē aʻe).

ʻO ke ʻano kēia, ua ʻōlelo ikaika ʻia kahi luna ZFS e ʻike i ka nui o ka ʻāpana maoli o kā lākou mau mea hana a hoʻonohonoho lima ʻia ashift. Inā haʻahaʻa loa ka neʻe ʻana, a laila piʻi nui ka helu o nā hana heluhelu / kākau. No laila, ʻo ke kākau ʻana i nā "ʻāpana" 512-byte i loko o kahi ʻāpana 4 KiB maoli ʻo ia ka mea e kākau i ka "ʻāpana" mua, a laila heluhelu i ka ʻāpana 4 KiB, hoʻololi iā ia me ka "sector" 512-byte ʻelua, e kākau hou i ka mea hou. ʻāpana 4 KiB, a pēlā aku no kēlā me kēia komo.

I ka honua maoli, ua hoʻopaʻi ʻia kēlā ʻano hoʻopaʻi iā Samsung EVO SSDs, no ia ashift=13, akā, ke wahaheʻe nei kēia mau SSD e pili ana i ka nui o kā lākou māhele, a no laila ua hoʻonohonoho ʻia ka paʻamau ashift=9. Inā ʻaʻole hoʻololi ka luna ʻōnaehana ʻike i kēia hoʻonohonoho, a laila hana kēia SSD lohi HDD magnetic maʻamau.

No ka hoʻohālikelike, no ka nui nui ashift ʻaʻohe hoʻopaʻi. ʻAʻohe hoʻopaʻi hoʻokō maoli, a ʻo ka piʻi ʻana o ka wahi i hoʻohana ʻole ʻia he liʻiliʻi loa (a i ʻole ʻole me ka hoʻohana ʻana i ka hoʻoemi). No laila, manaʻo ikaika mākou e hoʻokomo i kēlā mau drive e hoʻohana ana i nā ʻāpana 512-byte ashift=12 aiʻole ashift=13e alo i ka wā e hiki mai ana me ka hilinaʻi.

Waiwai ashift ua hoʻonohonoho ʻia no kēlā me kēia vdev virtual device, a ʻaʻole no ka loko, e like me ka manaʻo kuhihewa o ka poʻe - a ʻaʻole e loli ma hope o ke kau ʻana. Inā ʻaʻole ʻoe i pā ashift ke hoʻohui ʻoe i kahi vdev hou i loko o kahi pūnāwai, ua hoʻohaumia ʻole ʻoe i kēlā kolamu me kahi hāmeʻa haʻahaʻa haʻahaʻa a ʻaʻohe koho ʻē aʻe akā e luku i ka loko a hoʻomaka hou. ʻO ka wehe ʻana i ka vdev ʻaʻole ia e hoʻopakele iā ʻoe mai kahi hoʻonohonoho haʻihaʻi ashift!

Mekani kope-ma-kākau

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Inā pono kahi ʻōnaehana faila maʻamau e kākau i ka ʻikepili, hoʻololi ia i kēlā me kēia poloka ma kahi o ia

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Kākau ka ʻōnaehana waihona kope-ma-kākau i kahi mana poloka hou a laila wehe i ka mana kahiko

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Ma ka mea kikoʻī, inā e haʻalele mākou i ka wahi maoli o nā poloka, a laila ua maʻalahi kā mākou "data comet" i kahi "worm data" e neʻe ana mai ka hema a i ka ʻākau ma o ka palapala ʻāina o kahi ākea.

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
I kēia manawa hiki iā mākou ke loaʻa ka manaʻo maikaʻi i ke ʻano o ka hana ʻana o nā paʻi kope-ma-kākau - hiki ke loaʻa i kēlā me kēia poloka e nā kiʻi paʻi he nui, a e hoʻomau ʻia a pau nā kiʻi paʻi pili.

ʻO ka mīkini kope ma ke kākau (CoW) ke kumu kumu o ka mea e hana ai ʻo ZFS i ʻōnaehana kupaianaha. He maʻalahi ka manaʻo kumu - inā ʻoe e noi i kahi ʻōnaehana faila kuʻuna e hoʻololi i kahi faila, e hana ʻo ia i kāu mea i noi ai. Inā ʻoe e noi i kahi ʻōnaehana faila kope-ma-kākau e hana like, e ʻōlelo ʻo ia "ok" akā wahaheʻe iā ʻoe.

No ka mea, kākau ka ʻōnaehana faila kope-ma-kākau i kahi mana hou o ka poloka i hoʻololi ʻia a laila hoʻohou i ka metadata o ka faila e wehe i ka loulou kahiko a hoʻopili i ka poloka hou āu i kākau ai iā ia.

ʻO ka wehe ʻana i ka poloka kahiko a me ka hoʻopili ʻana i ka mea hou i hana ʻia i hoʻokahi hana, no laila ʻaʻole hiki ke hoʻopau ʻia - inā e hoʻopau ʻoe ma hope o kēia hana ʻana, loaʻa iā ʻoe kahi mana hou o ka faila, a inā ʻoe e hoʻopau koke, aia iā ʻoe ka mana kahiko. . I kēlā me kēia hihia, ʻaʻohe paio ma ka ʻōnaehana faila.

ʻAʻole hiki ke kope-ma-kākau ma ZFS ma ka pae ʻōnaehana waihona wale nō, akā ma ka pae hoʻokele disk. 'O ia ho'i, 'a'ole pili 'ia ka ZFS e ka wahi ke'oke'o (he puka i ka RAID) - he hanana i loaʻa ka manawa e hoʻopaʻa hapa ai ka ʻōpala ma mua o ka hāʻule ʻana o ka ʻōnaehana, me ka pōʻino o ka hui ma hope o ka hoʻomaka hou ʻana. Maanei ua kākau ʻia ke kāʻei atomika, ʻo ka vdev ke kaʻina mau, a ʻO Bob kou ʻanakala.

ZIL: ZFS moʻolelo manaʻo

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Mālama ka ʻōnaehana ZFS i nā kākau synchronous ma kahi ala kūikawā - no ka manawa pōkole akā mālama koke iā lākou ma ZIL ma mua o ke kākau mau ʻana ma hope me nā kākau asynchronous.

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
ʻO ka maʻamau, ʻaʻole heluhelu hou ʻia ka ʻikepili i kākau ʻia i kahi ZIL. Akā hiki ke hana ma hope o ka hāʻule ʻana o ka ʻōnaehana

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
ʻO SLOG, a i ʻole nā ​​​​mea hana LOG lua, he mea kūikawā wale nō - a ʻoi aku ka wikiwiki loa - vdev, kahi e mālama ʻia ai ka ZIL mai ka waihona nui.

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Ma hope o ka hāʻule ʻana, hoʻihoʻi hou ʻia nā ʻikepili lepo āpau ma ZIL - i kēia hihia, aia ʻo ZIL ma SLOG, no laila e hoʻihoʻi ʻia mai laila.

ʻElua mau ʻāpana nui o nā hana kākau - synchronous (sync) a me asynchronous (async). No ka hapa nui o nā hana, ʻo ka hapa nui o nā mea kākau he asynchronous - hiki i ka ʻōnaehana faila ke hōʻuluʻulu a hoʻopuka ʻia i nā pūʻulu, e hōʻemi ana i ka ʻāpana a me ka hoʻonui nui ʻana i ka throughput.

He mea ʻokoʻa loa ka hoʻopaʻa ʻana i hoʻopaʻa ʻia. Ke noi nei kahi noi i kahi kākau synchronous, haʻi ia i ka ʻōnaehana faile: "Pono ʻoe e hana i kēia i ka hoʻomanaʻo non-volatile. kēia manawaa hiki i kēlā manawa, ʻaʻohe mea e hiki iaʻu ke hana." No laila, pono e hoʻopaʻa koke ʻia nā palapala synchronous i ka disk-a inā e hoʻonui i ka ʻāpana a i ʻole e hōʻemi i ka throughput, a laila pēlā nō.

Hana ʻo ZFS i nā kākau like ʻole ma mua o nā ʻōnaehana faila maʻamau-ma kahi o ka hoʻokomo koke ʻana iā lākou i ka mālama maʻamau, hāʻawi ʻo ZFS iā lākou i kahi wahi mālama kūikawā i kapa ʻia ʻo ZFS Intent Log, a i ʻole ZIL. ʻO ka maʻalahi o kēia mau moʻolelo i E hoʻomau i ka hoʻomanaʻo, e hui pū ʻia me nā noi kākau asynchronous maʻamau, e hoʻoheheʻe ʻia ma hope i ka waiho ʻana e like me nā TXG maʻamau (Transaction Groups).

Ma ka hana maʻamau, kākau ʻia ka ZIL a ʻaʻole heluhelu hou. I ka manawa, ma hope o kekahi mau manawa, hoʻopaʻa ʻia nā moʻolelo mai ka ZIL i ka waihona nui i nā TXG maʻamau mai RAM, ua hoʻokaʻawale ʻia lākou mai ka ZIL. ʻO ka manawa wale nō e heluhelu ʻia ai kahi mea mai ka ZIL i ka wā i lawe ʻia mai ai ka loko.

Inā hāʻule ʻo ZFS - hāʻule ka ʻōnaehana hana a i ʻole ka pau ʻana o ka mana - ʻoiai aia ka ʻikepili i loko o ka ZIL, e heluhelu ʻia kēlā ʻikepili i ka wā o ka lawe ʻana mai o ka punawai e hiki mai ana (no ka laʻana, ke hoʻomaka hou ka ʻōnaehana pilikia). E heluhelu ʻia nā mea āpau i ka ZIL, hui pū ʻia i TXGs, i hoʻopaʻa ʻia i ka waihona nui, a laila hoʻokaʻawale ʻia mai ka ZIL i ka wā o ka lawe ʻana mai.

ʻO kekahi o nā papa kōkua vdev i kapa ʻia ʻo LOG a i ʻole SLOG, ka mea hana lua o LOG. He hoʻokahi kumu - e hoʻolako i ka wai me kahi kaʻawale, a ʻoi aku ka wikiwiki, kākau-kūpaʻa vdev e mālama i ka ZIL, ma kahi o ka mālama ʻana i ka ZIL ma ka hale kūʻai vdev nui. ʻO ka ZIL ponoʻī ke ʻano like ʻole ma kahi e mālama ʻia ai, akā inā he kiʻekiʻe loa ka hana kākau ʻana o ka LOG vdev, ʻoi aku ka wikiwiki o nā kākau synchronous.

ʻAʻole pono ka hoʻohui ʻana i kahi vdev me LOG ʻaʻole hiki hoʻomaikaʻi i ka hana kākau asynchronous - ʻoiai inā ʻoe e hoʻoikaika i nā kākau āpau iā ZIL me zfs set sync=always, e hoʻopili ʻia lākou i ka waihona nui ma TXG ma ke ʻano like a me ka wikiwiki like me ka ʻole o ka log. ʻO ka hoʻomaikaʻi ʻana i ka hana pololei wale nō ka latency o nā kākau synchronous (no ka mea, ʻoi aku ka wikiwiki o ka log i nā hana). sync).

Eia nō naʻe, i loko o kahi kaiapuni i koi mua i ka nui o nā kākau synchronous, hiki i ka vdev LOG ke hoʻolalelale i nā kākau asynchronous a me ka heluhelu ʻole-cached. ʻO ka hoʻokuʻu ʻana i nā komo ZIL i kahi vdev LOG ʻokoʻa ʻo ia ka liʻiliʻi o ka paio no IOPS ma kahi mālama mua, kahi e hoʻomaikaʻi ai i ka hana o nā heluhelu a me ke kākau ʻana i kekahi ʻano.

Kiʻi kiʻi

ʻO ke ʻano hana kope-ma-kākau kekahi kumu kūpono no nā kiʻi kiʻi ZFS atomic a me ka hoʻopiʻi asynchronous hou. He kumu lāʻau kuhikuhi ka ʻōnaehana waihona ʻeleu e hōʻailona ana i nā moʻolelo āpau me ka ʻikepili o kēia manawa - ke kiʻi ʻoe i kahi kiʻi, hana ʻoe i kope o kēia kumu kuhikuhi.

Ke hoʻopau ʻia kahi moʻolelo ma ka ʻōnaehana faila ikaika, kākau mua ʻo ZFS i ka mana poloka hou i kahi wahi i hoʻohana ʻole ʻia. Hoʻokaʻawale ia i ka mana kahiko o ka poloka mai ka ʻōnaehana faila o kēia manawa. Akā inā pili kekahi paʻi kiʻi i ka poloka kahiko, ʻaʻole i loli. ʻAʻole e hoʻihoʻi ʻia ka poloka kahiko ma ke ʻano he wahi kaʻawale a pau nā kiʻi paʻi e kuhikuhi ana i kēia poloka!

Hoʻopili hou

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
ʻO kaʻu waihona Steam ma 2015 ʻo 158 GiB a loaʻa iā 126 faila. ʻO kēia kahi kokoke loa i ke kūlana maikaʻi loa no rsync - ZFS replication ma luna o ka pūnaewele "wale nō" 927% wikiwiki.

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Ma ka pūnaewele hoʻokahi, he moʻolelo ʻokoʻa loa ka hoʻopili ʻana i kahi faila kiʻi kiʻi 40GB Windows 7. ʻO ka hoʻopiʻi ZFS he 289 mau manawa ʻoi aku ka wikiwiki ma mua o rsync - a i ʻole "wale" 161 mau manawa wikiwiki inā ʻike ʻoe e kāhea iā rsync me --inplace.

Nā Kumu ZFS: Hoʻopaʻa a me ka hana
Ke hoʻonui ʻia kahi kiʻi VM, hoʻopuka ʻo rsync i ka nui me ia. ʻAʻole nui ka 1,9 TiB no kahi kiʻi VM hou - akā nui ka nui o ka hoʻopiʻi ZFS he 1148 mau manawa ʻoi aku ka wikiwiki ma mua o rsync, ʻoiai me ka rsync's --inplace argument.

Ke hoʻomaopopo ʻoe i ka hana ʻana o nā kiʻi paʻi kiʻi, pono e maʻalahi ke ʻike i ke ʻano o ka hana hou ʻana. No ka mea ʻo kahi kiʻi kiʻi he kumu lāʻau kuhikuhi i nā moʻolelo, e hahai ana inā mākou e hana zfs send snapshot, a laila hoʻouna mākou i kēia lāʻau a me nā moʻolelo āpau e pili ana me ia. Ke hoʻouna mākou i kēia zfs send в zfs receive ma ka pahu hopu, kākau ʻo ia i nā ʻike maoli o ka poloka a me ka lāʻau o nā kuhikuhi e kuhikuhi ana i nā poloka i ka ʻikepili helu.

ʻOi aku ka hoihoi o nā mea ma ka lua zfs send. Loaʻa iā mākou ʻelua ʻōnaehana, aia kēlā me kēia poolname/datasetname@1, a lawe ʻoe i kahi paʻi kiʻi hou poolname/datasetname@2. No laila, i loko o ka wai puna mua iā ʻoe datasetname@1 и datasetname@2, a i loko o ka loko i hoʻopaʻa ʻia i kēia manawa wale nō ka paʻi kiʻi mua datasetname@1.

No ka mea, loaʻa iā mākou kahi kiʻi maʻamau ma waena o ke kumu a me ka pahuhopu datasetname@1, hiki iā mākou ke hana hoʻonui zfs send ma luna o laila. Ke ʻōlelo mākou i ka ʻōnaehana zfs send -i poolname/datasetname@1 poolname/datasetname@2, hoʻohālikelike ia i ʻelua lāʻau kuhikuhi. ʻO nā kuhikuhi i loaʻa wale i loko @2, ʻike maopopo i nā poloka hou - no laila pono mākou i nā mea o kēia mau poloka.

Ma kahi ʻōnaehana mamao, e hana ana i kahi hoʻonui send e like me ka maʻalahi. E kākau mua mākou i nā mea hou a pau i komo i loko o ke kahawai send, a laila hoʻohui i nā kuhikuhi i kēlā mau poloka. Voila, loaʻa iā mākou @2 i ka ʻōnaehana hou!

ʻO ka ZFS asynchronous incremental replication kahi hoʻomaikaʻi maikaʻi loa ma mua o nā ala kumu ʻole-snapshot e like me rsync. I nā hihia ʻelua, hoʻololi wale ʻia ka ʻikepili i hoʻololi ʻia - akā pono ka rsync ma mua heluhelu mai ka diski nā ʻikepili a pau ma nā ʻaoʻao ʻelua e nānā i ka huina a hoʻohālikelike. ʻO ka ʻokoʻa, ʻaʻole heluhelu ka ZFS replication i nā kumu lāʻau kuhikuhi - a me nā poloka ʻaʻole i loaʻa i ka snapshot kaʻana like.

Hoʻopili i kūkulu ʻia

Hoʻomaʻamaʻa ka ʻōnaehana kope-ma-kākau i ka ʻōnaehana hoʻopili inline. I loko o kahi ʻōnaehana waihona kuʻuna, pilikia ka hoʻopiʻi - ʻo ka mana kahiko a me ka mana hou o ka ʻikepili i hoʻololi ʻia e noho ana ma kahi like.

Inā mākou e noʻonoʻo i kahi ʻāpana o ka ʻikepili i waenakonu o kahi faila e hoʻomaka ana i ke ola ma ke ʻano he megabyte o nā zeros mai 0x00000000 a pēlā aku, he mea maʻalahi loa ke kaomi ʻana i hoʻokahi māhele ma ka disk. Akā he aha ka hopena inā mākou e hoʻololi i kēlā megabyte o nā zeros me kahi megabyte o ka ʻikepili incompressible e like me JPEG a i ʻole pseudo-random noise? ʻAʻole i manaʻo ʻia, e koi ʻia kēia megabyte o ka ʻikepili i hoʻokahi, akā 256 4 KiB mau ʻāpana, a ma kēia wahi ma ka disk hoʻokahi wale nō ʻāpana i mālama ʻia.

ʻAʻohe pilikia o ZFS, no ka mea, ua kākau mau ʻia nā moʻolelo i hoʻololi ʻia i kahi wahi i hoʻohana ʻole ʻia - noho wale ka poloka kumu i hoʻokahi ʻāpana 4 KiB, a e noho ka moʻolelo hou i 256, akā ʻaʻole kēia pilikia - kahi ʻāpana i hoʻololi hou ʻia mai ka " waena" o ka faila e kākau ʻia i kahi wahi i hoʻohana ʻole ʻia inā ʻaʻole i loli ka nui a ʻaʻole paha, no laila no ZFS he kūlana maʻamau kēia.

Hoʻopau ʻia ka hoʻokē ʻai ʻana o ZFS maoli ma ka paʻamau, a hāʻawi ka ʻōnaehana i nā algorithms pluggable-i kēia manawa ʻo LZ4, gzip (1-9), LZJB, a me ZLE.

  • LZ4 ʻO ia kahi algorithm streaming e hāʻawi ana i ka wikiwiki wikiwiki a me ka decompression a me nā pono hana no ka hapa nui o nā hihia - ʻoiai ma nā CPU lohi.
  • GZIP He algorithm mahalo i ʻike a aloha ʻia e nā mea hoʻohana Unix āpau. Hiki ke hoʻokō ʻia me nā pae compression 1-9, me ka hoʻonui ʻana o ka hoʻohana ʻana a me ka hoʻohana ʻana i ka CPU i ka hoʻokokoke ʻana i ka pae 9. Ua kūpono ka algorithm no nā mea hoʻohana āpau (a i ʻole nā ​​​​mea hoʻohana ʻē aʻe i hoʻopaʻa ʻia), akā i ʻole ke kumu pinepine i nā pilikia CPU − hoʻohana iā ia. me ka mālama pono, ʻoi aku ma nā pae kiʻekiʻe.
  • LZJB ʻo ia ka algorithm mua ma ZFS. Hoʻopau ʻia a ʻaʻole pono e hoʻohana hou ʻia, ʻoi aku ka LZ4 ma nā ʻano āpau.
  • INO - ka hoʻopaʻa ʻana o ka pae ʻole, ka hoʻopaʻa ʻana o ka pae ʻole. ʻAʻole ia e hoʻopā i ka ʻikepili maʻamau, akā kaomi i nā kaʻina nui o nā zeros. Maikaʻi no nā ʻikepili hiki ʻole ke hoʻopiʻi ʻia (e like me JPEG, MP4, a i ʻole nā ​​palapala ʻē aʻe i hoʻopaʻa ʻia) no ka mea ʻaʻole ia e nānā i ka ʻikepili hiki ʻole ke hoʻopiʻi akā e hoʻopaʻa i kahi i hoʻohana ʻole ʻia i nā moʻolelo i hopena.

Paipai mākou i ka hoʻopili ʻana i ka LZ4 no nā hihia hoʻohana a pau; ʻo ka hoʻopaʻi hana i ka wā e hālāwai ai i ka ʻikepili incompressible he liʻiliʻi loa, a ulu ana He mea nui ka hana no ka ʻikepili maʻamau. Ke kope ʻana i kahi kiʻi mīkini virtual no kahi hoʻokomo hou o ka ʻōnaehana hana Windows (OS i hoʻokomo hou ʻia, ʻaʻohe ʻikepili i loko) me compression=lz4 hala 27% wikiwiki ma mua o compression=none, i loko keia hoao ana ma 2015.

ARC - kahi hoʻololi hoʻololi hoʻololi

ʻO ZFS wale nō ka ʻōnaehana faila hou a mākou i ʻike ai e hoʻohana ana i kāna ʻano hana cache heluhelu ponoʻī, ma mua o ka hilinaʻi ʻana i ka cache ʻaoʻao o ka ʻōnaehana hana e mālama i nā kope o nā poloka i heluhelu ʻia i ka RAM.

ʻOiai ʻaʻole pilikia ka cache maoli - ʻaʻole hiki iā ZFS ke pane i nā noi hoʻokaʻawale hoʻomanaʻo hou e like me ka kernel, no laila ka luʻi hou. malloc() ʻAʻole hiki ke hāʻawi ʻia i ka hoʻomanaʻo inā pono ia i ka RAM i noho ʻia e ARC. Akā aia nā kumu maikaʻi e hoʻohana ai i kāu huna huna, ma ka liʻiliʻi loa i kēia manawa.

ʻO nā ʻōnaehana hana hou a pau i ʻike ʻia, me MacOS, Windows, Linux a me BSD, e hoʻohana i ka algorithm LRU (Least Recently Used) e hoʻokō i ka cache ʻaoʻao. ʻO kēia kahi algorithm mua e koi ana i ka poloka cache "up the queue" ma hope o kēlā me kēia heluhelu ʻana, a kaomi i nā poloka "i lalo i ka pila" e like me ka mea e pono ai e hoʻohui i nā hala huna hou (nā poloka i heluhelu ʻia mai ka disk, ʻaʻole mai ka cache) i luna.

Maikaʻi ka hana algorithm, akā ma nā ʻōnaehana me nā ʻikepili hana nui, alakaʻi maʻalahi ʻo LRU i ka thrashing - e kipaku i nā poloka i makemake pinepine ʻia e hoʻokaʻawale i nā poloka ʻaʻole e heluhelu hou ʻia mai ka cache.

Kōkua ʻO kahi algorithm haʻahaʻa haʻahaʻa loa i hiki ke noʻonoʻo ʻia he "kaumaha" cache. I kēlā me kēia manawa e heluhelu ʻia kahi poloka i hūnā ʻia, ʻoi aku ka "kaumaha" a paʻakikī hoʻi e kipaku - a ma hope o ka hoʻokuke ʻana i kahi poloka hahai ʻia i loko o kekahi manawa. ʻO kahi poloka i kipaku ʻia akā pono e heluhelu hou ʻia i loko o ka cache e lilo i "kaumaha".

ʻO ka hopena o kēia mau mea a pau, he cache me ka lakene kiʻekiʻe aʻe, ka ratio ma waena o nā hits cache (heluhelu i hana ʻia mai ka cache) a me nā hala huna (heluhelu mai ka disk). He helu koʻikoʻi nui kēia - ʻaʻole wale ka paʻi ʻana o ka cache iā lākou iho i nā kauoha o ka nui o ka wikiwiki, hiki ke lawelawe wikiwiki ʻia nā hala huna, no ka mea ʻoi aku ka nui o ka cache i laila, ʻo ka liʻiliʻi o nā noi disk concurrent a me ka haʻahaʻa o ka latency no kēlā mau hala i koe. pono e lawelawe me ka diski.

hopena

Ma hope o ke aʻo ʻana i nā semantics maʻamau o ZFS - pehea e hana ai ke kope-ma-kākau, a me nā pilina ma waena o nā loko waihona, nā mea uila, nā poloka, nā ʻāpana, a me nā faila - ua mākaukau mākou e kūkākūkā i ka hana maoli me nā helu maoli.

Ma ka ʻaoʻao aʻe, e nānā mākou i ka hana maoli o nā loko me nā vdevs mirrored a me RAIDz, kūʻē kekahi i kekahi, a me nā topologies Linux kernel RAID kahiko a mākou i ʻimi ai. mamua.

I ka wā mua, makemake mākou e uhi wale i nā kumu - nā topologies ZFS iā lākou iho - akā ma hope pela E hoʻomākaukau kākou e kamaʻilio e pili ana i ka hoʻonohonoho holomua a me ke kani ʻana o ZFS, me ka hoʻohana ʻana i nā ʻano vdev kōkua e like me L2ARC, SLOG a me Special Allocation.

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka