Ama-Hacks okusebenza ngenani elikhulu lamafayela amancane

Umbono walesi sihloko wazalwa ngokuzenzakalelayo kusukela engxoxweni emazwaneni kuya esihlokweni "Okuthile mayelana ne-inode".

Ama-Hacks okusebenza ngenani elikhulu lamafayela amancane

Iqiniso liwukuthi ukucaciswa kwangaphakathi kwezinsizakalo zethu ukugcinwa kwenombolo enkulu yamafayela amancane. Okwamanje sinamakhulu ama-terabyte edatha enjalo. Futhi sithole amaraki asobala futhi angabonakali futhi sawazulazula ngempumelelo.

Ngakho-ke, ngihlanganyela isipiliyoni sethu, mhlawumbe kuyoba usizo kumuntu.

Inkinga yokuqala: “Asikho isikhala esisele kudivayisi”

Njengoba kushiwo esihlokweni esishiwo ngenhla, inkinga ukuthi kukhona amabhlogo wamahhala ohlelweni lwefayela, kodwa i-inode isiphelile.

Ungahlola inombolo yama-inode asetshenzisiwe namahhala ngomyalo df -ih:

Ama-Hacks okusebenza ngenani elikhulu lamafayela amancane

Ngeke ngiphinde ngikhulume i-athikili; ngamafuphi, idiski iqukethe kokubili amabhlogo wedatha ngokwayo kanye namabhulokhi olwazi lwe-meta, owaziwa nangokuthi ama-inodes (i-index node). Inombolo yabo isethwe lapho isistimu yefayela iqalwa (sikhuluma nge-ext2 nabalandelayo) futhi ayishintshi ngokuqhubekayo. Ibhalansi yamabhulokhi wedatha nama-inode ibalwa kusukela kudatha yezibalo ezimaphakathi, kodwa esimweni sethu, lapho kunamafayela amaningi amancane, ibhalansi kufanele ishintshele enanini lama-inode - kufanele kube ngaphezulu kwawo.

В Linux Sesivele sinikeze izinketho ezinezilinganiso ezahlukene, futhi zonke lezi zilungiselelo ezibalwe kusengaphambili zisefayeleni /etc/mke2fs.conf.
Ngakho-ke, ngesikhathi sokuqaliswa kokuqala kohlelo lwefayela nge- mke2fs, ungacacisa iphrofayili oyifunayo.

Nazi ezinye izibonelo ezivela kufayela:

    small = {
        blocksize = 1024
        inode_size = 128
        inode_ratio = 4096
    }

    big = {
        inode_ratio = 32768
    }

    largefile = {
        inode_ratio = 1048576
        blocksize = -1
    }

Ungakhetha uhlobo olufiswayo lokusebenzisa usebenzisa inketho ethi “-T” lapho ushayela i-make2fs. Ungakwazi futhi ukusetha ngokwakho imingcele edingekayo uma singekho isisombululo esenziwe ngomumo.

Imininingwane eyengeziwe ichazwe kumanyuwali we mke2fs.conf и mke2fs.

Isici esingathintwanga esihlokweni esishiwo ngenhla ukuthi ungasetha usayizi webhulokhi yedatha. Ngokusobala, kumafayela amakhulu kunengqondo ukuba nosayizi webhulokhi enkulu, kumafayela amancane kunengqondo ukuba nencane.

Kodwa-ke, kufanelekile ukucatshangelwa isici esithokozisayo njengokwakhiwa kweprosesa.
Ngake ngacabanga ukuthi kumafayela ezithombe ezinkulu ngidinga usayizi webhulokhi omkhulu. Kwenzeke ekhaya, esitolo samafayela asekhaya okuthiwa i-WD ekwakhiweni kwe-ARM. Ngaphandle kokungabaza, ngimise usayizi webhulokhi ku-8k noma ku-16k esikhundleni se-4k evamile, ngilinganisele ukonga ngaphambilini. Futhi konke kwakuhamba kahle kuze kube yilapho isitoreji ngokwaso sihluleka, futhi idiski yayisaphila. Ngemva kokufaka idiski kukhompyutha evamile enephrosesa ye-Intel evamile, ngathola isimanga: usayizi webhulokhi ongasekelwe. Sesifikile. Idatha ikhona, konke kuhamba kahle, kodwa akunakwenzeka ukufunda. i386 kanye namaphrosesa afanayo awakwazi ukusebenza namabhulokhi osayizi abangahambisani nosayizi wekhasi lememori, elingu-4k ncamashi. Ngokuvamile, udaba lwaphela ngokusetshenziswa kwezinsiza ezivela endaweni yomsebenzisi, yonke into yayihamba kancane futhi idabukisa, kodwa idatha yalondolozwa. Uma kukhona onentshisekelo, google igama lensiza fuseext2. Ukuziphatha: cabanga ngawo wonke amacala kusenesikhathi, noma ungazenzi iqhawe futhi usebenzise izilungiselelo ezijwayelekile zamakhosikazi asekhaya.

I-UPD. Ngokusho kokuphawula komsebenzisi berez Ngingathanda ukucacisa ukuthi ku-i386 usayizi we-block akufanele udlule i-4k, kodwa akudingekile ukuba ube ncamashi 4k, i.e. 1k kanye no-2k kwamukelekile.

Pho, sizixazulule kanjani izinkinga?

Okokuqala, sihlangabezane nenkinga lapho idiski ye-multi-terabyte igcwele idatha, futhi asikwazanga ukushintsha ukucushwa kwesistimu yefayela.

Okwesibili, isixazululo sasidingeka ngokushesha.

Ngenxa yalokho, sifinyelele esiphethweni sokuthi sidinga ukushintsha ibhalansi ngokunciphisa inani lamafayela.
Ukuze kuncishiswe inani lamafayela, kunqunywe ukuthi kufakwe amafayela endaweni eyodwa yomlando evamile. Ngokucabangela imininingwane yethu, sibeka wonke amafayela esikhathini esithile endaweni yomlando eyodwa, futhi senze ukugcinwa kungobo yomlando sisebenzisa umsebenzi we-cron njalo ebusuku.

Ingobo yomlando ye-zip ekhethiwe. Emazwaneni esihlokweni esandulele, i-tar yaphakanyiswa, kodwa kunobunzima obubodwa ngayo: ayinalo uhlu lokuqukethwe, futhi amafayela akulo agcinwe emfudlaneni (akukona nje ukuthi elithi “tiyela” liyisifinyezo. kwe-"Tape Archive", ifa lamadrayivu etheyiphu), okungukuthi. uma udinga ukufunda ifayela ekugcineni kwengobo yomlando, udinga ukufunda yonke ingobo yomlando, njengoba ingenawo ama-offset wefayela ngalinye ngokuhlobene nasekuqaleni kwengobo yomlando. Ngakho-ke kuwumsebenzi omude. Konke kungcono kakhulu ku-zip: inethebula elifanayo lokuqukethwe kanye nokususwa kwefayela ngaphakathi kwengobo yomlando, futhi isikhathi sokufinyelela kufayela ngalinye asincikile endaweni yalo. Nokho, esimweni sethu, kwakungenzeka ukusetha inketho yokucindezela ku- "0", ngoba wonke amafayela asevele ecindezelwe ku-gzip kusengaphambili.

Amaklayenti athola amafayela nge-nginx, futhi ngokusho kwe-API endala, igama lefayela licacisiwe, isibonelo kanje:

http://www.server.com/hydra/20170416/0453/3bd24ae7-1df4-4d76-9d28-5b7fcb7fd8e5

Ukuqaqa amafayela kumpukane, sithole futhi saxhuma imojula ye-nginx-unzip-module (https://github.com/youzee/nginx-unzip-module) futhi usethe imifula emibili ekhuphukayo.

Umphumela kube ukucushwa okulandelayo:

Ama-Hacks okusebenza ngenani elikhulu lamafayela amancane

Abasingathi ababili kuzilungiselelo babukeka kanjena:

server {
  listen *:8081;

  location / {
    root      /home/filestorage;
  }
}

server {
  listen *:8082;

  location ~ ^/hydra/(d+)/(d+)/(.*)$ {
    root      /home/filestorage;
    file_in_unzip_archivefile "/home/filestorage/hydra/$1/$2.zip";
    file_in_unzip_extract "$2/$3";
    file_in_unzip;
  }
}

Futhi ukucushwa komfula phezulu ku-nginx ekhuphukayo:

upstream storage {
  server server.com:8081;
  server server.com:8082;
}

Isebenza kanjani:

  • Iklayenti liya ngaphambili nginx
  • I-Front nginx izama ukuhlinzeka ifayela kusukela kowokuqala okhuphuka nomfula, i.e. ngokuqondile ohlelweni lwefayela
  • Uma lingekho ifayela, izama ukuthumela isuka kweyesibili ekhuphukayo, ezama ukuthola ifayela ngaphakathi kwengobo yomlando.

Inkinga yesibili: futhi “Asikho isikhala esisele kudivayisi”

Lena inkinga yesibili esihlangabezane nayo lapho kunamafayela amaningi ohlwini lwemibhalo.
Sizama ukwakha ifayela, uhlelo lukhala ngokuthi asikho isikhala. Sishintsha igama lefayela bese sizama ukulenza futhi.

Kuvele.

Kubukeka kanjena:

Ama-Hacks okusebenza ngenani elikhulu lamafayela amancane

Ukuhlola ama-inode akuzange kunikeze lutho - kuneziningi zamahhala.
Ukuhlola indawo kuyafana.
Sicabange ukuthi kungase kube namafayela amaningi kakhulu ohlwini lwemibhalo, futhi kunomkhawulo kulokhu, kodwa futhi cha: Inombolo enkulu yamafayela umkhombandlela ngamunye: ~1.3 × 10^20

Yebo, futhi ungakha ifayela uma ushintsha igama.
Isiphetho - inkinga isegameni lefayela.

Ukusesha okwengeziwe kubonise ukuthi inkinga iku-algorithm ye-hashing lapho kwakhiwa inkomba yemibhalo; ngenani elikhulu lamafayela, ukungqubuzana kuyabonwa nayo yonke imiphumela elandelayo. Ungafunda imininingwane eyengeziwe lapha: https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Hash_Tree_Directories

Ungakwazi ukukhubaza le nketho, kodwa... ukusesha ifayela ngegama kungase kube isikhathi eside ngendlela engaqondakali lapho usesha wonke amafayela.

 tune2fs -O "^dir_index" /dev/sdb3

Ngokuvamile, lesi yisixazululo sesikhashana esingase sisebenze.

Ukuziphatha: amafayela amaningi kunkomba ngokuvamile ayinto embi. Asikho isidingo sokwenza lokhu.

Ngokuvamile ezimweni ezinjalo, izinkomba ezifakwe esidlekeni zidalwa ngezinhlamvu zokuqala zegama lefayela noma ngamanye amapharamitha, ngokwesibonelo, ngezinsuku, ezimweni eziningi lokhu kusindisa usuku.
Kodwa inani eliphelele lamafayela amancane liselibi, ngisho noma lihlukaniswe ngezinkomba - bese ubona inkinga yokuqala.

Inkinga yesithathu: indlela yokubuka uhlu lwamafayela uma emaningi wawo

Esimweni sethu, lapho sinamafayela amaningi, ngandlela thile sibhekane nenkinga yokuthi singabuka kanjani okuqukethwe yinkomba.

Isixazululo esijwayelekile - umyalo ls.
Kulungile, ake sibone ukuthi kwenzekani kumafayela angu-4772098:


$ time ls /home/app/express.repository/offercache/ >/dev/null

real	0m30.203s
user	0m28.327s
sys	0m1.876s

Imizuzwana engu-30... izoba miningi kakhulu. Ngaphezu kwalokho, isikhathi esiningi sisetshenziselwa ukucubungula amafayela endaweni yomsebenzisi, futhi hhayi nhlobo ekusebenzeni kwe-kernel.

Kodwa likhona ikhambi:


$ time find /home/app/express.repository/offercache/ >/dev/null

real	0m3.714s
user	0m1.998s
sys	0m1.717s

3 imizuzwana. izikhathi ezingu-10 ngokushesha.
Hooray!

I-UPD.

Isixazululo esisheshayo esiphuma kumsebenzisi berez - khubaza ukuhlunga ls


time ls -U /home/app/express.repository/offercache/ >/dev/null
real	0m2.985s
user	0m1.377s
sys	0m1.608s

Inkinga yesine: LA enkulu lapho usebenza ngamafayela

Ngezikhathi ezithile kuvela isimo lapho udinga ukukopisha inqwaba yamafayela usuka komunye umshini uye komunye. Ngesikhathi esifanayo, i-LA ivame ukukhula ngokungenangqondo, ngoba konke kuncike ekusebenzeni kwamadiski ngokwawo.

Into ehlakaniphe kakhulu ofuna ukuyenza ukusebenzisa i-SSD. Kupholile ngempela. Okuwukuphela kombuzo yizindleko zama-multi-terabyte SSD.

Kodwa uma ama-disks ajwayelekile, udinga ukukopisha amafayela, futhi lokhu kuyisistimu yokukhiqiza, lapho ukulayisha ngokweqile kuholela ekubabazeni okunganeliseki okuvela kumakhasimende? Kukhona okungenani amathuluzi amabili awusizo: nice и ionice.

nice - kunciphisa ukubaluleka kwenqubo, ngokufanele umhleli usabalalisa izingcezu zesikhathi eziningi kwezinye, izinqubo ezibaluleke kakhulu.
Ekusebenzeni kwethu, kwasiza ukusetha okuhle kuye phezulu (19 iyona ebaluleke kakhulu, -20 (minus 20) iyona ephezulu).

ionice — lungisa ukuhlela kwe-I/O ngokufanele

Uma usebenzisa i-RAID futhi kungazelelwe idinga ukuvumelanisa (ngemuva kokuqalisa kabusha okungaphumelelanga noma udinga ukubuyisela uhlu lwe-RAID ngemuva kokufaka idiski), khona-ke kwezinye izimo kunengqondo ukunciphisa isivinini sokuvumelanisa ukuze ezinye izinqubo zisebenze kakhudlwana. noma ngaphansi ngokwanele. Umyalo olandelayo uzosiza kulokhu:


echo 1000 > /proc/sys/dev/raid/speed_limit_max

Inkinga yesihlanu: Ungawavumelanisa kanjani amafayela ngesikhathi sangempela

Sisenenombolo efanayo enkulu yamafayela adinga ukugcinwa isipele kuseva yesibili ukuze ugweme... Amafayela abhalwa njalo, ngakho-ke ukuze ube nokulahlekelwa okuncane, udinga ukuwakopisha ngokushesha ngangokunokwenzeka.

Isixazululo esijwayelekile: Rsync phezu kwe-SSH.

Lena inketho enhle ngaphandle uma udinga ukukwenza njalo ngemizuzwana embalwa. Futhi kunamafayela amaningi. Ngisho noma ungazikopishi, usadinga ukuqonda ngandlela-thile ukuthi yini eshintshile, futhi ukuqhathanisa amafayela ayizigidi ezimbalwa kuthatha isikhathi nokulayisha kumadiski.

Labo. sidinga ukwazi ngokushesha ukuthi yini okufanele ikopishwe, ngaphandle kokusebenzisa ukuqhathanisa ngaso sonke isikhathi.

Ukuhlenga - lsyncd. Lsyncd - Ukuvumelanisa Okubukhoma (Isibuko) I-Daemon. Isebenza futhi nge-rsync, kodwa ngaphezu kwalokho iqapha isistimu yefayela ukuze ithole izinguquko isebenzisa inotify kanye ne-fsevents futhi iqala ukukopisha lawo mafayela avele noma ashintshile.

Inkinga yesithupha: indlela yokuqonda ukuthi ubani olayisha amadiski

Cishe wonke umuntu uyazi lokhu, kodwa nokho, ukuqedela isithombe: kunomyalo wokuqapha uhlelo olungaphansi lwediski iotop - njengokuthi top, kodwa ikhombisa izinqubo ezisetshenziswa kakhulu amadiski.

Ama-Hacks okusebenza ngenani elikhulu lamafayela amancane

Ngendlela, i-top endala enhle ikuvumela ukuthi uqonde ukuthi kunenkinga ngamadiski noma cha. Kunezinketho ezimbili ezifaneleka kakhulu zalokhu: Layisha Isilinganiso и IOwait.

Ama-Hacks okusebenza ngenani elikhulu lamafayela amancane

Esokuqala sibonisa ukuthi zingaki izinqubo ezisemgqeni wesevisi, ngokuvamile ezingaphezu kwezi-2 - kukhona osekungahambi kahle. Uma sikopishela kumaseva ayisipele, sivumela kufika ku-6-8, ngemva kwalokho isimo sithathwa njengesingavamile.

Okwesibili ukuthi iphrosesa imatasa kangakanani ngokusebenza kwediski. I-IOwait >10% iyimbangela yokukhathazeka, nakuba kumaseva ethu anephrofayili ethile yokulayisha ihlale ingu-40-50%, futhi lokhu kujwayelekile ngempela.

Ngizoqeda lapha, nakuba cishe kunamaphuzu amaningi esingakaze sibhekane nawo, ngizokujabulela ukulinda ukuphawula nezincazelo zamacala angempela athakazelisayo.

Source: www.habr.com

Thenga ukusingathwa okuthembekile kwamasayithi anokuvikelwa kwe-DDoS, amaseva e-VPS VDS 🔥 Thenga ukusingathwa kwewebhusayithi okuthembekile ngokuvikelwa kwe-DDoS, amaseva e-VPS VDS | ProHoster