Isitoreji Sedatha Esiqinile kanye nama-API wefayela le-Linux

Ngenkathi ngicwaninga ukusimama kokugcinwa kwedatha kumasistimu wamafu, nginqume ukuzihlola ukuze ngiqiniseke ukuthi ngiyaziqonda izinto eziyisisekelo. I iqale ngokufunda imininingwane ye-NVMe ukuze uqonde ukuthi yiziphi iziqinisekiso mayelana nokugcinwa kwedatha okuqhubekayo (okungukuthi, okuqinisekisa ukuthi idatha izotholakala ngemva kokuhluleka kwesistimu) sinikeze amadiski e-NMVe. Ngenze iziphetho eziyinhloko ezilandelayo: idatha kufanele ibhekwe njengelimele kusukela ngesikhathi lapho umyalo wokubhala idatha unikezwa kuze kube yilapho kubhalwa khona endaweni yokugcina. Nokho, izinhlelo eziningi zisebenzisa izingcingo zesistimu ngenjabulo ukurekhoda idatha.

Kulokhu okuthunyelwe, ngihlola izindlela zokugcina eziqhubekayo ezinikezwa ama-API wefayela le-Linux. Kubonakala sengathi konke kufanele kube lula lapha: uhlelo lubiza umyalo write(), futhi ngemva kokuqeda lo myalo, idatha izogcinwa ngokuphephile kudiski. Kodwa write() ikopisha kuphela idatha yohlelo lokusebenza kunqolobane ye-kernel etholakala ku-RAM. Ukuze uphoqe uhlelo ukuthi lubhale idatha kudiski, udinga ukusebenzisa ezinye izindlela ezengeziwe.

Isitoreji Sedatha Esiqinile kanye nama-API wefayela le-Linux

Sekukonke, lokhu okuqukethwe kuyiqoqo lamanothi ahlobene nalokho engikufundile ngesihloko engithakaselayo. Uma sikhuluma kafushane kakhulu ngento ebaluleke kakhulu, kuvela ukuthi ukuze uhlele ukugcinwa kwedatha esimeme udinga ukusebenzisa umyalo. fdatasync() noma uvule amafayela ngefulegi O_DSYNC. Uma ungathanda ukufunda kabanzi mayelana nokuthi kwenzekani kudatha esendleleni esuka kukhodi iye kwidiski, bheka lokhu isihloko.

Izici zokusebenzisa umsebenzi wokubhala ()

Ucingo lwesistimu write() kuchazwe ezingeni IEEE POSIX njengomzamo wokubhala idatha kusichazi sefayela. Ngemva kokuphothula ngempumelelo write() Imisebenzi yokufunda idatha kufanele ibuyisele amabhayithi ayebhalwe ngaphambilini, yenze lokhu ngisho noma idatha ifinyelelwa kwezinye izinqubo noma imicu (lapha isigaba esifanele sezinga le-POSIX). kuyinto, esigabeni sokuthi uchungechunge lusebenzisana kanjani nokusebenza kwefayela okuvamile, kukhona inothi elithi uma imicu emibili ngayinye ibiza le misebenzi, khona-ke ucingo ngalunye kufanele lubone noma yonke imiphumela emisiwe yolunye ucingo, noma lungabikho nhlobo. imiphumela. Lokhu kuholela esiphethweni sokuthi yonke imisebenzi ye-I/O yefayela kufanele ibambe isikhiya esisetshenziswa abasebenza kuso.

Ingabe lokhu kusho ukuthi ukuhlinzwa write() ingabe i-athomu? Ngokombono wezobuchwepheshe, yebo. Imisebenzi yokufunda idatha kufanele ibuyise konke noma lutho kwalokho okubhalwe ngakho write(). Kodwa ukuhlinzwa write(), ngokwendinganiso, akudingekile ukuba iphethe ngokubhala phansi yonke into ecelwe ukuba ibhale phansi. Uvunyelwe ukubhala ingxenye kuphela yedatha. Isibonelo, singase sibe nemicu emibili ngayinye ehlanganisa amabhayithi angu-1024 efayeleni elichazwe ngesichazi sefayela esifanayo. Ngokombono wezinga, umphumela owamukelekayo uzoba lapho umsebenzi ngamunye wokubhala unganezela ibhayithi eyodwa kuphela efayeleni. Le misebenzi izohlala i-athomu, kodwa ngemva kokuba isiqediwe, idatha abayibhale efayelini izoxutshwa. Lapha ingxoxo ethakazelisa kakhulu ngalesi sihloko sokuchichima kwesitaki.

fsync() kanye ne-fdatasync() imisebenzi

Indlela elula yokukhipha idatha kudiski ukushayela umsebenzi fsync(). Lo msebenzi ucela isistimu yokusebenza ukuthi idlulisele wonke amabhulokhi aguquliwe ukusuka kunqolobane kuya kudiski. Lokhu kuhlanganisa yonke imethadatha yefayela (isikhathi sokufinyelela, isikhathi sokushintsha ifayela, njalo njalo). Ngikholwa ukuthi le metadata ayidingeki kangako, ngakho-ke uma wazi ukuthi ayibalulekile kuwe, ungasebenzisa umsebenzi. fdatasync(). I Usizo on fdatasync() kuthiwa ngesikhathi sokusebenza kwalo msebenzi, inani elinjalo lemethadatha ligcinwa kudiski β€œelidingekayo ukuze kwenziwe kahle le misebenzi elandelayo yokufunda idatha.” Futhi yilokhu kanye izinhlelo zokusebenza eziningi ezikukhathalelayo.

Inkinga eyodwa engavela lapha ukuthi lezi zindlela aziqinisekisi ukuthi ifayela lizotholakala ngemva kokwehluleka okungenzeka. Ikakhulukazi, lapho udala ifayela elisha, udinga ukushayela fsync() kuhla lwemibhalo oluqukethe. Uma kungenjalo, ngemva kokwehluleka, kungase kuvele ukuthi leli fayela alikho. Isizathu salokhu ukuthi ku-UNIX, ngenxa yokusetshenziswa kwezixhumanisi eziqinile, ifayela lingaba khona kuma-directory amaningi. Ngakho-ke, lapho ufona fsync() ayikho indlela yokuthi ifayela lazi ukuthi iyiphi idatha yohlu lwemibhalo okufanele futhi ihanjiswe kudiski (lapha Ungafunda kabanzi ngalokhu). Kubukeka sengathi uhlelo lwefayela lwe-ext4 luyakwazi ngokuzenzakalelayo faka isicelo fsync() ezinhlwini zemibhalo eziqukethe amafayela ahambisanayo, kodwa lokhu kungase kungabi njalo kwamanye amasistimu wefayela.

Le nqubo ingase isetshenziswe ngendlela ehlukile kumasistimu wefayela ahlukene. ngisebenzise blktrace ukuze ufunde mayelana nokuthi iyiphi imisebenzi yediski esetshenziswa ezinhlelweni zefayela le-ext4 ne-XFS. Zombili zikhipha imiyalo evamile yokubhala kudiski yakho kokubili okuqukethwe kwefayela kanye nejenali yesistimu yefayela, sula inqolobane, futhi uphume ngokwenza i-FUA (Force Unit Access, ukubhala idatha ngokuqondile kudiski, ukweqa inqolobane) bhalela kujenali. Cishe bakwenza lokhu ukuze baqinisekise ukuthi ukuthengiselana kwenzeke. Kumadrayivu angayisekeli i-FUA, lokhu kubangela ukuguquguquka kwenqolobane okubili. Ukuhlola kwami ​​​​kubonise lokho fdatasync() ngokushesha kancane fsync(). Isisetshenziswa blktrace ikhombisa lokho fdatasync() ngokuvamile ibhala idatha encane kudiski (ku-ext4 fsync() ubhala 20 KB, futhi fdatasync() 16KiB). Futhi, ngithole ukuthi i-XFS ishesha kancane kune-ext4. Futhi lapha ngosizo blktrace ukwazile ukuthola lokho fdatasync() isusa idatha encane kudiski (4 KB ku-XFS).

Izimo ezingacacile eziphakamayo uma usebenzisa i-fsync()

Ngingacabanga ngezimo ezintathu ezingacacile mayelana fsync()engihlangane nakho ekusebenzeni.

Icala lokuqala elinjalo lenzeka ngo-2008. Khona-ke isixhumi esibonakalayo seFirefox 3 simile uma inani elikhulu lamafayela libhalelwe kudiski. Inkinga yayiwukuthi ukuqaliswa kwesixhumi esibonakalayo kusebenzisa isizindalwazi se-SQLite ukugcina ulwazi mayelana nesimo sayo. Ngemuva koshintsho ngalunye olwenzekile kusixhumi esibonakalayo, umsebenzi wabizwa fsync(), enikeze iziqinisekiso ezinhle zokugcina idatha ezinzile. Kuhlelo lwefayela le-ext3 bese kusetshenziswa, umsebenzi fsync() ilahle wonke amakhasi "angcolile" ohlelweni kudiski, hhayi lawo ahlobene nefayela elihambisanayo. Lokhu kwakusho ukuthi ukuchofoza inkinobho kuFirefox kungase kuqalise amamegabhayithi edatha ukuthi abhalwe kudiski kazibuthe, okungathatha imizuzwana eminingi. Isixazululo senkinga, ngokwazi kwami salokhu impahla bekuwukudlulisa umsebenzi nesizindalwazi emisebenzini engemuva engavumelanisi. Lokhu kusho ukuthi iFirefox ngaphambilini isebenzise izidingo eziqinile zokulondoloza kunalokho obekudingeka ngempela, futhi izici zohlelo lwefayela le-ext3 zenze le nkinga yaba yimbi nakakhulu.

Inkinga yesibili yenzeka ngo-2009. Kwathi ngemva kokuphahlazeka kwesistimu, abasebenzisi bohlelo olusha lwefayela le-ext4 babhekane neqiniso lokuthi amafayela amaningi asanda kwakhiwa anobude obuyiziro, kodwa lokhu akwenzekanga ngohlelo oludala lwefayela le-ext3. Esigabeni esandulele, ngikhulume ngendlela i-ext3 exosha ngayo idatha eningi kudiski, ebambezela izinto kakhulu. fsync(). Ukuze uthuthukise isimo, ku-ext4 kuphela lawo makhasi angcolile ahambisana nefayela elithile athululelwa kudiski. Futhi idatha evela kwamanye amafayela ihlala enkumbulweni isikhathi eside kakhulu kune-ext3. Lokhu kwenzelwa ukuthuthukisa ukusebenza (ngokuzenzakalelayo, idatha ihlala kulesi simo imizuzwana engama-30, ungamisa lokhu usebenzisa amasenti_angcolile_aphelelwa yisikhathi; lapha Ungathola izinto ezengeziwe mayelana nalokhu). Lokhu kusho ukuthi inani elikhulu ledatha lingalahleka ngokungenakulungiseka ngemva kokwehluleka. Isixazululo sale nkinga ukusebenzisa fsync() ezinhlelweni ezidinga ukuqinisekisa ukugcinwa kwedatha okuzinzile futhi zivikeleke kakhulu emiphumeleni yokwehluleka. Umsebenzi fsync() isebenza kahle kakhulu uma usebenzisa i-ext4 kunalapho usebenzisa i-ext3. Ububi bale ndlela ukuthi ukusetshenziswa kwayo, njengangaphambili, kubambezela ukwenziwa kweminye imisebenzi, njengokufaka izinhlelo. Bona imininingwane mayelana nalokhu lapha ΠΈ lapha.

Inkinga yesithathu mayelana fsync(), yaqalwa ngo-2018. Khona-ke, ngaphakathi kohlaka lwephrojekthi ye-PostgreSQL, kwatholakala ukuthi uma umsebenzi fsync() ihlangabezana nephutha, imaka amakhasi "angcolile" ngokuthi "ahlanzekile". Ngenxa yalokho, izingcingo ezilandelayo fsync() Abenzi lutho ngamakhasi anjalo. Ngenxa yalokhu, amakhasi ashintshiwe agcinwa enkumbulweni futhi awalokothi abhalelwe kudiski. Lokhu kuyinhlekelele yangempela, ngoba uhlelo lokusebenza luzocabanga ukuthi idatha ethile ibhalwe kudiski, kodwa empeleni ngeke ibe. Ukwehluleka okunjalo fsync() azivamile, isicelo ezimweni ezinjalo angenza cishe lutho ukulwa nenkinga. Kulezi zinsuku, uma lokhu kwenzeka, i-PostgreSQL nezinye izinhlelo zokusebenza ziyaphahlazeka. kuyinto, esihlokweni esithi β€œIngabe Izicelo Zingalulama Ekuhlulekeni kwe-fsync?”, le nkinga icutshungulwa ngokuningiliziwe. Okwamanje isixazululo esingcono kakhulu sale nkinga ukusebenzisa i-Direct I/O nefulegi O_SYNC noma ngefulegi O_DSYNC. Ngale ndlela, uhlelo luzobika amaphutha angenzeka ngesikhathi sokubhala okuthile, kodwa le ndlela idinga ukuthi uhlelo lokusebenza lulawule amabhafa ngokwawo. Funda kabanzi ngalokhu lapha ΠΈ lapha.

Ivula amafayela kusetshenziswa amafulegi we-O_SYNC kanye ne-O_DSYNC

Ake sibuyele engxoxweni yezindlela ze-Linux ezinikeza ukugcinwa kwedatha okuzinzile. Okungukuthi, sikhuluma ngokusebenzisa ifulegi O_SYNC noma ifulege O_DSYNC lapho uvula amafayela usebenzisa ikholi yesistimu vula(). Ngale ndlela, umsebenzi ngamunye wokubhala idatha wenziwa njengokungathi ngemva komyalo ngamunye write() uhlelo lunikezwa imiyalo ngokufanele fsync() ΠΈ fdatasync(). I Imininingwane ye-POSIX lokhu kubizwa ngokuthi "Ukuqedwa Kwefayela Le-I/O Okuvunyelanisiwe" kanye "Ukuqedwa Kobuqotho Bedatha". Inzuzo enkulu yale ndlela ukuthi ukuqinisekisa ubuqotho bedatha, udinga kuphela ukwenza ikholi yesistimu eyodwa, kunezimbili (isibonelo - write() ΠΈ fdatasync()). Okubi okuyinhloko kwale ndlela ukuthi konke okubhaliwe kusetshenziswa isichazi sefayela esihambisanayo kuzovumelaniswa, okunganciphisa ikhono lokuhlela ikhodi yesicelo.

Ukusebenzisa i-Direct I/O ngefulegi le-O_DIRECT

Ucingo lwesistimu open() isekela ifulege O_DIRECT, eklanyelwe ukudlula inqolobane yesistimu yokusebenza ukuze yenze imisebenzi ye-I/O ngokusebenzisana ngokuqondile nediski. Lokhu, ezimweni eziningi, kusho ukuthi imiyalo yokubhala ekhishwe uhlelo izohunyushwa ngokuqondile emiyalweni ehloselwe ukusebenza nediski. Kodwa, ngokuvamile, lo mshini awuthathi indawo yemisebenzi fsync() noma fdatasync(). Iqiniso liwukuthi i-disk ngokwayo ingakwazi hlehlisa noma inqolobane imiyalo ehambisanayo yokubhala idatha. Futhi, ukwenza izinto zibe zimbi kakhulu, kwezinye izimo ezikhethekile imisebenzi ye-I/O eyenziwa lapho kusetshenziswa ifulegi O_DIRECT, ukusakaza emisebenzini yendabuko evikelwe. Indlela elula yokuxazulula le nkinga ukusebenzisa ifulegi ukuvula amafayela O_DSYNC, okuzosho ukuthi umsebenzi ngamunye wokubhala uzolandelwa ucingo fdatasync().

Kuvele ukuthi isistimu yefayela ye-XFS isanda kwengeza "indlela esheshayo" ye O_DIRECT|O_DSYNC-ukuqoshwa kwedatha. Uma ibhulokhi ibhalwa kabusha kusetshenziswa O_DIRECT|O_DSYNC, bese i-XFS, esikhundleni sokusula inqolobane, izokhipha umyalo wokubhala we-FUA uma idivayisi iwusekela. Ngiqinisekise lokhu ngokusebenzisa insiza blktrace ku-Linux 5.4/Ubuntu 20.04 system. Le ndlela kufanele isebenze kahle, ngoba lapho isetshenziswa, inani elincane ledatha libhalwa kudiski futhi kusetshenziswe umsebenzi owodwa, kunokuba kube okubili (ukubhala nokukhipha i-cache). Ngithole isixhumanisi ku i-patch 2018 kernel, esebenzisa le ndlela. Kunengxoxo ethile lapho mayelana nokusebenzisa lokhu kulungiselelwa kwezinye izinhlelo zefayela, kodwa ngokwazi kwami, i-XFS ukuphela kwesistimu yefayela esekela lokhu kuze kube manje.

sync_file_range() umsebenzi

I-Linux inocingo lwesistimu sync_file_range(), okukuvumela ukuthi ukhiphe ingxenye yefayela kuphela kudiski, esikhundleni sefayela lonke. Le kholi iqala ukuguquguquka kwedatha okungavumelaniyo futhi ayilindi ukuthi iqede. Kodwa esitifiketini sync_file_range() leli qembu kuthiwa β€œliyingozi kakhulu”. Akunconywa ukuyisebenzisa. Izici nezingozi sync_file_range() kuchazwe kahle kakhulu ku lokhu impahla. Ngokucacile, le kholi ibonakala isebenzisa i-RocksDB ukulawula lapho i-kernel ithululela idatha engcolile kudiski. Kodwa ngesikhathi esifanayo, ukuqinisekisa ukugcinwa kwedatha okuzinzile, kuyasetshenziswa futhi fdatasync(). I ikhodi I-RocksDB inamazwana athakazelisayo ngalesi sihloko. Isibonelo, kubonakala sengathi ucingo sync_file_range() Uma usebenzisa i-ZFS, ayisusi idatha kudiski. Okuhlangenwe nakho kungitshela ukuthi ikhodi engavamile ukusetshenziswa kungenzeka iqukethe iziphazamisi. Ngakho-ke, ngingakweluleka ngokumelene nokusebenzisa lolu cingo lwesistimu ngaphandle uma kunesidingo.

Amakholi esistimu asiza ukuqinisekisa ukuphikelela kwedatha

Ngifinyelele esiphethweni sokuthi kunezindlela ezintathu ezingasetshenziswa ukwenza imisebenzi ye-I/O eqinisekisa ukuphikelela kwedatha. Zonke zidinga ucingo lomsebenzi fsync() kuhla lwemibhalo lapho ifayela lidalwe khona. Lezi izindlela:

  1. Ukushayela umsebenzi fdatasync() noma fsync() ngemva komsebenzi write() (Kungcono ukusebenzisa fdatasync()).
  2. Ukusebenza ngesichazi sefayela kuvulwe ngefulegi O_DSYNC noma O_SYNC (okungcono - ngefulegi O_DSYNC).
  3. Ukusebenzisa umyalo pwritev2() ngefulegi RWF_DSYNC noma RWF_SYNC (okungcono kube nefulegi RWF_DSYNC).

Amanothi Okusebenza

Angikalingani kahle ukusebenza kwezinqubo ezahlukene engizihlolile. Umehluko engiwuqaphelile ngejubane lomsebenzi wabo mncane kakhulu. Lokhu kusho ukuthi kungenzeka nginephutha, nokuthi ngaphansi kwezimo ezahlukene into efanayo ingase iveze imiphumela ehlukene. Okokuqala, ngizokhuluma ngalokho okuthinta ukusebenza kakhulu, bese-ke ukuthi yini ethinta ukusebenza kancane.

  1. Ukubhala ngaphezulu idatha yefayela kuyashesha kunokwengeza idatha kufayela (inzuzo yokusebenza ingaba ngu-2-100%). Ukwengeza idatha kufayela kudinga izinguquko ezengeziwe kumethadatha yefayela, ngisho nangemuva kwekholi yesistimu fallocate(), kodwa ubukhulu balo mphumela bungase buhluke. Ngincoma, ukuze usebenze kahle kakhulu, ushaye ucingo fallocate() ukwaba kusengaphambili indawo edingekayo. Khona-ke lesi sikhala kufanele sigcwaliswe ngokucacile ngo-zero futhi sibizwe fsync(). Lokhu kuzoqinisekisa ukuthi amabhulokhi ahambisanayo ohlelweni lwefayela amakwe ngokuthi "abelwe" kunokuba "abelwe". Lokhu kunikeza ukuthuthukiswa kokusebenza okuncane (okungaba ngu-2%). Ukwengeza, amanye amadiski angase abe nokufinyelela kokuqala okuhamba kancane kubhulokhi kunamanye. Lokhu kusho ukuthi ukugcwalisa isikhala ngoziro kungaholela ekuthuthukisweni okuphawulekayo (okungaba ngu-100%) ekusebenzeni. Ikakhulukazi, lokhu kungenzeka ngamadiski AWS EBS (le idatha engekho emthethweni, angikwazanga ukuyiqinisekisa). Okufanayo kuya ekugcinweni I-GCP Persistent Disk (futhi lokhu sekuvele kuwulwazi olusemthethweni, oluqinisekiswe ukuhlolwa). Abanye ochwepheshe benze okufanayo ukuqaphela, ahlobene namadiski ahlukahlukene.
  2. Izingcingo zesistimu ezimbalwa, ukusebenza okuphezulu (inzuzo ingaba ngu-5%). Kubukeka kuyinselele open() ngefulegi O_DSYNC noma shayela pwritev2() ngefulegi RWF_SYNC ngokushesha kunocingo fdatasync(). Ngisola ukuthi iphuzu lapha ukuthi le ndlela idlala indima eqinisweni lokuthi izingcingo zesistimu ezimbalwa okufanele zenziwe ukuze kuxazululwe inkinga efanayo (ucingo olulodwa esikhundleni sezimbili). Kodwa umehluko ekusebenzeni mncane kakhulu, ngakho-ke ungayiziba ngokuphelele futhi usebenzise okuthile kuhlelo lokusebenza okungeke kuxabanise i-logic yayo.

Uma unentshisekelo esihlokweni sokugcinwa kwedatha esimeme, nazi ezinye izinto eziwusizo:

  • I/O Izindlela zokufinyelela β€” Uhlolojikelele lwezisekelo zendlela yokufaka/yokukhiphayo.
  • Ukuqinisekisa ukuthi idatha ifinyelela kudiski β€” Indaba yokuthi kwenzekani kudatha esendleleni isuka kuhlelo iye kudiski.
  • Kufanele uvumelanise nini uhla lwemibhalo oluqukethe - impendulo yombuzo wokuthi isetshenziswa nini fsync() okwemibhalo. Ukubeka lokhu ngamafuphi, kuvela ukuthi udinga ukwenza lokhu lapho udala ifayela elisha, futhi isizathu salesi sincomo ukuthi ku-Linux kungaba nezinkomba eziningi zefayela elifanayo.
  • Iseva ye-SQL ku-Linux: I-FUA Yangaphakathi - nansi incazelo yokuthi ukugcinwa kwedatha okuqhubekayo kwenziwa kanjani ku-SQL Server kuplathifomu ye-Linux. Kukhona ukuqhathanisa okuthakazelisayo phakathi kwezingcingo zesistimu ye-Windows ne-Linux lapha. Ngicishe ngiqiniseke ukuthi kungenxa yalokhu okubalulekile engifunde mayelana nokwenza kahle kwe-FUA ye-XFS.

Ingabe ulahlekelwe idatha obucabanga ukuthi igcinwe ngokuphephile kudiski?

Isitoreji Sedatha Esiqinile kanye nama-API wefayela le-Linux

Isitoreji Sedatha Esiqinile kanye nama-API wefayela le-Linux

Source: www.habr.com