Matsa bayanai a cikin Apache Ignite. Kwarewar Sber

Matsa bayanai a cikin Apache Ignite. Kwarewar SberLokacin aiki tare da manyan kundin bayanai, matsalar rashin sarari diski na iya tasowa wani lokaci. Wata hanyar da za a magance wannan matsala ita ce matsawa, godiya ga wanda, a kan kayan aiki guda ɗaya, za ku iya samun damar ƙara yawan adadin ajiya. A cikin wannan labarin, za mu kalli yadda matsawar bayanai ke aiki a cikin Apache Ignite. Wannan labarin zai bayyana kawai hanyoyin matsawa faifai da aka aiwatar a cikin samfurin. Sauran hanyoyin damtse bayanai (a kan hanyar sadarwa, a ƙwaƙwalwar ajiya), ko aiwatarwa ko a'a, za su kasance a waje da iyaka.

Don haka, tare da kunna yanayin dagewa, sakamakon canje-canjen bayanai a cikin caches, Ignite ya fara rubutawa zuwa faifai:

  1. Abubuwan da ke cikin caches
  2. Rubuta Log gaba (nan gaba kawai WAL)

Akwai wata hanyar matsewar WAL na ɗan lokaci kaɗan yanzu, mai suna WAL compaction. Apache Ignite 2.8 da aka saki kwanan nan ya gabatar da wasu ƙarin hanyoyi guda biyu waɗanda ke ba ku damar damfara bayanai akan faifai: matsawar shafi na diski don matsawa abubuwan da ke cikin caches da matsar hoto na shafin WAL don matsawa wasu shigarwar WAL. Ƙarin cikakkun bayanai game da waɗannan hanyoyin guda uku a ƙasa.

Matsi shafin diski

Ta yaya wannan aikin

Da farko, bari mu ɗan ɗan duba yadda Ignite ke adana bayanai. Ana amfani da ƙwaƙwalwar shafi don ajiya. An saita girman shafin a farkon kumburi kuma ba za a iya canza shi a matakai na gaba ba; haka nan, girman shafin dole ne ya zama iko na biyu da yawa na girman toshe tsarin fayil. Ana loda shafuka zuwa RAM daga faifai kamar yadda ake buƙata; girman bayanai akan faifai na iya wuce adadin RAM da aka keɓe. Idan babu isasshen sarari a cikin RAM don loda shafi daga faifai, tsofaffi, shafukan da ba a yi amfani da su ba za a fitar da su daga RAM.

Ana adana bayanan akan faifai a cikin nau'i mai zuwa: an ƙirƙiri wani fayil daban don kowane bangare na kowane rukunin cache; a cikin wannan fayil ɗin, shafuka suna bayyana ɗaya bayan ɗaya a cikin tsari mai hawa sama. Cikakken mai gano shafi ya ƙunshi mai gano ƙungiyar cache, lambar ɓangarori, da fihirisar shafi a cikin fayil ɗin. Don haka, ta yin amfani da cikakken mai gano shafi, za mu iya keɓance keɓancewar fayil ɗin da kashewa a cikin fayil ɗin kowane shafi. Kuna iya karanta ƙarin game da ƙwaƙwalwar ajiya a cikin labarin Apache Ignite Wiki: Ƙaddamar da Shagon Dagewa - ƙarƙashin hular.

Na'urar matsawa shafin faifai, kamar yadda zaku iya tsammani daga sunan, yana aiki a matakin shafi. Lokacin da aka kunna wannan tsarin, ana sarrafa bayanai a cikin RAM kamar yadda suke, ba tare da matsawa ba, amma idan an adana shafuka daga RAM zuwa diski, ana matsa su.

Amma matsawa kowane shafi daban-daban ba shine mafita ga matsalar ba; kuna buƙatar rage girman fayilolin bayanan da aka samu. Idan girman shafin ya daina gyarawa, ba za mu iya sake rubuta shafuka zuwa fayil ɗin ɗaya bayan ɗaya ba, tunda wannan na iya haifar da matsaloli da yawa:

  • Yin amfani da fihirisar shafi, ba za mu iya ƙididdige adadin kuɗin da aka samu a cikin fayil ɗin ba.
  • Ba a bayyana abin da za a yi da shafukan da ba a ƙarshen fayil ɗin ba kuma canza girman su. Idan girman shafin ya ragu, sararin da ya saki ya ɓace. Idan girman shafin ya ƙaru, kuna buƙatar nemo sabon wuri a cikin fayil ɗin don shi.
  • Idan shafi yana motsawa da adadin bytes waɗanda ba adadi mai yawa na tsarin tsarin fayil ɗin toshewa ba, to karantawa ko rubuta shi zai buƙaci sake taɓa tsarin tsarin fayil guda ɗaya, wanda zai haifar da lalacewa.

Don guje wa magance waɗannan matsalolin a matakin nasa, matsawar shafi na diski a Apache Ignite yana amfani da tsarin tsarin fayil da ake kira fayilolin sparse. Fayil ɗin da ba shi da ƙarfi shine wanda a cikinsa za a iya yiwa wasu yankuna masu cika sifili alama a matsayin "ramuka". A wannan yanayin, ba za a keɓance tubalan tsarin fayil don adana waɗannan ramuka ba, yana haifar da tanadi akan sararin diski.

Yana da ma'ana cewa don 'yantar da tsarin tsarin fayil, girman ramin dole ne ya fi girma ko daidai da toshe tsarin fayil, wanda ke sanya ƙarin iyakance akan girman shafin da Apache Ignite: don matsawa don yin tasiri, Girman shafin dole ne ya fi girma fiye da girman toshe tsarin fayil. Idan girman shafin ya yi daidai da girman toshe, to ba za mu taba samun damar 'yantar da bulogi daya ba, tunda don 'yantar da bulogi guda, shafin da aka matsa dole ne ya mamaye 0 bytes. Idan girman shafin ya yi daidai da girman tubalan 2 ko 4, za mu riga mu sami damar 'yantar da aƙalla shinge ɗaya idan an matsa shafinmu zuwa akalla 50% ko 75%, bi da bi.

Don haka, bayanin ƙarshe na yadda tsarin ke aiki: Lokacin rubuta shafi zuwa faifai, ana ƙoƙarin damfara shafin. Idan girman shafin da aka matsa ya ba da damar guda ɗaya ko fiye da tubalan tsarin fayiloli, to, an rubuta shafin a cikin nau'i mai matsewa, kuma an sanya "rami" a madadin tubalan da aka 'yantar (ana aiwatar da kiran tsarin. fallocate() tare da tutar rami mai naushi). Idan girman shafin da aka matsa bai ba da damar 'yantar da tubalan ba, ana ajiye shafin kamar yadda yake, ba a matsawa ba. Ana ƙididdige duk madaidaicin shafi kamar yadda ba tare da matsawa ba, ta hanyar ninka fihirisar shafi da girman shafin. Ba a buƙatar canja wurin shafuka da kanku ba. Matsalolin shafi, kamar ba tare da matsawa ba, sun faɗi kan iyakokin toshe tsarin fayil.

Matsa bayanai a cikin Apache Ignite. Kwarewar Sber

A cikin aiwatarwa na yanzu, Ignite zai iya aiki tare da ƙananan fayiloli a ƙarƙashin Linux OS; saboda haka, matsawar shafi na diski kawai za'a iya kunna lokacin amfani da Ignite akan wannan tsarin aiki.

Algorithms na matsawa waɗanda za a iya amfani da su don matsawa shafin diski: ZSTD, LZ4, Snappy. Bugu da kari, akwai yanayin aiki (SKIP_GARBAGE), wanda sararin da ba a yi amfani da shi ba ne kawai a cikin shafin ke jefar ba tare da sanya matsi a kan sauran bayanan ba, wanda ke rage nauyin CPU idan aka kwatanta da algorithms da aka jera a baya.

Tasirin Ayyuka

Abin baƙin ciki, ban gudanar da ainihin ma'auni na ayyuka a kan matakan gaske ba, tun da ba mu shirya yin amfani da wannan tsarin wajen samarwa ba, amma za mu iya yin hasashen inda za mu yi rashin nasara da kuma inda za mu yi nasara.

Don yin wannan, muna buƙatar tuna yadda ake karantawa da rubuta shafuka idan aka shiga:

  • Lokacin da ake yin aikin karantawa, ana fara bincika ta cikin RAM, idan binciken bai yi nasara ba, ana loda shafin a cikin RAM daga faifai ta hanyar zaren da ake karantawa.
  • Lokacin da aka yi aikin rubutawa, shafin da ke cikin RAM ana yiwa alama datti, amma ba a ajiye shafin a zahiri zuwa faifai nan da nan ta hanyar zaren da ke yin rubutun. Ana ajiye duk shafuka masu datti zuwa faifai daga baya a cikin tsarin bincike a cikin zaren daban-daban.

Don haka tasirin ayyukan karantawa shine:

  • Kyakkyawan (disk IO), saboda raguwar adadin tubalan tsarin fayil ɗin karantawa.
  • Negative (CPU), saboda ƙarin nauyin da tsarin aiki ke buƙata don aiki tare da ƙananan fayiloli. Hakanan yana yiwuwa ƙarin ayyukan IO za su bayyana a fakaice a nan don adana ƙaƙƙarfan tsarin fayil mara nauyi (abin takaici, ban saba da duk cikakkun bayanai na yadda manyan fayilolin ke aiki ba).
  • Korau (CPU), saboda buƙatar damfara shafuka.
  • Babu wani tasiri akan ayyukan rubutu.
  • Tasiri kan tsarin bincike (komai anan yayi kama da ayyukan karantawa):
  • Kyakkyawan (disk IO), saboda raguwar adadin rubutattun tubalan tsarin fayil.
  • Korau (CPU, mai yiwuwa diski IO), saboda aiki tare da ƙananan fayiloli.
  • Negative (CPU), saboda buƙatar matsawa shafi.

Wane gefen sikelin ne zai kai ma'aunin? Wannan duk ya dogara da yanayin, amma ina son in yi imani cewa matsawar shafi na diski zai iya haifar da lalacewar aiki akan yawancin tsarin. Bugu da ƙari, gwaje-gwaje akan wasu DBMSs waɗanda ke amfani da irin wannan hanya tare da fayiloli marasa ƙarfi suna nuna raguwar aiki lokacin da aka kunna matsawa.

Yadda ake kunnawa da daidaitawa

Kamar yadda aka ambata a sama, ƙaramin sigar Apache Ignite wanda ke goyan bayan matsawa shafi na diski shine 2.8 kuma tsarin aiki na Linux kawai ake tallafawa. Kunna kuma saita kamar haka:

  • Dole ne a sami madaidaicin matsewar wuta a cikin hanyar-aji. Ta hanyar tsoho, yana cikin rarraba Apache Ignite a cikin libs/ directory na zaɓi kuma ba a haɗa shi cikin hanyar aji ba. Za ka iya kawai matsar da directory sama mataki daya zuwa libs sa'an nan idan kun kunna shi ta hanyar ignite.sh za a kunna ta atomatik.
  • Dole ne a kunna dagewa (An kunna ta DataRegionConfiguration.setPersistenceEnabled(true)).
  • Girman shafin dole ne ya fi girman tsarin tsarin fayil (zaka iya saita shi ta amfani da shi DataStorageConfiguration.setPageSize() ).
  • Ga kowane cache wanda bayanansa ke buƙatar matsawa, dole ne ku saita hanyar matsawa da (na zaɓi) matakin matsawa (hanyoyi). CacheConfiguration.setDiskPageCompression() , CacheConfiguration.setDiskPageCompressionLevel()).

Farashin WAL

Ta yaya wannan aikin

Menene WAL kuma me yasa ake buƙata? A taƙaice sosai: wannan log ɗin ne wanda ke ƙunshe da duk abubuwan da suka faru waɗanda a ƙarshe suka canza maajiyar shafi. Ana buƙatar farko don samun damar murmurewa idan faɗuwa ya faru. Duk wani aiki, kafin ba da iko ga mai amfani, dole ne ya fara rikodin wani taron a cikin WAL, ta yadda idan ya gaza, za a iya sake kunna shi a cikin log ɗin kuma a maido da duk ayyukan da mai amfani ya sami amsa mai nasara, koda kuwa waɗannan ayyukan. ba su da lokacin da za a iya nunawa a cikin ajiyar shafi akan faifai ( Tuni a sama An bayyana cewa ainihin rubutun zuwa kantin sayar da shafi ana yin shi ta hanyar da ake kira "checkpointing" tare da jinkiri ta hanyar zaren daban).

An raba shigarwar cikin WAL zuwa ma'ana da ta zahiri. Boolean su ne maɓalli da ƙimar kansu. Na zahiri - yana nuna canje-canje zuwa shafuka a cikin shagon shafi. Duk da yake bayanan ma'ana na iya zama da amfani ga wasu lokuta, ana buƙatar bayanan jiki kawai don murmurewa idan wani hatsarin ya faru kuma ana buƙatar bayanan kawai tun daga wurin bincike na ƙarshe na ƙarshe. Anan ba za mu shiga daki-daki ba kuma mu bayyana dalilin da yasa yake aiki ta wannan hanyar, amma masu sha'awar na iya komawa ga labarin da aka riga aka ambata akan Apache Ignite Wiki: Ƙaddamar da Shagon Dagewa - ƙarƙashin hular.

Sau da yawa akan sami bayanan jiki da yawa a kowane rikodin ma'ana. Wato, alal misali, wanda aka sanya aiki a cikin cache yana rinjayar shafuka da yawa a cikin ƙwaƙwalwar ajiyar shafi (shafi tare da bayanan da kansa, shafuka masu alamomi, shafuka masu jerin kyauta). A wasu gwaje-gwajen roba, na gano cewa bayanan jiki sun sha kashi 90% na fayil ɗin WAL. Koyaya, ana buƙatar su na ɗan gajeren lokaci (ta tsohuwa, tazara tsakanin wuraren bincike shine mintuna 3). Zai zama ma'ana don kawar da wannan bayanan bayan rasa mahimmancinsa. Wannan shi ne ainihin abin da tsarin haɗakarwa na WAL ke yi: yana kawar da bayanan jiki kuma yana matsawa sauran bayanan ma'ana ta amfani da zip, yayin da girman fayil ɗin yana raguwa sosai (wani lokaci ta sau goma).

A zahiri, WAL ya ƙunshi sassa da yawa (10 ta tsohuwa) na ƙayyadaddun girman (64MB ta tsohuwa), waɗanda aka sake rubuta su ta hanyar madauwari. Da zaran an cika ɓangaren na yanzu, ana sanya kashi na gaba a matsayin na yanzu, kuma ana kwafi ɓangaren da aka cika zuwa ma'ajiyar ta hanyar zaren daban. WAL compaction ya riga ya yi aiki tare da sassan kayan tarihi. Hakanan, azaman zaren daban, yana sa ido kan yadda ake aiwatar da wurin binciken kuma yana fara matsawa a cikin sassan ma'ajin da ba a buƙatar bayanan zahiri.

Matsa bayanai a cikin Apache Ignite. Kwarewar Sber

Tasirin Ayyuka

Tunda ƙaddamarwar WAL yana gudana azaman zare daban, bai kamata a sami wani tasiri kai tsaye akan ayyukan da ake yi ba. Amma har yanzu yana sanya ƙarin nauyin bayanan baya akan CPU (compression) da faifai (karanta kowane ɓangaren WAL daga rumbun adana bayanai da rubuta sassan da aka matsa), don haka idan tsarin yana aiki a iyakar ƙarfinsa, zai haifar da lalacewa.

Yadda ake kunnawa da daidaitawa

Kuna iya kunna haɗin WAL ta amfani da kadarorin WalCompactionEnabled в DataStorageConfiguration (DataStorageConfiguration.setWalCompactionEnabled(true)). Hakanan, ta amfani da hanyar DataStorageConfiguration.setWalCompactionLevel(), zaku iya saita matakin matsawa idan baku gamsu da ƙimar tsoho ba (BEST_SPEED).

Matse hoto na shafin WAL

Ta yaya wannan aikin

Mun riga mun gano cewa a cikin bayanan WAL sun kasu kashi na hankali da na zahiri. Ga kowane canji zuwa kowane shafi, ana samar da rikodin WAL na zahiri a ƙwaƙwalwar shafi. Rikodin jiki, bi da bi, kuma ana kasu kashi biyu cikin substeps: Rikodin Snapshot da Rikodin Delta. A duk lokacin da muka canza wani abu a shafi kuma muka canza shi daga tsaftataccen yanayi zuwa yanayi mara kyau, ana adana cikakken kwafin wannan shafin a cikin WAL. Ko da mun canza byte ɗaya kawai a cikin WAL, rikodin zai zama ɗan girma fiye da girman shafin. Idan muka canza wani abu a shafin da ya rigaya ya ƙazantu, an kafa rikodin delta a cikin WAL, wanda ke nuna canje-canje kawai idan aka kwatanta da yanayin shafin da ya gabata, amma ba duka shafin ba. Tunda sake saita yanayin shafuka daga ƙazanta zuwa tsabta ana yin su yayin aikin binciken, nan da nan bayan fara wurin binciken, kusan duk bayanan na zahiri za su ƙunshi hotunan hotuna ne kawai (tun da duk shafukan nan da nan bayan fara wurin binciken suna da tsabta). , sannan yayin da muka kusanci wurin bincike na gaba, sashin rikodin delta ya fara girma kuma ya sake saitawa a farkon wurin bincike na gaba. Aunawa a wasu gwaje-gwajen roba sun nuna cewa rabon hotunan hotunan shafi a cikin jimillar adadin bayanan jiki ya kai kashi 2%.

Tunanin damfara hoton hoto na WAL shine a damfara hotunan shafi ta amfani da kayan aikin matsawa shafi (duba matsawar shafin diski). A lokaci guda kuma, a cikin WAL, ana adana bayanan jeri a cikin yanayin append-kawai kuma babu buƙatar ɗaure rikodin zuwa iyakokin tsarin tsarin fayil, don haka a nan, ba kamar na'urar damfara shafin faifai ba, ba ma buƙatar fayiloli marasa ƙarfi a duk; saboda haka, wannan tsarin zai yi aiki ba kawai akan Linux OS ba. Bugu da kari, ba ya damu da mu nawa muka iya damfara shafin. Ko da mun saki 1 byte, wannan ya riga ya zama sakamako mai kyau kuma za mu iya adana bayanan da aka matsa a cikin WAL, ba kamar matsawa shafi na diski ba, inda muke adana shafin da aka matsa kawai idan mun saki fiye da tsarin fayil guda 1.

Shafuna suna da matsewa sosai, rabonsu a cikin jimlar WAL yana da yawa sosai, don haka ba tare da canza tsarin fayil ɗin WAL ba za mu iya samun raguwa sosai a girmansa. Matsi, gami da bayanan ma'ana, zai buƙaci canji a cikin tsari da asarar daidaituwa, alal misali, ga masu siye na waje waɗanda zasu iya sha'awar bayanan ma'ana, amma ba zai haifar da raguwa mai yawa a girman fayil ba.

Kamar yadda yake tare da matsawar shafi na faifai, matsawa hoto na shafin WAL na iya amfani da ZSTD, LZ4, Snappy compression algorithms, da yanayin SKIP_GARBAGE.

Tasirin Ayyuka

Ba shi da wahala a lura cewa kunna shafin yanar gizon WAL kai tsaye yana shafar zaren da ke rubuta bayanai zuwa ƙwaƙwalwar ajiyar shafi, wato, zaren da ke canza bayanai a cikin caches. Karatun bayanan jiki daga WAL yana faruwa sau ɗaya kawai, a halin yanzu an ɗaga kumburin bayan faɗuwa (kuma kawai idan ya faɗi yayin wurin bincike).

Wannan yana rinjayar zaren da ke canza bayanai ta hanya mai zuwa: muna samun sakamako mara kyau (CPU) saboda buƙatar damfara shafin kowane lokaci kafin rubutawa zuwa faifai, da sakamako mai kyau (disk IO) saboda raguwar adadin adadin. rubuta bayanai. Dangane da haka, komai yana da sauƙi a nan: idan aikin tsarin yana iyakance ta CPU, muna samun raguwa kaɗan, idan an iyakance ta diski I / O, muna samun karuwa.

A kaikaice, rage girman WAL shima yana shafar (tabbatacce) rafukan da ke zubar da sassan WAL cikin ma'ajiyar bayanai da magudanan ruwa na WAL.

Gwaje-gwajen aiki na gaske a cikin mahallin mu ta amfani da bayanan roba ya nuna ɗan ƙaramin haɓaka (abin da aka samu ya karu da 10% -15%, latency ya ragu da 10%-15%).

Yadda ake kunnawa da daidaitawa

Mafi ƙarancin Apache Ignite: 2.8. Kunna kuma saita kamar haka:

  • Dole ne a sami madaidaicin matsewar wuta a cikin hanyar-aji. Ta hanyar tsoho, yana cikin rarraba Apache Ignite a cikin libs/ directory na zaɓi kuma ba a haɗa shi cikin hanyar aji ba. Za ka iya kawai matsar da directory sama mataki daya zuwa libs sa'an nan idan kun kunna shi ta hanyar ignite.sh za a kunna ta atomatik.
  • Dole ne a kunna dagewa (An kunna ta DataRegionConfiguration.setPersistenceEnabled(true)).
  • Dole ne a saita yanayin matsawa ta amfani da hanyar DataStorageConfiguration.setWalPageCompression(), an kashe matsawa ta tsohuwa (Yanayin KASHE).
  • Da zaɓin, zaku iya saita matakin matsawa ta amfani da hanyar DataStorageConfiguration.setWalPageCompression(), duba javadoc don hanyar don ingantattun ƙimar kowane yanayi.

ƙarshe

Ana iya amfani da hanyoyin damtse bayanan da aka yi la'akari a cikin Apache Ignite ba tare da juna ba, amma duk wani haɗin kansu shima abin karɓa ne. Fahimtar yadda suke aiki zai ba ku damar sanin yadda suka dace da ayyukanku a cikin mahallin ku da abin da za ku yi sadaukarwa yayin amfani da su. An tsara matsawar shafi na diski don damfara babban ajiya kuma yana iya ba da matsakaicin matsawa. Matsa hoton hoton shafi na WAL zai ba da matsakaiciyar matsawa ga fayilolin WAL, kuma zai fi dacewa ma inganta aiki. Ƙirƙirar WAL ba zai yi tasiri mai kyau akan aiki ba, amma zai rage girman fayilolin WAL gwargwadon yuwuwa ta hanyar cire bayanan jiki.

source: www.habr.com

Add a comment