Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Kuita kwepamusoro ndechimwe chezvakakosha zvinodiwa paunenge uchishanda nedata hombe. Mudhipatimendi rekurodha data kuSberbank, tinopomba zvinenge zvese zvekutengesa muHadoop-based Data Cloud uye saka tinobata nekuyerera kukuru kweruzivo. Nomuzvarirwo, isu tinogara tichitsvaga nzira dzekuvandudza mashandiro, uye ikozvino tinoda kukuudza kuti takakwanisa sei kubata RegionServer HBase uye HDFS mutengi, nekuda kwatakakwanisa kuwedzera zvakanyanya kumhanya kwekuverenga mashandiro.
Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Nekudaro, usati waenderera kune izvo zvakakosha zvekuvandudza, zvakakosha kutaura nezve zvirambidzo izvo, mumusimboti, hazvigone kutenderedzwa kana ukagara paHDD.

Nei HDD uye nekukurumidza Random Access kuverenga zvisingaenderane
Sezvaunoziva, HBase, uye mamwe akawanda dhatabhesi, chengetedza data mumabhuroko emakumi akati wandei emakirobhayiti muhukuru. Nekusagadzikana inenge 64 KB. Zvino ngatifungei kuti tinoda kuwana zana chete mabheti uye tinokumbira HBase kutipa iyi data vachishandisa kiyi. Sezvo saizi yebhuroka muHFiles iri 100 KB, chikumbiro chichave chakakura ka64 (miniti chete!) kupfuura zvinodiwa.

Tevere, sezvo chikumbiro chichapfuura neHDFS uye metadata caching mechanism ShortCircuitCache (iyo inobvumira kupinda zvakananga kune mafaira), izvi zvinotungamirira pakuverenga kare 1 MB kubva padhisiki. Zvisinei, izvi zvinogona kugadziriswa neparameter dfs.client.read.shortcircuit.buffer.size uye muzviitiko zvakawanda zvine musoro kuderedza kukosha uku, semuenzaniso ku126 KB.

Ngatitii tinoita izvi, asi nekuwedzera, patinotanga kuverenga data kuburikidza nejava api, zvakadai semabasa akaita seFileChannel.read uye bvunza sangano rekushanda kuti riverenge nhamba yakatarwa yedata, inoverenga "chero kana" 2 nguva dzakawanda. , i.e. 256 KB kwatiri. Izvi zvinodaro nekuti java haina nzira iri nyore yekuseta iyo FADV_RANDOM mureza kudzivirira maitiro aya.

Nekuda kweizvozvo, kuti tiwane yedu 100 bytes, 2600 nguva yakawanda inoverengwa pasi pehodhi. Zvingaita sekuti mhinduro iri pachena, ngatideredze saizi yebhuroka kusvika kilobyte, tiise mureza wataurwa uye tiwane kukurumidza kujekeswa kukuru. Asi dambudziko nderekuti nekudzikisira saizi yebhuroko kaviri, tinoderedzawo huwandu hwemabhaiti akaverengwa pachikamu chenguva nekaviri.

Imwe bhenefiti kubva pakuseta iyo FADV_RADOM mureza inogona kuwanikwa, asi chete neakawanda-tambo-tambo uye nehukuru hwebhuroko hwe128 KB, asi iyi inodarika makumi akati wandei muzana:

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Miedzo yakaitwa pamafaira e100, imwe neimwe 1 GB muhukuru uye iri pa10 HDDs.

Ngativerengei zvatinogona, mumusimboti, kuvimba nazvo nekumhanya uku:
Ngatitii tinoverenga kubva ku10 disks nekumhanya kwe280 MB / sec, i.e. 3 miriyoni nguva 100 bytes. Asi sezvatinorangarira, data yatinoda yakaderera ka2600 pane inoverengwa. Saka, tinogovanisa 3 miriyoni ne2600 towana 1100 zvinyorwa pasekondi.

Kuora mwoyo, handizvo here? Ndiwo hunhu Random Kuwana kuwana data paHDD - zvisinei nehukuru hweblock. Uyu ndiwo muganho wenyama wekuwana zvisina tsarukano uye hapana dhatabhesi rinogona kudzvanya kunze pasi pemamiriro akadai.

Saka sei dhatabhesi dzinowana kumhanya kwakanyanya? Kuti tipindure mubvunzo uyu, ngatitarisei zviri kuitika mumufananidzo unotevera:

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Pano tinoona kuti kwemaminetsi mashomanana ekutanga kukurumidza kunenge kune chiuru chezvinyorwa pasekondi. Nekudaro, zvakare, nekuda kwekuti zvakawanda zvinoverengerwa pane zvakakumbirwa, iyo data inoguma mubuff / cache yeiyo inoshanda sisitimu (linux) uye kumhanya kunowedzera kusvika kune yakasarudzika 60 zviuru pasekondi.

Nekudaro, kuenderera mberi isu tichabata nekukasira kuwana chete kune iyo data iri muOS cache kana iri muSSD/NVMe midziyo yekuchengetera yekufananidza yekumhanya.

Kwatiri isu, isu tichaitisa bvunzo pabhenji 4 maseva, imwe neimwe inobhadhariswa sezvinotevera:

CPU: Xeon E5-2680 v4 @ 2.40GHz 64 threads.
Chiyeuchidzo: 730 GB.
java shanduro: 1.8.0_111

Uye pano pfungwa inokosha ndeyehuwandu hwemashoko mumatafura anoda kuverengwa. Icho chokwadi ndechekuti kana iwe ukaverenga data kubva patafura iyo yakaiswa zvachose muHBase cache, saka haizotombouya pakuverenga kubva kune inoshanda sisitimu buff / cache. Nekuti HBase nekusarudzika inopa makumi mana muzana yendangariro kune chimiro chinonzi BlockCache. Chaizvoizvo iyi iConcurrentHashMap, uko kiyi iri zita refaira + offset yebhuroko, uye kukosha ndiyo data chaiyo panguva ino.

Nokudaro, kana tichiverenga chete kubva muchimiro ichi, isu tinoona kumhanya kwakanakisa, semiriyoni yezvikumbiro pasekondi. Asi ngatifungei kuti hatigone kugovera mazana emagigabytes endangariro nekuda kwezvido zvedatabase, nekuti kune zvakawanda zvimwe zvinhu zvinobatsira zvinomhanya pamaseva aya.

Semuenzaniso, kwatiri isu, vhoriyamu yeBlockCache pane imwe RS inenge 12 GB. Takamhara maRS maviri pane imwe node, i.e. 96 GB yakagoverwa BlockCache pane dzose nodes. Uye kune data rakawanda kakawanda, semuenzaniso, ngaive matafura mana, matunhu zana nemakumi matatu ega ega, umo mafaera ari 4 MB muhukuru, akamanikidzwa neFAST_DIFF, i.e. huwandu hwe130 GB (iyi data yakachena, i.e. pasina kufunga nezve replication factor).

Nokudaro, BlockCache inongova ye23% yehuwandu hwehuwandu hwemashoko uye izvi zviri pedyo zvikuru nemamiriro ezvinhu chaiwo anonzi BigData. Uye apa ndipo panotangira mafaro - nekuti zviri pachena, mashoma cache anorova, zvakanyanya kuita. Mushure mezvose, kana ukapotsa, iwe uchafanirwa kuita basa rakawanda - i.e. enda pasi kunodaidza masisitimu mabasa. Nekudaro, izvi hazvigone kudzivirirwa, saka ngatitarisei chikamu chakasiyana zvachose - chii chinoitika kune data mukati mecache?

Ngatirerutsa mamiriro acho uye tifunge kuti isu tine cache inokwana chinhu chimwe chete. Heino muenzaniso wezvichaitika kana tikayedza kushanda nevhoriyamu yedata 1 nguva yakakura kupfuura cache, isu tichafanirwa:

1. Isa chivharo 1 mu cache
2. Bvisa bhuroka 1 kubva kune cache
3. Isa chivharo 2 mu cache
4. Bvisa bhuroka 2 kubva kune cache
5. Isa chivharo 3 mu cache

5 zviito zvakapedzwa! Nekudaro, mamiriro aya haagone kunzi akajairika; kutaura zvazviri, tiri kumanikidza HBase kuita boka rebasa risingabatsiri zvachose. Iyo inogara ichiverenga data kubva kuOS cache, inoiisa muBlockCache, kungoikanda kunze nekukurumidza nekuti chikamu chitsva che data chasvika. Iwo animation ekutanga kweposvo inoratidza musimboti wedambudziko - Muunganidzi wemarara ari kuenda kure, mhepo iri kupisa, Greta mudiki ari kure uye ari kupisa Sweden ari kugumbuka. Uye isu vanhu veIT hatizvifarire kana vana vakasuruvara, saka tinotanga kufunga nezvezvatingaite nezvazvo.

Zvakadini kana iwe ukasaisa mabhuraki ese mu cache, asi chikamu chimwe chete muzana chazvo, kuitira kuti cache irege kufashukira? Ngatitangei nekungowedzera mitsetse mishoma yekodhi pakutanga kwebasa rekuisa data muBlockCache:

  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) {
    if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) {
      if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) {
        return;
      }
    }
...

Pfungwa iri pano ndeiyi inotevera: offset ndiyo nzvimbo yevharo mufaira uye nhamba dzayo dzekupedzisira dzinongoitika uye dzakaenzana dzakagoverwa kubva 00 kusvika 99. Nokudaro, isu tichangodarika chete avo vanowira munharaunda yatinoda.

Semuenzaniso, isa cacheDataBlockPercent = 20 uye ona zvinoitika:

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Mugumisiro wacho uri pachena. Mune magirafu ari pazasi, zvinova pachena kuti nei kukwidziridzwa kwakadaro kwakaitika - isu tinochengetedza yakawanda yeGC zviwanikwa tisina kuita basa reSisyphean rekuisa data mu cache kungorikanda pasi pedoro rembwa dzeMartian:

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Panguva imwecheteyo, kushandiswa kweCPU kunowedzera, asi kwakaderera pane kubereka:

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Zvakakoshawo kuziva kuti zvidhinha zvakachengetwa muBlockCache zvakasiyana. Yakawanda, inenge 95%, idata pachayo. Uye mamwe mametadata, akadai seBloom mafirita kana LEAF_INDEX uye etc.. Iyi data haina kukwana, asi inobatsira zvikuru, nekuti isati yasvika iyo data zvakananga, HBase inotendeukira kune meta kuti inzwisise kana zvichidikanwa kutsvaga pano zvakare uye, kana zvakadaro, panowanikwa bhuroka yekufarira.

Nokudaro, mukodhi tinoona cheki cheki buf.getBlockType().isData() uye nekuda kwemeta iyi, tichaisiya mu cache chero zvakadaro.

Iye zvino ngatiwedzerei mutoro uye zvishoma kuomesa chimiro mune imwe kuenda. Muyedzo yekutanga takaita cutoff percentage = 20 uye BlockCache yakashandiswa zvishoma. Zvino ngatiisei ku23% towedzera tambo zana maminetsi ega ega mashanu kuti tione kuti kuzara kunoitika papi:

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Pano tinoona kuti shanduro yepakutanga inenge pakarepo inorova sirin'i pamusoro pezvikumbiro 100 zviuru pasekondi. Nepo chigamba chinopa kukwidziridza kusvika ku300 zviuru. Panguva imwecheteyo, zviri pachena kuti kumwe kukwidziridza hakusisiri "kwemahara"; kushandiswa kweCPU kuri kuwedzerawo.

Nekudaro, iyi haisi mhinduro yakanakisa, sezvo isu tisingazive pachine nguva kuti ndeipi chikamu chezvivharo chinoda kuvharirwa, zvinoenderana nemutoro wembiri. Naizvozvo, imwe nzira yakaitwa kuti igadzirise otomatiki iyi parameter zvichienderana nebasa rekuverenga mashandiro.

Sarudzo nhatu dzakawedzerwa kudzora izvi:

hbase.lru.cache.heavy.eviction.count.limit - inoseta kuti kangani maitiro ekudzinga data kubva kucache anofanira kumhanya tisati tatanga kushandisa optimization (kureva kusvetuka zvidhinha). Nekumisikidza yakaenzana neMAX_INT = 2147483647 uye zvinoreva kuti chimiro hachizombotanga kushanda neichi kukosha. Nokuti kudzingwa kunotanga ose 5 - 10 masekonzi (zvinoenderana mutoro) uye 2147483647 * 10/60/60/24/365 = 680 makore. Nekudaro, isu tinogona kuseta iyi paramende ku0 uye kuita kuti chimiro chishande nekukurumidza mushure mekutanga.

Nekudaro, pane zvakare mubhadharo mune iyi parameter. Kana mutoro wedu wakadaro zvekuti kuverenga kwenguva pfupi (taura masikati) uye kuverenga kwenguva refu (usiku) kunogara kuchipindirana, saka tinogona kuve nechokwadi chekuti chimiro chinobatidzwa chete kana maoperation ekuverenga kwenguva refu achienderera mberi.

Semuenzaniso, tinoziva kuti kuverenga kwenguva pfupi kunowanzo tora maminitsi 1. Hapana chikonzero chekutanga kukanda zvidhinha, iyo cache haizove nenguva yekuve yekare uye isu tinogona kuseta iyi parameter yakaenzana, semuenzaniso, 10. Izvi zvinotungamira kune chokwadi chekuti optimization ichatanga kushanda chete kana yakareba- izwi rinoshanda kuverenga rakatanga, i.e. mumasekonzi zana. Nokudaro, kana tine kuverenga kwenguva pfupi, ipapo zvivharo zvose zvichapinda mu cache uye zvichave zviripo (kunze kweavo vachadzingwa ne-standard algorithm). Uye kana isu tichiverenga kwenguva refu, iyo ficha inobatidzwa uye isu tingave nekuita kwakanyanya kwepamusoro.

hbase.lru.cache.heavy.eviction.mb.size.limit - inoseta kuti mangani megabytes atinoda kuisa mu cache (uye, zvechokwadi, kudzinga) mumasekonzi gumi. Iyo ficha ichaedza kusvika kune iyi kukosha uye kuichengeta. Pfungwa ndeiyi: kana tikasundira gigabytes mu cache, saka tichafanira kudzinga gigabytes, uye izvi, sezvataona pamusoro, zvinodhura zvikuru. Nekudaro, haufanirwe kuyedza kuimisa idiki, sezvo izvi zvichizoita kuti block skip mode ibude nguva isati yakwana. Kune maseva ane simba (anenge 10-20 emuviri cores), zvakaringana kuseta nezve 40-300 MB. Kune yepakati kirasi (~ 400 cores) 10-200 MB. Kune asina kusimba masisitimu (300-2 cores) 5-50 MB inogona kunge yakajairika (isina kuedzwa pane izvi).

Ngatitarisei kuti izvi zvinoshanda sei: ngatiti isu takaseta hbase.lru.cache.heavy.eviction.mb.size.limit = 500, pane imwe mhando yemutoro (kuverenga) uyezve ~ masekondi gumi tinoverenga kuti mangani mabyte aive vakadzingwa vachishandisa formula:

Overhead = Yakasunungurwa Bytes Sum (MB) * 100 / Limit (MB) - 100;

Kana chokwadi 2000 MB vakadzingwa, ipapo Overhead yakaenzana ne:

2000 * 100 / 500 - 100 = 300%

Iyo algorithms inoedza kuchengetedza isingapfuure makumi mashoma ezana muzana, saka chimiro chinoderedza chikamu cheakavharirwa mabhuroko, nekudaro kushandisa auto-tuning meshini.

Nekudaro, kana mutoro ukadzikira, ngatiti 200 MB chete inodzingwa uye Overhead inova isina kunaka (iyo inonzi overshooting):

200 * 100 / 500 - 100 = -60%

Mukupesana, iyo ficha ichawedzera muzana yeakavharirwa mabhuroko kusvika Overhead yave yakanaka.

Pazasi pane muenzaniso wekuti izvi zvinotaridzika sei pane chaiyo data. Hapana chikonzero chekuedza kusvika ku0%, hazvigoneki. Iyo yakanaka kwazvo kana iri ye30 - 100%, izvi zvinobatsira kudzivirira kubuda nguva isati yakwana kubva kune optimization modhi panguva yekuvhiya kwenguva pfupi.

hbase.lru.cache.heavy.eviction.overhead.coefficient - inoisa kuti tingada nekukurumidza sei kuwana mhedzisiro. Kana isu tichiziva zvechokwadi kuti kuverenga kwedu kunowanzo kureba uye hatidi kumirira, tinogona kuwedzera chiyero ichi uye kuwana kushanda kwepamusoro nekukurumidza.

Semuenzaniso, tinoisa iyi coefficient = 0.01. Izvi zvinoreva kuti Overhead (ona pamusoro) ichawedzerwa nenhamba iyi nemhedzisiro uye chikamu chezvivharo zvakachengetwa chichaderedzwa. Ngatitorei kuti Overhead = 300% uye coefficient = 0.01, ipapo iyo percentage ye cached blocks ichaderedzwa ne3%.

Iyo yakafanana "Backpressure" logic inoshandiswa zvakare kune yakaipa Overhead (overshooting) maitiro. Sezvo kuchinjika kwenguva pfupi muhuwandu hwekuverenga uye kudzingwa kuchigoneka, iyi michina inobvumidza iwe kuti udzivise kubuda nguva isati yakwana kubva kune optimization mode. Backpressure ine inverted logic: iyo yakasimba iyo overshooting, mabhururu akawanda anochengetwa.

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Implementation code

        LruBlockCache cache = this.cache.get();
        if (cache == null) {
          break;
        }
        freedSumMb += cache.evict()/1024/1024;
        /*
        * Sometimes we are reading more data than can fit into BlockCache
        * and it is the cause a high rate of evictions.
        * This in turn leads to heavy Garbage Collector works.
        * So a lot of blocks put into BlockCache but never read,
        * but spending a lot of CPU resources.
        * Here we will analyze how many bytes were freed and decide
        * decide whether the time has come to reduce amount of caching blocks.
        * It help avoid put too many blocks into BlockCache
        * when evict() works very active and save CPU for other jobs.
        * More delails: https://issues.apache.org/jira/browse/HBASE-23887
        */

        // First of all we have to control how much time
        // has passed since previuos evict() was launched
        // This is should be almost the same time (+/- 10s)
        // because we get comparable volumes of freed bytes each time.
        // 10s because this is default period to run evict() (see above this.wait)
        long stopTime = System.currentTimeMillis();
        if ((stopTime - startTime) > 1000 * 10 - 1) {
          // Here we have to calc what situation we have got.
          // We have the limit "hbase.lru.cache.heavy.eviction.bytes.size.limit"
          // and can calculte overhead on it.
          // We will use this information to decide,
          // how to change percent of caching blocks.
          freedDataOverheadPercent =
            (int) (freedSumMb * 100 / cache.heavyEvictionMbSizeLimit) - 100;
          if (freedSumMb > cache.heavyEvictionMbSizeLimit) {
            // Now we are in the situation when we are above the limit
            // But maybe we are going to ignore it because it will end quite soon
            heavyEvictionCount++;
            if (heavyEvictionCount > cache.heavyEvictionCountLimit) {
              // It is going for a long time and we have to reduce of caching
              // blocks now. So we calculate here how many blocks we want to skip.
              // It depends on:
             // 1. Overhead - if overhead is big we could more aggressive
              // reducing amount of caching blocks.
              // 2. How fast we want to get the result. If we know that our
              // heavy reading for a long time, we don't want to wait and can
              // increase the coefficient and get good performance quite soon.
              // But if we don't sure we can do it slowly and it could prevent
              // premature exit from this mode. So, when the coefficient is
              // higher we can get better performance when heavy reading is stable.
              // But when reading is changing we can adjust to it and set
              // the coefficient to lower value.
              int change =
                (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient);
              // But practice shows that 15% of reducing is quite enough.
              // We are not greedy (it could lead to premature exit).
              change = Math.min(15, change);
              change = Math.max(0, change); // I think it will never happen but check for sure
              // So this is the key point, here we are reducing % of caching blocks
              cache.cacheDataBlockPercent -= change;
              // If we go down too deep we have to stop here, 1% any way should be.
              cache.cacheDataBlockPercent = Math.max(1, cache.cacheDataBlockPercent);
            }
          } else {
            // Well, we have got overshooting.
            // Mayby it is just short-term fluctuation and we can stay in this mode.
            // It help avoid permature exit during short-term fluctuation.
            // If overshooting less than 90%, we will try to increase the percent of
            // caching blocks and hope it is enough.
            if (freedSumMb >= cache.heavyEvictionMbSizeLimit * 0.1) {
              // Simple logic: more overshooting - more caching blocks (backpressure)
              int change = (int) (-freedDataOverheadPercent * 0.1 + 1);
              cache.cacheDataBlockPercent += change;
              // But it can't be more then 100%, so check it.
              cache.cacheDataBlockPercent = Math.min(100, cache.cacheDataBlockPercent);
            } else {
              // Looks like heavy reading is over.
              // Just exit form this mode.
              heavyEvictionCount = 0;
              cache.cacheDataBlockPercent = 100;
            }
          }
          LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " +
            "heavy eviction counter: {}, " +
            "current caching DataBlock (%): {}",
            freedSumMb, freedDataOverheadPercent,
            heavyEvictionCount, cache.cacheDataBlockPercent);

          freedSumMb = 0;
          startTime = stopTime;
       }

Ngatitarisei zvese izvi tichishandisa muenzaniso chaiwo. Tine inotevera test script:

  1. Ngatitangei kuita Scan (25 shinda, batch = 100)
  2. Mushure memaminitsi mashanu, wedzera akawanda-anowana (tambo makumi maviri neshanu, batch = 5)
  3. Mushure memaminitsi mashanu, dzima akawanda-anowana (chete scan yasara zvakare)

Isu tinoita miitiro miviri, yekutanga hbase.lru.cache.heavy.eviction.count.limit = 10000 (iyo inodzima chimiro), tozoisa muganhu = 0 (inogonesa).

Mune matanda pazasi tinoona kuti chimiro chinobatidzwa sei uye chinogadzirisa Overshooting kusvika 14-71%. Nguva nenguva mutoro unodzikira, unovhura Backpressure uye HBase caches mamwe mabhuraki zvakare.

Log RegionServer
kudzingwa (MB): 0, reshiyo 0.0, pamusoro (%): -100, inorema eviction counter: 0, ikozvino caching DataBlock (%): 100
kudzingwa (MB): 0, reshiyo 0.0, pamusoro (%): -100, inorema eviction counter: 0, ikozvino caching DataBlock (%): 100
kudzingwa (MB): 2170, chiyero 1.09, pamusoro (%): 985, inorema yekudzinga counter: 1, ikozvino caching DataBlock (%): 91 < kutanga
vakadzingwa (MB): 3763, chiyero 1.08, pamusoro (%): 1781, inorema eviction counter: 2, ikozvino caching DataBlock (%): 76
vakadzingwa (MB): 3306, chiyero 1.07, pamusoro (%): 1553, inorema eviction counter: 3, ikozvino caching DataBlock (%): 61
vakadzingwa (MB): 2508, chiyero 1.06, pamusoro (%): 1154, inorema eviction counter: 4, ikozvino caching DataBlock (%): 50
vakadzingwa (MB): 1824, chiyero 1.04, pamusoro (%): 812, inorema eviction counter: 5, ikozvino caching DataBlock (%): 42
vakadzingwa (MB): 1482, chiyero 1.03, pamusoro (%): 641, inorema eviction counter: 6, ikozvino caching DataBlock (%): 36
vakadzingwa (MB): 1140, chiyero 1.01, pamusoro (%): 470, inorema eviction counter: 7, ikozvino caching DataBlock (%): 32
vakadzingwa (MB): 913, chiyero 1.0, pamusoro (%): 356, inorema eviction counter: 8, ikozvino caching DataBlock (%): 29
vakadzingwa (MB): 912, chiyero 0.89, pamusoro (%): 356, inorema eviction counter: 9, ikozvino caching DataBlock (%): 26
vakadzingwa (MB): 684, chiyero 0.76, pamusoro (%): 242, inorema eviction counter: 10, ikozvino caching DataBlock (%): 24
vakadzingwa (MB): 684, chiyero 0.61, pamusoro (%): 242, inorema eviction counter: 11, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 456, chiyero 0.51, pamusoro (%): 128, inorema eviction counter: 12, ikozvino caching DataBlock (%): 21
vakadzingwa (MB): 456, chiyero 0.42, pamusoro (%): 128, inorema eviction counter: 13, ikozvino caching DataBlock (%): 20
vakadzingwa (MB): 456, chiyero 0.33, pamusoro (%): 128, inorema eviction counter: 14, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 15, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 342, chiyero 0.32, pamusoro (%): 71, inorema eviction counter: 16, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 342, chiyero 0.31, pamusoro (%): 71, inorema eviction counter: 17, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 228, chiyero 0.3, pamusoro (%): 14, inorema eviction counter: 18, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 228, chiyero 0.29, pamusoro (%): 14, inorema eviction counter: 19, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 228, chiyero 0.27, pamusoro (%): 14, inorema eviction counter: 20, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 228, chiyero 0.25, pamusoro (%): 14, inorema eviction counter: 21, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 228, chiyero 0.24, pamusoro (%): 14, inorema eviction counter: 22, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 228, chiyero 0.22, pamusoro (%): 14, inorema eviction counter: 23, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 228, chiyero 0.21, pamusoro (%): 14, inorema eviction counter: 24, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 228, chiyero 0.2, pamusoro (%): 14, inorema eviction counter: 25, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 228, chiyero 0.17, pamusoro (%): 14, inorema eviction counter: 26, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 456, reshiyo 0.17, pamusoro (%): 128, inorema eviction counter: 27, ikozvino caching DataBlock (%): 18 < akawedzera anowana (asi tafura yakafanana)
vakadzingwa (MB): 456, chiyero 0.15, pamusoro (%): 128, inorema eviction counter: 28, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 342, chiyero 0.13, pamusoro (%): 71, inorema eviction counter: 29, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 342, chiyero 0.11, pamusoro (%): 71, inorema eviction counter: 30, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 342, chiyero 0.09, pamusoro (%): 71, inorema eviction counter: 31, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 228, chiyero 0.08, pamusoro (%): 14, inorema eviction counter: 32, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 228, chiyero 0.07, pamusoro (%): 14, inorema eviction counter: 33, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 228, chiyero 0.06, pamusoro (%): 14, inorema eviction counter: 34, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 228, chiyero 0.05, pamusoro (%): 14, inorema eviction counter: 35, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 228, chiyero 0.05, pamusoro (%): 14, inorema eviction counter: 36, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 228, chiyero 0.04, pamusoro (%): 14, inorema eviction counter: 37, ikozvino caching DataBlock (%): 17
kudzingwa (MB): 109, chiyero 0.04, pamusoro (%): -46, inorema eviction counter: 37, ikozvino caching DataBlock (%): 22 < back pressure
vakadzingwa (MB): 798, chiyero 0.24, pamusoro (%): 299, inorema eviction counter: 38, ikozvino caching DataBlock (%): 20
vakadzingwa (MB): 798, chiyero 0.29, pamusoro (%): 299, inorema eviction counter: 39, ikozvino caching DataBlock (%): 18
vakadzingwa (MB): 570, chiyero 0.27, pamusoro (%): 185, inorema eviction counter: 40, ikozvino caching DataBlock (%): 17
vakadzingwa (MB): 456, chiyero 0.22, pamusoro (%): 128, inorema eviction counter: 41, ikozvino caching DataBlock (%): 16
vakadzingwa (MB): 342, chiyero 0.16, pamusoro (%): 71, inorema eviction counter: 42, ikozvino caching DataBlock (%): 16
vakadzingwa (MB): 342, chiyero 0.11, pamusoro (%): 71, inorema eviction counter: 43, ikozvino caching DataBlock (%): 16
vakadzingwa (MB): 228, chiyero 0.09, pamusoro (%): 14, inorema eviction counter: 44, ikozvino caching DataBlock (%): 16
vakadzingwa (MB): 228, chiyero 0.07, pamusoro (%): 14, inorema eviction counter: 45, ikozvino caching DataBlock (%): 16
vakadzingwa (MB): 228, chiyero 0.05, pamusoro (%): 14, inorema eviction counter: 46, ikozvino caching DataBlock (%): 16
vakadzingwa (MB): 222, chiyero 0.04, pamusoro (%): 11, inorema eviction counter: 47, ikozvino caching DataBlock (%): 16
kudzingwa (MB): 104, reshiyo 0.03, pamusoro (%): -48, inorema eviction counter: 47, ikozvino caching DataBlock (%): 21 < kukanganisa inowana
vakadzingwa (MB): 684, chiyero 0.2, pamusoro (%): 242, inorema eviction counter: 48, ikozvino caching DataBlock (%): 19
vakadzingwa (MB): 570, chiyero 0.23, pamusoro (%): 185, inorema eviction counter: 49, ikozvino caching DataBlock (%): 18
vakadzingwa (MB): 342, chiyero 0.22, pamusoro (%): 71, inorema eviction counter: 50, ikozvino caching DataBlock (%): 18
vakadzingwa (MB): 228, chiyero 0.21, pamusoro (%): 14, inorema eviction counter: 51, ikozvino caching DataBlock (%): 18
vakadzingwa (MB): 228, chiyero 0.2, pamusoro (%): 14, inorema eviction counter: 52, ikozvino caching DataBlock (%): 18
vakadzingwa (MB): 228, chiyero 0.18, pamusoro (%): 14, inorema eviction counter: 53, ikozvino caching DataBlock (%): 18
vakadzingwa (MB): 228, chiyero 0.16, pamusoro (%): 14, inorema eviction counter: 54, ikozvino caching DataBlock (%): 18
vakadzingwa (MB): 228, chiyero 0.14, pamusoro (%): 14, inorema eviction counter: 55, ikozvino caching DataBlock (%): 18
kudzingwa (MB): 112, chiyero 0.14, pamusoro (%): -44, inorema eviction counter: 55, ikozvino caching DataBlock (%): 23 < back pressure
vakadzingwa (MB): 456, chiyero 0.26, pamusoro (%): 128, inorema eviction counter: 56, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.31, pamusoro (%): 71, inorema eviction counter: 57, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 58, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 59, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 60, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 61, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 62, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 63, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.32, pamusoro (%): 71, inorema eviction counter: 64, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 65, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 66, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.32, pamusoro (%): 71, inorema eviction counter: 67, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 68, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.32, pamusoro (%): 71, inorema eviction counter: 69, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.32, pamusoro (%): 71, inorema eviction counter: 70, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 71, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 72, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 73, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 74, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 75, ikozvino caching DataBlock (%): 22
vakadzingwa (MB): 342, chiyero 0.33, pamusoro (%): 71, inorema eviction counter: 76, ikozvino caching DataBlock (%): 22
kudzingwa (MB): 21, reshiyo 0.33, pamusoro (%): -90, inorema eviction counter: 76, ikozvino caching DataBlock (%): 32
kudzingwa (MB): 0, reshiyo 0.0, pamusoro (%): -100, inorema eviction counter: 0, ikozvino caching DataBlock (%): 100
kudzingwa (MB): 0, reshiyo 0.0, pamusoro (%): -100, inorema eviction counter: 0, ikozvino caching DataBlock (%): 100

Iwo ma scans aidiwa kuratidza maitiro akafanana muchimiro chegirafu yehukama pakati pezvikamu zviviri zvecache - imwechete (apo mabhuraki asina kumbobvira akumbirwa) uye akawanda (data "yakakumbirwa" kamwechete zvakachengetwa pano):

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Uye pakupedzisira, kushanda kweiyo parameter kunotaridzika sei muchimiro chegirafu. Kuenzanisa, cache yakadzimwa zvachose pakutanga, ipapo HBase yakatangwa necaching uye kunonoka kutanga kwebasa rekugadzirisa nemaminitsi mashanu (5 eviction cycles).

Kodhi yakazara inogona kuwanikwa muPull Chikumbiro HBASE 23887 pa github.

Nekudaro, zviuru mazana matatu zvekuverenga pasekondi hazvisi zvese zvinogona kuwanikwa pane iyi hardware pasi pemamiriro aya. Ichokwadi ndechekuti kana iwe uchida kuwana data kuburikidza neHDFS, iyo ShortCircuitCache (inozonzi SSC) inoshandiswa, iyo inokubvumira kuti uwane iyo data zvakananga, kudzivisa kushamwaridzana kwetiweki.

Profiling yakaratidza kuti kunyangwe iyi nzira inopa budiriro yakakura, zvakare pane imwe nguva inova bhodhoro, nekuti anenge ese anorema maoparesheni anoitika mukati mekukiya, izvo zvinotungamira kuvharira nguva zhinji.

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Taona izvi, takaona kuti dambudziko rinogona kutenderedzwa nekugadzira akatevedzana akazvimirira maSSC:

private final ShortCircuitCache[] shortCircuitCache;
...
shortCircuitCache = new ShortCircuitCache[this.clientShortCircuitNum];
for (int i = 0; i < this.clientShortCircuitNum; i++)
  this.shortCircuitCache[i] = new ShortCircuitCache(…);

Uye wobva washanda navo, usingabatanidzi mharadzano zvakare pane yekupedzisira offset digit:

public ShortCircuitCache getShortCircuitCache(long idx) {
    return shortCircuitCache[(int) (idx % clientShortCircuitNum)];
}

Iye zvino unogona kutanga kuedza. Kuti tiite izvi, isu tichaverenga mafaera kubva kuHDFS ine yakapusa yakawanda-yakarukwa application. Seta ma parameters:

conf.set("dfs.client.read.shortcircuit", "true");
conf.set("dfs.client.read.shortcircuit.buffer.size", "65536"); // ΠΏΠΎ Π΄Π΅Ρ„ΠΎΠ»Ρ‚Ρƒ = 1 ΠœΠ‘ ΠΈ это сильно замСдляСт Ρ‡Ρ‚Π΅Π½ΠΈΠ΅, поэтому Π»ΡƒΡ‡ΡˆΠ΅ привСсти Π² соотвСтствиС ΠΊ Ρ€Π΅Π°Π»ΡŒΠ½Ρ‹ΠΌ Π½ΡƒΠΆΠ΄Π°ΠΌ
conf.set("dfs.client.short.circuit.num", num); // ΠΎΡ‚ 1 Π΄ΠΎ 10

Uye ingoverenga mafaera:

FSDataInputStream in = fileSystem.open(path);
for (int i = 0; i < count; i++) {
    position += 65536;
    if (position > 900000000)
        position = 0L;
    int res = in.read(position, byteBuffer, 0, 65536);
}

Iyi kodhi inoitwa mune tambo dzakasiyana uye isu tichawedzera huwandu hweakaverengerwa mafaira panguva imwe chete (kubva pa10 kusvika pa200 - horizontal axis) uye nhamba yecache (kubva pa1 kusvika ku10 - graphics). Iyo yakatwasuka axis inoratidza kukwidziridzwa kunobva mukuwedzera kweSSC inoenderana nenyaya kana paine cache imwe chete.

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Maitiro ekuverenga girafu: Nguva yekuuraya ye100 zviuru inoverengwa mu64 KB mabhuroko ane cache imwe inoda 78 masekonzi. Nepo ne5 cache zvinotora masekonzi gumi nematanhatu. Avo. kune kukurumidza kwe ~ 16 nguva. Sezvinogona kuonekwa kubva mugirafu, mhedzisiro yacho haioneki zvakanyanya kune nhamba diki yekuverenga kwakafanana; inotanga kuita basa rinocherechedzwa kana paine anopfuura shinda ye5. Zvinoonekwa zvakare kuti kuwedzera nhamba yeSSCs kubva pa50. uye pamusoro apa inopa yakanyanya kudiki kuita kuwedzera.

Cherechedzo 1: sezvo mhinduro dzebvunzo dzichinyanya kushanduka (ona pazasi), 3 inomhanya yakaitwa uye mhedzisiro yakakosha yakaverengerwa.

Cherechedzo 2: Kuwanikwa kwekuita kubva mukugadzirisa kusarudzika kuwana kwakafanana, kunyangwe iyo yekupinda pachayo ichinonoka zvishoma.

Nekudaro, zvinofanirwa kujekesa kuti, kusiyana neiyo nyaya neHBase, kukwidziridzwa uku hakusi kwemahara nguva dzose. Pano isu "tinovhura" kugona kweCPU kuita basa rakawanda, pane kurembera pamakiyi.

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Pano iwe unogona kuona kuti, kazhinji, kuwedzera kwehuwandu hwema cache kunopa kunenge kwakaenzana kuwedzera kweCPU kushandiswa. Nekudaro, kune zvishoma zvishoma zvinokunda zvinosanganiswa.

Semuenzaniso, ngatitarisei zvakanyanya pakugadzika SSC = 3. Kuwedzera kwekuita pahutano kunenge 3.3 nguva. Pazasi pane mhinduro kubva kune ese matatu akapatsanurwa anomhanya.

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Nepo kushandiswa kweCPU kuchiwedzera neinosvika 2.8 nguva. Musiyano hauna kunyanya kukura, asi Greta mudiki atofara uye anogona kunge aine nguva yekuenda kuchikoro uye kutora zvidzidzo.

Nekudaro, izvi zvichave nemhedzisiro yakanaka kune chero chishandiso chinoshandisa kuwanda kuwana kuHDFS (semuenzaniso Spark, nezvimwewo), chero iyo kodhi yekunyorera ikareruka (kureva kuti plug iri padivi remutengi weHDFS) uye paine mahara CPU simba. . Kuti titarise, ngationgororei kuti kushandiswa kwakasanganiswa kweBlockCache optimization uye SSC tuning yekuverenga kubva kuHBase ichave nei.

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Zvinogona kuoneka kuti pasi pemamiriro akadai mhedzisiro haina kukura senge muyedzo dzakanatswa (kuverenga pasina chero kugadzirisa), asi zvinokwanisika kudzvanya imwe 80K pano. Pamwe chete, zvese optimizations zvinopa kusvika ku4x kumhanya.

A PR yakagadzirirwawo kugadzirisa uku [HDFS-15202], iyo yakabatanidzwa uye mashandiro aya achawanikwa mune ramangwana rekuburitswa.

Uye pakupedzisira, zvainakidza kuenzanisa kuverenga kweiyo yakafanana yakakura-column dhatabhesi, Cassandra uye HBase.

Kuti tiite izvi, takatangisa zviitiko zveyakajairwa YCSB mutoro wekuyedza utility kubva kune maviri mauto (800 tambo yakazara). Padivi reseva - 4 zviitiko zveRegionServer neCassandra pane 4 mauto (kwete iwo ari kumhanyisa vatengi, kudzivirira pesvedzero yavo). Kuverenga kwakabva pamatafura ehukuru:

HBase - 300 GB paHDFS (100 GB data yakachena)

Cassandra - 250 GB (replication factor = 3)

Avo. iyo vhoriyamu yaive yakafanana (muHBase zvishoma zvishoma).

HBase parameters:

dfs.client.short.circuit.num = 5 (HDFS mutengi optimization)

hbase.lru.cache.heavy.eviction.count.limit = 30 - izvi zvinoreva kuti chigamba chichatanga kushanda mushure mekudzingwa makumi matatu (~ maminitsi mashanu)

hbase.lru.cache.heavy.eviction.mb.size.limit = 300 - chinangwa chehuwandu hwecaching uye kudzingwa

YCSB matanda akapatsanurwa uye akaunganidzwa muExcel magirafu:

Maitiro ekuwedzera kukurumidza kuverenga kubva kuHBase kusvika ku3 nguva uye kubva HDFS kusvika ku5 nguva

Sezvauri kuona, izvi optimizations zvinoita kuti zvikwanise kuenzanisa kuita kweiyo dhatabhesi pasi pemamiriro aya uye kuwana 450 zviuru kuverenga pasekondi.

Tinovimba kuti ruzivo urwu runogona kubatsira kune mumwe munhu panguva yehondo inonakidza yekubudirira.

Source: www.habr.com

Voeg