Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Babban aiki shine ɗayan mahimman buƙatun lokacin aiki tare da manyan bayanai. A cikin sashen loda bayanai a Sberbank, muna tura kusan duk ma'amaloli zuwa cikin Hadoop na tushen Data Cloud don haka mu'amala da manyan kwararar bayanai. A zahiri, koyaushe muna neman hanyoyin haɓaka aiki, kuma yanzu muna son gaya muku yadda muka gudanar da facin RegionServer HBase da abokin ciniki HDFS, godiya ga wanda muka sami damar haɓaka saurin karanta ayyukan.
Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Koyaya, kafin ci gaba zuwa ainihin abubuwan haɓakawa, yana da daraja magana game da hane-hane waɗanda, bisa ƙa'ida, ba za a iya kewaya su ba idan kun zauna akan HDD.

Me yasa HDD da saurin karatun Random Access ba su dace ba
Kamar yadda ka sani, HBase, da sauran ma'ajin bayanai, suna adana bayanai a cikin tubalan da yawa na kilobytes masu girma. Ta hanyar tsoho yana kusan 64 KB. Yanzu bari mu yi tunanin cewa muna buƙatar samun bytes 100 kawai kuma muna neman HBase ya ba mu wannan bayanan ta amfani da wani maɓalli. Tun da girman toshe a cikin HFiles shine 64 KB, buƙatar za ta kasance mafi girma sau 640 (minti ɗaya kawai!) Fiye da buƙata.

Na gaba, tun da buƙatar za ta bi ta HDFS da tsarin caching na metadata ShortCircuitCache (wanda ke ba da damar isa ga fayiloli kai tsaye), wannan yana haifar da karanta riga 1 MB daga faifai. Koyaya, ana iya daidaita wannan tare da siga dfs.client.karanta.shortcircuit.buffer.size kuma a yawancin lokuta yana da ma'ana don rage wannan darajar, misali zuwa 126 KB.

Bari mu ce muna yin haka, amma ƙari, lokacin da muka fara karanta bayanai ta hanyar java api, kamar ayyuka kamar FileChannel.read kuma tambayi tsarin aiki don karanta ƙayyadaddun adadin bayanai, yana karanta “kawai idan” sau 2 ƙari. , i.e. 256 KB a wurinmu. Wannan saboda java ba shi da hanya mai sauƙi don saita tutar FADV_RANDOM don hana wannan hali.

Sakamakon haka, don samun bytes 100, ana karanta ƙarin sau 2600 a ƙarƙashin hular. Zai yi kama da cewa mafita a bayyane take, bari mu rage girman toshe zuwa kilobyte, saita tutar da aka ambata kuma mu sami haɓakar haɓakar haske. Amma matsalar ita ce ta hanyar rage girman block da sau 2, muna kuma rage adadin bytes da ake karantawa kowace raka'a sau biyu.

Ana iya samun wasu ribar daga saita tutar FADV_RANDOM, amma tare da babban zare da yawa kuma tare da girman toshe 128 KB, amma wannan shine matsakaicin kashi biyu cikin ɗari:

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

An gudanar da gwaje-gwaje akan fayiloli 100, kowane girman 1 GB kuma yana kan 10 HDDs.

Bari mu lissafta abin da za mu iya, bisa manufa, ƙidaya akan wannan saurin:
Bari mu ce muna karanta daga faifai 10 a gudun 280 MB / s, watau. sau miliyan 3 100 bytes. Amma kamar yadda muke tunawa, bayanan da muke buƙata sun ragu sau 2600 fiye da abin da aka karanta. Don haka, muna raba miliyan 3 da 2600 kuma mu samu 1100 records per second.

Rashin damuwa, ko ba haka ba? Wannan dabi'a ce Samun dama damar samun bayanai akan HDD - ba tare da la'akari da girman toshe ba. Wannan ita ce iyaka ta zahiri na samun damar shiga bazuwar kuma babu bayanan da zai iya matsewa a ƙarƙashin irin waɗannan yanayi.

Ta yaya sai bayanan bayanai ke samun saurin gudu? Don amsa wannan tambayar, bari mu ga abin da ke faruwa a wannan hoton:

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

A nan mun ga cewa a cikin 'yan mintuna na farko gudun yana da kusan rikodin rikodin dubu a cikin dakika. Koyaya, ƙari, saboda gaskiyar cewa ana karantawa da yawa fiye da yadda aka buƙata, bayanan sun ƙare a cikin buff / cache na tsarin aiki (linux) kuma saurin yana ƙaruwa zuwa mafi kyawun 60 dubu a sakan daya.

Don haka, za mu ci gaba da tuntuɓar haɓaka damar shiga kawai ga bayanan da ke cikin cache na OS ko wanda ke cikin na'urorin ajiya na SSD/NVMe na saurin samun dama.

A yanayin mu, za mu gudanar da gwaje-gwaje a kan benci na sabobin 4, kowanne daga cikinsu yana cajin kamar haka:

CPU: Xeon E5-2680 v4 @ 2.40GHz 64 zaren.
Ƙwaƙwalwar ajiya: 730 GB.
sigar java: 1.8.0_111

Kuma a nan babban batu shine adadin bayanai a cikin tebur da ke buƙatar karantawa. Gaskiyar ita ce, idan kun karanta bayanai daga tebur wanda aka sanya gaba ɗaya a cikin ma'ajin HBase, to ba zai ma zuwa karantawa daga buff/cache na tsarin aiki ba. Saboda HBase ta tsohuwa yana ware kashi 40 na ƙwaƙwalwar ajiya zuwa tsarin da ake kira BlockCache. Mahimmanci wannan shine ConcurrentHashMap, inda maɓalli shine sunan fayil + soket na toshe, kuma ƙimar ita ce ainihin bayanai a wannan kashewa.

Don haka, lokacin karantawa kawai daga wannan tsarin, mu muna ganin kyakkyawan gudu, kamar buƙatun miliyan ɗaya a sakan daya. Amma bari mu yi tunanin cewa ba za mu iya ware ɗaruruwan gigabytes na ƙwaƙwalwar ajiya kawai don buƙatun bayanai ba, saboda akwai abubuwa da yawa masu amfani da ke gudana akan waɗannan sabar.

Misali, a cikin yanayinmu, ƙarar BlockCache akan RS ɗaya shine kusan 12 GB. Mun sauka RS guda biyu akan kumburi daya, watau. An ware 96 GB don BlockCache akan duk nodes. Kuma akwai ƙarin bayanai sau da yawa, misali, bari ya zama tebur 4, yankuna 130 kowannensu, wanda fayilolin suna da girman 800 MB, matse ta FAST_DIFF, watau. jimlar 410 GB (wannan tsattsauran bayanai ne, watau ba tare da la'akari da yanayin kwafi ba).

Don haka, BlockCache shine kawai kusan 23% na jimlar adadin bayanai kuma wannan ya fi kusanci da ainihin yanayin abin da ake kira BigData. Kuma wannan shine inda nishaɗin ya fara - saboda a bayyane yake, ƙarancin cache ya bugu, mafi munin aikin. Bayan haka, idan kun rasa, za ku yi ayyuka da yawa - watau. sauka zuwa ayyukan tsarin kira. Duk da haka, ba za a iya kauce wa wannan ba, don haka bari mu dubi wani bangare daban-daban - menene ya faru da bayanan da ke cikin cache?

Bari mu sauƙaƙa lamarin kuma mu ɗauka cewa muna da cache wanda ya dace da abu 1 kawai. Ga misalin abin da zai faru idan muka yi ƙoƙarin yin aiki tare da ƙarar bayanai sau 3 mafi girma fiye da cache, dole ne mu:

1. Sanya toshe 1 a cikin cache
2. Cire toshe 1 daga cache
3. Sanya toshe 2 a cikin cache
4. Cire toshe 2 daga cache
5. Sanya toshe 3 a cikin cache

An kammala ayyuka 5! Koyaya, wannan yanayin ba za a iya kiran shi al'ada ba; a zahiri, muna tilasta HBase don yin gungun ayyukan mara amfani gaba ɗaya. Kullum yana karanta bayanai daga cache na OS, yana sanya shi a cikin BlockCache, kawai don jefa shi kusan nan da nan saboda sabon ɓangaren bayanai ya zo. raye-raye a farkon post ɗin yana nuna ainihin matsalar - Mai tara shara yana tafiya cikin sikelin, yanayin yana dumama, ƙaramin Greta a nesa kuma Sweden mai zafi yana tashi. Kuma mu masu IT da gaske ba ma son abin sa’ad da yara ke baƙin ciki, don haka mu fara tunanin abin da za mu iya yi game da shi.

Me zai faru idan ba ka sanya duk tubalan a cikin cache ba, amma kawai wani kaso daga cikinsu, don kada cache ya cika? Bari mu fara da ƙara kawai ƴan layin lamba zuwa farkon aikin don saka bayanai a cikin BlockCache:

  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) {
    if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) {
      if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) {
        return;
      }
    }
...

Ma'anar a nan ita ce mai zuwa: kashewa shine matsayi na toshe a cikin fayil ɗin kuma lambobinsa na ƙarshe suna bazuwa kuma ana rarraba su daidai daga 00 zuwa 99. Saboda haka, za mu tsallake waɗanda suka fada cikin kewayon da muke bukata kawai.

Misali, saita cacheDataBlockPercent = 20 kuma duba abin da zai faru:

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Sakamakon a bayyane yake. A cikin jadawali da ke ƙasa, ya bayyana a fili dalilin da yasa irin wannan haɓakar ya faru - muna adana albarkatun GC da yawa ba tare da yin aikin Sisyphean na sanya bayanai a cikin cache kawai don jefa shi nan da nan zuwa magudanar karnuka na Martian:

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

A lokaci guda, amfani da CPU yana ƙaruwa, amma yana da ƙasa da yawan aiki:

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Hakanan yana da mahimmanci a lura cewa tubalan da aka adana a cikin BlockCache sun bambanta. Yawancin, kusan 95%, bayanai ne da kansu. Kuma sauran metadata ne, kamar Bloom filters ko LEAF_INDEX da da dai sauransu.. Wannan bayanan bai isa ba, amma yana da amfani sosai, domin kafin samun damar bayanan kai tsaye, HBase ya juya zuwa meta don fahimtar ko ya zama dole a bincika anan gaba kuma, idan haka ne, inda ainihin shingen sha'awa yake.

Saboda haka, a cikin code muna ganin yanayin dubawa buf.getBlockType().isData() kuma godiya ga wannan meta, za mu bar shi a cikin cache a kowane hali.

Yanzu bari mu ƙara lodi da dan kadan ƙara sama da alama a tafi daya. A cikin gwajin farko mun sanya kashi na yanke = 20 kuma ba a yi amfani da BlockCache kadan ba. Yanzu bari mu saita shi zuwa 23% kuma mu ƙara zaren 100 kowane minti 5 don ganin lokacin jikewa ya faru:

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

A nan mun ga cewa ainihin sigar kusan nan da nan ta buga rufin kusan buƙatun dubu 100 a sakan daya. Ganin cewa facin yana ba da haɓaka har zuwa dubu 300. A lokaci guda, a bayyane yake cewa ƙarin haɓakawa ba ya zama “kyauta”; Amfani da CPU shima yana ƙaruwa.

Duk da haka, wannan ba shine mafita mai kyau ba, tun da ba mu sani ba a gaba na yawan adadin tubalan da ake buƙatar cache, ya dogara da bayanin martaba. Don haka, an aiwatar da wata hanya don daidaita wannan siga ta atomatik dangane da ayyukan ayyukan karatu.

An ƙara zaɓuɓɓuka guda uku don sarrafa wannan:

hbase.lru.cache.nauyi.eviction.count.limit - saita sau nawa tsarin fitar da bayanai daga cache yakamata ya gudana kafin mu fara amfani da ingantawa (watau tsallake shinge). Ta hanyar tsoho yana daidai da MAX_INT = 2147483647 kuma a zahiri yana nufin cewa fasalin ba zai taɓa fara aiki tare da wannan ƙimar ba. Domin tsarin korar yana farawa kowane 5 - 10 seconds (ya dogara da nauyin kaya) da 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 shekaru. Koyaya, zamu iya saita wannan siga zuwa 0 kuma mu sanya fasalin yayi aiki nan da nan bayan ƙaddamar da shi.

Duk da haka, akwai kuma abin biya a cikin wannan siga. Idan nauyinmu ya kasance irin wannan karatun na ɗan gajeren lokaci (ka ce a cikin rana) da kuma karatun dogon lokaci (da daddare) akai-akai, to za mu iya tabbatar da cewa an kunna fasalin ne kawai lokacin da ayyukan karatun dogon lokaci ke ci gaba.

Misali, mun san cewa karatun ɗan gajeren lokaci yawanci yana ɗaukar kusan minti 1. Babu buƙatar fara fitar da tubalan, cache ɗin ba zai sami lokaci don zama tsohon ba sannan kuma zamu iya saita wannan siga daidai da, misali, 10. Wannan zai haifar da gaskiyar cewa ingantawa zai fara aiki ne kawai lokacin da dogon lokaci- An fara karatu mai aiki a lokaci, watau. cikin dakika 100. Don haka, idan muna da karatun ɗan gajeren lokaci, to duk tubalan za su shiga cikin cache kuma za su kasance (sai dai waɗanda za a fitar da su ta daidaitaccen algorithm). Kuma idan muka yi karatu na dogon lokaci, ana kunna fasalin kuma za mu sami babban aiki sosai.

hbase.lru.cache.mai nauyi.kori.mb. girman.limit - saita megabytes nawa muke son sanyawa a cikin ma'ajin (kuma, ba shakka, korar) a cikin daƙiƙa 10. Siffar za ta yi ƙoƙarin isa wannan ƙimar kuma ta kula da shi. Maganar ita ce: idan muka cusa gigabytes a cikin ma'ajin, to dole ne mu kori gigabytes, kuma wannan, kamar yadda muka gani a sama, yana da tsada sosai. Koyaya, bai kamata ku yi ƙoƙarin saita shi kaɗan ba, saboda hakan zai sa yanayin tsallake shingen ya fita da wuri. Don sabar masu ƙarfi (kimanin 20-40 muryoyin jiki), yana da kyau a saita kusan 300-400 MB. Domin aji na tsakiya (~ 10 cores) 200-300 MB. Don tsarin rauni (Cores 2-5) 50-100 MB na iya zama na al'ada (ba a gwada su akan waɗannan ba).

Bari mu dubi yadda wannan ke aiki: bari mu ce mun saita hbase.lru.cache.heavy.eviction.mb.size.limit = 500, akwai wani nau'i na kaya (karantawa) sannan kowane ~ 10 seconds muna lissafin adadin bytes nawa ne. korar ta hanyar amfani da dabara:

Sama = Ƙimar Ƙarfafa Ƙwararru (MB) * 100 / Iyaka (MB) - 100;

Idan a zahiri an fitar da 2000 MB, to Overhead daidai yake da:

2000 * 100/500 - 100 = 300%

Algorithms suna ƙoƙarin kiyaye ba fiye da ƴan dubun bisa dari ba, don haka fasalin zai rage yawan adadin tubalan da aka adana, ta yadda za a aiwatar da tsarin daidaitawa ta atomatik.

Koyaya, idan nauyin ya faɗi, bari mu ce kawai 200 MB ne aka fitar da sama kuma ya zama mara kyau (abin da ake kira overshooting):

200 * 100 / 500 - 100 = -60%

Akasin haka, fasalin zai ƙara yawan adadin tubalan da aka adana har sai saman ya zama tabbatacce.

Da ke ƙasa akwai misalin yadda wannan ke kallon ainihin bayanai. Babu buƙatar ƙoƙari don isa 0%, ba zai yiwu ba. Yana da kyau sosai lokacin da yake kusan 30 - 100%, wannan yana taimakawa don gujewa fita da wuri daga yanayin ingantawa yayin hawan ɗan gajeren lokaci.

hbase.lru.cache.mai nauyi.eviction.overhead.coefficient - saita yadda sauri muke son samun sakamakon. Idan mun san tabbas cewa karatunmu ya fi tsayi kuma ba sa son jira, za mu iya ƙara wannan rabo kuma mu sami babban aiki cikin sauri.

Misali, mun saita wannan ƙididdiga = 0.01. Wannan yana nufin cewa Overhead (duba sama) za a ninka ta wannan lamba ta sakamakon sakamakon kuma za a rage yawan adadin tubalan da aka adana. Bari mu ɗauka cewa Overhead = 300% da coefficient = 0.01, sannan za a rage yawan adadin tubalan da aka adana da kashi 3%.

Hakanan ana aiwatar da irin wannan dabarar “Backpressure” don ƙimar sama da ƙasa mara kyau (overshooting). Tun da sauye-sauye na ɗan gajeren lokaci a cikin ƙarar karantawa da fitar da su koyaushe yana yiwuwa, wannan tsarin yana ba ku damar guje wa fita da wuri daga yanayin ingantawa. Matsi na baya yana da dabarar jujjuyawar: idan aka fi ƙarfin overshooting, yawancin tubalan ana adana su.

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Lambar aiwatarwa

        LruBlockCache cache = this.cache.get();
        if (cache == null) {
          break;
        }
        freedSumMb += cache.evict()/1024/1024;
        /*
        * Sometimes we are reading more data than can fit into BlockCache
        * and it is the cause a high rate of evictions.
        * This in turn leads to heavy Garbage Collector works.
        * So a lot of blocks put into BlockCache but never read,
        * but spending a lot of CPU resources.
        * Here we will analyze how many bytes were freed and decide
        * decide whether the time has come to reduce amount of caching blocks.
        * It help avoid put too many blocks into BlockCache
        * when evict() works very active and save CPU for other jobs.
        * More delails: https://issues.apache.org/jira/browse/HBASE-23887
        */

        // First of all we have to control how much time
        // has passed since previuos evict() was launched
        // This is should be almost the same time (+/- 10s)
        // because we get comparable volumes of freed bytes each time.
        // 10s because this is default period to run evict() (see above this.wait)
        long stopTime = System.currentTimeMillis();
        if ((stopTime - startTime) > 1000 * 10 - 1) {
          // Here we have to calc what situation we have got.
          // We have the limit "hbase.lru.cache.heavy.eviction.bytes.size.limit"
          // and can calculte overhead on it.
          // We will use this information to decide,
          // how to change percent of caching blocks.
          freedDataOverheadPercent =
            (int) (freedSumMb * 100 / cache.heavyEvictionMbSizeLimit) - 100;
          if (freedSumMb > cache.heavyEvictionMbSizeLimit) {
            // Now we are in the situation when we are above the limit
            // But maybe we are going to ignore it because it will end quite soon
            heavyEvictionCount++;
            if (heavyEvictionCount > cache.heavyEvictionCountLimit) {
              // It is going for a long time and we have to reduce of caching
              // blocks now. So we calculate here how many blocks we want to skip.
              // It depends on:
             // 1. Overhead - if overhead is big we could more aggressive
              // reducing amount of caching blocks.
              // 2. How fast we want to get the result. If we know that our
              // heavy reading for a long time, we don't want to wait and can
              // increase the coefficient and get good performance quite soon.
              // But if we don't sure we can do it slowly and it could prevent
              // premature exit from this mode. So, when the coefficient is
              // higher we can get better performance when heavy reading is stable.
              // But when reading is changing we can adjust to it and set
              // the coefficient to lower value.
              int change =
                (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient);
              // But practice shows that 15% of reducing is quite enough.
              // We are not greedy (it could lead to premature exit).
              change = Math.min(15, change);
              change = Math.max(0, change); // I think it will never happen but check for sure
              // So this is the key point, here we are reducing % of caching blocks
              cache.cacheDataBlockPercent -= change;
              // If we go down too deep we have to stop here, 1% any way should be.
              cache.cacheDataBlockPercent = Math.max(1, cache.cacheDataBlockPercent);
            }
          } else {
            // Well, we have got overshooting.
            // Mayby it is just short-term fluctuation and we can stay in this mode.
            // It help avoid permature exit during short-term fluctuation.
            // If overshooting less than 90%, we will try to increase the percent of
            // caching blocks and hope it is enough.
            if (freedSumMb >= cache.heavyEvictionMbSizeLimit * 0.1) {
              // Simple logic: more overshooting - more caching blocks (backpressure)
              int change = (int) (-freedDataOverheadPercent * 0.1 + 1);
              cache.cacheDataBlockPercent += change;
              // But it can't be more then 100%, so check it.
              cache.cacheDataBlockPercent = Math.min(100, cache.cacheDataBlockPercent);
            } else {
              // Looks like heavy reading is over.
              // Just exit form this mode.
              heavyEvictionCount = 0;
              cache.cacheDataBlockPercent = 100;
            }
          }
          LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " +
            "heavy eviction counter: {}, " +
            "current caching DataBlock (%): {}",
            freedSumMb, freedDataOverheadPercent,
            heavyEvictionCount, cache.cacheDataBlockPercent);

          freedSumMb = 0;
          startTime = stopTime;
       }

Bari yanzu mu kalli wannan duka ta amfani da misali na gaske. Muna da rubutun gwaji mai zuwa:

  1. Bari mu fara yin Scan ( zaren 25, tsari = 100)
  2. Bayan minti 5, ƙara Multi-samun (zaren 25, tsari = 100)
  3. Bayan minti 5, kashe Multi-samun (kawai ya rage saura sake)

Muna yin gudu biyu, na farko hbase.lru.cache.heavy.eviction.count.limit = 10000 (wanda a zahiri ya hana fasalin), sannan saita iyaka = 0 (yana ba da damar shi).

A cikin rajistan ayyukan da ke ƙasa muna ganin yadda ake kunna fasalin kuma sake saita Overshooting zuwa 14-71%. Daga lokaci zuwa lokaci nauyin yana raguwa, wanda ke kunna Backpressure kuma HBase yana sake adana ƙarin tubalan.

Log RegionServer
fitar (MB): 0, rabo 0.0, sama (%): -100, ma'aunin fitarwa mai nauyi: 0, DataBlock na caching na yanzu (%): 100
fitar (MB): 0, rabo 0.0, sama (%): -100, ma'aunin fitarwa mai nauyi: 0, DataBlock na caching na yanzu (%): 100
fitar (MB): 2170, rabo 1.09, sama da sama (%): 985, ma'aunin fitarwa mai nauyi: 1, DataBlock na yanzu (%): 91 <fara
fitar da (MB): 3763, rabo 1.08, sama (%): 1781, ma'aunin fitarwa mai nauyi: 2, DataBlock na yanzu (%): 76
fitar da (MB): 3306, rabo 1.07, sama (%): 1553, ma'aunin fitarwa mai nauyi: 3, DataBlock na yanzu (%): 61
fitar da (MB): 2508, rabo 1.06, sama (%): 1154, ma'aunin fitarwa mai nauyi: 4, DataBlock na yanzu (%): 50
fitar da (MB): 1824, rabo 1.04, sama (%): 812, ma'aunin fitarwa mai nauyi: 5, DataBlock na yanzu (%): 42
fitar da (MB): 1482, rabo 1.03, sama (%): 641, ma'aunin fitarwa mai nauyi: 6, DataBlock na yanzu (%): 36
fitar da (MB): 1140, rabo 1.01, sama (%): 470, ma'aunin fitarwa mai nauyi: 7, DataBlock na yanzu (%): 32
fitar da (MB): 913, rabo 1.0, sama (%): 356, ma'aunin fitarwa mai nauyi: 8, DataBlock na yanzu (%): 29
fitar da (MB): 912, rabo 0.89, sama (%): 356, ma'aunin fitarwa mai nauyi: 9, DataBlock na yanzu (%): 26
fitar da (MB): 684, rabo 0.76, sama (%): 242, ma'aunin fitarwa mai nauyi: 10, DataBlock na yanzu (%): 24
fitar da (MB): 684, rabo 0.61, sama (%): 242, ma'aunin fitarwa mai nauyi: 11, DataBlock na yanzu (%): 22
fitar da (MB): 456, rabo 0.51, sama (%): 128, ma'aunin fitarwa mai nauyi: 12, DataBlock na yanzu (%): 21
fitar da (MB): 456, rabo 0.42, sama (%): 128, ma'aunin fitarwa mai nauyi: 13, DataBlock na yanzu (%): 20
fitar da (MB): 456, rabo 0.33, sama (%): 128, ma'aunin fitarwa mai nauyi: 14, DataBlock na yanzu (%): 19
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 15, DataBlock na yanzu (%): 19
fitar da (MB): 342, rabo 0.32, sama (%): 71, ma'aunin fitarwa mai nauyi: 16, DataBlock na yanzu (%): 19
fitar da (MB): 342, rabo 0.31, sama (%): 71, ma'aunin fitarwa mai nauyi: 17, DataBlock na yanzu (%): 19
fitar da (MB): 228, rabo 0.3, sama (%): 14, ma'aunin fitarwa mai nauyi: 18, DataBlock na yanzu (%): 19
fitar da (MB): 228, rabo 0.29, sama (%): 14, ma'aunin fitarwa mai nauyi: 19, DataBlock na yanzu (%): 19
fitar da (MB): 228, rabo 0.27, sama (%): 14, ma'aunin fitarwa mai nauyi: 20, DataBlock na yanzu (%): 19
fitar da (MB): 228, rabo 0.25, sama (%): 14, ma'aunin fitarwa mai nauyi: 21, DataBlock na yanzu (%): 19
fitar da (MB): 228, rabo 0.24, sama (%): 14, ma'aunin fitarwa mai nauyi: 22, DataBlock na yanzu (%): 19
fitar da (MB): 228, rabo 0.22, sama (%): 14, ma'aunin fitarwa mai nauyi: 23, DataBlock na yanzu (%): 19
fitar da (MB): 228, rabo 0.21, sama (%): 14, ma'aunin fitarwa mai nauyi: 24, DataBlock na yanzu (%): 19
fitar da (MB): 228, rabo 0.2, sama (%): 14, ma'aunin fitarwa mai nauyi: 25, DataBlock na yanzu (%): 19
fitar da (MB): 228, rabo 0.17, sama (%): 14, ma'aunin fitarwa mai nauyi: 26, DataBlock na yanzu (%): 19
fitar (MB): 456, rabo 0.17, sama da sama (%): 128, babban korar counter: 27, dataBlock caching na yanzu (%): 18 <ƙara samun (amma tebur iri ɗaya)
fitar da (MB): 456, rabo 0.15, sama (%): 128, ma'aunin fitarwa mai nauyi: 28, DataBlock na yanzu (%): 17
fitar da (MB): 342, rabo 0.13, sama (%): 71, ma'aunin fitarwa mai nauyi: 29, DataBlock na yanzu (%): 17
fitar da (MB): 342, rabo 0.11, sama (%): 71, ma'aunin fitarwa mai nauyi: 30, DataBlock na yanzu (%): 17
fitar da (MB): 342, rabo 0.09, sama (%): 71, ma'aunin fitarwa mai nauyi: 31, DataBlock na yanzu (%): 17
fitar da (MB): 228, rabo 0.08, sama (%): 14, ma'aunin fitarwa mai nauyi: 32, DataBlock na yanzu (%): 17
fitar da (MB): 228, rabo 0.07, sama (%): 14, ma'aunin fitarwa mai nauyi: 33, DataBlock na yanzu (%): 17
fitar da (MB): 228, rabo 0.06, sama (%): 14, ma'aunin fitarwa mai nauyi: 34, DataBlock na yanzu (%): 17
fitar da (MB): 228, rabo 0.05, sama (%): 14, ma'aunin fitarwa mai nauyi: 35, DataBlock na yanzu (%): 17
fitar da (MB): 228, rabo 0.05, sama (%): 14, ma'aunin fitarwa mai nauyi: 36, DataBlock na yanzu (%): 17
fitar da (MB): 228, rabo 0.04, sama (%): 14, ma'aunin fitarwa mai nauyi: 37, DataBlock na yanzu (%): 17
fitar (MB): 109, rabo 0.04, sama da sama (%): -46, ma'aunin fitarwa mai nauyi: 37, DataBlock na yanzu (%): 22 <matsi na baya
fitar da (MB): 798, rabo 0.24, sama (%): 299, ma'aunin fitarwa mai nauyi: 38, DataBlock na yanzu (%): 20
fitar da (MB): 798, rabo 0.29, sama (%): 299, ma'aunin fitarwa mai nauyi: 39, DataBlock na yanzu (%): 18
fitar da (MB): 570, rabo 0.27, sama (%): 185, ma'aunin fitarwa mai nauyi: 40, DataBlock na yanzu (%): 17
fitar da (MB): 456, rabo 0.22, sama (%): 128, ma'aunin fitarwa mai nauyi: 41, DataBlock na yanzu (%): 16
fitar da (MB): 342, rabo 0.16, sama (%): 71, ma'aunin fitarwa mai nauyi: 42, DataBlock na yanzu (%): 16
fitar da (MB): 342, rabo 0.11, sama (%): 71, ma'aunin fitarwa mai nauyi: 43, DataBlock na yanzu (%): 16
fitar da (MB): 228, rabo 0.09, sama (%): 14, ma'aunin fitarwa mai nauyi: 44, DataBlock na yanzu (%): 16
fitar da (MB): 228, rabo 0.07, sama (%): 14, ma'aunin fitarwa mai nauyi: 45, DataBlock na yanzu (%): 16
fitar da (MB): 228, rabo 0.05, sama (%): 14, ma'aunin fitarwa mai nauyi: 46, DataBlock na yanzu (%): 16
fitar da (MB): 222, rabo 0.04, sama (%): 11, ma'aunin fitarwa mai nauyi: 47, DataBlock na yanzu (%): 16
fitar (MB): 104, rabo 0.03, sama da sama (%): -48, ma'aunin fitarwa mai nauyi: 47, DataBlock na yanzu (%): 21 < katse yana samun
fitar da (MB): 684, rabo 0.2, sama (%): 242, ma'aunin fitarwa mai nauyi: 48, DataBlock na yanzu (%): 19
fitar da (MB): 570, rabo 0.23, sama (%): 185, ma'aunin fitarwa mai nauyi: 49, DataBlock na yanzu (%): 18
fitar da (MB): 342, rabo 0.22, sama (%): 71, ma'aunin fitarwa mai nauyi: 50, DataBlock na yanzu (%): 18
fitar da (MB): 228, rabo 0.21, sama (%): 14, ma'aunin fitarwa mai nauyi: 51, DataBlock na yanzu (%): 18
fitar da (MB): 228, rabo 0.2, sama (%): 14, ma'aunin fitarwa mai nauyi: 52, DataBlock na yanzu (%): 18
fitar da (MB): 228, rabo 0.18, sama (%): 14, ma'aunin fitarwa mai nauyi: 53, DataBlock na yanzu (%): 18
fitar da (MB): 228, rabo 0.16, sama (%): 14, ma'aunin fitarwa mai nauyi: 54, DataBlock na yanzu (%): 18
fitar da (MB): 228, rabo 0.14, sama (%): 14, ma'aunin fitarwa mai nauyi: 55, DataBlock na yanzu (%): 18
fitar (MB): 112, rabo 0.14, sama da sama (%): -44, ma'aunin fitarwa mai nauyi: 55, DataBlock na yanzu (%): 23 <matsi na baya
fitar da (MB): 456, rabo 0.26, sama (%): 128, ma'aunin fitarwa mai nauyi: 56, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.31, sama (%): 71, ma'aunin fitarwa mai nauyi: 57, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 58, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 59, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 60, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 61, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 62, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 63, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.32, sama (%): 71, ma'aunin fitarwa mai nauyi: 64, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 65, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 66, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.32, sama (%): 71, ma'aunin fitarwa mai nauyi: 67, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 68, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.32, sama (%): 71, ma'aunin fitarwa mai nauyi: 69, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.32, sama (%): 71, ma'aunin fitarwa mai nauyi: 70, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 71, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 72, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 73, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 74, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 75, DataBlock na yanzu (%): 22
fitar da (MB): 342, rabo 0.33, sama (%): 71, ma'aunin fitarwa mai nauyi: 76, DataBlock na yanzu (%): 22
fitar (MB): 21, rabo 0.33, sama (%): -90, ma'aunin fitarwa mai nauyi: 76, DataBlock na caching na yanzu (%): 32
fitar (MB): 0, rabo 0.0, sama (%): -100, ma'aunin fitarwa mai nauyi: 0, DataBlock na caching na yanzu (%): 100
fitar (MB): 0, rabo 0.0, sama (%): -100, ma'aunin fitarwa mai nauyi: 0, DataBlock na caching na yanzu (%): 100

Ana buƙatar sikanin don nuna wannan tsari a cikin nau'i na jadawali na dangantaka tsakanin sassan cache guda biyu - guda ɗaya (inda tubalan da ba a taɓa neman su ba) da yawa (bayanan "an buƙata" aƙalla sau ɗaya ana adana su a nan):

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Kuma a ƙarshe, menene aikin sigogi yayi kama da sigar jadawali. Don kwatantawa, an kashe cache gaba ɗaya a farkon, sannan an ƙaddamar da HBase tare da caching da jinkirta fara aikin ingantawa ta mintuna 5 (zagayen fitar da 30).

Ana iya samun cikakken lambar a Buƙatun Ja HBASE 23887 na github.

Koyaya, karantawa dubu 300 a cikin daƙiƙa ɗaya ba shine abin da za'a iya samu akan wannan kayan aikin a ƙarƙashin waɗannan sharuɗɗan ba. Gaskiyar ita ce, lokacin da kake buƙatar samun damar bayanai ta hanyar HDFS, ana amfani da tsarin ShortCircuitCache (wanda ake kira SSC), wanda ke ba ka damar shiga bayanan kai tsaye, guje wa hulɗar cibiyar sadarwa.

Profiling ya nuna cewa duk da cewa wannan tsari yana ba da babbar riba, shi ma a wani lokaci ya zama cikas, saboda kusan dukkanin ayyuka masu nauyi suna faruwa a cikin kulle, wanda ke haifar da toshe mafi yawan lokaci.

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Bayan mun fahimci hakan, mun fahimci cewa za a iya shawo kan matsalar ta hanyar samar da gungun SSC masu zaman kansu:

private final ShortCircuitCache[] shortCircuitCache;
...
shortCircuitCache = new ShortCircuitCache[this.clientShortCircuitNum];
for (int i = 0; i < this.clientShortCircuitNum; i++)
  this.shortCircuitCache[i] = new ShortCircuitCache(…);

Sannan yi aiki tare da su, ban da intersections kuma a lamba ta ƙarshe:

public ShortCircuitCache getShortCircuitCache(long idx) {
    return shortCircuitCache[(int) (idx % clientShortCircuitNum)];
}

Yanzu za ku iya fara gwaji. Don yin wannan, za mu karanta fayiloli daga HDFS tare da aikace-aikace mai sauƙi da yawa. Saita sigogi:

conf.set("dfs.client.read.shortcircuit", "true");
conf.set("dfs.client.read.shortcircuit.buffer.size", "65536"); // по дефолту = 1 МБ и это сильно замедляет чтение, поэтому лучше привести в соответствие к реальным нуждам
conf.set("dfs.client.short.circuit.num", num); // от 1 до 10

Kuma kawai karanta fayilolin:

FSDataInputStream in = fileSystem.open(path);
for (int i = 0; i < count; i++) {
    position += 65536;
    if (position > 900000000)
        position = 0L;
    int res = in.read(position, byteBuffer, 0, 65536);
}

Ana aiwatar da wannan lambar a cikin zaren daban kuma za mu ƙara adadin fayilolin karanta lokaci guda (daga 10 zuwa 200 - axis a kwance) da adadin caches (daga 1 zuwa 10 - graphics). Axis na tsaye yana nuna hanzarin da ke haifar da karuwa a cikin SSC dangane da shari'ar lokacin da cache ɗaya kawai.

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Yadda ake karanta jadawali: Lokacin aiwatarwa don karantawa dubu 100 a cikin 64 KB blocks tare da cache ɗaya yana buƙatar 78 seconds. Alhali tare da caches 5 yana ɗaukar daƙiƙa 16. Wadancan. akwai hanzarin ~ 5 sau. Kamar yadda za a iya gani a cikin jadawali, tasirin ba ya da kyau ga ƙananan adadin karatu na layi daya, yana fara taka rawar gani yayin da ake karanta zaren fiye da 50. Hakanan ana iya lura cewa ƙara yawan SSCs daga 6. kuma a sama yana ba da haɓakar ƙarami mai mahimmanci.

Lura 1: tun da sakamakon gwajin ya kasance mara ƙarfi (duba ƙasa), an gudanar da gudu 3 kuma an ƙididdige ƙimar sakamakon.

Lura 2: Ribar aikin daga daidaita damar bazuwar iri ɗaya ne, kodayake damar da kanta ta ɗan ɗan yi hankali.

Koyaya, ya zama dole a fayyace cewa, sabanin yanayin HBase, wannan haɓakar ba koyaushe bane kyauta. Anan muna "buɗe" ikon CPU don yin aiki da yawa, maimakon rataye akan makullai.

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Anan zaku iya lura cewa, gabaɗaya, haɓakar adadin caches yana ba da kusan haɓakar amfani da CPU. Duk da haka, akwai ɗan ƙarin nasara haduwa.

Misali, bari mu kalli saitin SSC = 3. Haɓaka aikin akan kewayon shine kusan sau 3.3. A ƙasa akwai sakamako daga duk guda uku daban-daban.

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Yayin da yawan amfani da CPU ke ƙaruwa da kusan sau 2.8. Bambancin bai yi girma sosai ba, amma ƙaramin Greta ya riga ya yi farin ciki kuma yana iya samun lokacin zuwa makaranta da ɗaukar darussa.

Don haka, wannan zai sami sakamako mai kyau ga duk wani kayan aiki da ke amfani da babban damar zuwa HDFS (misali Spark, da sauransu), muddin lambar aikace-aikacen tana da nauyi (watau filogi yana gefen abokin ciniki na HDFS) kuma akwai ikon CPU kyauta. . Don bincika, bari mu gwada menene tasirin haɗin gwiwar amfani da ingantawa na BlockCache da kunna SSC don karatu daga HBase zai yi.

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Ana iya ganin cewa a ƙarƙashin irin waɗannan yanayi tasirin bai kai girma ba kamar a cikin ingantattun gwaje-gwaje (karantawa ba tare da wani aiki ba), amma yana yiwuwa a matse ƙarin 80K anan. Tare, duka ingantawa suna ba da saurin gudu zuwa 4x.

An kuma yi PR don wannan ingantawa [HDFS-15202], wanda aka haɗa kuma wannan aikin zai kasance a cikin sakewa na gaba.

Kuma a ƙarshe, yana da ban sha'awa a kwatanta aikin karantawa na irin wannan babban rumbun bayanai, Cassandra da HBase.

Don yin wannan, mun ƙaddamar da misalin daidaitaccen kayan aikin gwajin lodin YCSB daga runduna biyu (zaren 800 gabaɗaya). A gefen uwar garken - lokuta 4 na RegionServer da Cassandra akan runduna 4 (ba waɗanda abokan ciniki ke gudana ba, don guje wa tasirin su). Karatun ya fito daga Tables masu girma:

HBase - 300 GB akan HDFS (100 GB tsantsa bayanan)

Cassandra - 250 GB (matsalar maimaitawa = 3)

Wadancan. ƙarar ta kasance kusan iri ɗaya (a cikin HBase kaɗan).

HBase sigogi:

dfs.client.short.circuit.num = 5 (Haɓaka abokin ciniki HDFS)

hbase.lru.cache.heavy.eviction.count.limit = 30 - wannan yana nufin cewa facin zai fara aiki bayan korar 30 (~ mintuna 5)

hbase.lru.cache.heavy.eviction.mb.size.limit = 300 - girman manufa na caching da fitarwa

An rarraba rajistan ayyukan YCSB kuma an haɗa su cikin jadawali na Excel:

Yadda ake ƙara saurin karantawa daga HBase har sau 3 kuma daga HDFS har zuwa sau 5

Kamar yadda kake gani, waɗannan haɓakawa suna ba da damar kwatanta aikin waɗannan bayanan bayanan a ƙarƙashin waɗannan yanayi kuma cimma nasarar karantawa dubu 450 a sakan daya.

Muna fatan wannan bayanin zai iya zama da amfani ga wani yayin gwagwarmaya mai ban sha'awa don yawan aiki.

source: www.habr.com

Add a comment