Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Kev ua haujlwm siab yog ib qho ntawm cov kev xav tau tseem ceeb thaum ua haujlwm nrog cov ntaub ntawv loj. Nyob rau hauv lub chaw thauj khoom cov ntaub ntawv ntawm Sberbank, peb tso yuav luag tag nrho cov kev hloov pauv rau hauv peb Hadoop-based Data Cloud thiab yog li cuam tshuam nrog cov ntaub ntawv loj heev. Lawm, peb ib txwm nrhiav txoj hauv kev los txhim kho kev ua tau zoo, thiab tam sim no peb xav qhia koj tias peb tau tswj hwm li cas rau thaj tsam RegionServer HBase thiab HDFS tus neeg siv khoom, ua tsaug uas peb tuaj yeem ua kom nrawm ntawm kev nyeem ntawv ua haujlwm.
Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Txawm li cas los xij, ua ntej hloov mus rau qhov tseem ceeb ntawm kev txhim kho, nws tsim nyog tham txog cov kev txwv uas, hauv paus ntsiab lus, tsis tuaj yeem hla yog tias koj zaum ntawm HDD.

Vim li cas HDD thiab ceev Random Access nyeem tsis tau
Raws li koj paub, HBase, thiab ntau lwm cov databases, khaws cov ntaub ntawv hauv cov blocks ntawm ntau kaum ntawm kilobytes loj. Los ntawm lub neej ntawd nws yog hais txog 64 KB. Tam sim no cia peb xav txog tias peb yuav tsum tau txais tsuas yog 100 bytes thiab peb thov HBase muab cov ntaub ntawv no rau peb siv ib qho tseem ceeb. Txij li qhov thaiv qhov loj me hauv HFiles yog 64 KB, qhov kev thov yuav loj dua 640 npaug (tsuas yog ib feeb!) tshaj qhov tsim nyog.

Tom ntej no, txij li qhov kev thov yuav mus dhau HDFS thiab nws cov metadata caching mechanism ShortCircuitCache (uas tso cai rau kev nkag ncaj qha rau cov ntaub ntawv), qhov no ua rau kev nyeem ntawv twb 1 MB ntawm disk. Txawm li cas los xij, qhov no tuaj yeem hloov kho nrog cov parameter dfs.client.read.shortcircuit.buffer.size thiab nyob rau hauv ntau zaus nws ua rau kev txiav txim siab txo tus nqi no, piv txwv li rau 126 KB.

Cia peb hais tias peb ua qhov no, tab sis ntxiv rau, thaum peb pib nyeem cov ntaub ntawv los ntawm java api, xws li kev ua haujlwm zoo li FileChannel.read thiab hais kom lub operating system nyeem cov ntaub ntawv teev tseg, nws nyeem "tsuas yog" 2 zaug ntxiv. , i.e. 256 KB hauv peb rooj plaub. Qhov no yog vim java tsis muaj txoj hauv kev yooj yim los teeb tsa FADV_RANDOM chij los tiv thaiv tus cwj pwm no.

Raws li qhov tshwm sim, kom tau txais peb 100 bytes, 2600 zaug ntxiv yog nyeem hauv qab hood. Nws yuav zoo li tias qhov kev daws teeb meem yog pom tseeb, cia peb txo qhov thaiv qhov loj me mus rau ib kilobyte, teeb tsa tus chij hais thiab tau txais kev pom zoo acceleration. Tab sis qhov teeb meem yog tias los ntawm kev txo cov block loj los ntawm 2 zaug, peb kuj txo tus lej ntawm bytes nyeem ib chav tsev ntawm 2 zaug.

Qee qhov kev nce qib los ntawm kev teeb tsa FADV_RANDOM chij tuaj yeem tau txais, tab sis tsuas yog muaj ntau txoj xov zoo thiab nrog qhov loj ntawm 128 KB, tab sis qhov no yog qhov siab tshaj ntawm ob peb kaum ntawm feem pua:

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Kev ntsuam xyuas tau ua tiav ntawm 100 cov ntaub ntawv, txhua 1 GB loj thiab nyob ntawm 10 HDDs.

Cia peb suav qhov peb tuaj yeem ua tau, hauv txoj cai, suav ntawm qhov nrawm no:
Cia peb hais tias peb nyeem los ntawm 10 disks ntawm qhov ceev ntawm 280 MB / sec, i.e. 3 lab x 100 bytes. Tab sis raws li peb nco ntsoov, cov ntaub ntawv peb xav tau yog 2600 lub sij hawm tsawg dua qhov tau nyeem. Yog li, peb faib 3 lab los ntawm 2600 thiab tau txais 1100 cov ntaub ntawv ib ob.

Kev nyuaj siab, puas yog? Qhov ntawd yog qhov xwm txheej Nkag Mus Rau Saib nkag mus rau cov ntaub ntawv ntawm HDD - tsis hais txog qhov thaiv qhov loj. Qhov no yog lub cev txwv tsis pub nkag mus thiab tsis muaj cov ntaub ntawv tuaj yeem nyem tawm ntau dua nyob rau hauv cov xwm txheej zoo li no.

Yuav ua li cas cov databases ua tiav ntau dua speeds? Txhawm rau teb cov lus nug no, cia saib dab tsi tshwm sim hauv daim duab hauv qab no:

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Ntawm no peb pom tias thawj ob peb feeb qhov ceev yog tiag tiag txog ib txhiab cov ntaub ntawv ib ob. Txawm li cas los xij, ntxiv, vim qhov tseeb tias nyeem ntau ntau dua li qhov tau thov, cov ntaub ntawv xaus rau hauv buff / cache ntawm lub operating system (linux) thiab qhov nrawm nce mus rau 60 txhiab ib ob.

Yog li, txuas ntxiv peb yuav cuam tshuam nrog kev nkag mus nrawm nkaus xwb rau cov ntaub ntawv uas nyob hauv OS cache lossis nyob hauv SSD / NVMe cia cov khoom siv sib piv rau kev nkag ceev.

Hauv peb qhov xwm txheej, peb yuav ua qhov kev xeem ntawm lub rooj ntev zaum ntawm 4 servers, txhua tus raug them raws li hauv qab no:

CPU: Xeon E5-2680 v4 @ 2.40GHz 64 xov.
Nco: 730 GB.
java version: 1.8.0_111

Thiab ntawm no lub ntsiab lus tseem ceeb yog tus nqi ntawm cov ntaub ntawv hauv cov ntxhuav uas yuav tsum tau nyeem. Qhov tseeb yog tias koj nyeem cov ntaub ntawv los ntawm lub rooj uas tag nrho tso rau hauv HBase cache, ces nws yuav tsis txawm los nyeem los ntawm lub operating system's buff / cache. Vim tias HBase los ntawm lub neej ntawd faib 40% ntawm lub cim xeeb rau cov qauv hu ua BlockCache. Qhov tseem ceeb qhov no yog ConcurrentHashMap, qhov tseem ceeb yog cov ntaub ntawv npe + offset ntawm qhov thaiv, thiab tus nqi yog cov ntaub ntawv tiag tiag ntawm qhov offset no.

Yog li, thaum nyeem tsuas yog los ntawm cov qauv no, peb peb pom zoo heev ceev, zoo li ib lab thov ib ob. Tab sis cia peb xav txog tias peb tsis tuaj yeem faib ntau pua gigabytes ntawm lub cim xeeb tsuas yog rau cov kev xav tau ntawm cov ntaub ntawv, vim tias muaj ntau lwm yam tseem ceeb uas khiav ntawm cov servers no.

Piv txwv li, hauv peb rooj plaub, qhov ntim ntawm BlockCache ntawm ib qho RS yog li 12 GB. Peb tsaws ob RS ntawm ib qho, i.e. 96 GB tau faib rau BlockCache ntawm txhua qhov ntawm. Thiab muaj ntau zaus ntau cov ntaub ntawv, piv txwv li, cia nws yog 4 lub rooj, 130 cheeb tsam txhua, uas cov ntaub ntawv yog 800 MB loj, compressed los ntawm FAST_DIFF, i.e. tag nrho ntawm 410 GB (qhov no yog cov ntaub ntawv ntshiab, piv txwv li yam tsis xav txog qhov cuam tshuam rov qab).

Yog li, BlockCache tsuas yog kwv yees li 23% ntawm tag nrho cov ntaub ntawv ntim thiab qhov no yog ze dua rau cov xwm txheej tiag tiag ntawm qhov hu ua BigData. Thiab qhov no yog qhov kev lom zem pib - vim pom tseeb, tsawg dua cache hits, qhov ua tau zoo dua. Tom qab tag nrho, yog tias koj nco, koj yuav tau ua ntau yam haujlwm - i.e. nqes mus rau kev hu xov tooj ua haujlwm. Txawm li cas los xij, qhov no tsis tuaj yeem zam tau, yog li cia peb saib ntawm qhov sib txawv kiag li - yuav ua li cas rau cov ntaub ntawv hauv cache?

Cia peb ua kom yooj yim qhov xwm txheej thiab xav tias peb muaj lub cache uas tsuas yog haum rau 1 yam khoom. Nov yog ib qho piv txwv ntawm qhov yuav tshwm sim thaum peb sim ua haujlwm nrog cov ntaub ntawv ntim 3 zaug loj dua lub cache, peb yuav tsum:

1. Muab block 1 tso rau hauv cache
2. Tshem tawm block 1 ntawm cache
3. Muab block 2 tso rau hauv cache
4. Tshem tawm block 2 ntawm cache
5. Muab block 3 tso rau hauv cache

5 ua tiav! Txawm li cas los xij, qhov xwm txheej no tsis tuaj yeem hu ua ib txwm muaj; qhov tseeb, peb yuam HBase ua ib pawg ntawm kev ua haujlwm tsis muaj txiaj ntsig. Nws niaj hnub nyeem cov ntaub ntawv los ntawm OS cache, muab tso rau hauv BlockCache, tsuas yog muab pov tseg yuav luag tam sim ntawd vim tias ib feem tshiab ntawm cov ntaub ntawv tau tuaj txog. Cov animation thaum pib ntawm tus ncej qhia txog qhov tseem ceeb ntawm qhov teeb meem - Cov Khoom Siv khib nyiab yuav tawm mus, cov cua sov, me Greta nyob deb thiab kub Sweden tau chim siab. Thiab peb IT neeg yeej tsis nyiam nws thaum menyuam yaus tu siab, yog li peb pib xav txog qhov peb tuaj yeem ua li cas txog nws.

Yuav ua li cas yog tias koj tsis tso tag nrho cov blocks hauv cache, tab sis tsuas yog qee feem pua ​​​​ntawm lawv, kom lub cache tsis dhau? Cia peb pib los ntawm tsuas yog ntxiv ob peb kab ntawm cov cai rau qhov pib ntawm kev ua haujlwm rau kev tso cov ntaub ntawv rau hauv BlockCache:

  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) {
    if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) {
      if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) {
        return;
      }
    }
...

Lub ntsiab lus ntawm no yog cov hauv qab no: offset yog txoj hauj lwm ntawm qhov thaiv hauv cov ntaub ntawv thiab nws cov lej kawg yog random thiab sib npaug ntawm 00 txog 99. Yog li ntawd, peb tsuas yog hla cov uas poob rau hauv qhov peb xav tau.

Piv txwv li, teeb cacheDataBlockPercent = 20 thiab saib dab tsi tshwm sim:

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Qhov tshwm sim yog pom tseeb. Hauv cov duab hauv qab no, nws paub meej tias vim li cas qhov kev nrawm no tshwm sim - peb khaws ntau GC cov peev txheej yam tsis tau ua Sisyphean ua haujlwm ntawm kev tso cov ntaub ntawv hauv lub cache nkaus xwb kom tam sim ntawd pov rau hauv qhov dej ntawm Martian dev:

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Nyob rau tib lub sijhawm, kev siv CPU nce, tab sis tsawg dua li kev tsim khoom:

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Nws tseem tsim nyog sau cia tias cov blocks khaws cia hauv BlockCache txawv. Feem ntau, txog 95%, yog cov ntaub ntawv nws tus kheej. Thiab tus so yog metadata, xws li Bloom lim lossis LEAF_INDEX thiab lwm.. Cov ntaub ntawv no tsis txaus, tab sis nws muaj txiaj ntsig zoo, vim tias ua ntej nkag mus rau cov ntaub ntawv ncaj qha, HBase hloov mus rau meta kom nkag siab seb nws puas tsim nyog los tshawb nrhiav ntawm no ntxiv thiab, yog tias muaj, qhov twg raws nraim qhov thaiv kev txaus siab nyob.

Yog li ntawd, nyob rau hauv lub code peb pom ib tug check mob buf.getBlockType().isData() thiab ua tsaug rau qhov meta no, peb yuav tso nws rau hauv lub cache hauv txhua rooj plaub.

Tam sim no cia peb nce lub load thiab me ntsis zawm lub feature nyob rau hauv ib tug mus. Hauv thawj qhov kev sim peb tau txiav tawm feem pua ​​= 20 thiab BlockCache yog me ntsis underutilized. Tam sim no cia peb teeb nws mus rau 23% thiab ntxiv 100 xov txhua 5 feeb kom pom qhov tshwm sim ntawm qhov tshwm sim:

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Ntawm no peb pom tias tus thawj version yuav luag tam sim ntawd tsoo lub qab nthab ntawm txog 100 txhiab thov ib ob. Whereas lub thaj muab acceleration txog li 300 txhiab. Nyob rau tib lub sijhawm, nws yog qhov tseeb tias kev nrawm ntxiv tsis yog "dawb" lawm; kev siv CPU kuj nce ntxiv.

Txawm li cas los xij, qhov no tsis yog qhov kev daws teeb meem zoo nkauj heev, vim peb tsis paub ua ntej qhov feem pua ​​​​ntawm cov blocks yuav tsum tau cached, nws nyob ntawm qhov profile load. Yog li ntawd, ib qho kev siv tshuab tau siv los hloov kho qhov ntsuas no nyob ntawm kev ua haujlwm ntawm kev nyeem ntawv.

Peb txoj kev xaiv tau ntxiv los tswj qhov no:

hbase.lru.cache.heavy.eviction.count.limit - teev pes tsawg zaus cov txheej txheem ntawm kev tshem tawm cov ntaub ntawv los ntawm lub cache yuav tsum khiav ua ntej peb pib siv kev ua kom zoo (piv txwv li hla blocks). Los ntawm lub neej ntawd nws yog sib npaug rau MAX_INT = 2147483647 thiab qhov tseeb txhais tau hais tias cov yam ntxwv yuav tsis pib ua haujlwm nrog tus nqi no. Vim tias cov txheej txheem tshem tawm pib txhua 5 - 10 vib nas this (nws nyob ntawm qhov load) thiab 2147483647 * 10 / 60 / 60 / 24 / 365 = 680 xyoo. Txawm li cas los xij, peb tuaj yeem teeb tsa qhov ntsuas no rau 0 thiab ua kom cov haujlwm ua haujlwm tam sim tom qab tso tawm.

Txawm li cas los xij, tseem muaj qhov them nyiaj hauv qhov ntsuas no. Yog hais tias peb cov load yog xws li cov luv luv nyeem ntawv (hais thaum nruab hnub) thiab ntev nyeem ntawv (thaum hmo ntuj) yog tsis tu ncua interspersed, ces peb yuav xyuas kom meej tias lub feature yog qhib tsuas yog thaum lub sij hawm nyeem ntawv ua hauj lwm nyob rau hauv kev kawm.

Piv txwv li, peb paub tias kev nyeem ntawv luv luv feem ntau kav li 1 feeb. Tsis tas yuav pib ntuav tawm blocks, lub cache yuav tsis muaj sij hawm los ua outdated thiab tom qab ntawd peb tuaj yeem teeb tsa qhov ntsuas no sib npaug, piv txwv li, 10. Qhov no yuav ua rau lub fact tias qhov optimization yuav pib ua hauj lwm tsuas yog thaum lub sij hawm ntev- lub sij hawm nquag nyeem tau pib, i.e. hauv 100 vib nas this. Yog li, yog tias peb nyeem luv luv, ces tag nrho cov blocks yuav nkag mus rau hauv cache thiab yuav muaj (tshwj tsis yog cov uas yuav raug tshem tawm los ntawm tus qauv algorithm). Thiab thaum peb nyeem ntawv mus ntev, cov yam ntxwv tau qhib thiab peb yuav muaj kev ua tau zoo dua.

hbase.lru.cache.heavy.eviction.mb.size.limit - teev pes tsawg megabytes peb xav tso rau hauv lub cache (thiab, tau kawg, tshem tawm) hauv 10 vib nas this. Lub feature yuav sim mus cuag tus nqi no thiab tswj nws. Lub ntsiab lus yog qhov no: yog tias peb shove gigabytes rau hauv lub cache, ces peb yuav tau tshem tawm gigabytes, thiab qhov no, raws li peb tau pom saum toj no, kim heev. Txawm li cas los xij, koj yuav tsum tsis txhob sim teeb nws me me, vim qhov no yuav ua rau lub block hla hom tawm ntxov ntxov. Rau cov servers muaj zog (txog 20-40 lub cev cores), nws yog qhov zoo tshaj plaws los teeb tsa li 300-400 MB. Rau cov chav nruab nrab (~ 10 cores) 200-300 MB. Rau cov tshuab tsis muaj zog (2-5 cores) 50-100 MB tej zaum yuav zoo li qub (tsis tau sim ntawm cov no).

Cia peb saib seb qhov no ua haujlwm li cas: cia peb hais tias peb teeb tsa hbase.lru.cache.heavy.eviction.mb.size.limit = 500, muaj qee yam kev thauj khoom (nyeem) thiab tom qab ntawd txhua ~ 10 vib nas this peb suav pes tsawg bytes. raug ntiab tawm siv tus qauv:

Nyiaj siv ua haujlwm = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100;

Yog hais tias qhov tseeb 2000 MB raug ntiab tawm, ces nyiaj siv ua haujlwm yog sib npaug rau:

2000 * 100 / 500 - 100 = 300%

Cov algorithms sim tswj tsis pub ntau tshaj li ob peb kaum feem pua, yog li cov yam ntxwv yuav txo qhov feem pua ​​​​ntawm cached blocks, yog li siv lub tshuab pib-tuning.

Txawm li cas los xij, yog tias qhov poob qis, cia peb hais tias tsuas yog 200 MB raug tshem tawm thiab Nyiaj Siv Ua Haujlwm dhau los ua qhov tsis zoo (qhov hu ua overshooting):

200 * 100 / 500 - 100 = -60%

Ntawm qhov tsis sib xws, qhov tshwj xeeb yuav nce qhov feem pua ​​​​ntawm cov blocks cached kom txog thaum Cov Nyiaj Ua Haujlwm dhau los ua qhov zoo.

Hauv qab no yog ib qho piv txwv ntawm qhov no zoo li cov ntaub ntawv tiag tiag. Tsis tas yuav sim kom ncav cuag 0%, nws tsis yooj yim sua. Nws yog qhov zoo heev thaum nws yog li 30 - 100%, qhov no yuav pab kom tsis txhob ntxov tawm ntawm qhov kev ua kom zoo thaum lub sijhawm luv luv.

hbase.lru.cache.heavy.eviction.overhead.coefficient - teem caij sai npaum li cas peb xav tau qhov tshwm sim. Yog tias peb paub tseeb tias peb cov ntawv nyeem feem ntau ntev thiab tsis xav tos, peb tuaj yeem nce qhov piv txwv no thiab tau txais kev ua haujlwm siab dua.

Piv txwv li, peb teeb qhov coefficient = 0.01. Qhov no txhais tau hais tias nyiaj siv ua haujlwm (saib saum toj no) yuav muab faib ua tus lej no los ntawm qhov tshwm sim thiab qhov feem pua ​​​​ntawm cov blocks cached yuav raug txo. Cia peb xav tias Nyiaj Siv Ua Haujlwm = 300% thiab coefficient = 0.01, ces qhov feem pua ​​​​ntawm cov blocks cached yuav raug txo los ntawm 3%.

Ib qho zoo sib xws "Backpressure" logic kuj yog siv rau qhov tsis zoo nyiaj siv ua haujlwm (overshooting) qhov tseem ceeb. Txij li lub sijhawm luv luv ntawm qhov kev hloov pauv ntawm qhov ntim ntawm kev nyeem ntawv thiab kev tshem tawm ib txwm ua tau, qhov txheej txheem no tso cai rau koj kom tsis txhob ntxov tawm ntawm qhov kev ua kom zoo. Backpressure muaj cov logic inverted: qhov muaj zog ntawm overshooting, ntau blocks yog cached.

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Kev siv code

        LruBlockCache cache = this.cache.get();
        if (cache == null) {
          break;
        }
        freedSumMb += cache.evict()/1024/1024;
        /*
        * Sometimes we are reading more data than can fit into BlockCache
        * and it is the cause a high rate of evictions.
        * This in turn leads to heavy Garbage Collector works.
        * So a lot of blocks put into BlockCache but never read,
        * but spending a lot of CPU resources.
        * Here we will analyze how many bytes were freed and decide
        * decide whether the time has come to reduce amount of caching blocks.
        * It help avoid put too many blocks into BlockCache
        * when evict() works very active and save CPU for other jobs.
        * More delails: https://issues.apache.org/jira/browse/HBASE-23887
        */

        // First of all we have to control how much time
        // has passed since previuos evict() was launched
        // This is should be almost the same time (+/- 10s)
        // because we get comparable volumes of freed bytes each time.
        // 10s because this is default period to run evict() (see above this.wait)
        long stopTime = System.currentTimeMillis();
        if ((stopTime - startTime) > 1000 * 10 - 1) {
          // Here we have to calc what situation we have got.
          // We have the limit "hbase.lru.cache.heavy.eviction.bytes.size.limit"
          // and can calculte overhead on it.
          // We will use this information to decide,
          // how to change percent of caching blocks.
          freedDataOverheadPercent =
            (int) (freedSumMb * 100 / cache.heavyEvictionMbSizeLimit) - 100;
          if (freedSumMb > cache.heavyEvictionMbSizeLimit) {
            // Now we are in the situation when we are above the limit
            // But maybe we are going to ignore it because it will end quite soon
            heavyEvictionCount++;
            if (heavyEvictionCount > cache.heavyEvictionCountLimit) {
              // It is going for a long time and we have to reduce of caching
              // blocks now. So we calculate here how many blocks we want to skip.
              // It depends on:
             // 1. Overhead - if overhead is big we could more aggressive
              // reducing amount of caching blocks.
              // 2. How fast we want to get the result. If we know that our
              // heavy reading for a long time, we don't want to wait and can
              // increase the coefficient and get good performance quite soon.
              // But if we don't sure we can do it slowly and it could prevent
              // premature exit from this mode. So, when the coefficient is
              // higher we can get better performance when heavy reading is stable.
              // But when reading is changing we can adjust to it and set
              // the coefficient to lower value.
              int change =
                (int) (freedDataOverheadPercent * cache.heavyEvictionOverheadCoefficient);
              // But practice shows that 15% of reducing is quite enough.
              // We are not greedy (it could lead to premature exit).
              change = Math.min(15, change);
              change = Math.max(0, change); // I think it will never happen but check for sure
              // So this is the key point, here we are reducing % of caching blocks
              cache.cacheDataBlockPercent -= change;
              // If we go down too deep we have to stop here, 1% any way should be.
              cache.cacheDataBlockPercent = Math.max(1, cache.cacheDataBlockPercent);
            }
          } else {
            // Well, we have got overshooting.
            // Mayby it is just short-term fluctuation and we can stay in this mode.
            // It help avoid permature exit during short-term fluctuation.
            // If overshooting less than 90%, we will try to increase the percent of
            // caching blocks and hope it is enough.
            if (freedSumMb >= cache.heavyEvictionMbSizeLimit * 0.1) {
              // Simple logic: more overshooting - more caching blocks (backpressure)
              int change = (int) (-freedDataOverheadPercent * 0.1 + 1);
              cache.cacheDataBlockPercent += change;
              // But it can't be more then 100%, so check it.
              cache.cacheDataBlockPercent = Math.min(100, cache.cacheDataBlockPercent);
            } else {
              // Looks like heavy reading is over.
              // Just exit form this mode.
              heavyEvictionCount = 0;
              cache.cacheDataBlockPercent = 100;
            }
          }
          LOG.info("BlockCache evicted (MB): {}, overhead (%): {}, " +
            "heavy eviction counter: {}, " +
            "current caching DataBlock (%): {}",
            freedSumMb, freedDataOverheadPercent,
            heavyEvictionCount, cache.cacheDataBlockPercent);

          freedSumMb = 0;
          startTime = stopTime;
       }

Cia wb mus saib tag nrho qhov no siv ib qho piv txwv tiag tiag. Peb muaj cov ntawv xeem hauv qab no:

  1. Cia peb pib ua Scan (25 threads, batch = 100)
  2. Tom qab 5 feeb, ntxiv ntau tau txais (25 threads, batch = 100)
  3. Tom qab 5 feeb, tua ntau qhov tau txais (tsuas yog scan dua)

Peb ua ob qhov kev khiav, thawj hbase.lru.cache.heavy.eviction.count.limit = 10000 (uas ua tau lov tes taw feature), thiab ces teem txwv = 0 (enables nws).

Nyob rau hauv cov cav hauv qab no peb pom yuav ua li cas lub feature qhib thiab rov pib dua Overshooting rau 14-71%. Los ntawm lub sij hawm mus rau lub sij hawm lub load txo, uas tig rau Backpressure thiab HBase caches ntau blocks dua.

Log RegionServer
evicted (MB): 0, piv 0.0, nyiaj siv ua haujlwm (%): -100, hnyav tshem tawm txee: 0, tam sim no caching DataBlock (%): 100
evicted (MB): 0, piv 0.0, nyiaj siv ua haujlwm (%): -100, hnyav tshem tawm txee: 0, tam sim no caching DataBlock (%): 100
evicted (MB): 2170, piv 1.09, nyiaj siv ua haujlwm (%): 985, hnyav tshem tawm txee: 1, tam sim no caching DataBlock (%): 91 < pib
evicted (MB): 3763, piv 1.08, nyiaj siv ua haujlwm (%): 1781, hnyav tshem tawm txee: 2, tam sim no caching DataBlock (%): 76
evicted (MB): 3306, piv 1.07, nyiaj siv ua haujlwm (%): 1553, hnyav tshem tawm txee: 3, tam sim no caching DataBlock (%): 61
evicted (MB): 2508, piv 1.06, nyiaj siv ua haujlwm (%): 1154, hnyav tshem tawm txee: 4, tam sim no caching DataBlock (%): 50
evicted (MB): 1824, piv 1.04, nyiaj siv ua haujlwm (%): 812, hnyav tshem tawm txee: 5, tam sim no caching DataBlock (%): 42
evicted (MB): 1482, piv 1.03, nyiaj siv ua haujlwm (%): 641, hnyav tshem tawm txee: 6, tam sim no caching DataBlock (%): 36
evicted (MB): 1140, piv 1.01, nyiaj siv ua haujlwm (%): 470, hnyav tshem tawm txee: 7, tam sim no caching DataBlock (%): 32
evicted (MB): 913, piv 1.0, nyiaj siv ua haujlwm (%): 356, hnyav tshem tawm txee: 8, tam sim no caching DataBlock (%): 29
evicted (MB): 912, piv 0.89, nyiaj siv ua haujlwm (%): 356, hnyav tshem tawm txee: 9, tam sim no caching DataBlock (%): 26
evicted (MB): 684, piv 0.76, nyiaj siv ua haujlwm (%): 242, hnyav tshem tawm txee: 10, tam sim no caching DataBlock (%): 24
evicted (MB): 684, piv 0.61, nyiaj siv ua haujlwm (%): 242, hnyav tshem tawm txee: 11, tam sim no caching DataBlock (%): 22
evicted (MB): 456, piv 0.51, nyiaj siv ua haujlwm (%): 128, hnyav tshem tawm txee: 12, tam sim no caching DataBlock (%): 21
evicted (MB): 456, piv 0.42, nyiaj siv ua haujlwm (%): 128, hnyav tshem tawm txee: 13, tam sim no caching DataBlock (%): 20
evicted (MB): 456, piv 0.33, nyiaj siv ua haujlwm (%): 128, hnyav tshem tawm txee: 14, tam sim no caching DataBlock (%): 19
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 15, tam sim no caching DataBlock (%): 19
evicted (MB): 342, piv 0.32, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 16, tam sim no caching DataBlock (%): 19
evicted (MB): 342, piv 0.31, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 17, tam sim no caching DataBlock (%): 19
evicted (MB): 228, piv 0.3, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 18, tam sim no caching DataBlock (%): 19
evicted (MB): 228, piv 0.29, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 19, tam sim no caching DataBlock (%): 19
evicted (MB): 228, piv 0.27, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 20, tam sim no caching DataBlock (%): 19
evicted (MB): 228, piv 0.25, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 21, tam sim no caching DataBlock (%): 19
evicted (MB): 228, piv 0.24, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 22, tam sim no caching DataBlock (%): 19
evicted (MB): 228, piv 0.22, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 23, tam sim no caching DataBlock (%): 19
evicted (MB): 228, piv 0.21, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 24, tam sim no caching DataBlock (%): 19
evicted (MB): 228, piv 0.2, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 25, tam sim no caching DataBlock (%): 19
evicted (MB): 228, piv 0.17, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 26, tam sim no caching DataBlock (%): 19
evicted (MB): 456, piv 0.17, nyiaj siv ua haujlwm (%): 128, hnyav tshem tawm cov txee: 27, tam sim no caching DataBlock (%): 18 < ntxiv tau txais (tab sis rooj tib yam)
evicted (MB): 456, piv 0.15, nyiaj siv ua haujlwm (%): 128, hnyav tshem tawm txee: 28, tam sim no caching DataBlock (%): 17
evicted (MB): 342, piv 0.13, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 29, tam sim no caching DataBlock (%): 17
evicted (MB): 342, piv 0.11, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 30, tam sim no caching DataBlock (%): 17
evicted (MB): 342, piv 0.09, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 31, tam sim no caching DataBlock (%): 17
evicted (MB): 228, piv 0.08, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 32, tam sim no caching DataBlock (%): 17
evicted (MB): 228, piv 0.07, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 33, tam sim no caching DataBlock (%): 17
evicted (MB): 228, piv 0.06, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 34, tam sim no caching DataBlock (%): 17
evicted (MB): 228, piv 0.05, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 35, tam sim no caching DataBlock (%): 17
evicted (MB): 228, piv 0.05, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 36, tam sim no caching DataBlock (%): 17
evicted (MB): 228, piv 0.04, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 37, tam sim no caching DataBlock (%): 17
evicted (MB): 109, piv 0.04, nyiaj siv ua haujlwm (%): -46, hnyav tshem tawm txee: 37, tam sim no caching DataBlock (%): 22 < rov qab siab
evicted (MB): 798, piv 0.24, nyiaj siv ua haujlwm (%): 299, hnyav tshem tawm txee: 38, tam sim no caching DataBlock (%): 20
evicted (MB): 798, piv 0.29, nyiaj siv ua haujlwm (%): 299, hnyav tshem tawm txee: 39, tam sim no caching DataBlock (%): 18
evicted (MB): 570, piv 0.27, nyiaj siv ua haujlwm (%): 185, hnyav tshem tawm txee: 40, tam sim no caching DataBlock (%): 17
evicted (MB): 456, piv 0.22, nyiaj siv ua haujlwm (%): 128, hnyav tshem tawm txee: 41, tam sim no caching DataBlock (%): 16
evicted (MB): 342, piv 0.16, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 42, tam sim no caching DataBlock (%): 16
evicted (MB): 342, piv 0.11, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 43, tam sim no caching DataBlock (%): 16
evicted (MB): 228, piv 0.09, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 44, tam sim no caching DataBlock (%): 16
evicted (MB): 228, piv 0.07, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 45, tam sim no caching DataBlock (%): 16
evicted (MB): 228, piv 0.05, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 46, tam sim no caching DataBlock (%): 16
evicted (MB): 222, piv 0.04, nyiaj siv ua haujlwm (%): 11, hnyav tshem tawm txee: 47, tam sim no caching DataBlock (%): 16
evicted (MB): 104, piv 0.03, nyiaj siv ua haujlwm (%): -48, hnyav tshem tawm txee: 47, tam sim no caching DataBlock (%): 21 < cuam tshuam tau txais
evicted (MB): 684, piv 0.2, nyiaj siv ua haujlwm (%): 242, hnyav tshem tawm txee: 48, tam sim no caching DataBlock (%): 19
evicted (MB): 570, piv 0.23, nyiaj siv ua haujlwm (%): 185, hnyav tshem tawm txee: 49, tam sim no caching DataBlock (%): 18
evicted (MB): 342, piv 0.22, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 50, tam sim no caching DataBlock (%): 18
evicted (MB): 228, piv 0.21, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 51, tam sim no caching DataBlock (%): 18
evicted (MB): 228, piv 0.2, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 52, tam sim no caching DataBlock (%): 18
evicted (MB): 228, piv 0.18, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 53, tam sim no caching DataBlock (%): 18
evicted (MB): 228, piv 0.16, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 54, tam sim no caching DataBlock (%): 18
evicted (MB): 228, piv 0.14, nyiaj siv ua haujlwm (%): 14, hnyav tshem tawm txee: 55, tam sim no caching DataBlock (%): 18
evicted (MB): 112, piv 0.14, nyiaj siv ua haujlwm (%): -44, hnyav tshem tawm txee: 55, tam sim no caching DataBlock (%): 23 < rov qab siab
evicted (MB): 456, piv 0.26, nyiaj siv ua haujlwm (%): 128, hnyav tshem tawm txee: 56, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.31, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 57, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 58, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 59, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 60, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 61, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 62, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 63, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.32, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 64, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 65, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 66, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.32, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 67, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 68, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.32, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 69, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.32, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 70, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 71, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 72, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 73, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 74, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 75, tam sim no caching DataBlock (%): 22
evicted (MB): 342, piv 0.33, nyiaj siv ua haujlwm (%): 71, hnyav tshem tawm txee: 76, tam sim no caching DataBlock (%): 22
evicted (MB): 21, piv 0.33, nyiaj siv ua haujlwm (%): -90, hnyav tshem tawm txee: 76, tam sim no caching DataBlock (%): 32
evicted (MB): 0, piv 0.0, nyiaj siv ua haujlwm (%): -100, hnyav tshem tawm txee: 0, tam sim no caching DataBlock (%): 100
evicted (MB): 0, piv 0.0, nyiaj siv ua haujlwm (%): -100, hnyav tshem tawm txee: 0, tam sim no caching DataBlock (%): 100

Cov kev soj ntsuam yuav tsum tau ua kom pom cov txheej txheem tib yam hauv daim duab ntawm kev sib raug zoo ntawm ob ntu cache - ib leeg (qhov twg cov blocks uas tsis tau thov ua ntej) thiab ntau (cov ntaub ntawv "thov" tsawg kawg ib zaug tau khaws cia ntawm no):

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Thiab thaum kawg, dab tsi ua haujlwm ntawm qhov tsis zoo li hauv daim duab. Rau kev sib piv, lub cache raug kaw tag nrho thaum pib, tom qab ntawd HBase tau pib nrog caching thiab ncua kev pib ua haujlwm kom zoo los ntawm 5 feeb (30 lub voj voog khiav tawm).

Tag nrho cov lej tuaj yeem pom hauv Pull Request TIAB SA 23887 ntawm github.

Txawm li cas los xij, 300 txhiab nyeem ib ob tsis yog txhua yam uas tuaj yeem ua tiav ntawm cov khoom siv no hauv cov xwm txheej no. Qhov tseeb yog tias thaum koj xav nkag mus rau cov ntaub ntawv ntawm HDFS, ShortCircuitCache (tom qab no hu ua SSC) mechanism yog siv, uas tso cai rau koj nkag mus rau cov ntaub ntawv ncaj qha, zam kev sib cuam tshuam hauv network.

Profiling tau pom tias txawm hais tias qhov txheej txheem no ua rau muaj txiaj ntsig loj, nws kuj tseem nyob rau qee lub sijhawm dhau los ua lub raj mis, vim tias yuav luag txhua qhov haujlwm hnyav tshwm sim hauv lub xauv, uas ua rau thaiv feem ntau ntawm lub sijhawm.

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Tau paub qhov no, peb pom tau hais tias qhov teeb meem tuaj yeem hla dhau los ntawm kev tsim ib qho kev ywj pheej ntawm SSCs:

private final ShortCircuitCache[] shortCircuitCache;
...
shortCircuitCache = new ShortCircuitCache[this.clientShortCircuitNum];
for (int i = 0; i < this.clientShortCircuitNum; i++)
  this.shortCircuitCache[i] = new ShortCircuitCache(…);

Thiab tom qab ntawd ua haujlwm nrog lawv, tsis suav nrog kev sib tshuam kuj ntawm tus lej offset kawg:

public ShortCircuitCache getShortCircuitCache(long idx) {
    return shortCircuitCache[(int) (idx % clientShortCircuitNum)];
}

Tam sim no koj tuaj yeem pib sim. Ua li no, peb yuav nyeem cov ntaub ntawv los ntawm HDFS nrog ib daim ntawv thov yooj yim ntau txoj xov. Teem cov parameter:

conf.set("dfs.client.read.shortcircuit", "true");
conf.set("dfs.client.read.shortcircuit.buffer.size", "65536"); // по дефолту = 1 МБ и это сильно замедляет чтение, поэтому лучше привести в соответствие к реальным нуждам
conf.set("dfs.client.short.circuit.num", num); // от 1 до 10

Thiab tsuas yog nyeem cov ntaub ntawv:

FSDataInputStream in = fileSystem.open(path);
for (int i = 0; i < count; i++) {
    position += 65536;
    if (position > 900000000)
        position = 0L;
    int res = in.read(position, byteBuffer, 0, 65536);
}

Cov cai no raug tua nyob rau hauv cov xov sib cais thiab peb yuav nce tus naj npawb ntawm ib txhij nyeem cov ntaub ntawv (los ntawm 10 mus rau 200 - kab rov tav axis) thiab tus naj npawb ntawm caches (los ntawm 1 txog 10 - duab). Lub axis ntsug qhia qhov nrawm uas tshwm sim los ntawm kev nce hauv SSC txheeb ze rau rooj plaub thaum tsuas muaj ib lub cache.

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Yuav ua li cas nyeem cov duab: Lub sijhawm ua tiav rau 100 txhiab nyeem hauv 64 KB blocks nrog ib lub cache yuav tsum 78 vib nas this. Whereas nrog 5 caches nws yuav siv sij hawm 16 vib nas this. Cov. muaj acceleration ntawm ~ 5 zaug. Raws li tuaj yeem pom los ntawm daim duab, cov txiaj ntsig tsis pom zoo rau qee qhov me me ntawm cov ntawv nyeem sib txuas; nws pib ua lub luag haujlwm tseem ceeb thaum muaj ntau tshaj 50 xov nyeem. Nws kuj tseem pom tau tias nce SSCs los ntawm 6 thiab saum toj no muab kev ua haujlwm me me nce ntxiv.

Lus Cim 1: txij li qhov kev xeem tau zoo heev (saib hauv qab), 3 kev khiav haujlwm tau ua tiav thiab qhov txiaj ntsig tshwm sim tau nruab nrab.

Lus Cim 2: Qhov kev ua tau zoo los ntawm kev teeb tsa kev nkag mus tsis zoo yog tib yam, txawm hais tias kev nkag mus rau nws tus kheej qeeb me ntsis.

Txawm li cas los xij, nws yuav tsum tau qhia meej tias, tsis zoo li rooj plaub nrog HBase, qhov kev nrawm no tsis yog ib txwm pub dawb. Ntawm no peb "xauv" CPU lub peev xwm ua haujlwm ntau dua, tsis yog dai ntawm cov xauv.

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Ntawm no koj tuaj yeem soj ntsuam tias, feem ntau, qhov nce ntawm cov caches muab kwv yees li qhov sib npaug ntawm kev siv CPU. Txawm li cas los xij, muaj kev sib tw me ntsis ntxiv.

Piv txwv li, cia peb saib ze rau ntawm qhov teeb tsa SSC = 3. Qhov nce hauv kev ua haujlwm ntawm qhov ntau yog li 3.3 npaug. Hauv qab no yog cov txiaj ntsig los ntawm tag nrho peb qhov kev sib cais.

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Thaum CPU noj nce li ntawm 2.8 npaug. Qhov sib txawv tsis loj heev, tab sis me ntsis Greta twb zoo siab thiab tej zaum yuav muaj sij hawm mus kawm ntawv thiab kawm.

Yog li, qhov no yuav muaj txiaj ntsig zoo rau txhua lub cuab yeej uas siv ntau nkag mus rau HDFS (piv txwv li Spark, thiab lwm yam), muab tias daim ntawv thov code yog qhov hnyav (piv txwv li lub ntsaws rau ntawm HDFS tus neeg siv sab) thiab muaj lub zog CPU dawb. . Txhawm rau kuaj xyuas, cia peb sim seb qhov cuam tshuam dab tsi ntawm kev siv ua ke ntawm BlockCache optimization thiab SSC tuning rau kev nyeem ntawv los ntawm HBase yuav muaj.

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Nws tuaj yeem pom tau tias nyob rau hauv cov xwm txheej zoo li no cov txiaj ntsig tsis zoo li hauv kev sim ua kom zoo (nyeem ntawv yam tsis tau ua), tab sis nws muaj peev xwm nyem tawm ntxiv 80K ntawm no. Ua ke, ob qho tib si optimizations muab mus txog 4x speedup.

Ib qho PR kuj tau tsim rau qhov kev ua kom zoo dua no [HDFS-15202], uas tau muab sib sau ua ke thiab cov haujlwm no yuav muaj nyob rau hauv kev tshaj tawm yav tom ntej.

Thiab thaum kawg, nws yog qhov nthuav kom sib piv cov kev nyeem ntawv ntawm cov ntaub ntawv dav dav zoo sib xws, Cassandra thiab HBase.

Txhawm rau ua qhov no, peb tau pib ua piv txwv ntawm tus qauv YCSB thauj khoom siv hluav taws xob los ntawm ob lub tswv (800 xov hauv tag nrho). Nyob rau sab server - 4 piv txwv ntawm RegionServer thiab Cassandra ntawm 4 tus tswv (tsis yog qhov uas cov neeg siv tau khiav, kom tsis txhob muaj kev cuam tshuam). Kev nyeem tau los ntawm cov rooj loj:

HBase - 300 GB ntawm HDFS (100 GB cov ntaub ntawv ntshiab)

Cassandra - 250 GB (replication factor = 3)

Cov. qhov ntim yog kwv yees li qub (hauv HBase me ntsis ntxiv).

HBase Parameters:

dfs.client.short.circuit.num = 5 (HDFS neeg optimization)

hbase.lru.cache.heavy.eviction.count.limit = 30 - qhov no txhais tau hais tias thaj yuav pib ua haujlwm tom qab 30 tshem tawm (~ 5 feeb)

hbase.lru.cache.heavy.eviction.mb.size.limit = 300 - lub hom phiaj ntim ntawm caching thiab tshem tawm

YCSB cav tau txheeb xyuas thiab muab tso ua ke rau hauv Excel daim duab:

Yuav ua li cas kom nyeem ntawv ceev ntawm HBase txog 3 zaug thiab los ntawm HDFS mus txog 5 zaug

Raws li koj tuaj yeem pom, cov kev ua kom zoo tshaj plaws no ua rau nws muaj peev xwm los sib piv cov kev ua tau zoo ntawm cov ntaub ntawv hauv qab no thiab ua tiav 450 txhiab nyeem ib ob.

Peb cia siab tias cov ntaub ntawv no tuaj yeem pab tau rau ib tus neeg thaum muaj kev tawm tsam zoo siab rau kev tsim khoom.

Tau qhov twg los: www.hab.com

Ntxiv ib saib