Dagaalka laba yakozuna, ama Cassandra vs HBase. waayo-aragnimada kooxda Sberbank

Tani xitaa ma aha kaftan, waxay u muuqataa in sawirkan gaarka ah uu si sax ah u muujinayo nuxurka xogtan, ugu dambeyntiina way caddaan doontaa sababta:

Dagaalka laba yakozuna, ama Cassandra vs HBase. waayo-aragnimada kooxda Sberbank

Marka loo eego DB-Engines Ranking, labada xog ururin ee NoSQL ee ugu caansan waa Cassandra (hadda dambe CS) iyo HBase (HB).

Dagaalka laba yakozuna, ama Cassandra vs HBase. waayo-aragnimada kooxda Sberbank

Sida rabitaanka qaddarka, kooxdayada maaraynta xogta ee Sberbank ayaa hore u lahaa Π΄Π°Π²Π½ΠΎ wuxuuna si dhow ula shaqeeyaa HB. Muddadaas, waxaan si fiican u barannay meelaha ay ku wanaagsan tahay iyo meelaha ay ka liidato, waxaanan baranay sida loo kariyo. Si kastaba ha ahaatee, joogitaanka beddelka qaabka CS wuxuu had iyo jeer nagu qasbay inaan nafteena ku cadaadinno wax yar oo shaki leh: ma samaynay doorasho sax ah? Waxaa intaa dheer, natiijooyinka isbarbardhigga, oo ay samaysay DataStax, waxay yiraahdeen CS waxay si sahal ah u garaacday HB iyadoo ku dhow dhibco burburin. Dhanka kale, DataStax waa koox xiiso leh, mana aha inaad eraygooda u qaadan. Waxaan sidoo kale ku wareernay tirada yar ee macluumaadka ku saabsan xaaladaha imtixaanka, sidaas darteed waxaan go'aansanay inaan ogaano annaga oo ah boqorka BigData NoSql, natiijadiina waxay noqotay mid aad u xiiso badan.

Si kastaba ha noqotee, ka hor inta aan loo gudbin natiijooyinka imtixaannada la sameeyay, waxaa lagama maarmaan ah in la qeexo dhinacyada muhiimka ah ee qaabeynta deegaanka. Xaqiiqdu waxay tahay in CS loo isticmaali karo hab u ogolaanaya luminta xogta. Kuwaas. tani waa marka hal server (node) oo kaliya uu mas'uul ka yahay xogta furaha gaar ah, iyo haddii sababo qaar ay ku guuldareystaan, markaas qiimaha furaha ayaa lumin doona. Hawlo badan tani maaha mid muhiim ah, laakiin qaybta bangiyada tani waa ka reebban halkii qaanuunka. Xaaladeena, waa muhiim in la haysto dhowr nuqul oo xog ah oo lagu kaydiyo la isku halayn karo.

Sidaa darteed, kaliya habka hawlgalka ee CS ee qaabka ku celcelinta saddex jeer ayaa la tixgeliyey, i.e. Abuuritaanka goobta kiiska waxaa lagu fuliyay xuduudaha soo socda:

CREATE KEYSPACE ks WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3};

Marka xigta, waxaa jira laba siyaabood oo lagu hubinayo heerka joogtada ah ee loo baahan yahay. Xeerka guud:
NW + NR > RF

Taas oo macnaheedu yahay in tirada xaqiijinta ee noodhka marka la qorayo (NW) iyo tirada xaqiijinta ee noodhka marka la akhrinayo (NR) waa in ay ka badan tahay qodobka soo noqnoqda. Xaaladeena, RF = 3, taas oo macnaheedu yahay xulashooyinka soo socda ayaa ku habboon:
2 + 2 > 3
3 + 1 > 3

Maadaama ay aasaas ahaan muhiim noogu tahay in aan u kaydino xogta sida ugu macquulsan, nidaamka 3+1 ayaa la doortay. Intaa waxaa dheer, HB waxay ku shaqeysaa mabda'a la mid ah, i.e. isbarbardhigga noocan oo kale ah wuxuu noqon doonaa mid cadaalad ah.

Waa in la ogaadaa in DataStax ay sameeyeen lid ku ah daraasaddooda, waxay dejiyeen RF = 1 labadaba CS iyo HB (kan dambe iyagoo beddelaya goobaha HDFS). Tani waa arrin runtii muhiim ah sababtoo ah saameynta waxqabadka CS ee kiiskan waa mid weyn. Tusaale ahaan, sawirka hoose wuxuu muujinayaa kororka wakhtiga loo baahan yahay in xogta lagu shubo CS:

Dagaalka laba yakozuna, ama Cassandra vs HBase. waayo-aragnimada kooxda Sberbank

Halkan waxaan ku aragnaa kuwan soo socda: mar kasta oo dunta tartamaya ay qoraan xogta, way dheeraanaysaa. Tani waa dabiici, laakiin waa muhiim in hoos u dhaca waxqabadka RF=3 uu aad u sarreeyo. Si kale haddii loo dhigo, haddii aan ku qorno 4 taxane midkiiba 5 miis (20 wadar ahaan), ka dib RF = 3 waxay luminaysaa qiyaastii 2 jeer (150 ilbiriqsi RF = 3 iyo 75 ee RF = 1). Laakiin haddii aan kordhino culeyska anagoo ku shubaya xogta 8 miis oo leh 5 dun midkiiba (40 wadar ahaan), ka dibna luminta RF = 3 waa horeba 2,7 jeer (375 ilbiriqsi iyo 138).

Waxaa laga yaabaa in tani ay qayb ahaan tahay sirta tijaabada rarka guusha leh ee ay samaysay DataStax ee CS, sababtoo ah HB marka aynu joogno beddelka qodobka taranka ee 2 ilaa 3 wax saamayn ah kuma yeelan. Kuwaas. saxanadaha ma aha xudunta HB ee qaabayntayada. Si kastaba ha ahaatee, waxaa jira habab kale oo badan oo halkan ah, sababtoo ah waa in la ogaadaa in noocayaga HB uu ahaa mid yar oo la dhajiyay oo la hagaajiyay, bay'aduhu gabi ahaanba way ka duwan yihiin, iwm. Waxaa sidoo kale mudan in la ogaado in laga yaabo in aan kaliya garaneynin sida loo diyaariyo CS si sax ah waxaana jira siyaabo badan oo wax ku ool ah oo lagula shaqeeyo, waxaana rajeynayaa inaan ogaan doono faallooyinka. Laakiin marka hore wax walba.

Dhammaan tijaabooyinka waxaa lagu sameeyay koox qalabeed oo ka kooban 4 server, mid walbana wuxuu leeyahay qaabeynta soo socota:

CPU: Xeon E5-2680 v4 @ 2.40GHz 64 dunta.
Disks: 12 xabbo oo SATA HDD ah
nooca Java: 1.8.0_111

Nooca CS: 3.11.5

qiyaasaha cassandra.ymltirada_calaamadaha: 256
hinted_handoff_enabled: run
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
hints_directory: /data10/cassandra/hints
Tilmaamaha_flush_period_in_ms: 10000
max_hints_file_size_in_mb: 128
batchlog_replay_throttle_in_kb: 1024
Xaqiijiye: AllowAllAuthenticator
awoodaha: AllowAllAuthorizer
doorka_maareeye: CassandraRoleManager
doorarka_ ansaxnimada_in_ms: 2000
ogolaanshaha_ ansax_in_ms: 2000
aqoonsiga_qiimaha_in_ms: 2000
qaybiye: org.apache.cassandra.dht.Murmur3Partitioner
xogta_file_directory:
- /data1/cassandra/data # tusaha dataN kasta waa saxan gooni ah
- /data2/cassandra/data
- /data3/cassandra/data
- /data4/cassandra/data
- /data5/cassandra/data
- /data6/cassandra/data
- /data7/cassandra/data
- /data8/cassandra/data
Committeelog_directory: /data9/cassandra/commitlog
cdc_enabled: been
disk_failure_policy: joogso
siyaasada_guuldarida_fuliso: jooji
odhaahyada_diyaarsan_cache_size_mb:
thrift_diyaariyay_baylaha_cache_size_mb:
key_cache_size_in_mb:
key_cache_save_period: 14400
saf_cache_size_in_mb: 0
saf_cache_save_period: 0
counter_cache_ size_in_mb:
counter_cache_save_period: 7200
save_caches_directory: /data10/cassandra/saved_caches
Committeelog_sync: xilliyeed
Committeelog_sync_period_in_ms: 10000
gudniin_segment_size_in_mb: 32
abuur_bixiye:
- class_name: org.apache.cassandra.locator.SimpleSeed Bixiyaha
Tilmaamaha:
- abuur: "*,*"
akhrinta_isku socota: 256 # isku dayay 64 - wax farqi ah oo la dareemay ma jiro
isku dhafka ah ayaa qoray: 256 # isku dayay 64 - wax farqi ah oo la dareemay
concurrent_counter_writes: 256 # isku dayay 64 - wax farqi ah lama ogaan
Aragtida_wax_ku-wadaaga_wax_uqoraa: 32
meel_taallo_meel_hal_midab leh_in_mb: 2048 # wuxuu isku dayay 16 GB - wuu ka yaraa
memtable_allocation_type: heap_buffers
index_summary_awood_in_mb:
index_summary_resize_interval_in_minutes: 60
trickle_fsync: been
trickle_fsync_interval_in_kb: 10240
kaydinta_dekedda: 7000
ssl_storage_port: 7001
ciwaanka dhegeyso: *
ciwaanka baahinta: *
dhegeyso_cinwaanka_warbaahinta: run
internode_authenticator: org.apache.cassandra.auth.AllowAllInternodeAuthenticator
start_native_transport: run
wadaniga_gaadiidka_dekada: 9042
start_rpc: run
rpc_cinwaanka: *
rpc_dekada: 9160
rpc_keepalive: run
rpc_server_type: sync
thrift_framed_gaadiidka_size_in_mb: 15
dib-u-celin-kordhin: been ah
sawir-qaadid_kahor_compaction: been
auto_snapshot: run
column_index_size_in_kb: 64
column_index_cache_size_in_kb: 2
compactors_isku-dhafan: 4
compaction_throughput_mb_per_sec: 1600
xasiloon_hordhac_furan_interval_in_mb: 50
akhri_request_timeout_in_ms: 100000
xadka_request_timeout_in_ms: 200000
qor_request_timeout_in_ms: 40000
counter_write_request_timeout_in_ms: 100000
cas_contention_timeout_in_ms: 20000
truncate_request_timeout_in_ms: 60000
Codsiga_waqti-guud_in_ms: 200000
slow_query_log_timeout_in_ms: 500
cross_node_timeout: been
endpoint_snitch: GossipingPropertyFileSnitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_xad_xun_xad: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
server_encryption_options:
internode_encryption: midna
macmiilka_encryption_options:
karti: been
internode_compression: dc
inter_dc_tcp_nodelay: been
tracetype_query_ttl: 86400
tracetype_repair_ttl: 604800
karti_user_defined_functions: been
karti_scripted_user_defined_functions: been
windows_timer_interval: 1
daah-furnaan_xogta_incryption_options:
karti: been
Dhagax-dhagax_dhagax_xabaal: 1000
Xadka_dhagax_guuldarrida: 100000
Dufcaddii_size_ka digtay heerka_kb: 200
dufcada_size_fail_threshold_in_kb: 250
Qaybaha_dhammaan_dhammaadka_aan_lagasoo_dirin
isafgarad_weyn_qayb_qoys_digniin_xadhig_mb: 100
gc_warn_threshold_ms: 1000
dhabarka_pressure_enabled: been
karti_aragtiyo_materialized: run
karti_sasi_indexes: run

Dejinta GC:

### Goobaha CMS-XX:+IsticmaalParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemark waa la dajiyay
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSinitiatingOccupancyFraction=75
-XX:+Isticmaalka Deganaanshaha Keliya
-XX:CMSWwaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecord Had iyo jeer
-XX:+CMSClass dejinta waa la dajiyay

Xusuusta jvm.options waxaa loo qoondeeyay 16Gb (waxaan sidoo kale isku daynay 32 Gb, wax farqi ah lama dareemin).

Shaxda waxaa lagu abuuray amarka:

CREATE TABLE ks.t1 (id bigint PRIMARY KEY, title text) WITH compression = {'sstable_compression': 'LZ4Compressor', 'chunk_length_kb': 64};

Nooca HB: 1.2.0-cdh5.14.2 (fasalka org.apache.hadoop.hbase.regionserver.HRegion waxaan ka saarnay MetricsRegion taasoo keentay GC markii tirada goboladu ay ka badnaayeen 1000 RegionServer)

Halbeegyada HBase ee aan asalka ahaynzookeeper.fadhiga.waqtigu dhamaaday: 120000
hbase.rpc.waqti dhammaatay: 2 daqiiqo(s)
hbase.client.scanner.timeout.period: 2 daqiiqo(s)
hbase.master.handler.tirin: 10
hbase.regionserver.lease.period, hbase.client.scanner.timeout.period: 2 daqiiqo(s)
hbase.regionserver.handler.count: 160
hbase.goboladaserver.metahandler.count: 30
hbase.regionserver.logroll.period: 4 saac (s)
hbase.goboladaserver.maxlogs: 200
hbase.hregion.memstore.flush. cabbirka: 1 GiB
hbase.hregion.memstore.block.multiplier: 6
hbase.hstore.compaction Xadka: 5
hbase.hstore.blockingStoreFiles: 200
hbase.hregion.compaction weyn: 1 maalmood
Adeegga HBase Sare Isku-xidhka Isku xidhka Sare (Safety Valve) ee hbase-site.xml:
hbase.gobolada.wal.codecorg.apache.hadoop.hbase.gobolada.wal.IndexedWALEditCodec
hbase.master.namespace.init.timeout3600000
hbase.regionserver.optionalcacheflushinterval18000000
hbase.gobolada.thread.compaction.weyn12
hbase.gobolada.wal.awood uqaadida runta
hbase.hstore.compaction.max.size1073741824
hbase.server.compactchecker.interval.multiplier200
Isku xidhka Java ee HBase RegionServer:
-XX:+IsticmaalParNewGC -XX:+IsticmaalConcMarkSweepGC -XX:CMSinitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:ReservedCodeCacheSize=256m
hbase.snapshot.master.timeoutMillis: 2 daqiiqo(s)
hbase.snapshot.gobolka.waqti dhammaatay: 2 daqiiqo(s)
hbase.snapshot.master.timeout.millis: 2 daqiiqo(s)
Xajmiga HBase REST Server ugu badnaan: 100 MiB
HBase REST kaydinta ugu badan ee kaydinta Log: 5
HBase Thrift Server Cabbirka ugu badan ee Log: 100 MiB
HBase Thrift Server-ka ugu badan ee kaydinta faylka: 5
Master Max Max Log Cabirka: 100 MiB
Master Ugu badnaan Kaydinta File Log: 5
Baaxadda Galitaanka ugu badan ee Server-ka: 100 MiB
RegionServer ugu badnaan kaydinta File Log: 5
HBase Daaqadda Ogaanshaha Sare ee Firfircoon: 4 daqiiqo
dfs.client.hedgeed.akhri.threadpool.cabbir: 40
dfs.client.hedged.akhri.threshold.milis: 10 millisecond(s)
hbase.rest.threads.min: 8
hbase.rest.threads.max: 150
Sharaxayaasha Faylka Habka ugu badan: 180000
hbase.thrift.minWorkerThreads: 200
hbase.master.executor.openregion.threads: 30
hbase.master.executor.closeregion.threads: 30
hbase.master.executor.serverops.threads: 60
hbase.gobolada.thread.compaction.yar: 6
hbase.ipc.server.read.threadpool. cabbirka: 20
Xargaha Dhaqdhaqaaqa Gobolka: 6
Cabirka Heap Java ee macmiilka ee Bytes: 1 GiB
Kooxda HBase REST Server-ka ugu tala galay: 3 GiB
HBase Thrift Server Group Default: 3 GiB
Cabirka Heap Java ee HBase Master ee Bytes: 16 GiB
Cabirka Heap Java ee Adeegga Gobolka HBase ee Bytes: 32 GiB

+ ZooKeeper
maxClientCnxns: 601
maxSessionTimeout: 120000
Samaynta miisaska:
hbase org.apache.hadoop.hbase.util.RegionSplitter ns:t1 UniformSplit -c 64 -f cf
beddelka 'ns:t1', {NAME => 'cf', DATA_BLOCK_ENCODING => 'FAST_DIFF', COMPRESSION => 'GZ'}

Waxaa jira hal qodob oo muhiim ah halkan - sharraxaadda DataStax ma sheegayso inta gobol ee loo adeegsaday abuurista miisaska HB, inkasta oo tani ay muhiim u tahay tiro badan. Sidaa darteed, imtixaannada, tirada = 64 ayaa la doortay, taas oo u oggolaanaysa kaydinta ilaa 640 GB, i.e. miiska cabbirka dhexdhexaadka ah.

Waqtiga imtixaanka, HBase waxay lahayd 22 kun oo miis ah iyo 67 kun oo gobol (tani waxay khatar u ahaan lahayd nooca 1.2.0 haddii aysan ahayn balastar aan kor ku soo sheegnay).

Hadda koodka. Maaddaama aysan caddayn isku-habaynta ka faa'iidaysanaysa xog ururin gaar ah, imtixaanno ayaa lagu fuliyay isku-daryo kala duwan. Kuwaas. imtixaanada qaar, 4 miis ayaa isku mar la raray (dhammaan 4ta nood ayaa loo isticmaalay isku xirka). Imtixaanada kale waxaan la shaqeynay 8 miis oo kala duwan. Xaaladaha qaarkood, cabbirku wuxuu ahaa 100, qaarna 200 (beeg-beegtida dufcaddii - eeg code hoose). Cabbirka xogta qiimuhu waa 10 bytes ama 100 bytes (dataSize). Wadar ahaan, 5 milyan oo diiwaan ayaa la qoray oo lagu akhriyay miis kasta mar kasta. Isla markaa, 5 xadhig ayaa lagu qoray/akhriyay miis kasta (lambarka xadhigga - thNum), mid kasta oo ka mid ah wuxuu adeegsaday furayaasha kala duwan (tirinta = 1 milyan):

if (opType.equals("insert")) {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        StringBuilder sb = new StringBuilder("BEGIN BATCH ");
        for (int i = 0; i < batch; i++) {
            String value = RandomStringUtils.random(dataSize, true, true);
            sb.append("INSERT INTO ")
                    .append(tableName)
                    .append("(id, title) ")
                    .append("VALUES (")
                    .append(key)
                    .append(", '")
                    .append(value)
                    .append("');");
            key++;
        }
        sb.append("APPLY BATCH;");
        final String query = sb.toString();
        session.execute(query);
    }
} else {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        StringBuilder sb = new StringBuilder("SELECT * FROM ").append(tableName).append(" WHERE id IN (");
        for (int i = 0; i < batch; i++) {
            sb = sb.append(key);
            if (i+1 < batch)
                sb.append(",");
            key++;
        }
        sb = sb.append(");");
        final String query = sb.toString();
        ResultSet rs = session.execute(query);
    }
}

Sidaa darteed, shaqada la midka ah ayaa la siiyay HB:

Configuration conf = getConf();
HTable table = new HTable(conf, keyspace + ":" + tableName);
table.setAutoFlush(false, false);
List<Get> lGet = new ArrayList<>();
List<Put> lPut = new ArrayList<>();
byte[] cf = Bytes.toBytes("cf");
byte[] qf = Bytes.toBytes("value");
if (opType.equals("insert")) {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        lPut.clear();
        for (int i = 0; i < batch; i++) {
            Put p = new Put(makeHbaseRowKey(key));
            String value = RandomStringUtils.random(dataSize, true, true);
            p.addColumn(cf, qf, value.getBytes());
            lPut.add(p);
            key++;
        }
        table.put(lPut);
        table.flushCommits();
    }
} else {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        lGet.clear();
        for (int i = 0; i < batch; i++) {
            Get g = new Get(makeHbaseRowKey(key));
            lGet.add(g);
            key++;
        }
        Result[] rs = table.get(lGet);
    }
}

Maadaama HB uu macmiilku ku qasban yahay inuu daryeesho qaybinta xogta lebiska ah, shaqada milixdu waxay u egtahay sidan:

public static byte[] makeHbaseRowKey(long key) {
    byte[] nonSaltedRowKey = Bytes.toBytes(key);
    CRC32 crc32 = new CRC32();
    crc32.update(nonSaltedRowKey);
    long crc32Value = crc32.getValue();
    byte[] salt = Arrays.copyOfRange(Bytes.toBytes(crc32Value), 5, 7);
    return ArrayUtils.addAll(salt, nonSaltedRowKey);
}

Hadda qaybta ugu xiisaha badan - natiijooyinka:

Dagaalka laba yakozuna, ama Cassandra vs HBase. waayo-aragnimada kooxda Sberbank

Wax la mid ah qaabka garaafka:

Dagaalka laba yakozuna, ama Cassandra vs HBase. waayo-aragnimada kooxda Sberbank

Faa'iidada HB waa wax la yaab leh in uu jiro tuhun ah in uu jiro nooc ka mid ah cilladaha ku jira dejinta CS. Si kastaba ha ahaatee, Googling iyo raadinta xuduudaha ugu cad cad (sida isku-xiran_writes ama memtable_heap_space_in_mb) wax ma dedejin. Isla mar ahaantaana, logu waa nadiif oo waxba ha ku dhaaran.

Xogta ayaa si siman loogu qaybiyey qanjidhada, tirakoobyada dhammaan noodyadu waxay ahaayeen qiyaastii isku mid.

Tani waa sida tirakoobka miiska uu u eg yahay mid ka mid ah noodyadaGoobta furaha: ks
Akhri tirada: 9383707
Akhri Latency: 0.04287025042448576 ms
Qor tirada: 15462012
Qor daahida: 0.1350068438699957 ms
Biyo-baxa la sugayo: 0
Shaxda: t1
Tirada SSTING: 16
Booska la isticmaalay (toos ah): 148.59 MiB
Booska la isticmaalay (wadarta): 148.59 MiB
Baaxadda loo isticmaalo sawir-qaadista (wadarta): 0 bytes
Xusuusta kaydka ka baxsan ee la isticmaalay (wadarta): 5.17 MiB
Qiyaasta Isku-buufinta SSTable: 0.5720989576459437
Tirada qaybaha (qiyaas): 3970323
Tirada unugyada la taaban karo: 0
Cabbirka xogta la taaban karo: 0 bytes
Memtable off xusuusta tuulan la isticmaalay: 0 bytes
Tirada beddelka la taaban karo: 5
Tirada akhrinta gudaha: 2346045
Daahitaanka akhriska maxalliga ah: NaN ms
Tirada qorista deegaanka: 3865503
Daahitaanka qoritaanka maxalliga ah: NaN ms
Qulqulka la sugayo: 0
Boqolkiiba la dayactiray: 0.0
Bloom shaandheynta beenta ah: 25
Filitaanka Bloom ee saamiga beenta ah: 0.00000
Meesha shaandhada Bloom ee la isticmaalo: 4.57 MiB
shaandhaynta Bloom off tuulan xusuusta la isticmaalo: 4.57 MiB
Soo koobid tusmada xusuusta tuulan ee la isticmaalay: 590.02 KiB
Xogta badan ee isku-buufinta ee xusuusta tuulan ee la isticmaalay: 19.45 KiB
Qaybaha is haysta ee ugu yar ee bytes: 36
Qaybaha is haysta ee ugu badnaan: 42
Qayb la is-afgareeyey oo macnaheedu yahay bytes: 42
Celceliska unugyada nool jeex kasta (shantii daqiiqo ee ugu dambeysay): NaN
Unugyada nool ee ugu badan jeex kasta (shantii daqiiqo ee ugu dambeysay): 0
Celceliska xabaalaha xabaasha halkii jeex (shantii daqiiqo ee ugu dambeysay): NaN
Xabaalaha ugu badan ee jeex kasta (shantii daqiiqo ee ugu dambeysay): 0
Isbeddellada la tuuray: 0 bytes

Isku day lagu doonayo in lagu dhimo xajmiga dufcada (xitaa u dirida shakhsi ahaan) wax saameyn ah ma yeelan, kaliya way ka sii dartay. Waxaa suurtogal ah in dhab ahaantii tani ay tahay waxqabadka ugu sarreeya ee CS, maaddaama natiijooyinka la helay CS ay la mid yihiin kuwa loo helay DataStax - qiyaastii boqollaal kun oo hawlgal ah halkii labaad. Intaa waxaa dheer, haddii aan eegno ka faa'iidaysiga kheyraadka, waxaan arki doonaa in CS uu isticmaalo CPU iyo saxanado aad u badan:

Dagaalka laba yakozuna, ama Cassandra vs HBase. waayo-aragnimada kooxda Sberbank
Jaantusku wuxuu muujinayaa ka faa'iidaysiga inta lagu jiro socodsiinta dhammaan imtixaanada isku xigta ee labada kayd.

Marka la eego faa'iidada akhriska ee xoogga leh ee HB. Halkan waxa aad ku arki kartaa in labada xog-ururinba, isticmaalka diskooga inta lagu jiro wax-akhrinta uu aad u hooseeyo (tijaabooyin akhris ayaa ah qaybta ugu dambaysa ee wareegga tijaabada ee xog kasta, tusaale ahaan CS kani waa 15:20 ilaa 15:40). Marka laga hadlayo HB, sababtu way caddahay - inta badan xogta waxay ku xidhan tahay xusuusta, xusuusta, qaarna waxay ku kaydsan yihiin blockcache. Dhanka CS, aad uma cadda sida uu u shaqeeyo, laakiin dib u warshadaynta diskku sidoo kale ma muuqato, laakiin haddii ay dhacdo, waxaa la isku dayay in la suurtageliyo khasnadda row_cache_size_in_mb = 2048 oo la dejiyo caching = {'furayaasha': 'ALL', ' saf_per_partition': '2000000'}, laakiin taasi waxay ka dhigtay xitaa in yar.

Waxa kale oo mudan in mar kale la sheego qodob muhiim ah oo ku saabsan tirada gobollada HB. Dhankayaga, qiimaha waxaa lagu qeexay 64. Haddii aad yareyso oo aad ka dhigto mid la mid ah, tusaale ahaan, 4, ka dibna markaad akhrineyso, xawaaruhu wuxuu hoos u dhacayaa 2 jeer. Sababta ayaa ah in memstore uu si dhakhso ah u buuxsami doono, faylalkana waa la nadiifin doonaa marar badan iyo marka la akhrinayo, faylal badan ayaa loo baahan doonaa in la habeeyo, taas oo ah qalliin aad u dhib badan HB. Xaaladaha dhabta ah, tan waxaa lagu daweyn karaa iyada oo laga fekerayo iyada oo loo marayo istaraatiijiyad horudhac ah iyo isafgarad; gaar ahaan, waxaan isticmaalnaa qalab iskiis u qoran oo aruuriya qashinka oo cadaadiya HFiles si joogto ah gadaasha. Waa suurtogal in imtixaanada DataStax ay u qoondeeyeen 1 gobol oo kaliya miiska (taas oo aan sax ahayn) tani waxay caddaynaysaa sababta HB ay aad uga hooseeyaan imtixaanadooda akhriska.

Gabagabada hordhaca ah ee soo socota ayaa laga soo qaatay tan. Iyadoo loo maleynayo in aan khaladaad waaweyn la samayn inta lagu jiro tijaabada, ka dibna Cassandra waxay u egtahay colossus cagaha dhoobada ah. Si sax ah, inta ay isku dheelitirto hal lug, sida sawirka bilawga maqaalka, waxay muujinaysaa natiijooyin wanaagsan, laakiin dagaal ku jira xaalado isku mid ah waxay lumisaa si toos ah. Isla mar ahaantaana, anagoo tixgelinayna isticmaalka hooseeya ee CPU ee qalabkeena, waxaan baranay inaan beerno laba RegionServer HBs martigeliyahaba oo markaas labanlaabay waxqabadka. Kuwaas. Iyadoo la tixgelinayo ka faa'iidaysiga kheyraadka, xaaladda CS waa mid aad looga xumaado.

Dabcan, imtixaanadani waa kuwo isku dhafan oo cadadka xogta halkan lagu isticmaalay waa mid yar. Waxaa suurtogal ah in haddii aan u wareegno terabytes, xaaladdu ay ka duwanaan doonto, laakiin HB waxaan ku shuban karnaa terabytes, CS tani waxay noqotay dhibaato. Waxay inta badan tuurtay OperationTimedOutException xitaa iyada oo muggaan, in kasta oo cabbirrada jawaabta sugidda ay horeyba u kordhiyeen dhowr jeer marka la barbardhigo kuwa caadiga ah.

Waxaan rajeynayaa in dadaal wadajir ah aan ku heli doonno caqabadaha CS iyo haddii aan dedejin karno, ka dibna dhamaadka boostada waxaan hubaal ah ku dari doonaa macluumaadka ku saabsan natiijooyinka kama dambaysta ah.

UPD: Waad ku mahadsan tahay talada asxaabta, waxaan ku guuleystay inaan dedejiyo akhrinta. Waxay ahayd:
159 ops (644 miis, 4 durdur, dufcad 5).
lagu daray:
.withLoadBalancingPolicy(cusub TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().build ()))
Oo waxaan ku ciyaaray hareeraha tirada dunta. Natiijadu waa sida soo socota:
4 miis, 100 dun, Dufcaddii = 1 (qayb gabal): 301 ops
4 miis, 100 xadhig, dufcad = 10: 447 ops
4 miis, 100 xadhig, dufcad = 100: 625 ops

Ka dib waxaan codsan doonaa tabaha hagaajinta kale, socodsiiya wareeg imtixaan buuxa oo ku dari doonaa natiijada dhamaadka boostada.

Source: www.habr.com

Add a comment