Agha nke yakozuna abụọ, ma ọ bụ Cassandra vs HBase. Ahụmahụ otu Sberbank

Nke a abụghịdị egwuregwu, ọ dị ka foto a na-egosipụta n'ụzọ ziri ezi na isi ihe dị na ọdụ data ndị a, na n'ikpeazụ ọ ga-edo anya ihe kpatara ya:

Agha nke yakozuna abụọ, ma ọ bụ Cassandra vs HBase. Ahụmahụ otu Sberbank

Dị ka DB-Engines Ranking si kwuo, ọdụ data data kọlụm NoSQL abụọ kachasị ewu ewu bụ Cassandra (CS ugbu a) na HBase (HB).

Agha nke yakozuna abụọ, ma ọ bụ Cassandra vs HBase. Ahụmahụ otu Sberbank

Site n'ọchịchọ nke akara aka, ndị otu njikwa data anyị na Sberbank enweelarị ogologo oge gara aga ma na-arụkọ ọrụ na HB. N'oge a, anyị mụrụ nke ọma ebe ọ na-eme nke ọma na ebe ọ na-adịghị ike ma mụta ka esi esi ya. Otú ọ dị, ọnụnọ nke ọzọ n'ụdị CS na-amanye anyị mgbe niile iji obi abụọ na-ata onwe anyị ahụhụ: ànyị mere nhọrọ ziri ezi? Ọzọkwa, pụta ntụnyere, nke DataStax mere, ha kwuru na CS na-akụ HB ngwa ngwa na ihe fọrọ nke nta ka ọ bụrụ akara ngwepịa. N'aka nke ọzọ, DataStax bụ onye nwere mmasị, na ị gaghị ewere okwu ha maka ya. Anyị nwekwara mgbagwoju anya site na obere ozi gbasara ọnọdụ nnwale, yabụ anyị kpebiri ịchọpụta n'onwe anyị onye bụ eze BigData NoSql, nsonaazụ enwetara wee bụrụ ihe na-atọ ụtọ.

Otú ọ dị, tupu ịga n'ihu na nsonaazụ nke ule ndị e mere, ọ dị mkpa ịkọwa akụkụ dị mkpa nke nhazi gburugburu ebe obibi. Nke bụ eziokwu bụ na CS nwere ike na-eji na a mode na-enye ohere data ọnwụ. Ndị ahụ. nke a bụ mgbe naanị otu ihe nkesa (node) na-ahụ maka data nke igodo ụfọdụ, ma ọ bụrụ n'ihi ihe ụfọdụ ọ daa, mgbe ahụ uru igodo a ga-efunahụ. Maka ọtụtụ ọrụ nke a abụghị ihe dị egwu, mana maka ngalaba ụlọ akụ nke a bụ ihe dị iche karịa iwu. N'ọnọdụ anyị, ọ dị mkpa ịnweta ọtụtụ mbipụta data maka nchekwa a pụrụ ịdabere na ya.

Ya mere, a tụlere naanị ụdị ọrụ CS n'ụdị mmegharị ugboro atọ, ya bụ. Ejiri paramita ndị a mee ihe okike nke oghere ikpe:

CREATE KEYSPACE ks WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3};

Na-esote, e nwere ụzọ abụọ iji hụ na ọkwa dị mkpa nke nhazi. Iwu izugbe:
NW + NR > RF

Nke pụtara na ọnụọgụ nkwenye sitere na ọnụ mgbe ị na-ede (NW) gbakwunyere ọnụ ọgụgụ nkwenye site na ọnụ mgbe ị na-agụ (NR) ga-abụrịrị ihe dị ukwuu karịa ihe ntụgharị. N'ọnọdụ anyị, RF = 3, nke pụtara na nhọrọ ndị a dabara adaba:
2 + 2> 3
3 + 1> 3

Ebe ọ bụ na ọ dị mkpa ka anyị chekwaa data ahụ dịka o kwere mee, a họọrọ atụmatụ 3+1. Na mgbakwunye, HB na-arụ ọrụ na ụkpụrụ yiri nke ahụ, i.e. ntụnyere dị otú ahụ ga-adịkwu mma.

Ekwesiri ighota na DataStax mere ihe ozo na akwukwo ha, ha debere RF = 1 maka CS na HB (maka nke ozo site na igbanwe ntọala HDFS). Nke a bụ akụkụ dị mkpa n'ezie n'ihi na mmetụta na arụmọrụ CS na nke a dị ukwuu. Dịka ọmụmaatụ, foto dị n'okpuru na-egosi mmụba nke oge achọrọ iji tinye data na CS:

Agha nke yakozuna abụọ, ma ọ bụ Cassandra vs HBase. Ahụmahụ otu Sberbank

N'ebe a, anyị na-ahụ ihe ndị a: ka eri na-asọmpi na-ede data, ogologo oge ọ na-ewe. Nke a bụ ihe okike, mana ọ dị mkpa na mmebi arụmọrụ maka RF = 3 dị elu nke ukwuu. N'ikwu ya n'ụzọ ọzọ, ọ bụrụ na anyị edee eriri 4 n'ime tebụl 5 nke ọ bụla (20 na mkpokọta), mgbe ahụ RF = 3 na-efunahụ ihe dị ka ugboro 2 (150 sekọnd maka RF = 3 vs 75 maka RF = 1). Mana ọ bụrụ na anyị na-ebuwanye ibu site na itinye data na tebụl 8 nwere eriri 5 nke ọ bụla (40 na mkpokọta), mgbe ahụ ọnwụ RF = 3 adịlarị ugboro 2,7 (375 sekọnd na 138).

Ikekwe nke a bụ akụkụ nzuzo nke nnwale ibu na-aga nke ọma nke DataStax mere maka CS, n'ihi na maka HB n'ọnọdụ anyị na-agbanwe ihe ntụgharị sitere na 2 ruo 3 enweghị mmetụta ọ bụla. Ndị ahụ. diski abụghị ihe mgbochi HB maka nhazi anyị. Agbanyeghị, enwere ọtụtụ ọnyà ndị ọzọ ebe a, n'ihi na ekwesịrị iburu n'uche na ụdị HB anyị dị ntakịrị patched na tweaked, gburugburu dị iche iche kpamkpam, wdg. Ọ dịkwa mma ịmara na ma eleghị anya, amaghị m otú e si akwadebe CS n'ụzọ ziri ezi na e nwere ụzọ ndị ọzọ dị irè iji rụọ ọrụ na ya, enwere m olileanya na anyị ga-achọpụta na nkwupụta. Ma mbụ ihe mbụ.

Emere ule niile na ụyọkọ ngwaike nwere sava 4, nke ọ bụla nwere nhazi ndị a:

CPU: Xeon E5-2680 v4 @ 2.40GHz 64 eri.
Disk: 12 iberibe SATA HDD
ụdị java: 1.8.0_111

Ụdị CS: 3.11.5

cassandra.yml parametersọnụọgụgụ: 256
hinted_handoff_enabled: eziokwu
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
hints_directory: /data10/cassandra/hints
hints_flush_period_in_ms: 10000
max_hints_file_size_in_mb: 128
batchlog_replay_throttle_in_kb: 1024
onye nyocha: AllowAllAuthenticator
odee: AllowAllAuthorizer
onye njikwa ọrụ: CassandraRoleManager
Ọrụ_nne_in_ms: 2000
ikike_validity_in_ms: 2000
nzere_validity_in_ms: 2000
partitioner: org.apache.cassandra.dht.Murmur3 Partitioner
data_file_directories:
- / data1/cassandra/data # ndekọ dataN ọ bụla bụ diski dị iche
- /data2/cassandra/data
- /data3/cassandra/data
- /data4/cassandra/data
- /data5/cassandra/data
- /data6/cassandra/data
- /data7/cassandra/data
- /data8/cassandra/data
Committeelog_directory: /data9/cassandra/commitlog
cdc_enabled: ụgha
disk_failure_policy: kwụsị
ime_failure_policy: kwụsị
akwadoro_nkwupụta_cache_size_mb:
thrift_prepared_statements_cache_size_mb:
key_cache_size_in_mb:
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
counter_cache_size_in_mb:
counter_cache_save_period: 7200
echekwara_caches_directory: /data10/cassandra/saved_caches
Committeelog_sync: oge
emelog_sync_period_in_ms: 10000
Committeelog_segment_size_in_mb: 32
onye na-enye mkpụrụ:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
ntinye:
- mkpụrụ: "*,*"
concurrent_reads: 256 # nwara 64 - ọ dịghị ihe dị iche hụrụ
concurrent_writes: 256 # gbalịrị 64 - ọ dịghị ihe dị iche hụrụ
concurrent_counter_writes: 256 # gbalịrị 64 - ọ dịghị ihe dị iche hụrụ
concurrent_materialized_view_write: 32
memtable_heap_space_in_mb: 2048 # nwara 16 GB - ọ dị nwayọ
memtable_allocation_type: heap_buffers
index_summary_ike_in_mb:
index_summary_resize_interval_in_minute: 60
trickle_fsync: ụgha
trickle_fsync_interval_in_kb: 10240
ọdụ ụgbọ mmiri: 7000
ssl_storage_ọdụ ụgbọ mmiri: 7001
gee ntị: *
adreesị mgbasa ozi: *
listen_on_broadcast_address: eziokwu
internode_authenticator: org.apache.cassandra.auth.AllowAllInternodeAuthenticator
start_native_transport: eziokwu
ọdụ ụgbọ ala obodo: 9042
start_rpc: eziokwu
rpc_adreesị: *
rpc_ọdụ ụgbọ mmiri: 9160
rpc_keepalive: eziokwu
rpc_server_type: mmekọrịta
thrift_framed_transport_size_in_mb: 15
incremental_ndabere: ụgha
snapshot_before_compaction: ụgha
auto_snapshot: eziokwu
column_index_size_in_kb: 64
column_index_cache_size_in_kb: 2
ndị na-emekọ ihe ọnụ: 4
compaction_throughput_mb_per_sec: 1600
sstable_preemptive_open_interval_in_mb: 50
gụọ_request_timeout_in_ms: 100000
nso_request_timeout_in_ms: 200000
dee_request_timeout_in_ms: 40000
counter_write_request_timeout_in_ms: 100000
cas_contention_timeout_in_ms: 20000
truncate_request_timeout_in_ms: 60000
arịrịọ_timeout_in_ms: 200000
slow_query_log_timeout_in_ms: 500
cross_node_timeout: ụgha
endpoint_snitch: GossipingPropertyFileSnitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_ọjọọ_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
server_encryption_options:
internode_encryption: ọ dịghị
client_encryption_options:
enyere: ụgha
internode_compression: dc
inter_dc_tcp_nodelay: ụgha
tracetype_query_ttl: 86400
tracetype_repair_ttl: 604800
enable_user_defined_functions: ụgha
enable_scripted_user_defined_functions: ụgha
windows_timer_interval: 1
transparent_data_encryption_options:
enyere: ụgha
Tombstone_ndụ ọdụ: 1000
tombstone_failure_threshold: 100000
batch_size_warn_threshold_in_kb: 200
batch_size_fail_threshold_in_kb: 250
unlogged_batch_across_partitions_warn_threshold: 10
compaction_large_partition_warning_threshold_mb: 100
gc_warn_threshold_in_ms: 1000
back_pressure_enabled: ụgha
ikike_materialized_views: eziokwu
enable_sasi_indexes: eziokwu

Ntọala GC:

### Ntọala CMS-XX:+JiriParNewGC
-XX:+JiriConcMarkSweepGC
-XX:+CMSParallelRemarkEnable
-XX:SurvivorRatio=8
-XX: MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+JiriCMSitilite Occupancy naanị
-XX:CMSWaitDuration=10000
-XX:+CMSParallel mbụMarkEnabled
-XX:+CMSEdenChunksRecord Mgbe niile
-XX:+CMSClass Nwepu

E kenyere ebe nchekwa jvm.options 16Gb (anyị gbalịkwara 32 Gb, ọ nweghị ihe dị iche ahụrụ).

E ji iwu ahụ mepụta tebụl ndị ahụ:

CREATE TABLE ks.t1 (id bigint PRIMARY KEY, title text) WITH compression = {'sstable_compression': 'LZ4Compressor', 'chunk_length_kb': 64};

Ụdị HB: 1.2.0-cdh5.14.2 (na klas org.apache.hadoop.hbase.regionserver.HRegion anyị wepụrụ MetricsRegion nke butere GC mgbe ọnụọgụ mpaghara karịrị 1000 na RegionServer)

Parampat HBase na-abụghị nke ndabarazookeeper.nnọkọ. Oge ngwụcha: 120000
hbase.rpc. Oge ngwụcha: nkeji 2
hbase.client.scanner.timeout.period: 2 nkeji(s)
hbase.master.handler.ọnụ: 10
hbase.regionserver.lease.period, hbase.client.scanner.timeout.period: 2 nkeji(s)
hbase.regionserver.handler.count: 160
hbase.regionserver.metahandler.count: 30
hbase.regionserver.logroll.period: 4 hour(s)
hbase.regionserver.maxlogs: 200
hbase.hregion.memstore.flush.size: 1 GiB
hbase.hregion.memstore.block.multiplier: 6
hbase.hstore.compaction Oke: 5
hbase.hstore.blockingStoreFiles: 200
hbase.hregion.majorcompaction: 1 ụbọchị (s)
Snippet nhazi ọkwa dị elu ọrụ HBase maka hbase-site.xml:
hbase.regionserver.wal.codecorg.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
hbase.master.namespace.init.timeout3600000
hbase.regionserver.optionalcacheflushinterval18000000
hbase.regionserver.thread.compaction.large12
hbase.regionserver.wal.enablecompressiontrue
hbase.hstore.compaction.max.size1073741824
hbase.server.compactchecker.interval.multiplier200
Nhọrọ nhazi Java maka HBase RegionServer:
-XX:+JiriParNewGC -XX:+JiriConcMarkSweepGC -XX:CMSinitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:ReservedCodeCacheSize=256m
hbase.snapshot.master.timeoutMillis: nkeji 2
hbase.snapshot.region.oge: 2 nkeji (s)
hbase.snapshot.master.timeout.milis: nkeji 2
Ihe nkesa HBase REST Oke ndekọ aha: 100 MiB
HBase REST nkesa nchekwa nchekwa faịlụ ndekọ kacha: 5
Ihe nkesa HBase Thrift Oke Ndekọ: 100 MiB
Ndabere faịlụ ndebanye kacha elu: 5
Nha Ndekọ Master Max: 100 MiB
Ndabere faịlụ ndekọ kacha elu: 5
Oke ndekọ ihe nkesa mpaghara: 100 MiB
Ndabere faịlụ ndebanye kacha nke sava mpaghara: 5
HBase Active Master Nchọpụta Ohere: nkeji 4
dfs.client.hedged.read.threadpool.nha: 40
dfs.client.hedged.read.threshold.milis: 10 millisecond(s)
hbase.rest.threads.min: 8
hbase.rest.threads.max: 150
Ndị na-akọwa faịlụ usoro kachasị: 180000
hbase.thrift.minOnye ọrụEriokwu: 200
hbase.master.executor.openregion.threads: 30
hbase.master.executor.closeregion.threads: 30
hbase.master.executor.serverops.threads: 60
hbase.regionserver.thread.compaction.small: 6
hbase.ipc.server.read.threadpool.size: 20
Eriri ndị na-ebugharị mpaghara: 6
Nha obo Java onye ahịa na Bytes: 1 GiB
Otu ndabara nkesa nke HBase REST: 3 GiB
Otu ndabara nkesa nke HBase Thrift: 3 GiB
Nha Java Heap nke HBase Master na Bytes: 16 GiB
Oke ikpo okwu Java nke HBase Region Server na Bytes: 32 GiB

+ ZooCheeper
maxClientCnxns: 601
maxSessionTimeout: 120000
Ịmepụta tebụl:
hbase org.apache.hadoop.hbase.util.RegionSplitter ns:t1 UniformSplit -c 64 -f cf
alter 'ns:t1', {NAME => 'cf', DATA_BLOCK_ENCODING => 'FAST_DIFF', COMPRESSION => 'GZ'}

Enwere otu isi ihe dị mkpa ebe a - nkọwa dataStax ekwughị mpaghara ole ejiri mepụta tebụl HB, ọ bụ ezie na nke a dị oke egwu maka nnukwu mpịakọta. Ya mere, maka ule, ọnụọgụ = 64 họọrọ, nke na-enye ohere ịchekwa ruo 640 GB, i.e. okpokoro nha ọkara.

N'oge ule ahụ, HBase nwere 22 puku tebụl na 67 puku mpaghara (nke a gaara egbu egbu maka ụdị 1.2.0 ma ọ bụrụ na ọ bụghị maka patch ahụ a kpọtụrụ aha n'elu).

Ugbu a maka koodu. Ebe ọ bụ na o doghị anya nhazi ndị dị mma maka otu nchekwa data, a na-eme ule na nchịkọta dị iche iche. Ndị ahụ. N'ule ụfọdụ, a na-ebunye tebụl 4 n'otu oge (a na-eji ọnụ 4 niile maka njikọ). N'ule ndị ọzọ anyị na-arụ ọrụ na 8 dị iche iche tebụl. N'ọnọdụ ụfọdụ, nha batch ahụ bụ 100, na ndị ọzọ 200 (pamita ogbe - lee koodu dị n'okpuru). Nha data maka uru bụ 10 bytes ma ọ bụ 100 bytes (dataSize). Na mkpokọta, edere nde 5 nde ndekọ ma gụọ ya na tebụl ọ bụla oge ọ bụla. N'otu oge ahụ, e dere / gụọ eriri 5 na tebụl ọ bụla (nọmba eriri - thNum), nke ọ bụla n'ime ha ji igodo nke ya (ọnụ = 1 nde):

if (opType.equals("insert")) {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        StringBuilder sb = new StringBuilder("BEGIN BATCH ");
        for (int i = 0; i < batch; i++) {
            String value = RandomStringUtils.random(dataSize, true, true);
            sb.append("INSERT INTO ")
                    .append(tableName)
                    .append("(id, title) ")
                    .append("VALUES (")
                    .append(key)
                    .append(", '")
                    .append(value)
                    .append("');");
            key++;
        }
        sb.append("APPLY BATCH;");
        final String query = sb.toString();
        session.execute(query);
    }
} else {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        StringBuilder sb = new StringBuilder("SELECT * FROM ").append(tableName).append(" WHERE id IN (");
        for (int i = 0; i < batch; i++) {
            sb = sb.append(key);
            if (i+1 < batch)
                sb.append(",");
            key++;
        }
        sb = sb.append(");");
        final String query = sb.toString();
        ResultSet rs = session.execute(query);
    }
}

N'ihi ya, e nyere ụdị ọrụ ahụ maka HB:

Configuration conf = getConf();
HTable table = new HTable(conf, keyspace + ":" + tableName);
table.setAutoFlush(false, false);
List<Get> lGet = new ArrayList<>();
List<Put> lPut = new ArrayList<>();
byte[] cf = Bytes.toBytes("cf");
byte[] qf = Bytes.toBytes("value");
if (opType.equals("insert")) {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        lPut.clear();
        for (int i = 0; i < batch; i++) {
            Put p = new Put(makeHbaseRowKey(key));
            String value = RandomStringUtils.random(dataSize, true, true);
            p.addColumn(cf, qf, value.getBytes());
            lPut.add(p);
            key++;
        }
        table.put(lPut);
        table.flushCommits();
    }
} else {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        lGet.clear();
        for (int i = 0; i < batch; i++) {
            Get g = new Get(makeHbaseRowKey(key));
            lGet.add(g);
            key++;
        }
        Result[] rs = table.get(lGet);
    }
}

Ebe ọ bụ na na HB onye ahịa ga-elekọta nkesa data n'otu, ọrụ nnu dị ka nke a:

public static byte[] makeHbaseRowKey(long key) {
    byte[] nonSaltedRowKey = Bytes.toBytes(key);
    CRC32 crc32 = new CRC32();
    crc32.update(nonSaltedRowKey);
    long crc32Value = crc32.getValue();
    byte[] salt = Arrays.copyOfRange(Bytes.toBytes(crc32Value), 5, 7);
    return ArrayUtils.addAll(salt, nonSaltedRowKey);
}

Ugbu a akụkụ kachasị adọrọ mmasị - nsonaazụ:

Agha nke yakozuna abụọ, ma ọ bụ Cassandra vs HBase. Ahụmahụ otu Sberbank

Otu ihe ahụ n'ụdị eserese:

Agha nke yakozuna abụọ, ma ọ bụ Cassandra vs HBase. Ahụmahụ otu Sberbank

Uru nke HB bụ ihe ijuanya na enwere enyo na enwere ụdị ọkpọkọ na nhazi CS. Agbanyeghị, Googling na ịchọ paramita ndị kacha pụta ìhè (dị ka concurrent_writes ma ọ bụ memtable_heap_space_in_mb) emeghị ngwa ngwa. N'otu oge ahụ, osisi ndị ahụ dị ọcha ma ghara iyi ihe ọ bụla.

E kesara data ahụ n'ụzọ ziri ezi n'ofe ọnụ ọnụ, ọnụ ọgụgụ sitere na ọnụ ọnụ niile dị ihe dịka otu.

Nke a bụ ihe ọnụ ọgụgụ tebụl dị ka otu n'ime ọnụ ọnụOghere igodo: ks
Ọnụ ọgụgụ: 9383707
Gụọ Latency: 0.04287025042448576 ms
Ọnụ ọgụgụ: 15462012
Detu oge: 0.1350068438699957 ms
Mmiri na-echere: 0
Isiokwu: t1
Enwere ike ịgụta ọnụ: 16
Oghere eji (dị ndụ): 148.59 MiB
Oghere eji (mkpokọta): 148.59 MiB
Oghere ejiri foto ọnyà (mkpokọta): 0 bytes
Gbanyụọ ikpo ebe nchekwa ejiri (ngụkọta): 5.17 MiB
Oke mkpakọ SSTable: 0.5720989576459437
Ọnụọgụ nkebi (atụmatụ): 3970323
Ọnụ ọgụgụ cell enweghị atụ: 0
Nha data enweghị ike ịgbagha: 0 bytes
A na-eji ebe nchekwa ewepụghị emebi: 0 bytes
Ọnụọgụ mgbanwe mgbanwe: 5
Ọnụ ọgụgụ mpaghara: 2346045
Ọnwụgụ ọgụgụ mpaghara: NaN ms
Ọnụ ọgụgụ mpaghara: 3865503
Latency dee mpaghara: NaN ms
Mwepu na-echere: 0
Pasent emeziri: 0.0
Bloom na-enyocha ihe adịgboroja: 25
Ngụkọ ụgha nzacha oge ntoju: 0.00000
Oghere nzacha oge ntoju eji: 4.57 MiB
Ebe nchekwa ihe nzacha oge ntoju ejiri: 4.57 MiB
A na-eji nchịkọta ndepụta ndeksi gbanyụọ ebe nchekwa: 590.02 KiB
Metadata mkpakọ kwụsịrị ikpo ebe nchekwa ejiri: 19.45 KiB
Obere opekata mpe nkebi kọmpat: 36
Nkebi kacha nke kọmpat: 42
Nkebi gbakọrọ ọnụ pụtara bytes: 42
Nkezi mkpụrụ ndụ dị ndụ kwa iberi (nkeji ise gara aga): NaN
Selụ dị ndụ kachasị n'otu iberi (nkeji ise gara aga): 0
Nkezi okwute ili kwa iberi (nkeji ise gara aga): NaN
Nkume ili kacha elu kwa iberi (nkeji ise gara aga): 0
Mgbanwe ewedara: 0 bytes

Mgbalị iji belata nha nke batch (ọbụlagodi izipu ya n'otu n'otu) enweghị mmetụta, ọ na-akawanye njọ. Ọ ga-ekwe omume na n'ezie nke a bụ n'ezie arụmọrụ kachasị maka CS, ebe ọ bụ na nsonaazụ enwetara maka CS yiri nke enwetara maka DataStax - ihe dị ka narị puku kwuru puku ọrụ kwa nkeji. Na mgbakwunye, ọ bụrụ na anyị eleba anya na iji akụrụngwa, anyị ga-ahụ na CS na-eji ọtụtụ CPU na diski ndị ọzọ:

Agha nke yakozuna abụọ, ma ọ bụ Cassandra vs HBase. Ahụmahụ otu Sberbank
Ọnụọgụ a na-egosi ojiji n'oge a na-agba ọsọ nke ule niile n'usoro maka ọdụ data abụọ ahụ.

Banyere uru ịgụ akwụkwọ dị ike nke HB. N'ebe a, ị nwere ike ịhụ na maka nchekwa data abụọ ahụ, iji diski eme ihe n'oge ịgụ akwụkwọ dị oke ala (nnwale ịgụ bụ akụkụ ikpeazụ nke usoro nyocha maka nchekwa data ọ bụla, dịka ọmụmaatụ maka CS nke a sitere na 15:20 ruo 15:40). N'ihe banyere HB, ihe kpatara ya doro anya - ọtụtụ n'ime data na-adabere na ebe nchekwa, na memstore, na ụfọdụ na-echekwa na blockcache. N'ihe gbasara CS, o dochaghị anya ka o si arụ ọrụ, mana anaghị ahụkwa mwegharị diski, mana ọ bụrụ na a nwara ime ka cache row_cache_size_in_mb = 2048 wee tọọ caching = {'igodo': 'ALL', 'rows_per_partition': '2000000'}, mana nke ahụ mere ka ọ dịkwuo njọ.

Ọ dịkwa mma ikwughachi otu isi ihe dị mkpa gbasara ọnụọgụ mpaghara na HB. N'ọnọdụ anyị, akọwapụtara uru dị ka 64. Ọ bụrụ na ị belata ya ma mee ka ọ bụrụ, dịka ọmụmaatụ, 4, mgbe ahụ mgbe ị na-agụ, ọsọ ahụ na-adaba ugboro abụọ. Ihe kpatara ya bụ na memstore ga-ejupụta ngwa ngwa na faịlụ ga-ekpochapụ ọtụtụ oge na mgbe ị na-agụ ya, a ga-achọ nhazi faịlụ ndị ọzọ, nke bụ ọrụ mgbagwoju anya maka HB. N'ọnọdụ dị adị, enwere ike ịgwọ nke a site n'iche echiche site na atụmatụ presplitting na compactification; karịsịa, anyị na-eji ihe eji eme onwe ya nke na-anakọta ihe mkpofu ma na-akpakọba HFiles mgbe niile n'azụ. Ọ ga-ekwe omume na maka ule DataStax, ha kenyere naanị mpaghara 2 n'otu tebụl (nke na-ezighi ezi) na nke a ga-akọwatụ ihe kpatara HB ji dị ala na ule ọgụgụ ha.

E si na nke a nweta nkwubi okwu mmalite ndị a. Na-eche na ọ dịghị nnukwu mmejọ e mehiere n'oge ule, Cassandra dị ka colossus nke nwere ụkwụ ụrọ. Kpọmkwem, ka ọ na-edozi otu ụkwụ, dị ka na foto a na mmalite nke isiokwu ahụ, ọ na-egosi na ọ dịtụ mma, ma n'ọgụ n'okpuru otu ọnọdụ ọ na-efunahụ kpamkpam. N'otu oge ahụ, n'iburu n'uche ojiji CPU dị ala na ngwaike anyị, anyị mụtara ịkụnye RegionServer HB abụọ maka onye ọbịa wee si otú ahụ mee ka arụmọrụ ahụ dị okpukpu abụọ. Ndị ahụ. N'iburu n'uche iji ihe onwunwe eme ihe, ọnọdụ maka CS dị njọ karị.

N'ezie, ule ndị a bụ nnọọ sịntetik na ọnụọgụ data ejiri mee ebe a dịtụ ntakịrị. O kwere omume na ọ bụrụ na anyị gbanwee na terabytes, ọnọdụ ahụ ga-adị iche, ma maka HB anyị nwere ike ibu terabytes, maka CS nke a ghọrọ nsogbu. Ọ na-atụbakarị OperationTimedOutException ọbụna na mpịakọta ndị a, n'agbanyeghị na parampat maka ichere nzaghachi abawanyela ọtụtụ ugboro ma e jiri ya tụnyere ndị ndabara.

Enwere m olileanya na site na mgbalị nkwonkwo anyị ga-achọta ihe mgbochi nke CS ma ọ bụrụ na anyị nwere ike ime ngwa ngwa, mgbe ahụ na njedebe nke post ahụ, m ga-agbakwunye ozi gbasara nsonaazụ ikpeazụ.

UPD: N'ihi ndụmọdụ ndị otu ibe m, ejisiri m ike mee ka ọgụgụ ahụ dị ngwa. bụ:
159 ops (tebụl 644, iyi 4, ogbe 5).
Tụkwasị:
.withLoadBalancingAtumatu(ohuru TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().build()))
M na-egwuri egwu na ọnụ ọgụgụ nke eri. Nsonaazụ bụ nke a:
Tebụl 4, eri 100, ogbe = 1 (ibe otu): 301 ops
Tebụl 4, eri 100, ogbe = 10: 447 ops
Tebụl 4, eri 100, ogbe = 100: 625 ops

E mesịa, m ga-etinye ndụmọdụ nlegharị anya ndị ọzọ, mee usoro nyocha zuru oke ma tinye nsonaazụ na njedebe nke post.

isi: www.habr.com

Tinye a comment