Impi yama-yakozuna amabili, noma i-Cassandra vs HBase. Isipiliyoni seqembu le-Sberbank

Lokhu akulona ngisho ihlaya, kubonakala sengathi lesi sithombe sibonisa ngokunembile ingqikithi yalezi zincwadi, futhi ekugcineni kuzocaca ukuthi kungani:

Impi yama-yakozuna amabili, noma i-Cassandra vs HBase. Isipiliyoni seqembu le-Sberbank

Ngokusho kwe-DB-Engines Ranking, imininingwane emibili edume kakhulu yekholamu ye-NoSQL yi-Cassandra (elandelayo CS) kanye ne-HBase (HB).

Impi yama-yakozuna amabili, noma i-Cassandra vs HBase. Isipiliyoni seqembu le-Sberbank

Ngokwentando yesiphetho, ithimba lethu lokuphatha ukulayishwa kwedatha e-Sberbank selivele selikhona Π΄Π°Π²Π½ΠΎ futhi isebenza eduze ne-HB. Ngalesi sikhathi, safunda amandla nobuthakathaka bayo kahle futhi safunda ukuyipheka. Kodwa-ke, ukuba khona kokunye ngendlela ye-CS kwakuhlale kusiphoqa ukuba sizihluphe kancane ngokungabaza: ingabe senze ukukhetha okufanele? Ngaphezu kwalokho, imiphumela ukuqhathanisa, eyenziwa yi-DataStax, bathi i-CS ishaya kalula i-HB cishe ngamaphuzu ahlabayo. Ngakolunye uhlangothi, i-DataStax iyinhlangano enentshisekelo, futhi akufanele uthathe izwi layo ngakho. Siphinde sadidwa inani elincane lolwazi mayelana nezimo zokuhlola, ngakho-ke sanquma ukuzitholela ngokwethu ukuthi ubani inkosi ye-BigData NoSql, futhi imiphumela etholiwe yabonakala ijabulisa kakhulu.

Kodwa-ke, ngaphambi kokudlulela emiphumeleni yokuhlolwa okwenziwe, kuyadingeka ukuchaza izici ezibalulekile zokucushwa kwemvelo. Iqiniso liwukuthi i-CS ingasetshenziswa kumodi evumela ukulahleka kwedatha. Labo. yilapho iseva eyodwa kuphela (i-node) inesibopho sedatha yokhiye othile, futhi uma ngesizathu esithile ihluleka, khona-ke inani lalesi sihluthulelo lizolahleka. Emisebenzini eminingi lokhu akubalulekile, kodwa emkhakheni wamabhange lokhu kuhlukile kunomthetho. Esimweni sethu, kubalulekile ukuba namakhophi amaningana edatha yokugcina okuthembekile.

Ngakho-ke, imodi yokusebenza ye-CS kuphela kumodi yokuphindaphinda kathathu yacatshangelwa, i.e. Ukwakhiwa kwe-casespace kwenziwa ngamapharamitha alandelayo:

CREATE KEYSPACE ks WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3};

Okulandelayo, kunezindlela ezimbili zokuqinisekisa izinga elidingekayo lokuvumelana. Umthetho ojwayelekile:
NW + NR > RF

Okusho ukuthi inani leziqinisekiso ezivela kumanodi lapho kubhalwa (NW) kanye nenani leziqinisekiso ezivela kumanodi lapho kufundwa (NR) kumelwe libe likhulu kunesici sokuphindaphinda. Esimweni sethu, i-RF = 3, okusho ukuthi izinketho ezilandelayo zifanelekile:
2 + 2 > 3
3 + 1 > 3

Njengoba kubaluleke kakhulu ngathi ukuthi sigcine idatha ngendlela enokwethenjelwa, kwakhethwa uhlelo lwe-3+1. Ngaphezu kwalokho, i-HB isebenza ngesimiso esifanayo, i.e. ukuqhathanisa okunjalo kuyoba fair kakhudlwana.

Kumele kuqashelwe ukuthi i-DataStax yenza okuphambene ocwaningweni lwabo, basetha i-RF = 1 kokubili i-CS ne-HB (kokugcina ngokushintsha izilungiselelo ze-HDFS). Lesi isici esibaluleke ngempela ngoba umthelela ekusebenzeni kwe-CS kuleli cala mkhulu. Isibonelo, isithombe esingezansi sibonisa ukukhuphuka kwesikhathi esidingekayo ukuze kulayishwe idatha ku-CS:

Impi yama-yakozuna amabili, noma i-Cassandra vs HBase. Isipiliyoni seqembu le-Sberbank

Lapha sibona okulandelayo: izintambo eziqhudelana kakhulu zibhala idatha, kuthatha isikhathi eside. Lokhu kungokwemvelo, kodwa kubalulekile ukuthi ukwehla kokusebenza kwe-RF=3 kuphezulu kakhulu. Ngamanye amazwi, uma sibhala imicu emi-4 kumathebula angu-5 ngalinye (ama-20 esewonke), khona-ke i-RF=3 ilahlekelwa cishe izikhathi ezi-2 (imizuzwana engu-150 ye-RF=3 uma iqhathaniswa no-75 ye-RF=1). Kodwa uma sikhulisa umthwalo ngokulayisha idatha kumathebula angu-8 anezintambo ezingu-5 ngalinye (ama-40 esewonke), khona-ke ukulahlekelwa kwe-RF=3 sekuvele kuyizikhathi ezingu-2,7 (amasekhondi angu-375 uma kuqhathaniswa ne-138).

Mhlawumbe lokhu kuyingxenye yemfihlo yokuhlolwa komthwalo okuphumelelayo okwenziwa yi-DataStax ye-CS, ngoba ku-HB endaweni yethu yokushintsha isici sokuphindaphinda kusuka ku-2 kuya ku-3 akuzange kube namphumela. Labo. amadiski awawona ibhodlela le-HB ekucushweni kwethu. Kodwa-ke, kunezinye izingibe eziningi lapha, ngoba kufanele kuqashelwe ukuthi inguqulo yethu ye-HB ifakwe nezichibiyelo kancane futhi yalungiswa, izindawo zihluke ngokuphelele, njll. Kuyafaneleka futhi ukuqaphela ukuthi mhlawumbe angazi nje ukuthi ngingayilungisa kanjani i-CS ngendlela efanele futhi kukhona ezinye izindlela eziphumelelayo zokusebenza nayo, futhi ngithemba ukuthi sizothola kumazwana. Kodwa izinto zokuqala kuqala.

Konke ukuhlola kwenziwe kuqoqo lezingxenyekazi zekhompuyutha elihlanganisa amaseva angu-4, ngalinye linokulungiselelwa okulandelayo:

CPU: Xeon E5-2680 v4 @ 2.40GHz 64 imicu.
Amadiski: 12 izingcezu SATA HDD
inguqulo ye-java: 1.8.0_111

Inguqulo ye-CS: 3.11.5

cassandra.yml amapharamithainombolo_amathokheni: 256
hinted_handoff_enabled: kuyiqiniso
hinted_handoff_throttle_in_kb: 1024
imicu_yokudiliva_i-max_hints: 2
hints_directory: /data10/cassandra/hints
amahints_flush_period_in_ms: 10000
usayizi_wefayela_ubukhulu_ku_mb: 128
i-batchlog_replay_throttle_in_kb: 1024
isiqinisekisi: AllowAllAuthenticator
isigunyazo: VumelaAllAuthorizer
role_manager: CassandraRoleManager
roles_validity_ms: 2000
izimvume_ukuqinisekiswa_kwama-ms: 2000
credentials_validity_in_ms: 2000
i-partitioner: org.apache.cassandra.dht.Murmur3Partitioner
idatha_file_directories:
- /data1/cassandra/data # umkhombandlela wedathaN ngayinye iyidiski ehlukile
- /data2/cassandra/data
- /data3/cassandra/data
- /data4/cassandra/data
- /data5/cassandra/data
- /data6/cassandra/data
- /data7/cassandra/data
- /data8/cassandra/data
i-commitlog_directory: /data9/cassandra/commitlog
cdc_enabled: amanga
disk_failure_policy: stop
bophezela_inqubomgomo_yokwehluleka: yeka
prepared_statements_cache_size_mb:
thrift_prepared_statements_cache_size_mb:
key_cache_size_in_mb:
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
counter_cache_size_in_mb:
counter_cache_save_period: 7200
saved_caches_directory: /data10/cassandra/saved_caches
commitlog_sync: ngezikhathi ezithile
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
umhlinzeki_wembewu:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
imingcele:
β€” imbewu: "*,*"
concurrent_reads: 256 # uzame 64 - awukho umehluko oboniwe
concurrent_writes: 256 # wazama 64 - awukho umehluko obonakalayo
concurrent_counter_writes: 256 # uzame 64 - awukho umehluko oboniwe
concurrent_materialized_view_writes: 32
memtable_heap_space_in_mb: 2048 # izame i-16 GB - yayihamba kancane
memtable_allocation_type: heap_buffers
index_summary_capacity_in_mb:
index_summary_resize_interval_in_mintes: 60
trickle_fsync: amanga
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
lalela_ikheli: *
ukusakaza_ikheli: *
lalela_emsakazweni_ikheli: iqiniso
i-internode_authenticator: org.apache.cassandra.auth.AllowAllInternodeAuthenticator
start_native_transport: kuyiqiniso
native_transport_port: 9042
start_rpc: iqiniso
rpc_ikheli: *
I-rpc_port: 9160
rpc_keepalive: iqiniso
rpc_server_type: sync
thrift_framed_transport_size_in_mb: 15
izipele_ezikhulayo: amanga
snapshot_before_compaction: amanga
auto_snapshot: kuyiqiniso
column_index_size_in_kb: 64
column_index_cache_size_in_kb: 2
concurrent_compactors: 4
compaction_throughput_mb_per_sec: 1600
stable_preemptive_open_interval_in_mb: 50
read_request_timeout_in_ms: 100000
range_request_timeout_ms: 200000
write_request_timeout_in_ms: 40000
counter_write_request_timeout_in_ms: 100000
cas_contention_timeout_in_ms: 20000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 200000
slow_query_log_timeout_in_ms: 500
cross_node_timeout: amanga
endpoint_snitch: GossipingPropertyFileSnitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
Izinketho_zeseva_zokubethela:
internode_encryption: none
Izinketho_zeklayenti_zokubethela:
kunikwe amandla: amanga
internode_compression: dc
inter_dc_tcp_nodelay: amanga
tracetype_query_ttl: 86400
tracetype_repair_ttl: 604800
enable_user_defined_functions: amanga
enable_scripted_user_defined_functions: amanga
windows_timer_interval: 1
transparent_data_encryption_options:
kunikwe amandla: amanga
i-tombstone_warn_threshold: 1000
i-tombstone_failure_threshold: 100000
batch_size_warn_threshold_in_kb: 200
batch_size_fail_threshold_in_kb: 250
unlogged_batch_across_partitions_warn_threshold: 10
compaction_large_partition_warning_threshold_mb: 100
gc_warn_threshold_in_ms: 1000
i-back_pressure_enabled: amanga
enable_materialized_views: kuyiqiniso
enable_sasi_indexes: true

Izilungiselelo ze-GC:

### Izilungiselelo ze-CMS-XX:+SebenzisaParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkVunyiwe
-XX:I-SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+SebenzisaCMSIInitiatingOccupancyOnly
-XX:CMSWaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways
-XX:+CMSClassUnloadingKuvunyelwe

Imemori ye-jvm.options yabelwa i-16Gb (siphinde sazama u-32 Gb, awukho umehluko obonwe).

Amathebula adalwe ngomyalo:

CREATE TABLE ks.t1 (id bigint PRIMARY KEY, title text) WITH compression = {'sstable_compression': 'LZ4Compressor', 'chunk_length_kb': 64};

Inguqulo ye-HB: 1.2.0-cdh5.14.2 (ekilasini elithi org.apache.hadoop.hbase.regionserver.HRegion asiyifaki i-MetricsRegion eholele ku-GC lapho inani lezifunda lalingaphezu kuka-1000 ku-RegionServer)

Amapharamitha we-HBase okungewona azenzakalelayozookeeper.session.timeout: 120000
hbase.rpc.timeout: 2 amaminithi
hbase.client.scanner.timeout.period: 2 amaminithi
hbase.master.handler.count: 10
hbase.regionserver.lease.period, hbase.client.scanner.timeout.period: 2 amaminithi
hbase.regionserver.handler.count: 160
hbase.regionserver.metahandler.count: 30
hbase.regionserver.logroll.period: 4 amahora
hbase.regionserver.maxlogs: 200
hbase.hregion.memstore.flush.size: 1 GiB
hbase.hregion.memstore.block.multiplier: 6
hbase.hstore.compactionThreshold: 5
hbase.hstore.blockingStoreFiles: 200
hbase.hregion.majorcompaction: 1 usuku(izi)
Amazwibela Wokucushwa Okuthuthukile Wesevisi ye-HBase (Ivalufa Yokuphepha) ye-hbase-site.xml:
hbase.regionserver.wal.codecorg.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
hbase.master.namespace.init.timeout3600000
hbase.regionserver.optionalcacheflushinterval18000000
hbase.regionserver.thread.compaction.large12
hbase.regionserver.wal.enablecompressiontrue
hbase.hstore.compaction.max.size1073741824
hbase.server.compactchecker.interval.multiplier200
Izinketho Zokucushwa kwe-Java ze-HBase RegionServer:
-XX:+SebenzisaParNewGC -XX:+SebenzisaConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:ReservedCodeCacheSize=256m
hbase.snapshot.master.timeoutMillis: 2 amaminithi
hbase.snapshot.region.timeout: 2 amaminithi
hbase.snapshot.master.timeout.millis: 2 amaminithi
I-HBase REST Server Max Usayizi Welogi: 100 MiB
I-HBase REST Yeseva Enkulu Izipele Zefayela Lelogi: 5
I-HBase Thrift Server Max Usayizi Welogi: 100 MiB
I-HBase Thrift Server Maximum File Log Backups: 5
I-Master Max Log Usayizi: 100 MiB
Izipele Zefayela Lelogi Elikhulu: 5
I-RegionServer Max Usayizi Welogi: 100 MiB
I-RegionServer Maximum File Log Backups: 5
Iwindi Lokuthola Okuyinhloko Le-HBase: 4 amaminithi
dfs.client.hedged.read.threadpool.size: 40
dfs.client.hedged.read.threshold.millis: 10 millisecond(ama)
hbase.rest.threads.min: 8
hbase.rest.threads.max: 150
Izincazelo Zefayela Lenqubo Ephezulu: 180000
hbase.thrift.minWorkerThreads: 200
hbase.master.executor.openregion.threads: 30
hbase.master.executor.closeregion.threads: 30
hbase.master.executor.serverops.threads: 60
hbase.regionserver.thread.compaction.encane: 6
hbase.ipc.server.read.threadpool.size: 20
Imicu Yokuhambisa Isifunda: 6
Usayizi we-Java Heap yeklayenti ngamabhayithi: 1 GiB
Iqembu Elizenzakalelayo Leseva ye-HBase REST: 3 GiB
Iqembu Elizenzakalelayo Leseva ye-HBase: 3 GiB
Usayizi we-Java Heap we-HBase Master ngamabhayithi: 16 GiB
Usayizi we-Java Heap we-HBase RegionServer ngamabhayithi: 32 GiB

+ZooKeeper
maxClientCnxns: 601
maxSessionTimeout: 120000
Ukudala amathebula:
hbase org.apache.hadoop.hbase.util.RegionSplitter ns:t1 UniformSplit -c 64 -f cf
shintsha 'ns:t1', {NAME => 'cf', DATA_BLOCK_ENCODING => 'FAST_DIFF', COMPRESSION => 'GZ'}

Kunephuzu elilodwa elibalulekile lapha - incazelo ye-DataStax ayisho ukuthi zingaki izifunda ezisetshenziswe ukudala amatafula e-HB, nakuba lokhu kubalulekile kumavolumu amakhulu. Ngakho-ke, ekuhlolweni, ubuningi = 64 bakhethwa, okuvumela ukugcina kuze kufike ku-640 GB, i.e. itafula eliphakathi nendawo.

Ngesikhathi sokuhlolwa, i-HBase yayinamatafula ayizinkulungwane ezingu-22 kanye nezifunda eziyizinkulungwane ezingu-67 (lokhu bekungaba yingozi enguqulweni engu-1.2.0 uma kungenjalo ngesichibi esishiwo ngenhla).

Manje ngekhodi. Njengoba bekungacaci ukuthi yikuphi ukucushwa okunenzuzo enkulu kusizindalwazi esithile, ukuhlolwa kwenziwa ngezinhlanganisela ezihlukahlukene. Labo. kwezinye izivivinyo, amatafula angu-4 alayishwa kanyekanye (wonke ama-node angu-4 asetshenziselwa ukuxhuma). Kwezinye izivivinyo sisebenze namatafula angu-8 ahlukene. Kwezinye izimo, usayizi we-batch wawuyi-100, kwabanye 200 (ipharamitha ye-batch - bheka ikhodi ngezansi). Usayizi wedatha yenani ungamabhayithi angu-10 noma amabhayithi angu-100 (usayizi wedatha). Sekukonke, amarekhodi ayizigidi ezi-5 abhalwa futhi afundwa etafuleni ngalinye isikhathi ngasinye. Ngesikhathi esifanayo, imicu emi-5 yabhalwa/ifundwa etafuleni ngalinye (inombolo yochungechunge - thNum), ngayinye esebenzisa uhla lwayo lokhiye (ukubala = 1 million):

if (opType.equals("insert")) {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        StringBuilder sb = new StringBuilder("BEGIN BATCH ");
        for (int i = 0; i < batch; i++) {
            String value = RandomStringUtils.random(dataSize, true, true);
            sb.append("INSERT INTO ")
                    .append(tableName)
                    .append("(id, title) ")
                    .append("VALUES (")
                    .append(key)
                    .append(", '")
                    .append(value)
                    .append("');");
            key++;
        }
        sb.append("APPLY BATCH;");
        final String query = sb.toString();
        session.execute(query);
    }
} else {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        StringBuilder sb = new StringBuilder("SELECT * FROM ").append(tableName).append(" WHERE id IN (");
        for (int i = 0; i < batch; i++) {
            sb = sb.append(key);
            if (i+1 < batch)
                sb.append(",");
            key++;
        }
        sb = sb.append(");");
        final String query = sb.toString();
        ResultSet rs = session.execute(query);
    }
}

Ngakho-ke, ukusebenza okufanayo kwahlinzekwa i-HB:

Configuration conf = getConf();
HTable table = new HTable(conf, keyspace + ":" + tableName);
table.setAutoFlush(false, false);
List<Get> lGet = new ArrayList<>();
List<Put> lPut = new ArrayList<>();
byte[] cf = Bytes.toBytes("cf");
byte[] qf = Bytes.toBytes("value");
if (opType.equals("insert")) {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        lPut.clear();
        for (int i = 0; i < batch; i++) {
            Put p = new Put(makeHbaseRowKey(key));
            String value = RandomStringUtils.random(dataSize, true, true);
            p.addColumn(cf, qf, value.getBytes());
            lPut.add(p);
            key++;
        }
        table.put(lPut);
        table.flushCommits();
    }
} else {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        lGet.clear();
        for (int i = 0; i < batch; i++) {
            Get g = new Get(makeHbaseRowKey(key));
            lGet.add(g);
            key++;
        }
        Result[] rs = table.get(lGet);
    }
}

Njengoba ku-HB iklayenti kufanele linakekele ukusatshalaliswa okufanayo kwedatha, umsebenzi wokhiye wokufaka usawoti wawubukeka kanje:

public static byte[] makeHbaseRowKey(long key) {
    byte[] nonSaltedRowKey = Bytes.toBytes(key);
    CRC32 crc32 = new CRC32();
    crc32.update(nonSaltedRowKey);
    long crc32Value = crc32.getValue();
    byte[] salt = Arrays.copyOfRange(Bytes.toBytes(crc32Value), 5, 7);
    return ArrayUtils.addAll(salt, nonSaltedRowKey);
}

Manje ingxenye ethakazelisa kakhulu - imiphumela:

Impi yama-yakozuna amabili, noma i-Cassandra vs HBase. Isipiliyoni seqembu le-Sberbank

Into efanayo efomini legrafu:

Impi yama-yakozuna amabili, noma i-Cassandra vs HBase. Isipiliyoni seqembu le-Sberbank

Inzuzo ye-HB iyamangalisa kangangokuthi kukhona izinsolo zokuthi kukhona uhlobo oluthile lwebhodlela ekusetheni kwe-CS. Nokho, i-Googling nokusesha amapharamitha asobala kakhulu (njenge-concurrent_writes noma i-memtable_heap_space_in_mb) akuzange kusheshise izinto. Ngesikhathi esifanayo, izingodo zihlanzekile futhi azithuki lutho.

Idatha yasatshalaliswa ngokulinganayo kuwo wonke ama-node, izibalo ezivela kuwo wonke ama-node zazicishe zifane.

Yile ndlela izibalo zethebula ezibukeka ngayo kwenye yamanodiIsikhala sokhiye: ks
Funda Isibalo: 9383707
Ukufunda Ukubambezeleka: 0.04287025042448576 ms
Bhala Isibalo: 15462012
Bhala Ukubambezeleka: 0.1350068438699957 ms
Ama-Flush Alindile: 0
Ithebula: t1
Inani eliqinile: 16
Isikhala esisetshenzisiwe (bukhoma): 148.59 MiB
Isikhala esisetshenzisiwe (inani): 148.59 MiB
Isikhala esisetshenziswe izifinyezo (ingqikithi): amabhayithi angu-0
Imemori ye-off heap esetshenzisiwe (inani): 5.17 MiB
I-SSTable Compression Ratio: 0.5720989576459437
Inombolo yama-partitions (isilinganiso): 3970323
Isibalo samaseli akhumbulekayo: 0
Usayizi wedatha okhumbulekayo: amabhayithi angu-0
Inkumbulo yenqwaba ekhumbulekayo esetshenzisiwe: amabhayithi angu-0
Inani lokushintsha elikhumbulekayo: 5
Isibalo sokufundwa kwendawo: 2346045
Ukubambezeleka kokufunda kwendawo: NaN ms
Isibalo sokubhala sendawo: 3865503
Ukubambezeleka kokubhala kwendawo: NaN ms
Ama-flushes alindile: 0
Amaphesenti alungisiwe: 0.0
Isihlungi se-Bloom amaphothizithi angamanga: 25
Isilinganiso esingamanga sesihlungi se-Bloom: 0.00000
Isikhala sokuhlunga seBloom esisetshenzisiwe: 4.57 MiB
Isihlungi se-Bloom kwimemori yenqwaba esetshenzisiwe: 4.57 MiB
Isifinyezo senkomba yenkumbulo yenqwaba esetshenzisiwe: 590.02 KiB
Imethadatha yokucindezela kumemori yenqwaba esetshenzisiwe: 19.45 KiB
Ubuncane bokuhlukanisa amabhayithi ahlanganisiwe: 36
Umkhawulo wamabhayithi wokuhlukanisa ohlanganisiwe: 42
Ukuhlukaniswa okuhlanganisiwe kusho amabhayithi: 42
Isilinganiso samaseli abukhoma ucezu ngalunye (imizuzu emihlanu yokugcina): NaN
Umkhawulo wamaseli abukhoma ucezu ngalunye (imizuzu emihlanu yokugcina): 0
Isilinganiso samatshe amathuna ngocezu ngalunye (imizuzu emihlanu yokugcina): NaN
Ubuningi bamatshe amathuna ngocezu ngalunye (imizuzu emihlanu yokugcina): 0
Ukuguqulwa Kwehlisiwe: amabhayithi angu-0

Umzamo wokunciphisa usayizi weqoqo (ngisho nokuwuthumela ngawodwana) awubanga namphumela, uvele waba kubi kakhulu. Kungenzeka ukuthi empeleni lokhu kungukusebenza okuphezulu kwe-CS, njengoba imiphumela etholwe ku-CS ifana naleyo etholwe ku-DataStax - cishe amakhulu ezinkulungwane zokusebenza ngomzuzwana. Ngaphezu kwalokho, uma sibheka ukusetshenziswa kwezinsiza, sizobona ukuthi i-CS isebenzisa i-CPU namadiski amaningi:

Impi yama-yakozuna amabili, noma i-Cassandra vs HBase. Isipiliyoni seqembu le-Sberbank
Isibalo sibonisa ukusetshenziswa phakathi nokwenziwa kwazo zonke izivivinyo zilandelana kuzo zombili isizindalwazi.

Mayelana nenzuzo yokufunda enamandla ye-HB. Lapha ungabona ukuthi kuzo zombili izingosi zolwazi, ukusetshenziswa kwediski ngesikhathi sokufunda kuphansi kakhulu (ukuhlola ukuhlola ingxenye yokugcina yomjikelezo wokuhlola kusizindalwazi ngasinye, isibonelo ku-CS lokhu kusuka ku-15:20 kuya ku-15:40). Endabeni ye-HB, isizathu sicacile - idatha eminingi ilenga enkumbulweni, ku-memstore, kanti enye ifakwe kunqolobane ye-blockcache. Ngokuqondene ne-CS, akucaci kahle ukuthi isebenza kanjani, kodwa ukugaywa kabusha kwediski nakho akubonakali, kodwa uma kwenzeka, umzamo wenziwa ukunika amandla inqolobane yenqolobane row_cache_size_in_mb = 2048 futhi usethe i-caching = {'keys': 'ALL', 'rows_per_partition': '2000000'}, kodwa lokho kukwenze kwaba kubi nakakhulu.

Kuyafaneleka futhi ukubalula iphuzu elibalulekile mayelana nenani lezifunda ku-HB. Esimweni sethu, inani lishiwo njengo-64. Uma ulinciphisa futhi ulenze lilingane, isibonelo, 4, khona-ke lapho ufunda, isivinini sehla izikhathi ezingu-2. Isizathu siwukuthi i-memstore izogcwala ngokushesha futhi amafayela azoshaywa kaningi futhi lapho kufundwa, amafayela amaningi azodinga ukucutshungulwa, okuwumsebenzi oyinkimbinkimbi we-HB. Ezimweni zangempela, lokhu kungelashwa ngokucabanga ngesu lokuhlukanisa kusengaphambili nokuhlanganisa; ikakhulukazi, sisebenzisa insiza ezibhalela yona eqoqa udoti futhi iminya ama-HFiles njalo ngemuva. Kungenzeka impela ukuthi ezivivinyweni ze-DataStax babele isifunda esingu-1 kuphela ithebula ngalinye (okungalungile) futhi lokhu kungacacisa ngandlela thize ukuthi kungani i-HB yayingaphansi kangaka ezivivinyweni zabo zokufunda.

Iziphetho ezilandelayo zokuqala zithathwa kulokhu. Ngokucabanga ukuthi awekho amaphutha amakhulu enziwe ngesikhathi sokuhlolwa, khona-ke uCassandra ubukeka njenge-colossus enezinyawo zobumba. Ngokuqondile, kuyilapho elinganisela emlenzeni owodwa, njengasesithombeni esisekuqaleni kwesihloko, ubonisa imiphumela emihle uma kuqhathaniswa, kodwa ekulweni ngaphansi kwezimo ezifanayo ulahlekelwa ngokuphelele. Ngesikhathi esifanayo, sicabangela ukusetshenziswa okuphansi kwe-CPU ku-hardware yethu, safunda ukutshala ama-RegionServer HB amabili umasingathi ngamunye futhi ngaleyo ndlela saphinda kabili ukusebenza. Labo. Uma kubhekwa ukusetshenziswa kwezinsiza, isimo se-CS sidabukisa nakakhulu.

Kunjalo, lezi zivivinyo zakhiwe impela futhi nenani ledatha elisetshenziswe lapha linesizotha uma kuqhathaniswa. Kungenzeka ukuthi uma sishintshela kuma-terabytes, isimo sizohluka, kodwa ngenkathi ku-HB singakwazi ukulayisha ama-terabytes, ku-CS lokhu kube yinkinga. Ivamise ukuphonsa i-OperationTimedOutException ngisho nangale miqulu, nakuba amapharamitha okulinda impendulo abesevele enyuswe izikhathi ezimbalwa uma kuqhathaniswa naleyo emisiwe.

Ngithemba ukuthi ngemizamo ehlanganyelwe sizothola izingqinamba ze-CS futhi uma singasheshisa, khona-ke ekupheleni kokuthunyelwe ngizofaka ngokuqinisekile ulwazi mayelana nemiphumela yokugcina.

UPD: Ngenxa yezeluleko zamaqabane, ngikwazile ukusheshisa ukufunda. Bekuyi:
159 ama-ops (amathebula ama-644, imifudlana emi-4, iqoqo elingu-5).
Ingeziwe ngu:
.withLoadBalancingPolicy(new TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().build()))
Futhi ngadlala ngenani lemicu. Umphumela uba okulandelayo:
amathebula ama-4, izintambo eziyi-100, iqoqo = 1 (isiqephu ngesiqephu): ama-ops angama-301
4 amathebula, izintambo eziyi-100, iqoqo = 10: 447 ama-ops
4 amathebula, izintambo eziyi-100, iqoqo = 100: 625 ama-ops

Kamuva ngizosebenzisa amanye amathiphu okushuna, ngiqhube umjikelezo wokuhlola ogcwele futhi ngengeze imiphumela ekupheleni kokuthunyelwe.

Source: www.habr.com

Engeza amazwana