Idabi lakhozuna ezimbini, okanye Cassandra vs HBase. Amava eqela leSberbank

Oku ayisiyonto intlekisa, kubonakala ngathi lo mfanekiso ubonisa ngokuchanekileyo eyona nto iphambili kwezi nkcukacha zolwazi, kwaye ekugqibeleni kuya kucaca ukuba kutheni:

Idabi lakhozuna ezimbini, okanye Cassandra vs HBase. Amava eqela leSberbank

Ngokutsho kwe-DB-Engines Ranking, i-database ye-NoSQL eyaziwa kakhulu yi-Cassandra (emva koku CS) kunye ne-HBase (HB).

Idabi lakhozuna ezimbini, okanye Cassandra vs HBase. Amava eqela leSberbank

Ngokuthanda kwekamva, iqela lethu lokulawula ukulayishwa kwedatha kwi-Sberbank sele sele kudala kwaye isebenza ngokusondeleyo ne-HB. Ngeli xesha, safunda amandla kunye nobuthathaka bayo kakuhle kwaye safunda ukuyipheka. Nangona kunjalo, ubukho bolunye uhlobo lwe-CS buhlala businyanzela ukuba sizithuthumbise kancinci ngokuthandabuza: ngaba senze ukhetho olufanelekileyo? Ngaphezu koko, iziphumo uthelekiso, eyenziwa yiDathaStax, batsho ukuba i-CS ibetha kalula i-HB malunga namanqaku atyumzayo. Ngakolunye uhlangothi, i-DataStax liqela elinomdla, kwaye akufanele uthathe ilizwi labo. Siphinde sabhideka sisixa esincinci solwazi malunga neemeko zovavanyo, ke sagqiba ekubeni sizifumanele ngokwethu ukuba ngubani oyinkosi yeBigData NoSql, kwaye iziphumo ezifunyenweyo ziye zanomdla kakhulu.

Nangona kunjalo, ngaphambi kokuqhubela phambili kwiziphumo zovavanyo olwenziweyo, kuyimfuneko ukuchaza imiba ebalulekileyo yolungelelwaniso lokusingqongileyo. Inyani kukuba i-CS ingasetyenziswa kwimo evumela ukulahleka kwedatha. Ezo. oku kuxa umncedisi omnye kuphela (i-node) inoxanduva lwedatha yesitshixo esithile, kwaye ukuba ngenxa yesizathu esithile iyasilela, ngoko ixabiso lesi sitshixo liya kulahleka. Kwimisebenzi emininzi le ayibalulekanga, kodwa kwicandelo lebhanki le nto ikhethekileyo kunomgaqo. Kwimeko yethu, kubalulekile ukuba neekopi ezininzi zedatha yokugcina okuthembekileyo.

Ngoko ke, kuphela imodi yokusebenza ye-CS kwimodi yokuphindaphinda kathathu yaqwalaselwa, okt. Ukwenziwa kwecasespace kwenziwa ngezi parameters zilandelayo:

CREATE KEYSPACE ks WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3};

Emva koko, kukho iindlela ezimbini zokuqinisekisa umgangatho ofunekayo wokuhambelana. Umgaqo jikelele:
NW + NR > RF

Nto leyo ethetha ukuba inani leziqinisekiso ezisuka kwiindawo ekudityanwa kuzo xa kubhalwa (NW) kunye nenani leziqinisekiso ezisuka kwiindawo zokufunda (Nodes) xa kufundwa (NR) kufuneka libe likhulu kunoko lokuphindaphindayo. Kwimeko yethu, iRF = 3, oku kuthetha ukuba ezi ndlela zilandelayo zifanelekile:
2 + 2 > 3
3 + 1 > 3

Ekubeni kubaluleke kakhulu ukuba sigcine idatha ngokuthembekileyo kangangoko kunokwenzeka, isicwangciso se-3 + 1 sakhethwa. Ukongezelela, i-HB isebenza kumgaqo ofanayo, okt. uthelekiso olunjalo luya kulunga ngakumbi.

Kufuneka kuqatshelwe ukuba i-DataStax yenza okuchaseneyo kwisifundo sabo, bamisela i-RF = 1 kuzo zombini i-CS kunye ne-HB (eyokugqibela ngokutshintsha izicwangciso ze-HDFS). Lo ngumba obaluleke ngokwenene kuba impembelelo ekusebenzeni kwe-CS kule meko inkulu. Umzekelo, umfanekiso ongezantsi ubonisa ukwanda kwexesha elifunekayo ukulayisha idatha kwi-CS:

Idabi lakhozuna ezimbini, okanye Cassandra vs HBase. Amava eqela leSberbank

Apha sibona oku kulandelayo: ngakumbi imisonto ekhuphisanayo ibhala idatha, ithatha ixesha elide. Oku kungokwemvelo, kodwa kubalulekile ukuba ukuthotywa kokusebenza kweRF=3 kuphezulu kakhulu. Ngamanye amazwi, ukuba sibhala imisonto emi-4 kwiitafile ezi-5 nganye (i-20 iyonke), emva koko RF=3 ilahlekelwe malunga namaxesha ama-2 (imizuzwana eyi-150 yeRF=3 xa ithelekiswa ne-75 yeRF=1). Kodwa ukuba sinyusa umthwalo ngokulayisha idatha kwiitafile ezi-8 ezinemisonto emi-5 nganye (i-40 iyonke), ukulahleka kwe-RF=3 sele sele kumaxesha angama-2,7 (imizuzwana engama-375 ngokuchasene ne-138).

Mhlawumbi le nxalenye imfihlelo yovavanyo lomthwalo oyimpumelelo owenziwe yiDathaStax ye-CS, kuba i-HB kwindawo yethu yokuma ukuguqula i-replication factor ukusuka kwi-2 ukuya kwi-3 ayinayo nayiphi na impembelelo. Ezo. iidiski aziyiyo i-bottleneck ye-HB kuqwalaselo lwethu. Nangona kunjalo, mininzi eminye imigibe apha, kuba kufuneka kuqatshelwe ukuba inguqulelo yethu ye-HB yacolwa kancinci kwaye yatshintshwa, iimeko zokusingqongileyo zahluke ngokupheleleyo, njl. Kwakhona kuyafaneleka ukuba uqaphele ukuba mhlawumbi andiyazi nje indlela yokulungisa i-CS ngokuchanekileyo kwaye kukho ezinye iindlela ezisebenzayo zokusebenzisana nayo, kwaye ndiyathemba ukuba siya kufumanisa kumazwana. Kodwa izinto zokuqala kuqala.

Zonke iimvavanyo zenziwa kwi-hardware cluster equka iiseva ezi-4, nganye inobumbeko lulandelayo:

CPU: Xeon E5-2680 v4 @ 2.40GHz 64 imisonto.
Iidiski: 12 iziqwenga SATA HDD
inguqulelo yejava: 1.8.0_111

Inguqulelo ye-CS: 3.11.5

cassandra.yml iiparamithainani_iimpawu: 256
hinted_handoff_enabled: yinyani
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
hints_directory: /data10/cassandra/hints
hints_flush_period_in_ms: 10000
max_hints_file_size_in_mb: 128
batchlog_replay_throttle_in_kb: 1024
isiqinisekisi: AllowAllAuthenticator
isigunyazisi: AllowAllAuthorizer
role_manager: CassandraRoleManager
roles_validity_in_ms: 2000
permits_validity_in_ms: 2000
credentials_validity_in_ms: 2000
umahluli: org.apache.cassandra.dht.Murmur3Partitioner
data_file_directories:
- /data1/cassandra/data # ulawulo lwedathaN nganye yidiski eyahlukileyo
- /data2/cassandra/data
- /data3/cassandra/data
- /data4/cassandra/data
- /data5/cassandra/data
- /data6/cassandra/data
- /data7/cassandra/data
- /data8/cassandra/data
i-commitlog_directory: /data9/cassandra/commitlog
cdc_enabled: bubuxoki
disk_falure_ policy: yima
bophezela_ukusilela_umgaqo-nkqubo: yima
zilungisiwe_iingxelo_cache_size_mb:
thrift_prepared_statements_cache_size_mb:
key_cache_size_in_mb:
key_cache_save_period: 14400
umqolo_ubukhulu_icache_kwi_mb: 0
row_cache_gcina_ixesha: 0
counter_cache_size_in_mb:
counter_cache_save_period: 7200
save_caches_directory: /data10/cassandra/saved_caches
commitlog_sync: ngamaxesha
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
Umboneleli_wembewu:
-igama_leklasi: org.apache.cassandra.locator.SimpleSeedProvider
Iiparameter:
β€” iimbewu: "*,*"
concurrent_reads: 256 # uzame 64 - akukho mahluko uqatshelwe
concurrent_writes: 256 # uzame 64 - akukho mahluko uqatshelweyo
concurrent_counter_writes: 256 # uzame 64 - akukho mahluko uqatshelweyo
concurrent_materialized_view_writes: 32
memtable_heap_space_in_mb: 2048 # yazama i-16 GB - yayicotha
memtable_allocation_type: heap_buffers
index_isishwankathelo_samandla_kwi_mb:
index_summary_resize_interval_in_minute: 60
trickle_fsync: bubuxoki
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
mamela_idilesi: *
idilesi_yosasazo: *
mamela_kwidilesi_yosasazo: yinyani
internode_authenticator: org.apache.cassandra.auth.AllowAllInternodeAuthenticator
start_native_transport: yinyani
native_transport_port: 9042
start_rpc: yinyani
rpc_idilesi: *
rpc_port: 9160
rpc_keepalive: yinyani
rpc_uhlobo_lomncedisi: ungqamaniso
Thrift_framed_transport_size_in_mb: 15
i-incremental_backups: bubuxoki
snapshot_before_compaction: bubuxoki
auto_snapshot: yinyani
ikholamu_yesalathisi_ubungakanani_kwi-kb: 64
ikholamu_yesalathisi_ecache_ubungakanani_kwi-kb: 2
concurrent_compactors: 4
compaction_throughput_mb_per_sec: 1600
stable_preemptive_open_interval_in_mb: 50
read_request_timeout_ins: 100000
uluhlu_request_timeout_in_ms: 200000
write_request_timeout_in_ms: 40000
counter_write_request_timeout_in_ms: 100000
cas_contention_timeout_in_ms: 20000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 200000
slow_query_log_timeout_in_ms: 500
cross_node_timeout: bubuxoki
endpoint_snitch: GossipingPropertyFileSnitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
iinketho_ zoguqulelo oluntsonkothileyo:
internode_encryption: akukho
Iinketho_ zoguqulelo oluntsonkothileyo:
yenziwe: bubuxoki
internode_compression: dc
inter_dc_tcp_nodelay: bubuxoki
tracetype_query_ttl: 86400
tracetype_repair_ttl: 604800
vumela_imisebenzi_ yomsebenzisi: bubuxoki
enable_scripted_user_defined_functions: bubuxoki
windows_timer_interval: 1
iinketho_zedatha_efihlweyo:
yenziwe: bubuxoki
tombstone_warn_threshold: 1000
tombstone_falure_threshold: 100000
batch_size_warn_threshold_in_kb: 200
batch_size_fail_threshold_in_kb: 250
Unlogged_batch_across_partitions_warn_threshold: 10
compaction_large_partition_warning_threshold_mb: 100
gc_warn_threshold_in_ms: 1000
back_pressure_enabled: bubuxoki
enable_materialized_views: yinyani
enable_sasi_indexes: yinyani

Imimiselo ye-GC:

### Iisetingi zeCMS-XX:+Sebenzisa iParNewGC
-XX:+SebenzisaConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:Umlinganiselo weSurvivor=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+SebenzisaCMSInitiatingOccupancyOnly
-XX:CMSWaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordSoloko
-XX:+CMSClassUnloadingEnabled

Imemori ye-jvm.options yabelwa i-16Gb (siphinde sazama i-32 Gb, akukho mehluko waqaphela).

Iitheyibhile zenziwe ngomyalelo:

CREATE TABLE ks.t1 (id bigint PRIMARY KEY, title text) WITH compression = {'sstable_compression': 'LZ4Compressor', 'chunk_length_kb': 64};

Inguqulo ye-HB: 1.2.0-cdh5.14.2 (kwiklasi org.apache.hadoop.hbase.regionserver.HRegion asiyibandakanyi iMetricsRegion eyakhokelela kwi-GC xa inani lemimandla lalingaphezulu kwe-1000 kwi-RegionServer)

Iiparamitha ze-HBase ezingagqibekangazookeeper.session.timeout: 120000
hbase.rpc.timeout: imizuzu emi-2
hbase.client.scanner.timeout.period: 2 umzuzu(imizuzu)
hbase.master.handler.count: 10
hbase.regionserver.lease.period, hbase.client.scanner.timeout.period: 2 umzuzu(imizuzu)
hbase.regionserver.handler.count: 160
hbase.regionserver.metahandler.count: 30
hbase.regionserver.logroll.period: 4 iiyure(ii)
hbase.regionserver.maxlogs: 200
hbase.hregion.memstore.flush.size: 1 GiB
hbase.hregion.memstore.block.multiplier: 6
hbase.hstore.compactionThreshold: 5
hbase.hstore.blockingStoreFiles: 200
hbase.hregion.majorcompaction: 1 usuku(ii)suku
INkonzo ye-HBase iSnippet yoLungiso oluPhezulu (iValve yoKhuseleko) ye-hbase-site.xml:
i-hbase.regionserver.wal.codecorg.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
hbase.master.namespace.init.timeout3600000
hbase.regionserver.optionalcacheflushinterval18000000
hbase.regionserver.thread.compaction.large12
hbase.regionserver.wal.enablecompressiontrue
hbase.hstore.compaction.max.size1073741824
hbase.server.compactchecker.interval.multiplier200
Iinketho zoLungiselelo lweJava kwiHBase RegionServer:
-XX:+SebenzisaParNewGC -XX:+SebenzisaConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:ReservedCodeCacheSize=256m
hbase.snapshot.master.timeoutMillis: 2 imizuzu(s)
hbase.snapshot.region.timeout: 2 umzuzu(s)
hbase.snapshot.master.timeout.milis: imizuzu emi-2
Iseva ye-HBase REST Ubungakanani beLog enkulu: 100 MiB
Iseva ye-HBase REST Ubuninzi beFayile yokuLondoloza kweeFayile ezigciniweyo: 5
Iseva yeHBase Thrift Max Log Ubungakanani: 100 MiB
I-HBase Thrift Server Ubuninzi beFayile yokuLondoloza iFayile yokugcina izinto: 5
UMax Max Log Ubungakanani: 100 MiB
UBugcino beFayile yeLogi eNgcono kakhulu: 5
I-RegionalServer Max Ubungakanani beLog: 100 MiB
I-RegionalServer Ubuninzi beFayile yokuLondoloza iFayile eZigciniweyo: 5
Ifestile ye-HBase esebenzayo yokuFumana: imizuzu emi-4
dfs.client.hedged.read.threadpool.size: 40
dfs.client.hedged.read.threshold.millis: 10 millisecond(s)
hbase.rest.threads.min: 8
hbase.rest.threads.max: 150
Ubuninzi beeNkcazo zeFayile yeNkqubo: 180000
hbase.thrift.minWorkerTreads: 200
hbase.master.executor.openregion.threads: 30
hbase.master.executor.closeregion.threads: 30
hbase.master.executor.serverops.threads: 60
hbase.regionserver.thread.compaction.ncinci: 6
hbase.ipc.server.read.threadpool.size: 20
Imisonto yoMhambisi weNgingqi: 6
Ubungakanani beHaphu yeJava yoMxumi kwii-Bytes: 1 GiB
IQela eliMiselweyo leSeva ye-HBase REST: 3 GiB
IQela eliMiselweyo leSeva yeHBase: 3 GiB
Ubungakanani beJava Heap yeHBase Master kwiiBytes: 16 GiB
Ubungakanani beNgqungquthela yeJava yeHBase yoMmandla weServer kwiiBytes: 32 GiB

+ZooKeeper
maxClientCnxns: 601
maxSessionTimeout: 120000
Ukudala iitafile:
hbase org.apache.hadoop.hbase.util.RegionSplitter ns:t1 UniformSplit -c 64 -f cf
guqula 'ns:t1', {NAME => 'cf', DATA_BLOCK_ENCODING => 'FAST_DIFF', COMPRESSION => 'GZ'}

Kukho inqaku elibalulekileyo apha - inkcazo yeDathaStax ayithethi ukuba zingaphi iindawo ezisetyenziselwa ukudala iitafile ze-HB, nangona oku kubaluleke kakhulu kwimiqulu emikhulu. Ngoko ke, kwiimvavanyo, ubuninzi = i-64 yakhethwa, evumela ukugcina ukuya kwi-640 GB, i.e. itafile ephakathi.

Ngexesha lovavanyo, i-HBase yayineetafile ezingamawaka angama-22 kunye nemimandla engamawaka angama-67 (oku bekuya kuba yingozi kwinguqulo ye-1.2.0 ukuba ibingeyonxalenye ekhankanywe ngasentla).

Ngoku ngekhowudi. Ekubeni kwakungacacanga ukuba loluphi ulungelelwaniso olulunge ngakumbi kwisiseko sedatha ethile, iimvavanyo zenziwa kwiindibaniso ezahlukeneyo. Ezo. kwezinye iimvavanyo, iitafile ezi-4 zalayishwa ngaxeshanye (zonke ii-nodes ezi-4 zazisetyenziselwa ukudibanisa). Kwezinye iimvavanyo sisebenze ngeetafile ezisi-8 ezahlukeneyo. Kwezinye iimeko, ubukhulu bebhetshi yayiyi-100, kwabanye i-200 (i-batch parameter - jonga ikhowudi engezantsi). Ubungakanani bedatha yexabiso yi-10 bytes okanye 100 bytes (dataSize). Zizonke, iirekhodi ezizigidi ezi-5 zabhalwa kwaye zafundwa kwitheyibhile nganye ngexesha ngalinye. Kwangaxeshanye, imisonto emi-5 yabhalwa/ifundwa kwitheyibhile nganye (inombolo yomsonto - thNum), nganye kuzo yasebenzisa uluhlu lwayo lwezitshixo (ukubala = 1 yezigidi):

if (opType.equals("insert")) {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        StringBuilder sb = new StringBuilder("BEGIN BATCH ");
        for (int i = 0; i < batch; i++) {
            String value = RandomStringUtils.random(dataSize, true, true);
            sb.append("INSERT INTO ")
                    .append(tableName)
                    .append("(id, title) ")
                    .append("VALUES (")
                    .append(key)
                    .append(", '")
                    .append(value)
                    .append("');");
            key++;
        }
        sb.append("APPLY BATCH;");
        final String query = sb.toString();
        session.execute(query);
    }
} else {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        StringBuilder sb = new StringBuilder("SELECT * FROM ").append(tableName).append(" WHERE id IN (");
        for (int i = 0; i < batch; i++) {
            sb = sb.append(key);
            if (i+1 < batch)
                sb.append(",");
            key++;
        }
        sb = sb.append(");");
        final String query = sb.toString();
        ResultSet rs = session.execute(query);
    }
}

Ngokufanelekileyo, ukusebenza okufanayo kuye kwabonelelwa kwi-HB:

Configuration conf = getConf();
HTable table = new HTable(conf, keyspace + ":" + tableName);
table.setAutoFlush(false, false);
List<Get> lGet = new ArrayList<>();
List<Put> lPut = new ArrayList<>();
byte[] cf = Bytes.toBytes("cf");
byte[] qf = Bytes.toBytes("value");
if (opType.equals("insert")) {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        lPut.clear();
        for (int i = 0; i < batch; i++) {
            Put p = new Put(makeHbaseRowKey(key));
            String value = RandomStringUtils.random(dataSize, true, true);
            p.addColumn(cf, qf, value.getBytes());
            lPut.add(p);
            key++;
        }
        table.put(lPut);
        table.flushCommits();
    }
} else {
    for (Long key = count * thNum; key < count * (thNum + 1); key += 0) {
        lGet.clear();
        for (int i = 0; i < batch; i++) {
            Get g = new Get(makeHbaseRowKey(key));
            lGet.add(g);
            key++;
        }
        Result[] rs = table.get(lGet);
    }
}

Kuba kwi-HB umxhasi kufuneka athathele ingqalelo ukuhanjiswa okufanayo kwedatha, umsebenzi ophambili wetyuwa wawujongeka ngolu hlobo:

public static byte[] makeHbaseRowKey(long key) {
    byte[] nonSaltedRowKey = Bytes.toBytes(key);
    CRC32 crc32 = new CRC32();
    crc32.update(nonSaltedRowKey);
    long crc32Value = crc32.getValue();
    byte[] salt = Arrays.copyOfRange(Bytes.toBytes(crc32Value), 5, 7);
    return ArrayUtils.addAll(salt, nonSaltedRowKey);
}

Ngoku eyona nxalenye inomdla - iziphumo:

Idabi lakhozuna ezimbini, okanye Cassandra vs HBase. Amava eqela leSberbank

Into efanayo kwifomu yegrafu:

Idabi lakhozuna ezimbini, okanye Cassandra vs HBase. Amava eqela leSberbank

Inzuzo ye-HB iyamangalisa kangangokuba kukho ukukrokra ukuba kukho uhlobo oluthile lwe-bottleneck kwi-CS setup. Nangona kunjalo, iGoogling kunye nokukhangela ezona parameters zicacileyo (njenge concurrent_writes or memtable_heap_space_in_mb) khange zikhawulezise izinto. Ngelo xesha, izigodo zihlambulukile kwaye azifungi nantoni na.

Idatha yahanjiswa ngokulinganayo kuwo wonke ama-nodes, izibalo ezivela kuzo zonke iindawo ziphantse zifana.

Yile ndlela izibalo zetafile zikhangeleka ngayo kwenye yeendawo zokuhlalaIsithuba esingundoqo: ks
Funda ukubala: 9383707
Ukufunda ixesha elide: 0.04287025042448576 ms
Bhala Bala: 15462012
Bhala iLatency: 0.1350068438699957 ms
IiFlush ezilindileyo: 0
Itheyibhile: t1
Ubalo oluzinzile: 16
Isithuba esisetyenzisiweyo (siphila): 148.59 MiB
Indawo esetyenzisiweyo (itotali): 148.59 MiB
Isithuba esisetyenziswe zii-snapshots (itotali): 0 bytes
Inkumbulo evaliweyo yemfumba esetyenzisiweyo (iyonke): 5.17 MiB
I-SSTable Compression ratio: 0.5720989576459437
Inani lezahlulo (uqikelelo): 3970323
Ubalo lweeseli olukhumbulekayo: 0
Ubungakanani bedatha yeMemtable: 0 bytes
Memtable off inkumbulo kwimfumba esetyenzisiweyo: 0 bytes
Ubalo lokutshintsha olukhunjulwayo: 5
Ubalo lokufunda lwasekuhlaleni: 2346045
Ixesha lokufunda lasekuhlaleni: NaN ms
Ubalo lokubhala lwasekhaya: 3865503
Ukubambezeleka kokubhala kwendawo: NaN ms
Ukugungxula okulindileyo: 0
Ipesenti yalungiswa: 0.0
Isihluzo seBloom esineempawu ezibubuxoki: 25
Isihluzi seBloom umlinganiselo wobuxoki: 0.00000
Indawo yokucoca iBloom esetyenzisiweyo: 4.57 MiB
Isihluzo seBloom kwinkumbulo yemfumba esetyenzisiweyo: 4.57 MiB
Isishwankathelo sesalathisi kwimemori yemfumba esetyenzisiweyo: 590.02 KB
Imetadata yocinezelo kwinkumbulo yemfumba esetyenzisiweyo: 19.45 KB
Ukwahlulahlula okuncinci kwee-bytes: 36
Ubuninzi bezahlulo ezidityanisiweyo: 42
Ulwahlulo oludityanisiweyo luthetha ii-bytes: 42
I-avareji yeeseli eziphilayo ngesilayi (imizuzu emihlanu yokugqibela): NaN
Ubuninzi beeseli eziphilayo kwisilayi (imizuzu emihlanu yokugqibela): 0
I-avareji yamatye engcwaba kwisilayi (imizuzu emihlanu yokugqibela): NaN
Ubuninzi bamatye engcwaba kwisilayi (imizuzu emihlanu yokugqibela): 0
Iinguqu ezilahliweyo: 0 bytes

Inzame yokunciphisa ubungakanani bebhetshi (nokuba uyithumele ngabanye) yayingenasiphumo, yaba mbi ngakumbi. Kunokwenzeka ukuba ngokwenene oku kukusebenza okuphezulu kwe-CS, ekubeni iziphumo ezifunyenweyo ze-CS zifana nezo zifunyenwe kwi-DataStax - malunga namakhulu amawaka emisebenzi ngomzuzwana. Ukongeza, ukuba sijonga ukusetyenziswa kwezixhobo, siya kubona ukuba i-CS isebenzisa ngakumbi i-CPU kunye neediski:

Idabi lakhozuna ezimbini, okanye Cassandra vs HBase. Amava eqela leSberbank
Umzobo ubonisa ukusetyenziswa ngexesha lokwenziwa kwazo zonke iimvavanyo ngokulandelelana kuzo zombini iidathabheyisi.

Ngokuphathelele i-HB enamandla yokufunda inzuzo. Apha unokubona ukuba kuzo zombini ii-database, ukusetyenziswa kwedisk ngexesha lokufunda kuphantsi kakhulu (uvavanyo lokufunda luyinxalenye yokugqibela yomjikelo wovavanyo lwesiseko sedatha nganye, umzekelo kwi-CS le isuka ku-15:20 ukuya ku-15:40). Kwimeko ye-HB, isizathu sicacile-ininzi yedatha ixhomekeke kwimemori, kwi-memstore, kwaye enye ifakwe kwi-blockcache. Ngokuphathelele i-CS, ayicaci kakuhle indlela esebenza ngayo, kodwa ukuphinda kusetyenziswe idisk kwakhona akubonakali, kodwa kwimeko nje, umzamo wenziwa ukunika amandla i-cache row_cache_size_in_mb = 2048 kwaye usete i-caching = {'izitshixo': 'ALL', 'rows_per_partition': '2000000'}, kodwa loo nto iyenze yambi ngakumbi.

Kukwafanelekile ukukhankanya kwakhona inqaku elibalulekileyo malunga nenani lemimandla kwi-HB. Kwimeko yethu, ixabiso lichazwe njengo-64. Ukuba uyanciphisa kwaye wenze ukuba lilingane, umzekelo, 4, ngoko xa ufunda, isantya sihla ngamaxesha angama-2. Isizathu kukuba i-memstore iya kugcwalisa ngokukhawuleza kwaye iifayile ziya kugungxulwa rhoqo kwaye xa kufundwa, iifayile ezininzi ziya kufuneka ziqwalaselwe, nto leyo iyinto enzima kakhulu kwi-HB. Kwiimeko zokwenyani, oku kunokunyangwa ngokucinga ngobuchule bokucalula kunye nokudibanisa; ngakumbi, sisebenzisa isixhobo esizibhalela sona esiqokelela inkunkuma kwaye sicinezele ii-HFiles rhoqo ngasemva. Kunokwenzeka ukuba kwiimvavanyo ze-DataStax zabela kuphela ingingqi ye-1 kwitafile nganye (engachanekanga) kwaye oku kuya kucacisa ukuba kutheni i-HB yayingaphantsi kwiimvavanyo zabo zokufunda.

Ezi zigqibo zandulelayo zithathwa koku. Ukucinga ukuba akukho ziphoso zinkulu ezenziweyo ngexesha lovavanyo, ngoko uCassandra ukhangeleka njenge-colossus eneenyawo zodongwe. Ngokuchanekileyo, ngelixa elinganisela emlenzeni omnye, njengoko kumfanekiso osekuqaleni kwenqaku, ubonisa imiphumo emihle, kodwa kumlo phantsi kweemeko ezifanayo ulahlekelwa ngokupheleleyo. Kwangaxeshanye, sithathela ingqalelo ukusetyenziswa kwe-CPU ephantsi kwi-hardware yethu, safunda ukutyala ii-RegionServer ze-HB ezimbini ngomkhosi ngamnye kwaye ngaloo ndlela siphinda-phinda ukusebenza. Ezo. Kuthathelwa ingqalelo ukusetyenziswa kwezibonelelo, imeko ye-CS imbi ngakumbi.

Ngokuqinisekileyo, olu mvavanyo lunokwenziwa kwaye inani ledatha elisetyenziswe apha lincinci. Kungenzeka ukuba ukuba sitshintshele kwii-terabytes, imeko iya kwahluka, kodwa ngelixa i-HB sinokulayisha ii-terabytes, kwi-CS oku kuye kwaba yingxaki. Yayisoloko iphosa i-OperationTimedOutException nangale miqulu, nangona iiparamitha zokulinda impendulo sele zonyuswe amaxesha amaninzi xa kuthelekiswa nezingagqibekanga.

Ndiyathemba ukuba ngemigudu edibeneyo siya kufumana iibhotile ze-CS kwaye ukuba sinokukhawuleza, ngoko ekupheleni kwesithuba ngokuqinisekileyo ndiya kongeza ulwazi malunga neziphumo zokugqibela.

UPD: Ndiyabulela kwiingcebiso zamaqabane, ndikwazile ukukhawulezisa ukufunda. Yayiyi:
159 ops (iitafile ezi-644, imilambo emi-4, ibhetshi eyi-5).
Yongeziwe:
.ngoMgaqo-nkqubo wokuLayishwa koBhalancing(uMgaqo-nkqubo omtsha wokuLawulwa koLwazi(DCAwareRoundRobinPolicy.builder().build()))
Kwaye ndadlala ngenani lemisonto. Iziphumo zezi zilandelayo:
Iitafile ezi-4, iintambo ezili-100, ibhetshi = 1 (isiqwenga ngesiqwenga): 301 ops
4 iitafile, 100 imisonto, ibhetshi = 10: 447 ops
4 iitafile, 100 imisonto, ibhetshi = 100: 625 ops

Kamva ndiza kusebenzisa ezinye iingcebiso zokulungisa, ndiqhube umjikelo opheleleyo wovavanyo kwaye ndongeze iziphumo ekupheleni kwesithuba.

umthombo: www.habr.com

Yongeza izimvo