
, dirancang untuk mengumpulkan, mengubah, dan mengirim data log, metrik, dan peristiwa.
β
Ditulis dalam bahasa Rust, ini ditandai dengan kinerja tinggi dan konsumsi RAM yang rendah dibandingkan analognya. Selain itu, banyak perhatian diberikan pada fungsi yang terkait dengan kebenaran, khususnya, kemampuan untuk menyimpan peristiwa yang belum terkirim ke buffer pada disk dan memutar file.
Secara arsitektural, Vector adalah router peristiwa yang menerima pesan dari satu atau lebih sumber, secara opsional menerapkan pesan-pesan ini transformasi, dan mengirimkannya ke satu atau lebih saluran air.
Vektor adalah pengganti filebeat dan logstash, ia dapat bertindak dalam kedua peran (menerima dan mengirim log), detail lebih lanjut tentangnya .
Jika di Logstash rantai dibuat sebagai input β filter β output, maka di Vector demikian β β
Contohnya dapat ditemukan di dokumentasi.
Instruksi ini merupakan revisi instruksi dari . Instruksi asli berisi pemrosesan geoip. Saat menguji geoip dari jaringan internal, vektor memberikan kesalahan.
Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=Β«geoip.country_nameΒ» rate_limit_secs=30Jika ada yang perlu memproses geoip, lihat instruksi asli dari .
Kami akan mengkonfigurasi kombinasi Nginx (Access logs) β Vector (Client | Filebeat) β Vector (Server | Logstash) β secara terpisah di Clickhouse dan secara terpisah di Elasticsearch. Kami akan menginstal 4 server. Meskipun Anda dapat mem-bypassnya dengan 3 server.

Skemanya kira-kira seperti ini.
Nonaktifkan Selinux di semua server Anda
sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
rebootKami memasang emulator server HTTP + utilitas di semua server
Sebagai emulator server HTTP yang akan kita gunakan dari
Nodejs-stub-server tidak memiliki rpm. buat rpm untuk itu. rpm akan dibangun menggunakan
Tambahkan repositori antonpatsev/nodejs-stub-server
yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-serverInstal nodejs-stub-server, benchmark Apache, dan screen terminal multiplexer di semua server
yum -y install stub_http_server screen mc httpd-tools screenSaya mengoreksi waktu respon stub_http_server di file /var/lib/stub_http_server/stub_http_server.js sehingga ada lebih banyak log.
var max_sleep = 10;Mari kita luncurkan stub_http_server.
systemctl start stub_http_server
systemctl enable stub_http_serverdi server 3
ClickHouse menggunakan set instruksi SSE 4.2, jadi kecuali ditentukan lain, dukungan untuk itu di prosesor yang digunakan menjadi persyaratan sistem tambahan. Berikut adalah perintah untuk memeriksa apakah prosesor saat ini mendukung SSE 4.2:
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"Pertama, Anda perlu menghubungkan repositori resmi:
sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64Untuk menginstal paket, Anda perlu menjalankan perintah berikut:
sudo yum install -y clickhouse-server clickhouse-clientIzinkan clickhouse-server mendengarkan kartu jaringan di file /etc/clickhouse-server/config.xml
<listen_host>0.0.0.0</listen_host>Mengubah level logging dari trace ke debug
men-debug
Pengaturan kompresi standar:
min_compress_block_size 65536
max_compress_block_size 1048576Untuk mengaktifkan kompresi Zstd, disarankan untuk tidak menyentuh config, melainkan menggunakan DDL.

Saya tidak dapat menemukan cara menggunakan kompresi zstd melalui DDL di Google. Jadi saya membiarkannya apa adanya.
Rekan-rekan yang menggunakan kompresi zstd di Clickhouse, silakan berbagi petunjuknya.
Untuk memulai server sebagai daemon, jalankan:
service clickhouse-server startSekarang mari kita beralih ke menyiapkan Clickhouse
Pergi ke Clickhouse
clickhouse-client -h 172.26.10.109 -m172.26.10.109 β IP server tempat Clickhouse diinstal.
Mari kita membuat database vektor
CREATE DATABASE vector;Mari kita periksa apakah databasenya ada.
show databases;Buat tabel vector.logs.
/* ΠΡΠΎ ΡΠ°Π±Π»ΠΈΡΠ° Π³Π΄Π΅ Ρ
ΡΠ°Π½ΡΡΡΡ Π»ΠΎΠ³ΠΈ ΠΊΠ°ΠΊ Π΅ΡΡΡ */
CREATE TABLE vector.logs
(
`node_name` String,
`timestamp` DateTime,
`server_name` String,
`user_id` String,
`request_full` String,
`request_user_agent` String,
`request_http_host` String,
`request_uri` String,
`request_scheme` String,
`request_method` String,
`request_length` UInt64,
`request_time` Float32,
`request_referrer` String,
`response_status` UInt16,
`response_body_bytes_sent` UInt64,
`response_content_type` String,
`remote_addr` IPv4,
`remote_port` UInt32,
`remote_user` String,
`upstream_addr` IPv4,
`upstream_port` UInt32,
`upstream_bytes_received` UInt64,
`upstream_bytes_sent` UInt64,
`upstream_cache_status` String,
`upstream_connect_time` Float32,
`upstream_header_time` Float32,
`upstream_response_length` UInt64,
`upstream_response_time` Float32,
`upstream_status` UInt16,
`upstream_content_type` String,
INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;Kami memeriksa apakah tabel telah dibuat. Mari kita luncurkan clickhouse-client dan membuat permintaan.
Mari kita pergi ke database vektor.
use vector;
Ok.
0 rows in set. Elapsed: 0.001 sec.Mari kita lihat tabelnya.
show tables;
ββnameβββββββββββββββββ
β logs β
βββββββββββββββββββββββMenginstal elasticsearch di server ke-4 untuk mengirim data yang sama ke Elasticsearch untuk dibandingkan dengan Clickhouse
Tambahkan kunci rpm publik
rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearchMari buat 2 repo:
/etc/yum.repos.d/elasticsearch.repo
[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md/etc/yum.repos.d/kibana.repo
[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-mdInstal elasticsearch dan kibana
yum install -y kibana elasticsearchKarena akan dalam 1 salinan, Anda perlu menambahkan yang berikut ke file /etc/elasticsearch/elasticsearch.yml:
discovery.type: single-nodeAgar vector bisa mengirimkan data ke elasticsearch dari server lain, mari kita ubah network.host.
network.host: 0.0.0.0Untuk terhubung ke kibana, ubah parameter server.host di file /etc/kibana/kibana.yml
server.host: "0.0.0.0"Lama dan sertakan elasticsearch di autostart
systemctl enable elasticsearch
systemctl start elasticsearchdan kibana
systemctl enable kibana
systemctl start kibanaMengonfigurasi Elasticsearch untuk mode simpul tunggal 1 pecahan, 0 replika. Kemungkinan besar Anda akan memiliki sekelompok server dalam jumlah besar dan Anda tidak perlu melakukan ini.
Untuk indeks di masa mendatang, perbarui templat default:
curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' Instalasi sebagai pengganti Logstash di server 2
yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screenMari siapkan Vector sebagai pengganti Logstash. Mengedit file /etc/vector/vector.toml
# /etc/vector/vector.toml
data_dir = "/var/lib/vector"
[sources.nginx_input_vector]
# General
type = "vector"
address = "0.0.0.0:9876"
shutdown_timeout_secs = 30
[transforms.nginx_parse_json]
inputs = [ "nginx_input_vector" ]
type = "json_parser"
[transforms.nginx_parse_add_defaults]
inputs = [ "nginx_parse_json" ]
type = "lua"
version = "2"
hooks.process = """
function (event, emit)
function split_first(s, delimiter)
result = {};
for match in (s..delimiter):gmatch("(.-)"..delimiter) do
table.insert(result, match);
end
return result[1];
end
function split_last(s, delimiter)
result = {};
for match in (s..delimiter):gmatch("(.-)"..delimiter) do
table.insert(result, match);
end
return result[#result];
end
event.log.upstream_addr = split_first(split_last(event.log.upstream_addr, ', '), ':')
event.log.upstream_bytes_received = split_last(event.log.upstream_bytes_received, ', ')
event.log.upstream_bytes_sent = split_last(event.log.upstream_bytes_sent, ', ')
event.log.upstream_connect_time = split_last(event.log.upstream_connect_time, ', ')
event.log.upstream_header_time = split_last(event.log.upstream_header_time, ', ')
event.log.upstream_response_length = split_last(event.log.upstream_response_length, ', ')
event.log.upstream_response_time = split_last(event.log.upstream_response_time, ', ')
event.log.upstream_status = split_last(event.log.upstream_status, ', ')
if event.log.upstream_addr == "" then
event.log.upstream_addr = "127.0.0.1"
end
if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
event.log.upstream_bytes_received = "0"
end
if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
event.log.upstream_bytes_sent = "0"
end
if event.log.upstream_cache_status == "" then
event.log.upstream_cache_status = "DISABLED"
end
if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
event.log.upstream_connect_time = "0"
end
if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
event.log.upstream_header_time = "0"
end
if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
event.log.upstream_response_length = "0"
end
if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
event.log.upstream_response_time = "0"
end
if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
event.log.upstream_status = "0"
end
emit(event)
end
"""
[transforms.nginx_parse_remove_fields]
inputs = [ "nginx_parse_add_defaults" ]
type = "remove_fields"
fields = ["data", "file", "host", "source_type"]
[transforms.nginx_parse_coercer]
type = "coercer"
inputs = ["nginx_parse_remove_fields"]
types.request_length = "int"
types.request_time = "float"
types.response_status = "int"
types.response_body_bytes_sent = "int"
types.remote_port = "int"
types.upstream_bytes_received = "int"
types.upstream_bytes_send = "int"
types.upstream_connect_time = "float"
types.upstream_header_time = "float"
types.upstream_response_length = "int"
types.upstream_response_time = "float"
types.upstream_status = "int"
types.timestamp = "timestamp"
[sinks.nginx_output_clickhouse]
inputs = ["nginx_parse_coercer"]
type = "clickhouse"
database = "vector"
healthcheck = true
host = "http://172.26.10.109:8123" # ΠΠ΄ΡΠ΅Ρ Clickhouse
table = "logs"
encoding.timestamp_format = "unix"
buffer.type = "disk"
buffer.max_size = 104900000
buffer.when_full = "block"
request.in_flight_limit = 20
[sinks.elasticsearch]
type = "elasticsearch"
inputs = ["nginx_parse_coercer"]
compression = "none"
healthcheck = true
# 172.26.10.116 - ΡΠ΅ΡΠ²Π΅Ρ Π³Π΄Π΅ ΡΡΡΠ°Π½ΠΎΠ²Π΅Π½ elasticsearch
host = "http://172.26.10.116:9200"
index = "vector-%Y-%m-%d"Anda dapat menyesuaikan bagian transforms.nginx_parse_add_defaults.
Sebagai menggunakan konfigurasi ini untuk CDN kecil dan mungkin ada beberapa nilai di upstream_*
Sebagai contoh:
"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"Jika ini bukan situasi Anda, bagian ini dapat disederhanakan
Mari buat pengaturan layanan untuk systemd /etc/systemd/system/vector.service
# /etc/systemd/system/vector.service
[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target
[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector
[Install]
WantedBy=multi-user.targetSetelah membuat tabel, Anda dapat menjalankan Vector
systemctl enable vector
systemctl start vectorLog vektor dapat dilihat seperti ini:
journalctl -f -u vectorSeharusnya ada entri seperti ini di log
INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.Di klien (server Web) - server pertama
Di server dengan nginx, Anda perlu menonaktifkan ipv6, karena tabel log di clickhouse menggunakan bidang tersebut upstream_addr IPv4, karena saya tidak menggunakan ipv6 di dalam jaringan. Jika ipv6 tidak dimatikan maka akan terjadi error:
DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)Mungkin pembaca, tambahkan dukungan ipv6.
Buat file /etc/sysctl.d/98-disable-ipv6.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1Menerapkan pengaturan
sysctl --systemMari kita instal nginx.
Menambahkan file repositori nginx /etc/yum.repos.d/nginx.repo
[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=trueInstal paket nginx
yum install -y nginxPertama, kita perlu mengkonfigurasi format log di Nginx di file /etc/nginx/nginx.conf
user nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically
# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
# provides the configuration file context in which the directives that affect connection processing are specified.
events {
# determines how much clients will be served per worker
# max clients = worker_connections * worker_processes
# max clients is also limited by the number of socket connections available on the system (~64k)
worker_connections 4000;
# optimized to serve many clients with each thread, essential for linux -- for testing environment
use epoll;
# accept as many connections as possible, may flood worker connections if set too low -- for testing environment
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
log_format vector escape=json
'{'
'"node_name":"nginx-vector",'
'"timestamp":"$time_iso8601",'
'"server_name":"$server_name",'
'"request_full": "$request",'
'"request_user_agent":"$http_user_agent",'
'"request_http_host":"$http_host",'
'"request_uri":"$request_uri",'
'"request_scheme": "$scheme",'
'"request_method":"$request_method",'
'"request_length":"$request_length",'
'"request_time": "$request_time",'
'"request_referrer":"$http_referer",'
'"response_status": "$status",'
'"response_body_bytes_sent":"$body_bytes_sent",'
'"response_content_type":"$sent_http_content_type",'
'"remote_addr": "$remote_addr",'
'"remote_port": "$remote_port",'
'"remote_user": "$remote_user",'
'"upstream_addr": "$upstream_addr",'
'"upstream_bytes_received": "$upstream_bytes_received",'
'"upstream_bytes_sent": "$upstream_bytes_sent",'
'"upstream_cache_status":"$upstream_cache_status",'
'"upstream_connect_time":"$upstream_connect_time",'
'"upstream_header_time":"$upstream_header_time",'
'"upstream_response_length":"$upstream_response_length",'
'"upstream_response_time":"$upstream_response_time",'
'"upstream_status": "$upstream_status",'
'"upstream_content_type":"$upstream_http_content_type"'
'}';
access_log /var/log/nginx/access.log main;
access_log /var/log/nginx/access.json.log vector; # ΠΠΎΠ²ΡΠΉ Π»ΠΎΠ³ Π² ΡΠΎΡΠΌΠ°ΡΠ΅ json
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}Agar tidak merusak konfigurasi Anda saat ini, Nginx mengizinkan Anda memiliki beberapa arahan access_log
access_log /var/log/nginx/access.log main; # Π‘ΡΠ°Π½Π΄Π°ΡΡΠ½ΡΠΉ Π»ΠΎΠ³
access_log /var/log/nginx/access.json.log vector; # ΠΠΎΠ²ΡΠΉ Π»ΠΎΠ³ Π² ΡΠΎΡΠΌΠ°ΡΠ΅ jsonJangan lupa menambahkan aturan untuk logrotate untuk log baru (jika file log tidak diakhiri dengan .log)
Hapus default.conf dari /etc/nginx/conf.d/
rm -f /etc/nginx/conf.d/default.confTambahkan host virtual /etc/nginx/conf.d/vhost1.conf
server {
listen 80;
server_name vhost1;
location / {
proxy_pass http://172.26.10.106:8080;
}
}Tambahkan host virtual /etc/nginx/conf.d/vhost2.conf
server {
listen 80;
server_name vhost2;
location / {
proxy_pass http://172.26.10.108:8080;
}
}Tambahkan host virtual /etc/nginx/conf.d/vhost3.conf
server {
listen 80;
server_name vhost3;
location / {
proxy_pass http://172.26.10.109:8080;
}
}Tambahkan host virtual /etc/nginx/conf.d/vhost4.conf
server {
listen 80;
server_name vhost4;
location / {
proxy_pass http://172.26.10.116:8080;
}
}Tambahkan virtual host (172.26.10.106 ip server tempat nginx diinstal) ke semua server ke file /etc/hosts:
172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4Dan jika semuanya sudah siap maka
nginx -t
systemctl restart nginxSekarang mari kita instal sendiri
yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpmMari buat file pengaturan untuk systemd /etc/systemd/system/vector.service
[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target
[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector
[Install]
WantedBy=multi-user.targetDan konfigurasikan pengganti Filebeat di konfigurasi /etc/vector/vector.toml. Alamat IP 172.26.10.108 adalah alamat IP server log (Vector-Server)
data_dir = "/var/lib/vector"
[sources.nginx_file]
type = "file"
include = [ "/var/log/nginx/access.json.log" ]
start_at_beginning = false
fingerprinting.strategy = "device_and_inode"
[sinks.nginx_output_vector]
type = "vector"
inputs = [ "nginx_file" ]
address = "172.26.10.108:9876"ΠΠ΅ Π·Π°Π±ΡΠ΄ΡΠ΅ Π΄ΠΎΠ±Π°Π²ΠΈΡΡ ΡΠ·Π΅ΡΠ° vector Π² Π½ΡΠΆΠ½ΡΡ Π³ΡΡΠΏΠΏΡ ΡΡΠΎ Π±Ρ ΠΎΠ½ ΠΌΠΎΠ³ ΡΠΈΡΠ°ΡΡ log ΡΠ°ΠΉΠ»Ρ. ΠΠ°ΠΏΡΠΈΠΌΠ΅Ρ, nginx Π² centos ΡΠΎΠ·Π΄Π°Π΅Ρ Π»ΠΎΠ³ΠΈ Ρ ΠΏΡΠ°Π²Π°ΠΌΠΈ Π³ΡΡΠΏΠΏΡ adm.
usermod -a -G adm vectorMari kita mulai layanan vektor
systemctl enable vector
systemctl start vectorLog vektor dapat dilihat seperti ini:
journalctl -f -u vectorSeharusnya ada entri seperti ini di log
INFO vector::topology::builder: Healthcheck: Passed.Pengujian Stres
Pengujian dilakukan dengan menggunakan benchmark Apache.
Paket httpd-tools telah diinstal di semua server
Kami memulai pengujian menggunakan benchmark Apache dari 4 server berbeda di layar. Pertama, kami meluncurkan multiplexer terminal layar, dan kemudian kami mulai menguji menggunakan benchmark Apache. Cara bekerja dengan layar dapat Anda temukan di .
Dari server pertama
while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; doneDari server pertama
while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; doneDari server pertama
while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; doneDari server pertama
while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; doneMari kita periksa data di Clickhouse
Pergi ke Clickhouse
clickhouse-client -h 172.26.10.109 -mMembuat kueri SQL
SELECT * FROM vector.logs;
ββnode_nameβββββ¬βββββββββββtimestampββ¬βserver_nameββ¬βuser_idββ¬βrequest_fullββββ¬βrequest_user_agentββ¬βrequest_http_hostββ¬βrequest_uriββ¬βrequest_schemeββ¬βrequest_methodββ¬βrequest_lengthββ¬βrequest_timeββ¬βrequest_referrerββ¬βresponse_statusββ¬βresponse_body_bytes_sentββ¬βresponse_content_typeββ¬βββremote_addrββ¬βremote_portββ¬βremote_userββ¬βupstream_addrββ¬βupstream_portββ¬βupstream_bytes_receivedββ¬βupstream_bytes_sentββ¬βupstream_cache_statusββ¬βupstream_connect_timeββ¬βupstream_header_timeββ¬βupstream_response_lengthββ¬βupstream_response_timeββ¬βupstream_statusββ¬βupstream_content_typeββ
β nginx-vector β 2020-08-07 04:32:42 β vhost1 β β GET / HTTP/1.0 β 1server β vhost1 β / β http β GET β 66 β 0.028 β β 404 β 27 β β 172.26.10.106 β 45886 β β 172.26.10.106 β 0 β 109 β 97 β DISABLED β 0 β 0.025 β 27 β 0.029 β 404 β β
ββββββββββββββββ΄ββββββββββββββββββββββ΄ββββββββββββββ΄ββββββββββ΄βββββββββββββββββ΄βββββββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββββββββββ΄βββββββββββββββββ΄βββββββββββββββββ΄βββββββββββββββ΄βββββββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββββββββββββββ΄ββββββββββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββ΄ββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββββββ΄ββββββββββββββββββββββ΄ββββββββββββββββββββββββ΄ββββββββββββββββββββββββ΄βββββββββββββββββββββββ΄βββββββββββββββββββββββββββ΄βββββββββββββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββββββββββCari tahu ukuran tabel di Clickhouse
select concat(database, '.', table) as table,
formatReadableSize(sum(bytes)) as size,
sum(rows) as rows,
max(modification_time) as latest_modification,
sum(bytes) as bytes_size,
any(engine) as engine,
formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;Mari kita cari tahu berapa banyak log yang digunakan di Clickhouse.

Ukuran tabel log adalah 857.19 MB.

Ukuran data yang sama dalam indeks di Elasticsearch adalah 4,5 GB.
Jika Anda tidak menentukan data dalam vektor di parameter, Clickhouse membutuhkan 4500/857.19 = 5.24 kali lebih sedikit daripada di Elasticsearch.
Dalam vektor, bidang kompresi digunakan secara default.
Obrolan Telegram oleh
Obrolan Telegram oleh
Obrolan Telegram melalui ""
Sumber: www.habr.com
