Ditulis dina basa Rust, éta dicirikeun ku kinerja anu luhur sareng konsumsi RAM anu rendah dibandingkeun sareng analogna. Salaku tambahan, seueur perhatian dibayar ka fungsi anu aya hubunganana sareng kabeneran, khususna, kamampuan pikeun ngahemat kajadian anu teu dikirim ka panyangga dina disk sareng muterkeun file.
Sacara arsitéktur, Véktor mangrupikeun router acara anu nampi pesen ti hiji atanapi langkung sumber, opsional nerapkeun ngaliwatan pesen ieu transformasi, sarta ngirimkeunana ka hiji atawa leuwih solokan.
Véktor mangrupikeun gaganti filebeat sareng logstash, éta tiasa ngalaksanakeun duanana peran (nampi sareng ngirim log), langkung rinci ngeunaan éta. website.
Lamun di Logstash ranté diwangun salaku input → filter → output lajeng dina Véktor éta sumber → ngarobih → tilelep
Conto tiasa dipendakan dina dokuméntasi.
Parentah ieu mangrupikeun instruksi anu dirévisi tina Vyacheslav Rakhinsky. Parentah aslina ngandung processing geoip. Nalika nguji geoip tina jaringan internal, vektor masihan kasalahan.
Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30
Upami aya anu peryogi ngolah geoip, teras tingal petunjuk asli ti Vyacheslav Rakhinsky.
Urang bakal ngonpigurasikeun kombinasi Nginx (Akses log) → Véktor (klien | Filebeat) → Véktor (Server | Logstash) → misah di Clickhouse sareng misah di Elasticsearch. Urang bakal masang 4 server. Sanajan anjeun bisa bypass eta kalawan 3 server.
Skéma éta sapertos kieu.
Pareuman Selinux dina sadaya pangladén anjeun
sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot
Urang masang émulator server HTTP + utilitas dina sadaya server
ClickHouse ngagunakeun set instruksi SSE 4.2, jadi iwal mun disebutkeun dieusian, rojongan pikeun eta dina processor dipaké jadi sarat sistem tambahan. Ieu paréntah pikeun pariksa naha prosésor ayeuna ngadukung SSE 4.2:
Ngonpigurasikeun Elasticsearch pikeun mode single-node 1 shard, 0 replika. Paling dipikaresep anjeun bakal boga klaster angka nu gede ngarupakeun server jeung anjeun teu kedah ngalakukeun ieu.
Saatos nyieun tabel, anjeun tiasa ngajalankeun Véktor
systemctl enable vector
systemctl start vector
Log vektor tiasa ditingali sapertos kieu:
journalctl -f -u vector
Kedah aya éntri sapertos kieu dina log
INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.
Dina klien (Web server) - 1st server
Dina server nganggo nginx, anjeun kedah nganonaktipkeun ipv6, sabab tabel log di clickhouse nganggo lapangan. upstream_addr IPv4, saprak kuring henteu nganggo IPv6 di jero jaringan. Upami ipv6 henteu dipareuman, bakal aya kasalahan:
DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)
Mimiti, urang kedah ngonpigurasikeun format log dina Nginx dina file /etc/nginx/nginx.conf
user nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically
# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
# provides the configuration file context in which the directives that affect connection processing are specified.
events {
# determines how much clients will be served per worker
# max clients = worker_connections * worker_processes
# max clients is also limited by the number of socket connections available on the system (~64k)
worker_connections 4000;
# optimized to serve many clients with each thread, essential for linux -- for testing environment
use epoll;
# accept as many connections as possible, may flood worker connections if set too low -- for testing environment
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
log_format vector escape=json
'{'
'"node_name":"nginx-vector",'
'"timestamp":"$time_iso8601",'
'"server_name":"$server_name",'
'"request_full": "$request",'
'"request_user_agent":"$http_user_agent",'
'"request_http_host":"$http_host",'
'"request_uri":"$request_uri",'
'"request_scheme": "$scheme",'
'"request_method":"$request_method",'
'"request_length":"$request_length",'
'"request_time": "$request_time",'
'"request_referrer":"$http_referer",'
'"response_status": "$status",'
'"response_body_bytes_sent":"$body_bytes_sent",'
'"response_content_type":"$sent_http_content_type",'
'"remote_addr": "$remote_addr",'
'"remote_port": "$remote_port",'
'"remote_user": "$remote_user",'
'"upstream_addr": "$upstream_addr",'
'"upstream_bytes_received": "$upstream_bytes_received",'
'"upstream_bytes_sent": "$upstream_bytes_sent",'
'"upstream_cache_status":"$upstream_cache_status",'
'"upstream_connect_time":"$upstream_connect_time",'
'"upstream_header_time":"$upstream_header_time",'
'"upstream_response_length":"$upstream_response_length",'
'"upstream_response_time":"$upstream_response_time",'
'"upstream_status": "$upstream_status",'
'"upstream_content_type":"$upstream_http_content_type"'
'}';
access_log /var/log/nginx/access.log main;
access_log /var/log/nginx/access.json.log vector; # Новый лог в формате json
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
Dina raraga teu megatkeun konfigurasi anjeun ayeuna, Nginx ngidinan Anjeun pikeun mibanda sababaraha access_log directives
access_log /var/log/nginx/access.log main; # Стандартный лог
access_log /var/log/nginx/access.json.log vector; # Новый лог в формате json
Tong hilap nambihan aturan pikeun logrotate pikeun log énggal (upami file log henteu ditungtungan ku .log)
Jeung ngonpigurasikeun ngagantian Filebeat dina /etc/vector/vector.toml config. Alamat IP 172.26.10.108 nyaéta alamat IP tina server log (Véktor-Server)
data_dir = "/var/lib/vector"
[sources.nginx_file]
type = "file"
include = [ "/var/log/nginx/access.json.log" ]
start_at_beginning = false
fingerprinting.strategy = "device_and_inode"
[sinks.nginx_output_vector]
type = "vector"
inputs = [ "nginx_file" ]
address = "172.26.10.108:9876"
Tong hilap nambihan pangguna vektor kana grup anu diperyogikeun supados anjeunna tiasa maca file log. Contona, nginx di centos nyieun log kalawan hak grup adm.
usermod -a -G adm vector
Hayu urang ngamimitian jasa vektor
systemctl enable vector
systemctl start vector
Log vektor tiasa ditingali sapertos kieu:
journalctl -f -u vector
Kedah aya éntri sapertos kieu dina log
INFO vector::topology::builder: Healthcheck: Passed.
Uji Stress
Kami ngalaksanakeun tés nganggo patokan Apache.
Paket httpd-tools dipasang dina sadaya server
Urang mimitian nguji ngagunakeun patokan Apache ti 4 server béda dina layar. Kahiji, urang ngajalankeun multiplexer terminal layar, lajeng urang mimitian nguji ngagunakeun patokan Apache. Kumaha damel sareng layar anjeun tiasa mendakan di artikel.
Ti server 1st
while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done
Ti server 2st
while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done
Ti server 3st
while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done
Ti server 4st
while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done
select concat(database, '.', table) as table,
formatReadableSize(sum(bytes)) as size,
sum(rows) as rows,
max(modification_time) as latest_modification,
sum(bytes) as bytes_size,
any(engine) as engine,
formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;
Hayu urang terang sabaraha log nyandak di Clickhouse.
Ukuran tabel log nyaéta 857.19 MB.
Ukuran data anu sami dina indéks dina Elasticsearch nyaéta 4,5GB.
Mun anjeun teu nangtukeun data dina vektor dina parameter, Clickhouse nyokot 4500/857.19 = 5.24 kali kirang ti di Elasticsearch.
Dina véktor, médan komprési dianggo sacara standar.