TÄ kÄ tÄ ir rakstÄ«ta Rust valodÄ, tai salÄ«dzinÄjumÄ ar analogiem ir raksturÄ«ga augsta veiktspÄja un zems RAM patÄriÅÅ”. TurklÄt liela uzmanÄ«ba tiek pievÄrsta funkcijÄm, kas saistÄ«tas ar pareizÄ«bu, jo Ä«paÅ”i iespÄjai saglabÄt nenosÅ«tÄ«tos notikumus diska buferÄ« un pagriezt failus.
ArhitektÅ«ras ziÅÄ Vector ir notikumu marÅ”rutÄtÄjs, kas saÅem ziÅojumus no viena vai vairÄkiem avotiem, pÄc izvÄles piemÄrojot Å”iem ziÅojumiem pÄrvÄrtÄ«basun nosÅ«tot tos vienam vai vairÄkiem notekas.
Vector aizstÄj filebeat un logstash, tas var darboties abÄs lomÄs (saÅemt un sÅ«tÄ«t žurnÄlus), sÄ«kÄka informÄcija par tiem TieÅ”saistÄ.
Ja Logstash Ä·Äde ir veidota kÄ ievade ā filtrs ā izvade, tad Vector tÄ ir avoti ā pÄrveido ā izlietnes
PiemÄrus var atrast dokumentÄcijÄ.
Å Ä« instrukcija ir pÄrskatÄ«ta instrukcija no VjaÄeslavs Rahinskis. SÄkotnÄjÄs instrukcijas satur geoip apstrÄdi. PÄrbaudot geoip no iekÅ”ÄjÄ tÄ«kla, vektors radÄ«ja kļūdu.
Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=Ā«geoip.country_nameĀ» rate_limit_secs=30
Ja kÄdam ir jÄapstrÄdÄ geoip, skatiet oriÄ£inÄlos norÄdÄ«jumus no VjaÄeslavs Rahinskis.
MÄs konfigurÄsim kombinÄciju Nginx (piekļuves žurnÄli) ā Vector (klients | Filebeat) ā Vector (serveris | Logstash) ā atseviŔķi pakalpojumÄ Clickhouse un atseviŔķi programmÄ Elasticsearch. MÄs uzstÄdÄ«sim 4 serverus. Lai gan to var apiet ar 3 serveriem.
ShÄma ir apmÄram Å”Äda.
AtspÄjojiet Selinux visos savos serveros
sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot
Uz visiem serveriem instalÄjam HTTP servera emulatoru + utilÄ«tas
ClickHouse izmanto SSE 4.2 instrukciju kopu, tÄpÄc, ja vien nav norÄdÄ«ts citÄdi, atbalsts tam izmantotajÄ procesorÄ kļūst par papildu sistÄmas prasÄ«bu. Å eit ir komanda, lai pÄrbaudÄ«tu, vai paÅ”reizÄjais procesors atbalsta SSE 4.2:
INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.
KlientÄ (tÄ«mekļa serverÄ«) - 1. serveris
ServerÄ« ar nginx ir jÄatspÄjo ipv6, jo žurnÄlu tabulÄ Clickhouse tiek izmantots lauks upstream_addr IPv4, jo es neizmantoju ipv6 tÄ«klÄ. Ja ipv6 nav izslÄgts, tiks parÄdÄ«tas kļūdas:
DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)
PirmkÄrt, mums ir jÄkonfigurÄ Å¾urnÄla formÄts Nginx failÄ /etc/nginx/nginx.conf.
user nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically
# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
# provides the configuration file context in which the directives that affect connection processing are specified.
events {
# determines how much clients will be served per worker
# max clients = worker_connections * worker_processes
# max clients is also limited by the number of socket connections available on the system (~64k)
worker_connections 4000;
# optimized to serve many clients with each thread, essential for linux -- for testing environment
use epoll;
# accept as many connections as possible, may flood worker connections if set too low -- for testing environment
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
log_format vector escape=json
'{'
'"node_name":"nginx-vector",'
'"timestamp":"$time_iso8601",'
'"server_name":"$server_name",'
'"request_full": "$request",'
'"request_user_agent":"$http_user_agent",'
'"request_http_host":"$http_host",'
'"request_uri":"$request_uri",'
'"request_scheme": "$scheme",'
'"request_method":"$request_method",'
'"request_length":"$request_length",'
'"request_time": "$request_time",'
'"request_referrer":"$http_referer",'
'"response_status": "$status",'
'"response_body_bytes_sent":"$body_bytes_sent",'
'"response_content_type":"$sent_http_content_type",'
'"remote_addr": "$remote_addr",'
'"remote_port": "$remote_port",'
'"remote_user": "$remote_user",'
'"upstream_addr": "$upstream_addr",'
'"upstream_bytes_received": "$upstream_bytes_received",'
'"upstream_bytes_sent": "$upstream_bytes_sent",'
'"upstream_cache_status":"$upstream_cache_status",'
'"upstream_connect_time":"$upstream_connect_time",'
'"upstream_header_time":"$upstream_header_time",'
'"upstream_response_length":"$upstream_response_length",'
'"upstream_response_time":"$upstream_response_time",'
'"upstream_status": "$upstream_status",'
'"upstream_content_type":"$upstream_http_content_type"'
'}';
access_log /var/log/nginx/access.log main;
access_log /var/log/nginx/access.json.log vector; # ŠŠ¾Š²ŃŠ¹ Š»Š¾Š³ Š² ŃŠ¾ŃŠ¼Š°ŃŠµ json
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
Lai nesabojÄtu jÅ«su paÅ”reizÄjo konfigurÄciju, Nginx ļauj jums izmantot vairÄkas access_log direktÄ«vas
Un konfigurÄjiet Filebeat aizstÄÅ”anu /etc/vector/vector.toml konfigurÄcijÄ. IP adrese 172.26.10.108 ir žurnÄla servera (vektoru servera) IP adrese.
data_dir = "/var/lib/vector"
[sources.nginx_file]
type = "file"
include = [ "/var/log/nginx/access.json.log" ]
start_at_beginning = false
fingerprinting.strategy = "device_and_inode"
[sinks.nginx_output_vector]
type = "vector"
inputs = [ "nginx_file" ]
address = "172.26.10.108:9876"
Neaizmirstiet pievienot vektora lietotÄju vajadzÄ«gajai grupai, lai viÅÅ” varÄtu lasÄ«t žurnÄla failus. PiemÄram, nginx in centos izveido žurnÄlus ar adm grupas tiesÄ«bÄm.
usermod -a -G adm vector
SÄksim vektora pakalpojumu
systemctl enable vector
systemctl start vector
Vektoru žurnÄlus var skatÄ«t Å”Ädi:
journalctl -f -u vector
ŽurnÄlos vajadzÄtu bÅ«t Å”Ädam ierakstam
INFO vector::topology::builder: Healthcheck: Passed.
Stresa testÄÅ”ana
TestÄÅ”ana tiek veikta, izmantojot Apache etalonu.
httpd-tools pakotne tika instalÄta visos serveros
MÄs sÄkam testÄÅ”anu, izmantojot Apache etalonu no 4 dažÄdiem ekrÄna serveriem. Vispirms mÄs palaižam ekrÄna terminÄļa multipleksoru un pÄc tam sÄkam testÄÅ”anu, izmantojot Apache etalonu. KÄ strÄdÄt ar ekrÄnu, varat atrast raksts.
No 1. servera
while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done
No 2. servera
while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done
No 3. servera
while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done
No 4. servera
while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done
select concat(database, '.', table) as table,
formatReadableSize(sum(bytes)) as size,
sum(rows) as rows,
max(modification_time) as latest_modification,
sum(bytes) as bytes_size,
any(engine) as engine,
formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;
Noskaidrosim, cik daudz baļķu aizÅÄma Clickhouse.
ŽurnÄlu tabulas izmÄrs ir 857.19 MB.
To paÅ”u datu lielums Elasticsearch rÄdÄ«tÄjÄ ir 4,5 GB.
Ja parametros nenorÄdÄ«siet datus vektorÄ, Clickhouse aizÅem 4500/857.19 = 5.24 reizes mazÄk nekÄ Elasticsearch.
VektorÄ pÄc noklusÄjuma tiek izmantots saspieÅ”anas lauks.