Lè yo ekri nan lang Rust, li karakterize pa pèfòmans segondè ak konsomasyon RAM ki ba konpare ak analogue li yo. Anplis de sa, yo peye anpil atansyon sou fonksyon ki gen rapò ak kòrèkteman, an patikilye, kapasite nan sove evènman ki pa voye yo nan yon tanpon sou disk ak Thorne dosye.
Achitekti, Vector se yon routeur evènman ki resevwa mesaj ki soti nan youn oswa plis sous, opsyonèlman aplike sou mesaj sa yo transfòmasyon, epi voye yo bay youn oswa plis drenaj.
Vector se yon ranplasman pou filebeat ak logstash, li ka aji nan tou de wòl (resevwa epi voye mòso bwa), plis detay sou yo. Online.
Si nan Logstash chèn nan bati kòm opinyon → filtre → pwodiksyon Lè sa a, nan Vector li se sous → transfòme → lavabo
Egzanp yo ka jwenn nan dokiman an.
Enstriksyon sa a se yon enstriksyon revize soti nan Vyacheslav Rakhinsky. Enstriksyon orijinal yo gen pwosesis geoip. Lè tès geoip soti nan yon rezo entèn, vektè bay yon erè.
Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30
Si yon moun bezwen trete geoip, al gade nan enstriksyon orijinal yo soti nan Vyacheslav Rakhinsky.
Nou pral konfigirasyon konbinezon Nginx (Jounal Aksè) → Vektè (Kliyan | Filebeat) → Vektè (Sèvè | Logstash) → separeman nan Clickhouse ak separeman nan Elasticsearch. Nou pral enstale 4 serveurs. Malgre ke ou ka kontoune li ak 3 serveurs.
Konplo a se yon bagay tankou sa a.
Enfim Selinux sou tout serveurs ou yo
sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot
Nou enstale yon Emulation sèvè HTTP + sèvis piblik sou tout serveurs
ClickHouse sèvi ak seri enstriksyon SSE 4.2 la, kidonk sof si yo espesifye otreman, sipò pou li nan processeur yo itilize a vin tounen yon kondisyon sistèm adisyonèl. Men lòd pou tcheke si processeur aktyèl la sipòte SSE 4.2:
Konfigirasyon Elasticsearch pou mòd yon sèl-nœud 1 shard, 0 kopi. Gen plis chans ou pral gen yon gwoup nan yon gwo kantite serveurs epi ou pa bezwen fè sa.
INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.
Sou kliyan an (sèvè Web) - 1ye sèvè
Sou sèvè a ak nginx, ou bezwen enfim ipv6, depi tab mòso bwa a nan clickhouse sèvi ak jaden an. upstream_addr IPv4, depi mwen pa sèvi ak ipv6 andedan rezo a. Si ipv6 pa etenn, pral gen erè:
DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)
Premyèman, nou bezwen konfigirasyon fòma boutèy la nan Nginx nan dosye /etc/nginx/nginx.conf.
user nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically
# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
# provides the configuration file context in which the directives that affect connection processing are specified.
events {
# determines how much clients will be served per worker
# max clients = worker_connections * worker_processes
# max clients is also limited by the number of socket connections available on the system (~64k)
worker_connections 4000;
# optimized to serve many clients with each thread, essential for linux -- for testing environment
use epoll;
# accept as many connections as possible, may flood worker connections if set too low -- for testing environment
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
log_format vector escape=json
'{'
'"node_name":"nginx-vector",'
'"timestamp":"$time_iso8601",'
'"server_name":"$server_name",'
'"request_full": "$request",'
'"request_user_agent":"$http_user_agent",'
'"request_http_host":"$http_host",'
'"request_uri":"$request_uri",'
'"request_scheme": "$scheme",'
'"request_method":"$request_method",'
'"request_length":"$request_length",'
'"request_time": "$request_time",'
'"request_referrer":"$http_referer",'
'"response_status": "$status",'
'"response_body_bytes_sent":"$body_bytes_sent",'
'"response_content_type":"$sent_http_content_type",'
'"remote_addr": "$remote_addr",'
'"remote_port": "$remote_port",'
'"remote_user": "$remote_user",'
'"upstream_addr": "$upstream_addr",'
'"upstream_bytes_received": "$upstream_bytes_received",'
'"upstream_bytes_sent": "$upstream_bytes_sent",'
'"upstream_cache_status":"$upstream_cache_status",'
'"upstream_connect_time":"$upstream_connect_time",'
'"upstream_header_time":"$upstream_header_time",'
'"upstream_response_length":"$upstream_response_length",'
'"upstream_response_time":"$upstream_response_time",'
'"upstream_status": "$upstream_status",'
'"upstream_content_type":"$upstream_http_content_type"'
'}';
access_log /var/log/nginx/access.log main;
access_log /var/log/nginx/access.json.log vector; # Новый лог в формате json
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
Nan lòd pa kraze konfigirasyon ou ye kounye a, Nginx pèmèt ou gen plizyè direktiv access_log
access_log /var/log/nginx/access.log main; # Стандартный лог
access_log /var/log/nginx/access.json.log vector; # Новый лог в формате json
Pa bliye ajoute yon règ nan logrotate pou nouvo mòso bwa (si dosye a pa fini ak .log)
Epi konfigirasyon ranplasman Filebeat nan /etc/vector/vector.toml konfigirasyon an. Adrès IP 172.26.10.108 se adrès IP sèvè boutèy la (Vektè-Sèvè)
data_dir = "/var/lib/vector"
[sources.nginx_file]
type = "file"
include = [ "/var/log/nginx/access.json.log" ]
start_at_beginning = false
fingerprinting.strategy = "device_and_inode"
[sinks.nginx_output_vector]
type = "vector"
inputs = [ "nginx_file" ]
address = "172.26.10.108:9876"
Pa bliye ajoute itilizatè vektè a nan gwoup obligatwa a pou li ka li dosye log. Pou egzanp, nginx nan centos kreye mòso bwa ak dwa gwoup adm.
usermod -a -G adm vector
Ann kòmanse sèvis vektè a
systemctl enable vector
systemctl start vector
Jounal vektè yo ka wè jan sa a:
journalctl -f -u vector
Ta dwe gen yon antre tankou sa a nan mòso bwa yo
INFO vector::topology::builder: Healthcheck: Passed.
Tès estrès
Nou fè tès lè l sèvi avèk Apache referans.
Pake httpd-tools la te enstale sou tout serveurs
Nou kòmanse fè tès lè l sèvi avèk Apache referans soti nan 4 serveurs diferan nan ekran. Premyèman, nou lanse multiplexeur tèminal ekran an, ak Lè sa a, nou kòmanse tès lè l sèvi avèk referans Apache la. Ki jan yo travay ak ekran ou ka jwenn nan Atik.
Soti nan 1ye sèvè
while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done
Soti nan 2ye sèvè
while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done
Soti nan 3ye sèvè
while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done
Soti nan 4ye sèvè
while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done
select concat(database, '.', table) as table,
formatReadableSize(sum(bytes)) as size,
sum(rows) as rows,
max(modification_time) as latest_modification,
sum(bytes) as bytes_size,
any(engine) as engine,
formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;
Ann chèche konnen ki kantite mòso bwa te pran nan Clickhouse.
Gwosè tab mòso bwa a se 857.19 MB.
Gwosè a nan menm done yo nan endèks la nan Elasticsearch se 4,5GB.
Si ou pa presize done nan vektè a nan paramèt yo, Clickhouse pran 4500/857.19 = 5.24 fwa mwens pase nan Elasticsearch.
Nan vektè, se jaden an konpresyon yo itilize pa default.