Ana rubuta shi cikin yaren Rust, ana siffanta shi da babban aiki da ƙarancin amfani da RAM idan aka kwatanta da kwatankwacinsa. Bugu da ƙari, ana ba da hankali sosai ga ayyuka masu alaƙa da daidaito, musamman, ikon adana abubuwan da ba a aika ba zuwa buffer akan faifai da juya fayiloli.
A tsarin gine-gine, Vector shine na'ura mai ba da hanya tsakanin hanyoyin sadarwa wanda ke karɓar saƙonni daga ɗaya ko fiye kafofin, da zaɓin yin amfani da waɗannan saƙonnin canje-canje, da aika su zuwa ɗaya ko fiye magudanan ruwa.
Vector shine maye gurbin filebeat da logstash, yana iya aiki a cikin duka matsayin (karɓa da aika rajistan ayyukan), ƙarin cikakkun bayanai akan su. shafin.
Idan a Logstash an gina sarkar azaman shigarwa → tace → fitarwa to a cikin Vector yana kafofin → canzawa → nutsuwa
Ana iya samun misalai a cikin takardun.
Wannan koyarwar koyarwa ce da aka bita daga Vyacheslav Rakhinsky. Umarnin asali sun ƙunshi sarrafa geoip. Lokacin gwada geoip daga cibiyar sadarwar ciki, vector ya ba da kuskure.
Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30
Idan kowa yana buƙatar aiwatar da geoip, to koma zuwa ainihin umarnin daga Vyacheslav Rakhinsky.
Za mu saita haɗin Nginx (Login shiga) → Vector (Aboki | Filebeat) → Vector (Server | Logstash) → daban a cikin Clickhouse kuma daban a cikin Elasticsearch. Za mu shigar da sabobin 4. Ko da yake kuna iya ƙetare shi da sabobin 3.
Tsarin shine wani abu kamar haka.
Kashe Selinux akan duk sabar ku
sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot
Mun shigar da abin koyi na uwar garken HTTP + abubuwan amfani akan duk sabobin
ClickHouse yana amfani da saitin umarni na SSE 4.2, don haka sai dai in an kayyade shi, goyan bayan sa a cikin na'ura mai sarrafawa da aka yi amfani da shi ya zama ƙarin tsarin da ake bukata. Anan ga umarnin don bincika idan mai sarrafa na yanzu yana goyan bayan SSE 4.2:
Yana daidaita Elasticsearch don yanayin kumburi ɗaya 1 shard, 0 kwafi. Mai yuwuwa za ku sami gungu na yawan adadin sabobin kuma ba kwa buƙatar yin wannan.
Dole ne a sami shigarwar irin wannan a cikin rajistan ayyukan
INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.
A kan abokin ciniki (Sabar yanar gizo) - uwar garken 1st
A kan uwar garke tare da nginx, kuna buƙatar musaki ipv6, tun da tebur log a cikin gidan dannawa yana amfani da filin. upstream_addr IPv4, tunda bana amfani da ipv6 a cikin hanyar sadarwa. Idan ba a kashe ipv6 ba, za a sami kurakurai:
DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)
Da farko, muna buƙatar saita tsarin log a cikin Nginx a cikin fayil /etc/nginx/nginx.conf
user nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically
# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
# provides the configuration file context in which the directives that affect connection processing are specified.
events {
# determines how much clients will be served per worker
# max clients = worker_connections * worker_processes
# max clients is also limited by the number of socket connections available on the system (~64k)
worker_connections 4000;
# optimized to serve many clients with each thread, essential for linux -- for testing environment
use epoll;
# accept as many connections as possible, may flood worker connections if set too low -- for testing environment
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
log_format vector escape=json
'{'
'"node_name":"nginx-vector",'
'"timestamp":"$time_iso8601",'
'"server_name":"$server_name",'
'"request_full": "$request",'
'"request_user_agent":"$http_user_agent",'
'"request_http_host":"$http_host",'
'"request_uri":"$request_uri",'
'"request_scheme": "$scheme",'
'"request_method":"$request_method",'
'"request_length":"$request_length",'
'"request_time": "$request_time",'
'"request_referrer":"$http_referer",'
'"response_status": "$status",'
'"response_body_bytes_sent":"$body_bytes_sent",'
'"response_content_type":"$sent_http_content_type",'
'"remote_addr": "$remote_addr",'
'"remote_port": "$remote_port",'
'"remote_user": "$remote_user",'
'"upstream_addr": "$upstream_addr",'
'"upstream_bytes_received": "$upstream_bytes_received",'
'"upstream_bytes_sent": "$upstream_bytes_sent",'
'"upstream_cache_status":"$upstream_cache_status",'
'"upstream_connect_time":"$upstream_connect_time",'
'"upstream_header_time":"$upstream_header_time",'
'"upstream_response_length":"$upstream_response_length",'
'"upstream_response_time":"$upstream_response_time",'
'"upstream_status": "$upstream_status",'
'"upstream_content_type":"$upstream_http_content_type"'
'}';
access_log /var/log/nginx/access.log main;
access_log /var/log/nginx/access.json.log vector; # Новый лог в формате json
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
Don kar a karya tsarin ku na yanzu, Nginx yana ba ku damar samun umarnin shiga_log da yawa
access_log /var/log/nginx/access.log main; # Стандартный лог
access_log /var/log/nginx/access.json.log vector; # Новый лог в формате json
Kar a manta da ƙara ƙa'ida don yin rajistar sabbin rajistan ayyukan (idan fayil ɗin log ɗin bai ƙare da .log ba)
Cire default.conf daga /etc/nginx/conf.d/
rm -f /etc/nginx/conf.d/default.conf
Ƙara mai masaukin baki /etc/nginx/conf.d/vhost1.conf
Kuma saita maye gurbin Filebeat a cikin /etc/vector/vector.toml config. Adireshin IP 172.26.10.108 shine adireshin IP na uwar garken log (Vector-Server)
data_dir = "/var/lib/vector"
[sources.nginx_file]
type = "file"
include = [ "/var/log/nginx/access.json.log" ]
start_at_beginning = false
fingerprinting.strategy = "device_and_inode"
[sinks.nginx_output_vector]
type = "vector"
inputs = [ "nginx_file" ]
address = "172.26.10.108:9876"
Kar a manta da kara mai amfani da vector zuwa rukunin da ake bukata domin ya iya karanta fayilolin log. Misali, nginx a cikin centos yana ƙirƙirar rajistan ayyukan tare da haƙƙin ƙungiyar talla.
usermod -a -G adm vector
Bari mu fara sabis na vector
systemctl enable vector
systemctl start vector
Za a iya kallon log ɗin vector kamar haka:
journalctl -f -u vector
Ya kamata a sami shigarwa kamar wannan a cikin rajistan ayyukan
INFO vector::topology::builder: Healthcheck: Passed.
Gwajin damuwa
Ana yin gwajin ta amfani da alamar Apache.
An shigar da kunshin httpd-tools akan duk sabobin
Mun fara gwaji ta amfani da alamar Apache daga sabobin 4 daban-daban a allo. Da farko, za mu ƙaddamar da tashar tashar multixer, sannan mu fara gwaji ta amfani da alamar Apache. Yadda ake aiki da allo za ku iya samu a ciki labarin.
Daga uwar garken 1st
while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done
Daga uwar garken 2st
while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done
Daga uwar garken 3st
while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done
Daga uwar garken 4st
while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done
select concat(database, '.', table) as table,
formatReadableSize(sum(bytes)) as size,
sum(rows) as rows,
max(modification_time) as latest_modification,
sum(bytes) as bytes_size,
any(engine) as engine,
formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;
Bari mu gano adadin rajistan ayyukan da aka karɓa a Clickhouse.
Girman teburin rajistan ayyukan shine 857.19 MB.
Girman bayanai iri ɗaya a cikin fihirisar a Elasticsearch shine 4,5GB.
Idan ba ku ƙayyade bayanai a cikin vector a cikin sigogi ba, Clickhouse yana ɗaukar 4500/857.19 = 5.24 sau ƙasa da na Elasticsearch.
A cikin vector, ana amfani da filin matsawa ta tsohuwa.