Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Wannan labarin zai tattauna aikin nginx-log-mai tattarawa, wanda zai karanta nginx logs, aika su zuwa gunkin Clickhouse. Yawancin lokaci ana amfani da ElasticSearch don rajistan ayyukan. Clickhouse yana buƙatar ƙarancin albarkatu ( sarari diski, RAM, CPU). Clickhouse yana rubuta bayanai da sauri. Clickhouse yana matsar da bayanan, wanda ke sa bayanan da ke kan faifai ya zama mafi m. Ana iya ganin fa'idodin Clickhouse a cikin nunin faifai 2 daga rahoton Yadda VK ke saka bayanai a cikin ClickHouse daga dubun-dubatar sabar.

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Don duba nazari ta rajistan ayyukan, bari mu ƙirƙiri dashboard don Grafana.

Wane ne ya damu, maraba a ƙarƙashin cat.

Shigar nginx, grafana a daidaitaccen hanya.

Shigar da gungu na gidan dannawa tare da littafin wasa mai yiwuwa daga Denis Proskurin.

Ƙirƙirar bayanan bayanai da teburi a cikin Clickhouse

A cikin wannan fayil An bayyana tambayoyin SQL don ƙirƙirar bayanan bayanai da teburi don nginx-log-collector a cikin Clickhouse.

Muna yin kowane buƙatu bi da bi kan kowane uwar garken gungu na Clickhouse.

Bayani mai mahimmanci. A cikin wannan layin, ya kamata a maye gurbin logs_cluster da sunan gungu daga fayil ɗin clickhouse_remote_servers.xml tsakanin "remote_servers" da "shard".

ENGINE = Distributed('logs_cluster', 'nginx', 'access_log_shard', rand())

Shigarwa da daidaita nginx-log-collector-rpm

Nginx-log-collector bashi da rpm. nan https://github.com/patsevanton/nginx-log-collector-rpm ƙirƙirar rpm don shi. rpm za a gina ta amfani da Fedora Copr

Shigar da fakitin rpm nginx-log-collector-rpm

yum -y install yum-plugin-copr
yum copr enable antonpatsev/nginx-log-collector-rpm
yum -y install nginx-log-collector
systemctl start nginx-log-collector

Shirya saitin /etc/nginx-log-collector/config.yaml:

  .......
  upload:
    table: nginx.access_log
    dsn: http://ip-адрес-кластера-clickhouse:8123/

- tag: "nginx_error:"
  format: error  # access | error
  buffer_size: 1048576
  upload:
    table: nginx.error_log
    dsn: http://ip-адрес-кластера-clickhouse:8123/

Saita nginx

Tsarin nginx na gabaɗaya:

user  nginx;
worker_processes  auto;

#error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    log_format avito_json escape=json
                     '{'
                     '"event_datetime": "$time_iso8601", '
                     '"server_name": "$server_name", '
                     '"remote_addr": "$remote_addr", '
                     '"remote_user": "$remote_user", '
                     '"http_x_real_ip": "$http_x_real_ip", '
                     '"status": "$status", '
                     '"scheme": "$scheme", '
                     '"request_method": "$request_method", '
                     '"request_uri": "$request_uri", '
                     '"server_protocol": "$server_protocol", '
                     '"body_bytes_sent": $body_bytes_sent, '
                     '"http_referer": "$http_referer", '
                     '"http_user_agent": "$http_user_agent", '
                     '"request_bytes": "$request_length", '
                     '"request_time": "$request_time", '
                     '"upstream_addr": "$upstream_addr", '
                     '"upstream_response_time": "$upstream_response_time", '
                     '"hostname": "$hostname", '
                     '"host": "$host"'
                     '}';

    access_log     syslog_server=unix:/var/run/nginx_log.sock,nohostname,tag=nginx avito_json; #ClickHouse
    error_log      syslog_server=unix:/var/run/nginx_log.sock,nohostname,tag=nginx_error; #ClickHouse

    #access_log  /var/log/nginx/access.log  main;

    proxy_ignore_client_abort on;
    sendfile        on;
    keepalive_timeout  65;
    include /etc/nginx/conf.d/*.conf;
}

Mai masaukin baki daya:

vhost1.conf:

upstream backend {
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
}

server {
    listen   80;
    server_name vhost1;
    location / {
        proxy_pass http://backend;
    }
}

Ƙara runduna kama-da-wane zuwa fayil ɗin /etc/hosts:

ip-адрес-сервера-с-nginx vhost1

HTTP uwar garken emulator

A matsayin emulator na uwar garken HTTP za mu yi amfani da shi nodejs-stub-uwar garken daga Maxim Ignatenko

nodejs-stub-uwar garken ba shi da rpm. nan https://github.com/patsevanton/nodejs-stub-server ƙirƙirar rpm don shi. rpm za a gina ta amfani da Fedora Copr

Sanya fakitin uwar garken nodejs-stub akan nginx rpm na sama

yum -y install yum-plugin-copr
yum copr enable antonpatsev/nodejs-stub-server
yum -y install stub_http_server
systemctl start stub_http_server

Gwajin damuwa

Ana yin gwajin ta amfani da alamar Apache.

Shigar da shi:

yum install -y httpd-tools

Mun fara gwaji ta amfani da alamar Apache daga sabobin 5 daban-daban:

while true; do ab -H "User-Agent: 1server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 2server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 3server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 4server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 5server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done

Saitin Grafana

Ba za ku sami dashboard akan gidan yanar gizon hukuma na Grafana ba.

Saboda haka, za mu yi shi da hannu.

Kuna iya nemo dashboard dina da aka ajiye a nan.

Hakanan kuna buƙatar ƙirƙirar canjin tebur tare da abun ciki nginx.access_log.
Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Jimlar Buƙatun Singlestat:

SELECT
 1 as t,
 count(*) as c
 FROM $table
 WHERE $timeFilter GROUP BY t

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Buƙatun Singlestat sun gaza:

SELECT
 1 as t,
 count(*) as c
 FROM $table
 WHERE $timeFilter AND status NOT IN (200, 201, 401) GROUP BY t

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Kashi Na Kasawar Singlestat:

SELECT
 1 as t, (sum(status = 500 or status = 499)/sum(status = 200 or status = 201 or status = 401))*100 FROM $table
 WHERE $timeFilter GROUP BY t

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Matsakaicin Lokacin Amsa Singlestat:

SELECT
 1, avg(request_time) FROM $table
 WHERE $timeFilter GROUP BY 1

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Lokacin Amsa Singlestat Max:

SELECT
 1 as t, max(request_time) as c
 FROM $table
 WHERE $timeFilter GROUP BY t

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Matsayin ƙidaya:

$columns(status, count(*) as c) from $table

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Don fitar da bayanai kamar kek, kuna buƙatar shigar da plugin ɗin kuma sake shigar da grafana.

grafana-cli plugins install grafana-piechart-panel
service grafana-server restart

Matsayi TOP 5:

SELECT
    1, /* fake timestamp value */
    status,
    sum(status) AS Reqs
FROM $table
WHERE $timeFilter
GROUP BY status
ORDER BY Reqs desc
LIMIT 5

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Zan ci gaba da ba da buƙatun ba tare da hotunan kariyar kwamfuta ba:

Ƙididdige http_user_agent:

$columns(http_user_agent, count(*) c) FROM $table

Kyakkyawan Rate/Rauni:

$rate(countIf(status = 200) AS good, countIf(status != 200) AS bad) FROM $table

lokacin amsawa:

$rate(avg(request_time) as request_time) FROM $table

Lokacin mayar da martani (lokacin amsawa na 1st na sama):

$rate(avg(arrayElement(upstream_response_time,1)) as upstream_response_time) FROM $table

Matsayin Ƙididdigar Tebu don duk vhosts:

$columns(status, count(*) as c) from $table

Gabaɗaya kallon dashboard

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Kwatanta avg() da quantile()

m()
Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse
yawa()
Nginx-log-collector mai amfani daga Avito don aika rajistan ayyukan nginx zuwa Clickhouse

Kammalawa:

Da fatan al'umma za su shiga cikin haɓakawa / gwaji da amfani da nginx-log-collector.
Kuma idan wani ya aiwatar da nginx-log-collector, zai gaya muku nawa ya ajiye faifai, RAM, CPU.

Tashar telegram:

Millse seconds:

Wanene ya damu game da millise seconds, rubuta ko zabe, don Allah, a cikin wannan batun.

source: www.habr.com

Add a comment