Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Bruidhnidh an artaigil seo mun phròiseact nginx-log-cruinneachaidh, a leughas logaichean nginx agus gan cur gu brabhsair Clickhouse. Mar as trice bidh ElasticSearch air a chleachdadh airson logaichean. Feumaidh taigh-cliog nas lugha de ghoireasan (àite diosc, RAM, CPU). Bidh Clickhouse a’ clàradh dàta nas luaithe. Bidh Clickhouse a’ teannachadh dàta, a’ dèanamh an dàta air diosc eadhon nas toinnte. Tha buannachdan Clickhouse rim faicinn ann an 2 shleamhnag bhon aithisg Mar a chuireas VK a-steach dàta a-steach do ClickHouse bho dheichean de mhìltean de luchd-frithealaidh.

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Gus sùil a thoirt air mion-sgrùdaidhean stèidhichte air logaichean, cruthaichidh sinn deas-bhòrd airson Grafana.

Duine sam bith le ùidh, fàilte gu cat.

Stàlaich nginx, grafana san dòigh àbhaisteach.

A’ stàladh brabhsair taigh-cliog a’ cleachdadh leabhar-cluiche ansible bho Denis Proskurin.

A 'cruthachadh stòran-dàta agus clàran ann an Clickhouse

San seo faidhle Thathas a’ toirt cunntas air ceistean SQL airson stòran-dàta agus clàran a chruthachadh airson nginx-log-collector ann an Clickhouse.

Bidh sinn a’ dèanamh gach iarrtas aon ri aon air gach frithealaiche ann am buidheann Clickhouse.

Nòta cudromach. San loidhne seo, feumaidh logs_cluster an t-ainm brabhsair agad a chur an àite an fhaidhle clickhouse_remote_servers.xml eadar “remote_servers” agus “shard”.

ENGINE = Distributed('logs_cluster', 'nginx', 'access_log_shard', rand())

A’ stàladh agus a’ rèiteachadh nginx-log-collector-rpm

Chan eil rpm aig Nginx-log-collector. Seo https://github.com/patsevanton/nginx-log-collector-rpm cruthaich rpm air a shon. rpm a chur ri chèile a’ cleachdadh Fedora Copr

Stàlaich am pasgan rpm nginx-log-collector-rpm

yum -y install yum-plugin-copr
yum copr enable antonpatsev/nginx-log-collector-rpm
yum -y install nginx-log-collector
systemctl start nginx-log-collector

Deasaich an config /etc/nginx-log-collector/config.yaml:

  .......
  upload:
    table: nginx.access_log
    dsn: http://ip-адрес-кластера-clickhouse:8123/

- tag: "nginx_error:"
  format: error  # access | error
  buffer_size: 1048576
  upload:
    table: nginx.error_log
    dsn: http://ip-адрес-кластера-clickhouse:8123/

A 'suidheachadh nginx

Cumadh coitcheann nginx:

user  nginx;
worker_processes  auto;

#error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    log_format avito_json escape=json
                     '{'
                     '"event_datetime": "$time_iso8601", '
                     '"server_name": "$server_name", '
                     '"remote_addr": "$remote_addr", '
                     '"remote_user": "$remote_user", '
                     '"http_x_real_ip": "$http_x_real_ip", '
                     '"status": "$status", '
                     '"scheme": "$scheme", '
                     '"request_method": "$request_method", '
                     '"request_uri": "$request_uri", '
                     '"server_protocol": "$server_protocol", '
                     '"body_bytes_sent": $body_bytes_sent, '
                     '"http_referer": "$http_referer", '
                     '"http_user_agent": "$http_user_agent", '
                     '"request_bytes": "$request_length", '
                     '"request_time": "$request_time", '
                     '"upstream_addr": "$upstream_addr", '
                     '"upstream_response_time": "$upstream_response_time", '
                     '"hostname": "$hostname", '
                     '"host": "$host"'
                     '}';

    access_log     syslog_server=unix:/var/run/nginx_log.sock,nohostname,tag=nginx avito_json; #ClickHouse
    error_log      syslog_server=unix:/var/run/nginx_log.sock,nohostname,tag=nginx_error; #ClickHouse

    #access_log  /var/log/nginx/access.log  main;

    proxy_ignore_client_abort on;
    sendfile        on;
    keepalive_timeout  65;
    include /etc/nginx/conf.d/*.conf;
}

Aon aoigheachd mas-fhìor:

vhost1.conf:

upstream backend {
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
}

server {
    listen   80;
    server_name vhost1;
    location / {
        proxy_pass http://backend;
    }
}

Cuir luchd-aoigheachd brìgheil ris an fhaidhle /etc/hosts:

ip-адрес-сервера-с-nginx vhost1

Emulator frithealaiche HTTP

Mar emuladair frithealaiche HTTP cleachdaidh sinn nodejs-stub-frithealaiche от Maxim Ignatenko

Chan eil rpm aig Nodejs-stub-server. Seo https://github.com/patsevanton/nodejs-stub-server cruthaich rpm air a shon. rpm a chur ri chèile a’ cleachdadh Fedora Copr

Stàlaich pasgan nodejs-stub-server air suas an abhainn nginx rpm

yum -y install yum-plugin-copr
yum copr enable antonpatsev/nodejs-stub-server
yum -y install stub_http_server
systemctl start stub_http_server

Deuchainn Stress

Bidh sinn a’ dèanamh deuchainnean a’ cleachdadh slat-tomhais Apache.

Stàlaich e:

yum install -y httpd-tools

Bidh sinn a’ tòiseachadh a’ dèanamh deuchainn le bhith a’ cleachdadh slat-tomhais Apache bho 5 frithealaichean eadar-dhealaichte:

while true; do ab -H "User-Agent: 1server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 2server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 3server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 4server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 5server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done

Grafana a stèidheachadh

Chan fhaigh thu deas-bhòrd air làrach-lìn oifigeil Grafana.

Mar sin, nì sinn e le làimh.

Gheibh thu lorg air an deas-bhòrd agam a chaidh a shàbhaladh an seo.

Feumaidh tu cuideachd caochladair bùird a chruthachadh leis an t-susbaint nginx.access_log.
Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Iarrtasan Singlestat Total:

SELECT
 1 as t,
 count(*) as c
 FROM $table
 WHERE $timeFilter GROUP BY t

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Iarrtasan air fàiligeadh le Singlestat:

SELECT
 1 as t,
 count(*) as c
 FROM $table
 WHERE $timeFilter AND status NOT IN (200, 201, 401) GROUP BY t

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Ìre Fàilligidh Singlestat:

SELECT
 1 as t, (sum(status = 500 or status = 499)/sum(status = 200 or status = 201 or status = 401))*100 FROM $table
 WHERE $timeFilter GROUP BY t

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Ùine freagairt Singlestat Avg:

SELECT
 1, avg(request_time) FROM $table
 WHERE $timeFilter GROUP BY 1

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Ùine freagairt Singlestat Max:

SELECT
 1 as t, max(request_time) as c
 FROM $table
 WHERE $timeFilter GROUP BY t

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Inbhe cunntais:

$columns(status, count(*) as c) from $table

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Gus dàta a thoirt a-mach mar phìos, feumaidh tu am plugan a stàladh agus grafana ath-thòiseachadh.

grafana-cli plugins install grafana-piechart-panel
service grafana-server restart

Inbhe TOP 5 Pie:

SELECT
    1, /* fake timestamp value */
    status,
    sum(status) AS Reqs
FROM $table
WHERE $timeFilter
GROUP BY status
ORDER BY Reqs desc
LIMIT 5

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Nas fhaide bheir mi iarrtasan às aonais seallaidhean-sgrìn:

Cunnt http_user_agent:

$columns(http_user_agent, count(*) c) FROM $table

Ìre mhath/droch-ìre:

$rate(countIf(status = 200) AS good, countIf(status != 200) AS bad) FROM $table

Ùine freagairt:

$rate(avg(request_time) as request_time) FROM $table

Ùine freagairt shuas an abhainn (1d ùine freagairt suas an abhainn):

$rate(avg(arrayElement(upstream_response_time,1)) as upstream_response_time) FROM $table

Inbhe Cunntais Clàr airson a h-uile vhost:

$columns(status, count(*) as c) from $table

Sealladh farsaing air an deas-bhòrd

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Coimeas eadar avg() agus quantile()

avg()
Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse
meud ()
Goireas Nginx-log-collector bho Avito airson logaichean nginx a chuir gu Clickhouse

Co-dhùnadh:

Tha mi an dòchas gum bi a’ choimhearsnachd an sàs ann an leasachadh/deuchainn agus cleachdadh nginx-log-collector.
Agus nuair a chuireas cuideigin nginx-log-collector an gnìomh, innsidh iad dhut cia mheud diosc, RAM, agus CPU a shàbhail iad.

Sianalan teileagram:

Milliseconds:

Cò dha na millisecons a tha cudromach, sgrìobhaibh no bhòtaibh ann an seo cùis.

Source: www.habr.com

Cuir beachd ann