Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Ushbu maqolada loyiha muhokama qilinadi nginx-log-kollektornginx jurnallarini o'qiydi, ularni Clickhouse klasteriga yuboring. Odatda ElasticSearch jurnallar uchun ishlatiladi. Clickhouse kamroq resurslarni talab qiladi (disk maydoni, RAM, CPU). Clickhouse ma'lumotlarni tezroq yozadi. Clickhouse ma'lumotlarni siqadi, bu esa diskdagi ma'lumotlarni yanada ixcham qiladi. Clickhouse-ning afzalliklarini hisobotdan 2 ta slaydda ko'rish mumkin VK qanday qilib o'n minglab serverlardan ClickHouse-ga ma'lumotlarni kiritadi.

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Jurnallar bo'yicha tahlillarni ko'rish uchun Grafana uchun asboblar paneli yarataylik.

Kimga g'amxo'rlik qiladi, mushuk ostida xush kelibsiz.

Nginx, grafana-ni standart tarzda o'rnating.

Clickhouse klasterini ansible-playbook dan o'rnating Denis Proskurin.

Clickhouse-da ma'lumotlar bazasi va jadvallarni yaratish

Bu erda fayl Clickhouse-da nginx-log-collector uchun ma'lumotlar bazalari va jadvallarni yaratish uchun SQL so'rovlari tasvirlangan.

Biz har bir so'rovni Clickhouse klasterining har bir serverida navbat bilan qilamiz.

Muhim eslatma. Ushbu qatorda logs_cluster "remote_servers" va "shard" o'rtasidagi clickhouse_remote_servers.xml faylidagi klaster nomi bilan almashtirilishi kerak.

ENGINE = Distributed('logs_cluster', 'nginx', 'access_log_shard', rand())

Nginx-log-collector-rpm-ni o'rnatish va sozlash

Nginx-log-kollektorida aylanish tezligi yo'q. Bu yerga https://github.com/patsevanton/nginx-log-collector-rpm Buning uchun rpm yarating. rpm yordamida quriladi Fedora Copr

nginx-log-collector-rpm rpm paketini o'rnating

yum -y install yum-plugin-copr
yum copr enable antonpatsev/nginx-log-collector-rpm
yum -y install nginx-log-collector
systemctl start nginx-log-collector

/etc/nginx-log-collector/config.yaml konfiguratsiyasini tahrirlang:

  .......
  upload:
    table: nginx.access_log
    dsn: http://ip-адрес-кластера-clickhouse:8123/

- tag: "nginx_error:"
  format: error  # access | error
  buffer_size: 1048576
  upload:
    table: nginx.error_log
    dsn: http://ip-адрес-кластера-clickhouse:8123/

Nginx o'rnatilmoqda

Umumiy nginx konfiguratsiyasi:

user  nginx;
worker_processes  auto;

#error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    log_format avito_json escape=json
                     '{'
                     '"event_datetime": "$time_iso8601", '
                     '"server_name": "$server_name", '
                     '"remote_addr": "$remote_addr", '
                     '"remote_user": "$remote_user", '
                     '"http_x_real_ip": "$http_x_real_ip", '
                     '"status": "$status", '
                     '"scheme": "$scheme", '
                     '"request_method": "$request_method", '
                     '"request_uri": "$request_uri", '
                     '"server_protocol": "$server_protocol", '
                     '"body_bytes_sent": $body_bytes_sent, '
                     '"http_referer": "$http_referer", '
                     '"http_user_agent": "$http_user_agent", '
                     '"request_bytes": "$request_length", '
                     '"request_time": "$request_time", '
                     '"upstream_addr": "$upstream_addr", '
                     '"upstream_response_time": "$upstream_response_time", '
                     '"hostname": "$hostname", '
                     '"host": "$host"'
                     '}';

    access_log     syslog_server=unix:/var/run/nginx_log.sock,nohostname,tag=nginx avito_json; #ClickHouse
    error_log      syslog_server=unix:/var/run/nginx_log.sock,nohostname,tag=nginx_error; #ClickHouse

    #access_log  /var/log/nginx/access.log  main;

    proxy_ignore_client_abort on;
    sendfile        on;
    keepalive_timeout  65;
    include /etc/nginx/conf.d/*.conf;
}

Virtual xost biri:

vhost1.conf:

upstream backend {
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
    server ip-адрес-сервера-с-stub_http_server:8080;
}

server {
    listen   80;
    server_name vhost1;
    location / {
        proxy_pass http://backend;
    }
}

/etc/hosts fayliga virtual xostlarni qo'shing:

ip-адрес-сервера-с-nginx vhost1

HTTP server emulyatori

HTTP server emulyatori sifatida biz foydalanamiz nodejs-stub-server от Maksim Ignatenko

nodejs-stub-serverda rpm yo'q. Bu yerga https://github.com/patsevanton/nodejs-stub-server Buning uchun rpm yarating. rpm yordamida quriladi Fedora Copr

Nodejs-stub-server paketini yuqoridagi nginx rpm-ga o'rnating

yum -y install yum-plugin-copr
yum copr enable antonpatsev/nodejs-stub-server
yum -y install stub_http_server
systemctl start stub_http_server

Stress testi

Sinov Apache benchmark yordamida amalga oshiriladi.

Uni o'rnating:

yum install -y httpd-tools

Biz 5 xil serverdan Apache benchmarkidan foydalanib sinovni boshlaymiz:

while true; do ab -H "User-Agent: 1server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 2server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 3server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 4server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done
while true; do ab -H "User-Agent: 5server" -c 10 -n 10 -t 10 http://vhost1/; sleep 1; done

Grafana o'rnatilmoqda

Grafana rasmiy veb-saytida asboblar panelini topa olmaysiz.

Shuning uchun biz buni qo'lda qilamiz.

Siz mening saqlangan asboblar panelini topishingiz mumkin shu yerda.

Bundan tashqari, tarkib bilan jadval o'zgaruvchisini yaratishingiz kerak nginx.access_log.
Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Yagona umumiy soʻrovlar:

SELECT
 1 as t,
 count(*) as c
 FROM $table
 WHERE $timeFilter GROUP BY t

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Singlestat bajarilmagan so'rovlar:

SELECT
 1 as t,
 count(*) as c
 FROM $table
 WHERE $timeFilter AND status NOT IN (200, 201, 401) GROUP BY t

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Yagona statistik muvaffaqiyatsizlik foizi:

SELECT
 1 as t, (sum(status = 500 or status = 499)/sum(status = 200 or status = 201 or status = 401))*100 FROM $table
 WHERE $timeFilter GROUP BY t

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Yagona oʻrtacha javob vaqti:

SELECT
 1, avg(request_time) FROM $table
 WHERE $timeFilter GROUP BY 1

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Singlestat maksimal javob vaqti:

SELECT
 1 as t, max(request_time) as c
 FROM $table
 WHERE $timeFilter GROUP BY t

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Hisob holati:

$columns(status, count(*) as c) from $table

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Pirog kabi ma'lumotlarni chiqarish uchun siz plaginni o'rnatishingiz va grafanani qayta yuklashingiz kerak.

grafana-cli plugins install grafana-piechart-panel
service grafana-server restart

Pie TOP 5 holati:

SELECT
    1, /* fake timestamp value */
    status,
    sum(status) AS Reqs
FROM $table
WHERE $timeFilter
GROUP BY status
ORDER BY Reqs desc
LIMIT 5

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Keyinchalik men skrinshotsiz so'rovlarni beraman:

http_user_agentni hisoblash:

$columns(http_user_agent, count(*) c) FROM $table

GoodRate/BadRate:

$rate(countIf(status = 200) AS good, countIf(status != 200) AS bad) FROM $table

javob vaqti:

$rate(avg(request_time) as request_time) FROM $table

Yuqori oqimning javob vaqti (birinchi oqimning javob vaqti):

$rate(avg(arrayElement(upstream_response_time,1)) as upstream_response_time) FROM $table

Barcha vhost uchun jadval soni holati:

$columns(status, count(*) as c) from $table

Boshqaruv panelining umumiy ko'rinishi

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

avg() va quantile() ni solishtirish

avg()
Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi
kvantil()
Clickhouse-ga nginx jurnallarini yuborish uchun Avito-dan Nginx-log-collector yordam dasturi

xulosa:

Umid qilamanki, hamjamiyat nginx-log-collector-ni ishlab chiqish/sinov va undan foydalanishda ishtirok etadi.
Va kimdir nginx-log-collector-ni qo'llaganida, u sizga diskni, operativ xotirani, protsessorni qancha saqlaganini aytib beradi.

Telegram kanallari:

Millisoniyalar:

Millisekundlar kimni qiziqtiradi, yozing yoki ovoz bering, iltimos, bu erda nashr.

Manba: www.habr.com

a Izoh qo'shish