Nginx json žurnālu sūtīŔana, izmantojot Vector, uz Clickhouse un Elasticsearch

Nginx json žurnālu sūtīŔana, izmantojot Vector, uz Clickhouse un Elasticsearch

vektors, kas izstrādāta, lai vāktu, pārveidotu un nosūtītu žurnāla datus, metriku un notikumus.

ā†’ GitHub

Tā kā tā ir rakstÄ«ta Rust valodā, tai salÄ«dzinājumā ar analogiem ir raksturÄ«ga augsta veiktspēja un zems RAM patēriņŔ. Turklāt liela uzmanÄ«ba tiek pievērsta funkcijām, kas saistÄ«tas ar pareizÄ«bu, jo Ä«paÅ”i iespējai saglabāt nenosÅ«tÄ«tos notikumus diska buferÄ« un pagriezt failus.

ArhitektÅ«ras ziņā Vector ir notikumu marÅ”rutētājs, kas saņem ziņojumus no viena vai vairākiem avotiem, pēc izvēles piemērojot Å”iem ziņojumiem pārvērtÄ«basun nosÅ«tot tos vienam vai vairākiem notekas.

Vector aizstāj filebeat un logstash, tas var darboties abās lomās (saņemt un sÅ«tÄ«t žurnālus), sÄ«kāka informācija par tiem TieÅ”saistē.

Ja Logstash ķēde ir veidota kā ievade ā†’ filtrs ā†’ izvade, tad Vector tā ir avoti ā†’ pārveido ā†’ izlietnes

Piemērus var atrast dokumentācijā.

Å Ä« instrukcija ir pārskatÄ«ta instrukcija no Vjačeslavs Rahinskis. Sākotnējās instrukcijas satur geoip apstrādi. Pārbaudot geoip no iekŔējā tÄ«kla, vektors radÄ«ja kļūdu.

Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=Ā«geoip.country_nameĀ» rate_limit_secs=30

Ja kādam ir jāapstrādā geoip, skatiet oriģinālos norādījumus no Vjačeslavs Rahinskis.

Mēs konfigurēsim kombināciju Nginx (piekļuves žurnāli) ā†’ Vector (klients | Filebeat) ā†’ Vector (serveris | Logstash) ā†’ atseviŔķi pakalpojumā Clickhouse un atseviŔķi programmā Elasticsearch. Mēs uzstādÄ«sim 4 serverus. Lai gan to var apiet ar 3 serveriem.

Nginx json žurnālu sūtīŔana, izmantojot Vector, uz Clickhouse un Elasticsearch

Shēma ir apmēram Ŕāda.

Atspējojiet Selinux visos savos serveros

sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot

Uz visiem serveriem instalējam HTTP servera emulatoru + utilītas

Mēs izmantosim kā HTTP servera emulatoru nodejs-stub-serveris no Maksims Ignatenko

Nodejs-stub-server nav rpm. Å eit izveidojiet tam apgriezienus. apgr./min tiks veidots, izmantojot Fedora Copr

Pievienojiet repozitoriju antonpatsev/nodejs-stub-server

yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-server

Instalējiet nodejs-stub-server, Apache etalonu un ekrāna termināļa multipleksoru visos serveros

yum -y install stub_http_server screen mc httpd-tools screen

Es izlaboju stub_http_server reakcijas laiku /var/lib/stub_http_server/stub_http_server.js failā, lai būtu vairāk žurnālu.

var max_sleep = 10;

Palaidīsim stub_http_serveri.

systemctl start stub_http_server
systemctl enable stub_http_server

Clickhouse uzstādīŔana serverī 3

ClickHouse izmanto SSE 4.2 instrukciju kopu, tāpēc, ja vien nav norādÄ«ts citādi, atbalsts tam izmantotajā procesorā kļūst par papildu sistēmas prasÄ«bu. Å eit ir komanda, lai pārbaudÄ«tu, vai paÅ”reizējais procesors atbalsta SSE 4.2:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

Vispirms jums ir jāpievieno oficiālais repozitorijs:

sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64

Lai instalētu pakotnes, ir jāizpilda Ŕādas komandas:

sudo yum install -y clickhouse-server clickhouse-client

Atļaut clickhouse-server klausīties tīkla karti failā /etc/clickhouse-server/config.xml

<listen_host>0.0.0.0</listen_host>

ReÄ£istrācijas lÄ«meņa maiņa no izsekoÅ”anas uz atkļūdoÅ”anu

atkļūdoŔanas

Standarta saspieŔanas iestatījumi:

min_compress_block_size  65536
max_compress_block_size  1048576

Lai aktivizētu Zstd saspieÅ”anu, tika ieteikts nepieskarties konfigurācijai, bet gan izmantot DDL.

Nginx json žurnālu sūtīŔana, izmantojot Vector, uz Clickhouse un Elasticsearch

Es nevarēju atrast, kā Google tÄ«klā izmantot zstd saspieÅ”anu, izmantojot DDL. Tāpēc es atstāju to kā ir.

Kolēģus, kuri Clickhouse izmanto zstd saspieÅ”anu, lÅ«dzu, dalieties ar instrukcijām.

Lai sāktu serveri kā dēmonu, palaidiet:

service clickhouse-server start

Tagad pāriesim pie Clickhouse iestatīŔanas

Dodieties uz Clickhouse

clickhouse-client -h 172.26.10.109 -m

172.26.10.109 ā€” tā servera IP, kurā ir instalēts Clickhouse.

Izveidosim vektoru datu bāzi

CREATE DATABASE vector;

Pārbaudīsim, vai datu bāze pastāv.

show databases;

Izveidojiet tabulu vector.logs.

/* Š­Ń‚Š¾ тŠ°Š±Š»ŠøцŠ° Š³Š“Šµ хрŠ°Š½ŃŃ‚ся Š»Š¾Š³Šø ŠŗŠ°Šŗ ŠµŃŃ‚ŃŒ */

CREATE TABLE vector.logs
(
    `node_name` String,
    `timestamp` DateTime,
    `server_name` String,
    `user_id` String,
    `request_full` String,
    `request_user_agent` String,
    `request_http_host` String,
    `request_uri` String,
    `request_scheme` String,
    `request_method` String,
    `request_length` UInt64,
    `request_time` Float32,
    `request_referrer` String,
    `response_status` UInt16,
    `response_body_bytes_sent` UInt64,
    `response_content_type` String,
    `remote_addr` IPv4,
    `remote_port` UInt32,
    `remote_user` String,
    `upstream_addr` IPv4,
    `upstream_port` UInt32,
    `upstream_bytes_received` UInt64,
    `upstream_bytes_sent` UInt64,
    `upstream_cache_status` String,
    `upstream_connect_time` Float32,
    `upstream_header_time` Float32,
    `upstream_response_length` UInt64,
    `upstream_response_time` Float32,
    `upstream_status` UInt16,
    `upstream_content_type` String,
    INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;

Mēs pārbaudām, vai tabulas ir izveidotas. Sāksim palaist clickhouse-client un iesniedziet pieprasījumu.

Dosimies uz vektoru datu bāzi.

use vector;

Ok.

0 rows in set. Elapsed: 0.001 sec.

Apskatīsim tabulas.

show tables;

ā”Œā”€nameā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
ā”‚ logs                ā”‚
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

Elasticsearch instalÄ“Å”ana ceturtajā serverÄ«, lai nosÅ«tÄ«tu tos paÅ”us datus Elasticsearch salÄ«dzināŔanai ar Clickhouse

Pievienojiet publisku apgriezienu skaita atslēgu

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Izveidosim 2 repo:

/etc/yum.repos.d/elasticsearch.repo

[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

/etc/yum.repos.d/kibana.repo

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Uzstādiet elasticsearch un kibana

yum install -y kibana elasticsearch

Tā kā tas būs vienā eksemplārā, failam /etc/elasticsearch/elasticsearch.yml jāpievieno tālāk norādītais:

discovery.type: single-node

Lai Å”is vektors varētu nosÅ«tÄ«t datus uz elasticsearch no cita servera, mainÄ«sim network.host.

network.host: 0.0.0.0

Lai izveidotu savienojumu ar kibana, mainiet server.host parametru failā /etc/kibana/kibana.yml

server.host: "0.0.0.0"

Vecs un ietver elasticsearch automātiskajā palaiŔanā

systemctl enable elasticsearch
systemctl start elasticsearch

un kibana

systemctl enable kibana
systemctl start kibana

Elasticsearch konfigurÄ“Å”ana viena mezgla režīmam 1 shard, 0 replika. Visticamāk, jums bÅ«s liels serveru skaits, un jums tas nav jādara.

Turpmākajiem indeksiem atjauniniet noklusējuma veidni:

curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' 

UzstādÄ«Å”ana vektors kā Logstash aizstājēju serverÄ« 2

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screen

IestatÄ«sim Vector kā Logstash aizstājēju. Faila /etc/vector/vector.toml rediģēŔana

# /etc/vector/vector.toml

data_dir = "/var/lib/vector"

[sources.nginx_input_vector]
  # General
  type                          = "vector"
  address                       = "0.0.0.0:9876"
  shutdown_timeout_secs         = 30

[transforms.nginx_parse_json]
  inputs                        = [ "nginx_input_vector" ]
  type                          = "json_parser"

[transforms.nginx_parse_add_defaults]
  inputs                        = [ "nginx_parse_json" ]
  type                          = "lua"
  version                       = "2"

  hooks.process = """
  function (event, emit)

    function split_first(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[1];
    end

    function split_last(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[#result];
    end

    event.log.upstream_addr             = split_first(split_last(event.log.upstream_addr, ', '), ':')
    event.log.upstream_bytes_received   = split_last(event.log.upstream_bytes_received, ', ')
    event.log.upstream_bytes_sent       = split_last(event.log.upstream_bytes_sent, ', ')
    event.log.upstream_connect_time     = split_last(event.log.upstream_connect_time, ', ')
    event.log.upstream_header_time      = split_last(event.log.upstream_header_time, ', ')
    event.log.upstream_response_length  = split_last(event.log.upstream_response_length, ', ')
    event.log.upstream_response_time    = split_last(event.log.upstream_response_time, ', ')
    event.log.upstream_status           = split_last(event.log.upstream_status, ', ')

    if event.log.upstream_addr == "" then
        event.log.upstream_addr = "127.0.0.1"
    end

    if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
        event.log.upstream_bytes_received = "0"
    end

    if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
        event.log.upstream_bytes_sent = "0"
    end

    if event.log.upstream_cache_status == "" then
        event.log.upstream_cache_status = "DISABLED"
    end

    if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
        event.log.upstream_connect_time = "0"
    end

    if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
        event.log.upstream_header_time = "0"
    end

    if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
        event.log.upstream_response_length = "0"
    end

    if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
        event.log.upstream_response_time = "0"
    end

    if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
        event.log.upstream_status = "0"
    end

    emit(event)

  end
  """

[transforms.nginx_parse_remove_fields]
    inputs                              = [ "nginx_parse_add_defaults" ]
    type                                = "remove_fields"
    fields                              = ["data", "file", "host", "source_type"]

[transforms.nginx_parse_coercer]

    type                                = "coercer"
    inputs                              = ["nginx_parse_remove_fields"]

    types.request_length = "int"
    types.request_time = "float"

    types.response_status = "int"
    types.response_body_bytes_sent = "int"

    types.remote_port = "int"

    types.upstream_bytes_received = "int"
    types.upstream_bytes_send = "int"
    types.upstream_connect_time = "float"
    types.upstream_header_time = "float"
    types.upstream_response_length = "int"
    types.upstream_response_time = "float"
    types.upstream_status = "int"

    types.timestamp = "timestamp"

[sinks.nginx_output_clickhouse]
    inputs   = ["nginx_parse_coercer"]
    type     = "clickhouse"

    database = "vector"
    healthcheck = true
    host = "http://172.26.10.109:8123" #  ŠŠ“рŠµŃ Clickhouse
    table = "logs"

    encoding.timestamp_format = "unix"

    buffer.type = "disk"
    buffer.max_size = 104900000
    buffer.when_full = "block"

    request.in_flight_limit = 20

[sinks.elasticsearch]
    type = "elasticsearch"
    inputs   = ["nginx_parse_coercer"]
    compression = "none"
    healthcheck = true
    # 172.26.10.116 - сŠµŃ€Š²ŠµŃ€ Š³Š“Šµ устŠ°Š½Š¾Š²ŠµŠ½ elasticsearch
    host = "http://172.26.10.116:9200" 
    index = "vector-%Y-%m-%d"

Varat pielāgot sadaļu transforms.nginx_parse_add_defaults.

Kā Vjačeslavs Rahinskis izmanto Ŕīs konfigurācijas nelielam CDN, un augÅ”pusē var bÅ«t vairākas vērtÄ«bas_*

Piemēram:

"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"

Ja tā nav jūsu situācija, tad Ŕo sadaļu var vienkārŔot

Izveidosim servisa iestatījumus systemd /etc/systemd/system/vector.service

# /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Pēc tabulu izveides varat palaist Vector

systemctl enable vector
systemctl start vector

Vektoru žurnālus var skatīt Ŕādi:

journalctl -f -u vector

Žurnālos vajadzētu bÅ«t Ŕādiem ierakstiem

INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.

Klientā (tīmekļa serverī) - 1. serveris

Serverī ar nginx ir jāatspējo ipv6, jo žurnālu tabulā Clickhouse tiek izmantots lauks upstream_addr IPv4, jo es neizmantoju ipv6 tīklā. Ja ipv6 nav izslēgts, tiks parādītas kļūdas:

DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)

Varbūt lasītāji, pievienojiet ipv6 atbalstu.

Izveidojiet failu /etc/sysctl.d/98-disable-ipv6.conf

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

IestatÄ«jumu piemēroÅ”ana

sysctl --system

Instalēsim nginx.

Pievienots nginx repozitorija fails /etc/yum.repos.d/nginx.repo

[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true

Instalējiet nginx pakotni

yum install -y nginx

Pirmkārt, mums ir jākonfigurē žurnāla formāts Nginx failā /etc/nginx/nginx.conf.

user  nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically

# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

# provides the configuration file context in which the directives that affect connection processing are specified.
events {
    # determines how much clients will be served per worker
    # max clients = worker_connections * worker_processes
    # max clients is also limited by the number of socket connections available on the system (~64k)
    worker_connections 4000;

    # optimized to serve many clients with each thread, essential for linux -- for testing environment
    use epoll;

    # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
    multi_accept on;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

log_format vector escape=json
    '{'
        '"node_name":"nginx-vector",'
        '"timestamp":"$time_iso8601",'
        '"server_name":"$server_name",'
        '"request_full": "$request",'
        '"request_user_agent":"$http_user_agent",'
        '"request_http_host":"$http_host",'
        '"request_uri":"$request_uri",'
        '"request_scheme": "$scheme",'
        '"request_method":"$request_method",'
        '"request_length":"$request_length",'
        '"request_time": "$request_time",'
        '"request_referrer":"$http_referer",'
        '"response_status": "$status",'
        '"response_body_bytes_sent":"$body_bytes_sent",'
        '"response_content_type":"$sent_http_content_type",'
        '"remote_addr": "$remote_addr",'
        '"remote_port": "$remote_port",'
        '"remote_user": "$remote_user",'
        '"upstream_addr": "$upstream_addr",'
        '"upstream_bytes_received": "$upstream_bytes_received",'
        '"upstream_bytes_sent": "$upstream_bytes_sent",'
        '"upstream_cache_status":"$upstream_cache_status",'
        '"upstream_connect_time":"$upstream_connect_time",'
        '"upstream_header_time":"$upstream_header_time",'
        '"upstream_response_length":"$upstream_response_length",'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_status": "$upstream_status",'
        '"upstream_content_type":"$upstream_http_content_type"'
    '}';

    access_log  /var/log/nginx/access.log  main;
    access_log  /var/log/nginx/access.json.log vector;      # ŠŠ¾Š²Ń‹Š¹ Š»Š¾Š³ Š² фŠ¾Ń€Š¼Š°Ń‚Šµ json

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

Lai nesabojātu jÅ«su paÅ”reizējo konfigurāciju, Nginx ļauj jums izmantot vairākas access_log direktÄ«vas

access_log  /var/log/nginx/access.log  main;            # Š”тŠ°Š½Š“Š°Ń€Ń‚Š½Ń‹Š¹ Š»Š¾Š³
access_log  /var/log/nginx/access.json.log vector;      # ŠŠ¾Š²Ń‹Š¹ Š»Š¾Š³ Š² фŠ¾Ń€Š¼Š°Ń‚Šµ json

Neaizmirstiet pievienot kārtulu logrotate jauniem žurnāliem (ja žurnāla fails nebeidzas ar .log)

Noņemiet default.conf no /etc/nginx/conf.d/

rm -f /etc/nginx/conf.d/default.conf

Pievienojiet virtuālo resursdatoru /etc/nginx/conf.d/vhost1.conf

server {
    listen 80;
    server_name vhost1;
    location / {
        proxy_pass http://172.26.10.106:8080;
    }
}

Pievienojiet virtuālo resursdatoru /etc/nginx/conf.d/vhost2.conf

server {
    listen 80;
    server_name vhost2;
    location / {
        proxy_pass http://172.26.10.108:8080;
    }
}

Pievienojiet virtuālo resursdatoru /etc/nginx/conf.d/vhost3.conf

server {
    listen 80;
    server_name vhost3;
    location / {
        proxy_pass http://172.26.10.109:8080;
    }
}

Pievienojiet virtuālo resursdatoru /etc/nginx/conf.d/vhost4.conf

server {
    listen 80;
    server_name vhost4;
    location / {
        proxy_pass http://172.26.10.116:8080;
    }
}

Pievienojiet virtuālos saimniekdatorus (172.26.10.106 ip serverī, kurā ir instalēts nginx) visiem serveriem failā /etc/hosts:

172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4

Un ja viss ir gatavs, tad

nginx -t 
systemctl restart nginx

Tagad uzstādīsim to paŔi vektors

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm

Izveidosim iestatījumu failu systemd /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Un konfigurējiet Filebeat aizstāŔanu /etc/vector/vector.toml konfigurācijā. IP adrese 172.26.10.108 ir žurnāla servera (vektoru servera) IP adrese.

data_dir = "/var/lib/vector"

[sources.nginx_file]
  type                          = "file"
  include                       = [ "/var/log/nginx/access.json.log" ]
  start_at_beginning            = false
  fingerprinting.strategy       = "device_and_inode"

[sinks.nginx_output_vector]
  type                          = "vector"
  inputs                        = [ "nginx_file" ]

  address                       = "172.26.10.108:9876"

Neaizmirstiet pievienot vektora lietotāju vajadzÄ«gajai grupai, lai viņŔ varētu lasÄ«t žurnāla failus. Piemēram, nginx in centos izveido žurnālus ar adm grupas tiesÄ«bām.

usermod -a -G adm vector

Sāksim vektora pakalpojumu

systemctl enable vector
systemctl start vector

Vektoru žurnālus var skatīt Ŕādi:

journalctl -f -u vector

Žurnālos vajadzētu bÅ«t Ŕādam ierakstam

INFO vector::topology::builder: Healthcheck: Passed.

Stresa testēŔana

TestēŔana tiek veikta, izmantojot Apache etalonu.

httpd-tools pakotne tika instalēta visos serveros

Mēs sākam testÄ“Å”anu, izmantojot Apache etalonu no 4 dažādiem ekrāna serveriem. Vispirms mēs palaižam ekrāna termināļa multipleksoru un pēc tam sākam testÄ“Å”anu, izmantojot Apache etalonu. Kā strādāt ar ekrānu, varat atrast raksts.

No 1. servera

while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done

No 2. servera

while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done

No 3. servera

while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done

No 4. servera

while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done

Pārbaudīsim datus Clickhouse

Dodieties uz Clickhouse

clickhouse-client -h 172.26.10.109 -m

SQL vaicājuma veikŔana

SELECT * FROM vector.logs;

ā”Œā”€node_nameā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€timestampā”€ā”¬ā”€server_nameā”€ā”¬ā”€user_idā”€ā”¬ā”€request_fullā”€ā”€ā”€ā”¬ā”€request_user_agentā”€ā”¬ā”€request_http_hostā”€ā”¬ā”€request_uriā”€ā”¬ā”€request_schemeā”€ā”¬ā”€request_methodā”€ā”¬ā”€request_lengthā”€ā”¬ā”€request_timeā”€ā”¬ā”€request_referrerā”€ā”¬ā”€response_statusā”€ā”¬ā”€response_body_bytes_sentā”€ā”¬ā”€response_content_typeā”€ā”¬ā”€ā”€ā”€remote_addrā”€ā”¬ā”€remote_portā”€ā”¬ā”€remote_userā”€ā”¬ā”€upstream_addrā”€ā”¬ā”€upstream_portā”€ā”¬ā”€upstream_bytes_receivedā”€ā”¬ā”€upstream_bytes_sentā”€ā”¬ā”€upstream_cache_statusā”€ā”¬ā”€upstream_connect_timeā”€ā”¬ā”€upstream_header_timeā”€ā”¬ā”€upstream_response_lengthā”€ā”¬ā”€upstream_response_timeā”€ā”¬ā”€upstream_statusā”€ā”¬ā”€upstream_content_typeā”€ā”
ā”‚ nginx-vector ā”‚ 2020-08-07 04:32:42 ā”‚ vhost1      ā”‚         ā”‚ GET / HTTP/1.0 ā”‚ 1server            ā”‚ vhost1            ā”‚ /           ā”‚ http           ā”‚ GET            ā”‚             66 ā”‚        0.028 ā”‚                  ā”‚             404 ā”‚                       27 ā”‚                       ā”‚ 172.26.10.106 ā”‚       45886 ā”‚             ā”‚ 172.26.10.106 ā”‚             0 ā”‚                     109 ā”‚                  97 ā”‚ DISABLED              ā”‚                     0 ā”‚                0.025 ā”‚                       27 ā”‚                  0.029 ā”‚             404 ā”‚                       ā”‚
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€

Uzziniet Clickhouse galdu izmērus

select concat(database, '.', table)                         as table,
       formatReadableSize(sum(bytes))                       as size,
       sum(rows)                                            as rows,
       max(modification_time)                               as latest_modification,
       sum(bytes)                                           as bytes_size,
       any(engine)                                          as engine,
       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;

Noskaidrosim, cik daudz baļķu aizņēma Clickhouse.

Nginx json žurnālu sūtīŔana, izmantojot Vector, uz Clickhouse un Elasticsearch

Žurnālu tabulas izmērs ir 857.19 MB.

Nginx json žurnālu sūtīŔana, izmantojot Vector, uz Clickhouse un Elasticsearch

To paÅ”u datu lielums Elasticsearch rādÄ«tājā ir 4,5 GB.

Ja parametros nenorādīsiet datus vektorā, Clickhouse aizņem 4500/857.19 = 5.24 reizes mazāk nekā Elasticsearch.

Vektorā pēc noklusējuma tiek izmantots saspieÅ”anas lauks.

Telegram tērzÄ“Å”ana ar Clickhouse
Telegram tērzÄ“Å”ana ar Elastikas meklÄ“Å”ana
Telegrammas tērzÄ“Å”ana ar "Sistēmas savākÅ”ana un analÄ«ze ziņas"

Avots: www.habr.com

Pievieno komentāru