U diritaanka Nginx json logs iyadoo la adeegsanayo Vector si loo diro Clickhouse iyo Elasticsearch

U diritaanka Nginx json logs iyadoo la adeegsanayo Vector si loo diro Clickhouse iyo Elasticsearch

dulinka, loogu talagalay in lagu ururiyo, beddelo oo diro xogta log, cabbirada iyo dhacdooyinka.

→ Github

In lagu qoro luqadda Rust, waxaa lagu gartaa waxqabadka sare iyo isticmaalka RAM oo hooseeya marka la barbar dhigo kuwa lamid ah. Intaa waxaa dheer, fiiro gaar ah ayaa la siiyaa hawlaha la xiriira saxnaanta, gaar ahaan, awoodda lagu badbaadinayo dhacdooyinka aan la soo dirin ee kaydinta diskka iyo wareejinta faylasha.

Qaab dhismeed ahaan, Vector waa router dhacdo oo ka hela fariimaha hal ama ka badan ilo, si ikhtiyaari ah u codsanaya farriimahan isbeddellada, una dirto mid ama ka badan biyo-mareennada.

Vector waa bedelka filebeat iyo logstash, waxay u dhaqmi kartaa labada door (heli oo soo dir logs), faahfaahin dheeraad ah iyaga goobta.

Haddii Logstash ku taal silsiladda waxaa loo dhisay sida gelida → filter → wax soo saar ka dibna Vector waa ilobeddelashowaaskada

Tusaalooyinka waxaa laga heli karaa dukumeentiyada.

Tilmaantan waa tilmaan dib loo eegay Vyacheslav Rakhinsky. Tilmaamaha asalka ah waxaa ku jira habaynta geoip. Markii la tijaabiyay geoip ee shabakada gudaha, vector wuxuu bixiyay qalad.

Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30

Haddii qof uu u baahan yahay inuu farsameeyo geoip, ka dibna tixraac tilmaamaha asalka ah ee ka yimid Vyacheslav Rakhinsky.

Waxaan u habeyn doonaa isku-darka Nginx (Galitaanka galitaanka) → Vector (macmiil | Filebeat) → Vector (Server | Logstash) → si gaar ah Clickhouse iyo si gooni ah Elasticsearch. Waxaan ku rakibi doonaa 4 server. Inkasta oo aad ku dhaafi karto 3 server.

U diritaanka Nginx json logs iyadoo la adeegsanayo Vector si loo diro Clickhouse iyo Elasticsearch

Nidaamku waa wax sidan oo kale ah.

Dami Selinux dhammaan server-yadaada

sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot

Waxaan ku rakibnaa emulator server HTTP ah + utilities dhammaan server-yada

Anagoo ah emulator server HTTP ah waxaan isticmaali doonaa nodejs-stub-server ka Maxim Ignatenko

Nodejs-stub-server ma laha rpm. waa u samee rpm. rpm waxaa lagu soo ururin doonaa iyadoo la isticmaalayo Shirkadda Fedora Copr

Ku dar kaydiyaha antonpatsev/nodejs-stub-server

yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-server

Ku rakib nodejs-stub-server, Apache benchmark iyo shaashadda terminal multiplexer dhammaan adeegayaasha

yum -y install stub_http_server screen mc httpd-tools screen

Waxaan saxay wakhtiga jawaabta stub_http_server ee /var/lib/stub_http_server/stub_http_server.js faylka si ay u jiraan qoraallo badan.

var max_sleep = 10;

Aan bilowno stub_http_server.

systemctl start stub_http_server
systemctl enable stub_http_server

Ku rakibida Clickhouse serverka 3

ClickHouse waxay isticmaashaa SSE 4.2 tilmaame, markaa haddii aan si kale loo cayimin, taageerada processor-ka la isticmaalay waxay noqonaysaa shuruudo dheeraad ah. Waa kan amarka lagu hubinayo haddii processor-ka hadda uu taageersan yahay SSE 4.2:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

Marka hore waxaad u baahan tahay inaad ku xidho kaydka rasmiga ah:

sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64

Si loo rakibo xirmooyinka waxaad u baahan tahay inaad socodsiiso amarada soo socda:

sudo yum install -y clickhouse-server clickhouse-client

U ogolow clickhouse-server inuu dhagaysto kaadhka shabakada ee faylka /etc/clickhouse-server/config.xml

<listen_host>0.0.0.0</listen_host>

Beddelka heerka gegida laga beddelayo raadadka una beddelo khaladka

Debug

Hababka isku-buufinta caadiga ah:

min_compress_block_size  65536
max_compress_block_size  1048576

Si loo hawlgeliyo isku-xidhka Zstd, waxa lagula taliyey in aan la taaban qaab-dhismeedka, balse waa in la isticmaalo DDL.

U diritaanka Nginx json logs iyadoo la adeegsanayo Vector si loo diro Clickhouse iyo Elasticsearch

Ma heli karin sida loo isticmaalo isku-buuqa zstd iyada oo loo marayo DDL gudaha Google. Sidaas ayaan uga tagay sidii ay ahayd.

Asxaabta isticmaala isku-buuqa zstd gudaha Clickhouse, fadlan la wadaag tilmaamaha.

Si aad u bilowdo server-ka sidii daemon ahaan, orod:

service clickhouse-server start

Hadda aan u gudubno dejinta Clickhouse

Aad Clickhouse

clickhouse-client -h 172.26.10.109 -m

172.26.10.109 - IP ee server-ka halka Clickhouse lagu rakibay.

Aynu abuurno kaydka xogta vector

CREATE DATABASE vector;

Aynu eegno in kaydku jiro.

show databases;

Samee miiska vector.logs.

/* Это таблица где хранятся логи как есть */

CREATE TABLE vector.logs
(
    `node_name` String,
    `timestamp` DateTime,
    `server_name` String,
    `user_id` String,
    `request_full` String,
    `request_user_agent` String,
    `request_http_host` String,
    `request_uri` String,
    `request_scheme` String,
    `request_method` String,
    `request_length` UInt64,
    `request_time` Float32,
    `request_referrer` String,
    `response_status` UInt16,
    `response_body_bytes_sent` UInt64,
    `response_content_type` String,
    `remote_addr` IPv4,
    `remote_port` UInt32,
    `remote_user` String,
    `upstream_addr` IPv4,
    `upstream_port` UInt32,
    `upstream_bytes_received` UInt64,
    `upstream_bytes_sent` UInt64,
    `upstream_cache_status` String,
    `upstream_connect_time` Float32,
    `upstream_header_time` Float32,
    `upstream_response_length` UInt64,
    `upstream_response_time` Float32,
    `upstream_status` UInt16,
    `upstream_content_type` String,
    INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;

Waxaan hubineynaa in miisaska la sameeyay. Aan bilowno clickhouse-client codsina samee.

Aan tagno xogta xogta vector.

use vector;

Ok.

0 rows in set. Elapsed: 0.001 sec.

Aan eegno miisaska.

show tables;

┌─name────────────────┐
│ logs                │
└─────────────────────┘

Ku rakibida elasticsearch serverka 4aad si loogu diro isla xogta Elasticsearch marka la barbar dhigo Clickhouse

Ku dar furaha rpm dadweynaha

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Aan abuurno 2 repo:

/etc/yum.repos.d/elasticsearch.repo

[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

/etc/yum.repos.d/kibana.repo

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Ku rakib elasticsearch iyo kibana

yum install -y kibana elasticsearch

Maadaama ay ku jiri doonto 1 nuqul, waxaad u baahan tahay inaad ku darto waxyaabaha soo socda faylka /etc/elasticsearch/elasticsearch.yml:

discovery.type: single-node

Markaa vector-ku wuxuu u soo diri karaa xogta elasticsearch server kale, aynu bedelno network.host.

network.host: 0.0.0.0

Si aad ugu xidho kibana, u beddel xadka server.host ee faylka /etc/kibana/kibana.yml

server.host: "0.0.0.0"

Duq ah oo ku dar elasticsearch autostart

systemctl enable elasticsearch
systemctl start elasticsearch

iyo kibana

systemctl enable kibana
systemctl start kibana

Habaynta Elasticsearch ee qaabka hal-noodka ah 1 shard, 0 nuqul ah. Waxay u badan tahay inaad haysato koox tiro badan oo adeegayaal ah uma baahnid inaad tan samayso.

Tusmooyinka mustaqbalka, cusboonaysii qaabka caadiga ah:

curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' 

Ku rakibida dulinka sida beddelka Logstash ee serverka 2

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screen

Aan u dhigno Vector bedelka Logstash. Tafatirka faylka /etc/vector/vector.toml

# /etc/vector/vector.toml

data_dir = "/var/lib/vector"

[sources.nginx_input_vector]
  # General
  type                          = "vector"
  address                       = "0.0.0.0:9876"
  shutdown_timeout_secs         = 30

[transforms.nginx_parse_json]
  inputs                        = [ "nginx_input_vector" ]
  type                          = "json_parser"

[transforms.nginx_parse_add_defaults]
  inputs                        = [ "nginx_parse_json" ]
  type                          = "lua"
  version                       = "2"

  hooks.process = """
  function (event, emit)

    function split_first(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[1];
    end

    function split_last(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[#result];
    end

    event.log.upstream_addr             = split_first(split_last(event.log.upstream_addr, ', '), ':')
    event.log.upstream_bytes_received   = split_last(event.log.upstream_bytes_received, ', ')
    event.log.upstream_bytes_sent       = split_last(event.log.upstream_bytes_sent, ', ')
    event.log.upstream_connect_time     = split_last(event.log.upstream_connect_time, ', ')
    event.log.upstream_header_time      = split_last(event.log.upstream_header_time, ', ')
    event.log.upstream_response_length  = split_last(event.log.upstream_response_length, ', ')
    event.log.upstream_response_time    = split_last(event.log.upstream_response_time, ', ')
    event.log.upstream_status           = split_last(event.log.upstream_status, ', ')

    if event.log.upstream_addr == "" then
        event.log.upstream_addr = "127.0.0.1"
    end

    if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
        event.log.upstream_bytes_received = "0"
    end

    if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
        event.log.upstream_bytes_sent = "0"
    end

    if event.log.upstream_cache_status == "" then
        event.log.upstream_cache_status = "DISABLED"
    end

    if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
        event.log.upstream_connect_time = "0"
    end

    if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
        event.log.upstream_header_time = "0"
    end

    if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
        event.log.upstream_response_length = "0"
    end

    if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
        event.log.upstream_response_time = "0"
    end

    if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
        event.log.upstream_status = "0"
    end

    emit(event)

  end
  """

[transforms.nginx_parse_remove_fields]
    inputs                              = [ "nginx_parse_add_defaults" ]
    type                                = "remove_fields"
    fields                              = ["data", "file", "host", "source_type"]

[transforms.nginx_parse_coercer]

    type                                = "coercer"
    inputs                              = ["nginx_parse_remove_fields"]

    types.request_length = "int"
    types.request_time = "float"

    types.response_status = "int"
    types.response_body_bytes_sent = "int"

    types.remote_port = "int"

    types.upstream_bytes_received = "int"
    types.upstream_bytes_send = "int"
    types.upstream_connect_time = "float"
    types.upstream_header_time = "float"
    types.upstream_response_length = "int"
    types.upstream_response_time = "float"
    types.upstream_status = "int"

    types.timestamp = "timestamp"

[sinks.nginx_output_clickhouse]
    inputs   = ["nginx_parse_coercer"]
    type     = "clickhouse"

    database = "vector"
    healthcheck = true
    host = "http://172.26.10.109:8123" #  Адрес Clickhouse
    table = "logs"

    encoding.timestamp_format = "unix"

    buffer.type = "disk"
    buffer.max_size = 104900000
    buffer.when_full = "block"

    request.in_flight_limit = 20

[sinks.elasticsearch]
    type = "elasticsearch"
    inputs   = ["nginx_parse_coercer"]
    compression = "none"
    healthcheck = true
    # 172.26.10.116 - сервер где установен elasticsearch
    host = "http://172.26.10.116:9200" 
    index = "vector-%Y-%m-%d"

Waxaad hagaajin kartaa qaybta transforms.nginx_parse_add_defaults.

Tan Vyacheslav Rakhinsky Waxay u isticmaashaa qaabayntan CDN yar waxaana jiri kara qiimeyaal dhowr ah oo kor u kaca_*

Tusaale ahaan:

"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"

Haddii aysan tani ahayn xaaladdaada, markaa qaybtan waa la fududayn karaa

Aan u abuurno goobaha adeega systemd /etc/systemd/system/vector.service

# /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Kadib abuurista miisaska, waxaad socodsiin kartaa Vector

systemctl enable vector
systemctl start vector

Vector logs waxaa loo arki karaa sidan:

journalctl -f -u vector

Waa inay jiraan qoraallo sidan oo kale ah oo ku jira diiwaannada

INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.

On macmiilka (Server Web) - server 1aad

Server-ka leh nginx, waxaad u baahan tahay inaad joojiso ipv6, maadaama miiska diiwaanka ee clickhouse uu isticmaalo goobta upstream_addr IPv4, maadaama aanan ku isticmaalin ipv6 gudaha shabakadda. Haddii ipv6 aan la damin, waxaa jiri doona khaladaad:

DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)

Malaha akhristayaasha, ku dar taageerada ipv6.

Samee faylka /etc/sysctl.d/98-disable-ipv6.conf

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Codsiga goobaha

sysctl --system

Aynu rakibno nginx.

Waxaa lagu daray faylka kaydka nginx /etc/yum.repos.d/nginx.repo

[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true

Ku rakib xirmada nginx

yum install -y nginx

Marka hore, waxaan u baahanahay inaan ku habeyno qaabka log ee Nginx ee faylka /etc/nginx/nginx.conf

user  nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically

# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

# provides the configuration file context in which the directives that affect connection processing are specified.
events {
    # determines how much clients will be served per worker
    # max clients = worker_connections * worker_processes
    # max clients is also limited by the number of socket connections available on the system (~64k)
    worker_connections 4000;

    # optimized to serve many clients with each thread, essential for linux -- for testing environment
    use epoll;

    # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
    multi_accept on;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

log_format vector escape=json
    '{'
        '"node_name":"nginx-vector",'
        '"timestamp":"$time_iso8601",'
        '"server_name":"$server_name",'
        '"request_full": "$request",'
        '"request_user_agent":"$http_user_agent",'
        '"request_http_host":"$http_host",'
        '"request_uri":"$request_uri",'
        '"request_scheme": "$scheme",'
        '"request_method":"$request_method",'
        '"request_length":"$request_length",'
        '"request_time": "$request_time",'
        '"request_referrer":"$http_referer",'
        '"response_status": "$status",'
        '"response_body_bytes_sent":"$body_bytes_sent",'
        '"response_content_type":"$sent_http_content_type",'
        '"remote_addr": "$remote_addr",'
        '"remote_port": "$remote_port",'
        '"remote_user": "$remote_user",'
        '"upstream_addr": "$upstream_addr",'
        '"upstream_bytes_received": "$upstream_bytes_received",'
        '"upstream_bytes_sent": "$upstream_bytes_sent",'
        '"upstream_cache_status":"$upstream_cache_status",'
        '"upstream_connect_time":"$upstream_connect_time",'
        '"upstream_header_time":"$upstream_header_time",'
        '"upstream_response_length":"$upstream_response_length",'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_status": "$upstream_status",'
        '"upstream_content_type":"$upstream_http_content_type"'
    '}';

    access_log  /var/log/nginx/access.log  main;
    access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

Si aanad u jabin qaabayntaada hadda, Nginx waxa ay kuu ogolaanaysaa inaad haysato dhawr dardaaran oo gelis_log ah

access_log  /var/log/nginx/access.log  main;            # Стандартный лог
access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

Ha iloobin inaad ku darto qaanuun aad ku qorto diiwaanka cusub (haddii faylka logu aanu ku dhammaanayn .log)

Ka saar default.conf /etc/nginx/conf.d/

rm -f /etc/nginx/conf.d/default.conf

Ku dar martigeliyaha casriga ah /etc/nginx/conf.d/vhost1.conf

server {
    listen 80;
    server_name vhost1;
    location / {
        proxy_pass http://172.26.10.106:8080;
    }
}

Ku dar martigeliyaha casriga ah /etc/nginx/conf.d/vhost2.conf

server {
    listen 80;
    server_name vhost2;
    location / {
        proxy_pass http://172.26.10.108:8080;
    }
}

Ku dar martigeliyaha casriga ah /etc/nginx/conf.d/vhost3.conf

server {
    listen 80;
    server_name vhost3;
    location / {
        proxy_pass http://172.26.10.109:8080;
    }
}

Ku dar martigeliyaha casriga ah /etc/nginx/conf.d/vhost4.conf

server {
    listen 80;
    server_name vhost4;
    location / {
        proxy_pass http://172.26.10.116:8080;
    }
}

Ku dar martigeliyaha farsamada (172.26.10.106 ip ee server-ka halka nginx lagu rakibay) dhammaan server-yada faylka /etc/hosts:

172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4

Oo haddii wax waliba diyaar yihiin markaa

nginx -t 
systemctl restart nginx

Hadda aynu ku rakibno nafteena dulinka

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm

Aynu abuurno faylka dejinta ee systemd /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Oo habee beddelka Filebeat ee /etc/vector/vector.toml config. Cinwaanka IP 172.26.10.108 waa cinwaanka IP-ga ee server-ka log (Vector-Server)

data_dir = "/var/lib/vector"

[sources.nginx_file]
  type                          = "file"
  include                       = [ "/var/log/nginx/access.json.log" ]
  start_at_beginning            = false
  fingerprinting.strategy       = "device_and_inode"

[sinks.nginx_output_vector]
  type                          = "vector"
  inputs                        = [ "nginx_file" ]

  address                       = "172.26.10.108:9876"

Ha iloobin inaad ku darto isticmaale vector kooxda loo baahan yahay si uu u akhriyo galalka log. Tusaale ahaan, nginx in centos waxay abuurtaa qoraallo leh xuquuqaha kooxda.

usermod -a -G adm vector

Aan bilowno adeega vector

systemctl enable vector
systemctl start vector

Vector logs waxaa loo arki karaa sidan:

journalctl -f -u vector

Waa in sidan oo kale loo soo galaa

INFO vector::topology::builder: Healthcheck: Passed.

Tijaabada Cadaadiska

Waxaan sameynaa tijaabada anagoo adeegsanayna halbeegga Apache.

Xirmada httpd-tools ayaa lagu rakibay dhammaan server-yada

Waxaan ku bilownaa tijaabinta anagoo adeegsanayna bartilmaameedka Apache ee 4 server oo kala duwan oo shaashadda ah. Marka hore, waxaan bilaabaynaa shaashada terminal multiplexer, ka dibna waxaan bilaabeynaa tijaabinta anagoo adeegsanayna bartilmaameedka Apache. Sida loogu shaqeeyo shaashadda waxaad ka heli kartaa maqaal.

Laga soo bilaabo serverka 1aad

while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done

Laga soo bilaabo serverka 2aad

while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done

Laga soo bilaabo serverka 3aad

while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done

Laga soo bilaabo serverka 4aad

while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done

Aynu eegno xogta gudaha Clickhouse

Aad Clickhouse

clickhouse-client -h 172.26.10.109 -m

Samaynta weydiinta SQL

SELECT * FROM vector.logs;

┌─node_name────┬───────────timestamp─┬─server_name─┬─user_id─┬─request_full───┬─request_user_agent─┬─request_http_host─┬─request_uri─┬─request_scheme─┬─request_method─┬─request_length─┬─request_time─┬─request_referrer─┬─response_status─┬─response_body_bytes_sent─┬─response_content_type─┬───remote_addr─┬─remote_port─┬─remote_user─┬─upstream_addr─┬─upstream_port─┬─upstream_bytes_received─┬─upstream_bytes_sent─┬─upstream_cache_status─┬─upstream_connect_time─┬─upstream_header_time─┬─upstream_response_length─┬─upstream_response_time─┬─upstream_status─┬─upstream_content_type─┐
│ nginx-vector │ 2020-08-07 04:32:42 │ vhost1      │         │ GET / HTTP/1.0 │ 1server            │ vhost1            │ /           │ http           │ GET            │             66 │        0.028 │                  │             404 │                       27 │                       │ 172.26.10.106 │       45886 │             │ 172.26.10.106 │             0 │                     109 │                  97 │ DISABLED              │                     0 │                0.025 │                       27 │                  0.029 │             404 │                       │
└──────────────┴─────────────────────┴─────────────┴─────────┴────────────────┴────────────────────┴───────────────────┴─────────────┴────────────────┴────────────────┴────────────────┴──────────────┴──────────────────┴─────────────────┴──────────────────────────┴───────────────────────┴───────────────┴─────────────┴─────────────┴───────────────┴───────────────┴─────────────────────────┴─────────────────────┴───────────────────────┴───────────────────────┴──────────────────────┴──────────────────────────┴────────────────────────┴─────────────────┴───────────────────────

Ogow cabbirka miisaska gudaha Clickhouse

select concat(database, '.', table)                         as table,
       formatReadableSize(sum(bytes))                       as size,
       sum(rows)                                            as rows,
       max(modification_time)                               as latest_modification,
       sum(bytes)                                           as bytes_size,
       any(engine)                                          as engine,
       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;

Aynu ogaano inta ay le'eg tahay diiwaannada laga qaaday Clickhouse.

U diritaanka Nginx json logs iyadoo la adeegsanayo Vector si loo diro Clickhouse iyo Elasticsearch

Cabbirka miiska loggu waa 857.19 MB.

U diritaanka Nginx json logs iyadoo la adeegsanayo Vector si loo diro Clickhouse iyo Elasticsearch

Cabbirka isla xogta ku jirta tusaha Elasticsearch waa 4,5GB.

Haddii aadan ku qeexin xogta ku jirta halbeegyada, Clickhouse waxay qaadataa 4500/857.19 = 5.24 jeer ka yar kan Elasticsearch.

Xagga vector-ka, goobta isku-buufinta ayaa si caadi ah loo isticmaalaa.

Telegram chat by clickhouse
Telegram chat by Elasticsearch
Telegram chat by"Ururinta iyo falanqaynta nidaamka fariimaha"

Source: www.habr.com

Add a comment