Xa Nginx json cav siv Vector rau Clickhouse thiab Elasticsearch

Xa Nginx json cav siv Vector rau Clickhouse thiab Elasticsearch

Vector, tsim los sau, hloov pauv thiab xa cov ntaub ntawv teev tseg, ntsuas thiab xwm txheej.

→ github

Tau sau ua lus Rust, nws yog tus cwj pwm los ntawm kev ua haujlwm siab thiab kev siv RAM tsawg piv rau nws cov analogues. Tsis tas li ntawd, kev saib xyuas ntau yog them rau cov haujlwm ntsig txog qhov raug, tshwj xeeb, muaj peev xwm txuag cov xwm txheej tsis xa mus rau qhov tsis nyob ntawm disk thiab tig cov ntaub ntawv.

Architecturally, Vector yog ib qho kev tshwm sim router uas tau txais cov lus los ntawm ib los yog ntau tshaj qhov chaw, optionally thov tshaj cov lus no kev hloov pauv, thiab xa lawv mus rau ib qho lossis ntau dua dej ntws.

Vector yog hloov pauv rau filebeat thiab logstash, nws tuaj yeem ua haujlwm hauv ob lub luag haujlwm (tau txais thiab xa cov cav), cov ntsiab lus ntxiv ntawm lawv qhov chaw.

Yog hais tias nyob rau hauv Logstash cov saw yog tsim raws li cov tswv yim → lim → tso zis ces nyob rau hauv Vector nws yog qhov chawhloovtog

Piv txwv tuaj yeem pom hauv cov ntaub ntawv.

Cov lus qhia no yog cov lus qhia hloov kho los ntawm Vyacheslav Rakhinsky. Thawj cov lus qhia muaj cov txheej txheem geoip. Thaum kuaj geoip los ntawm lub network sab hauv, vector muab qhov yuam kev.

Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30

Yog tias leej twg xav tau cov txheej txheem geoip, tom qab ntawd xa mus rau cov lus qhia thawj ntawm Vyacheslav Rakhinsky.

Peb yuav teeb tsa kev sib txuas ntawm Nginx (Access cav) → Vector (Client | Filebeat) → Vector (Server | Logstash) → cais hauv Clickhouse thiab cais hauv Elasticsearch. Peb yuav nruab 4 servers. Txawm hais tias koj tuaj yeem hla nws nrog 3 servers.

Xa Nginx json cav siv Vector rau Clickhouse thiab Elasticsearch

Lub tswvyim yog ib yam li no.

Disable Selinux ntawm tag nrho koj cov servers

sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot

Peb nruab HTTP server emulator + cov khoom siv hluav taws xob ntawm txhua lub servers

Raws li HTTP server emulator peb yuav siv nodejs-stub-server los ntawm Maxim Ignatenko

Nodejs-stub-server tsis muaj rpm. nws yog tsim rpm rau nws. rpm yuav tsim siv Fedora Copr

Ntxiv lub repository antonpatsev/nodejs-stub-server

yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-server

Nruab nodejs-stub-server, Apache benchmark thiab tshuaj ntsuam davhlau ya nyob twg multiplexer ntawm tag nrho cov servers

yum -y install stub_http_server screen mc httpd-tools screen

Kuv tau kho lub sij hawm teb stub_http_server hauv /var/lib/stub_http_server/stub_http_server.js cov ntaub ntawv kom muaj ntau lub cav.

var max_sleep = 10;

Cia peb pib stub_http_server.

systemctl start stub_http_server
systemctl enable stub_http_server

Clickhouse installation ntawm server 3

ClickHouse siv cov txheej txheem SSE 4.2, yog li tshwj tsis yog tias tau teev tseg, kev txhawb nqa rau nws hauv processor siv dhau los ua qhov yuav tsum tau ua ntxiv. Nov yog cov lus txib los xyuas seb tus processor tam sim no txhawb SSE 4.2:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

Ua ntej koj yuav tsum txuas lub official repository:

sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64

Txhawm rau txhim kho pob khoom koj yuav tsum tau khiav cov lus txib hauv qab no:

sudo yum install -y clickhouse-server clickhouse-client

Cia clickhouse-server mloog daim npav network hauv cov ntaub ntawv /etc/clickhouse-server/config.xml

<listen_host>0.0.0.0</listen_host>

Hloov cov qib nkag los ntawm kab mus rau debug

debug

Standard compression nqis:

min_compress_block_size  65536
max_compress_block_size  1048576

Txhawm rau qhib Zstd compression, nws tau qhia kom tsis txhob kov lub teeb tsa, tab sis siv DDL.

Xa Nginx json cav siv Vector rau Clickhouse thiab Elasticsearch

Kuv nrhiav tsis tau yuav siv zstd compression li cas ntawm DDL hauv Google. Yog li ntawd kuv thiaj li tso nws tseg.

Cov npoj yaig uas siv zstd compression hauv Clickhouse, thov qhia cov lus qhia.

Txhawm rau pib lub server ua tus daemon, khiav:

service clickhouse-server start

Tam sim no cia peb mus rau kev teeb tsa Clickhouse

Mus rau Clickhouse

clickhouse-client -h 172.26.10.109 -m

172.26.10.109 - IP ntawm lub server qhov twg Clickhouse raug teeb tsa.

Wb tsim ib tug vector database

CREATE DATABASE vector;

Cia peb tshawb xyuas tias cov ntaub ntawv muaj nyob.

show databases;

Tsim ib lub rooj vector.logs.

/* Это таблица где хранятся логи как есть */

CREATE TABLE vector.logs
(
    `node_name` String,
    `timestamp` DateTime,
    `server_name` String,
    `user_id` String,
    `request_full` String,
    `request_user_agent` String,
    `request_http_host` String,
    `request_uri` String,
    `request_scheme` String,
    `request_method` String,
    `request_length` UInt64,
    `request_time` Float32,
    `request_referrer` String,
    `response_status` UInt16,
    `response_body_bytes_sent` UInt64,
    `response_content_type` String,
    `remote_addr` IPv4,
    `remote_port` UInt32,
    `remote_user` String,
    `upstream_addr` IPv4,
    `upstream_port` UInt32,
    `upstream_bytes_received` UInt64,
    `upstream_bytes_sent` UInt64,
    `upstream_cache_status` String,
    `upstream_connect_time` Float32,
    `upstream_header_time` Float32,
    `upstream_response_length` UInt64,
    `upstream_response_time` Float32,
    `upstream_status` UInt16,
    `upstream_content_type` String,
    INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;

Peb xyuas tias cov ntxhuav tau tsim. Cia peb pib clickhouse-client thiab ua ib qho kev thov.

Cia peb mus rau vector database.

use vector;

Ok.

0 rows in set. Elapsed: 0.001 sec.

Cia peb saib cov rooj.

show tables;

┌─name────────────────┐
│ logs                │
└─────────────────────┘

Txhim kho elasticsearch ntawm 4th server kom xa cov ntaub ntawv tib yam rau Elasticsearch rau kev sib piv nrog Clickhouse

Ntxiv tus yuam sij rau pej xeem rpm

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Wb tsim 2 repo:

/etc/yum.repos.d/elasticsearch.repo

[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

/etc/yum.repos.d/kibana.repo

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Nruab elasticsearch thiab kibana

yum install -y kibana elasticsearch

Txij li thaum nws yuav nyob rau hauv 1 daim ntawv, koj yuav tsum tau ntxiv cov nram qab no rau cov ntaub ntawv /etc/elasticsearch/elasticsearch.yml:

discovery.type: single-node

Yog li ntawd vector tuaj yeem xa cov ntaub ntawv mus rau elasticsearch los ntawm lwm tus neeg rau zaub mov, cia peb hloov network.host.

network.host: 0.0.0.0

Txhawm rau txuas rau kibana, hloov server.host parameter hauv cov ntaub ntawv /etc/kibana/kibana.yml

server.host: "0.0.0.0"

Laus thiab suav nrog elasticsearch hauv autostart

systemctl enable elasticsearch
systemctl start elasticsearch

thiab kibana

systemctl enable kibana
systemctl start kibana

Configuring Elasticsearch rau ib leeg-node hom 1 shard, 0 replica. Feem ntau koj yuav muaj pawg ntawm ntau tus servers thiab koj tsis tas yuav ua qhov no.

Rau yav tom ntej indexes, hloov kho lub neej ntawd template:

curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' 

chaw Vector raws li kev hloov rau Logstash ntawm server 2

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screen

Cia peb teeb Vector ua qhov hloov rau Logstash. Kho cov ntaub ntawv /etc/vector/vector.toml

# /etc/vector/vector.toml

data_dir = "/var/lib/vector"

[sources.nginx_input_vector]
  # General
  type                          = "vector"
  address                       = "0.0.0.0:9876"
  shutdown_timeout_secs         = 30

[transforms.nginx_parse_json]
  inputs                        = [ "nginx_input_vector" ]
  type                          = "json_parser"

[transforms.nginx_parse_add_defaults]
  inputs                        = [ "nginx_parse_json" ]
  type                          = "lua"
  version                       = "2"

  hooks.process = """
  function (event, emit)

    function split_first(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[1];
    end

    function split_last(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[#result];
    end

    event.log.upstream_addr             = split_first(split_last(event.log.upstream_addr, ', '), ':')
    event.log.upstream_bytes_received   = split_last(event.log.upstream_bytes_received, ', ')
    event.log.upstream_bytes_sent       = split_last(event.log.upstream_bytes_sent, ', ')
    event.log.upstream_connect_time     = split_last(event.log.upstream_connect_time, ', ')
    event.log.upstream_header_time      = split_last(event.log.upstream_header_time, ', ')
    event.log.upstream_response_length  = split_last(event.log.upstream_response_length, ', ')
    event.log.upstream_response_time    = split_last(event.log.upstream_response_time, ', ')
    event.log.upstream_status           = split_last(event.log.upstream_status, ', ')

    if event.log.upstream_addr == "" then
        event.log.upstream_addr = "127.0.0.1"
    end

    if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
        event.log.upstream_bytes_received = "0"
    end

    if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
        event.log.upstream_bytes_sent = "0"
    end

    if event.log.upstream_cache_status == "" then
        event.log.upstream_cache_status = "DISABLED"
    end

    if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
        event.log.upstream_connect_time = "0"
    end

    if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
        event.log.upstream_header_time = "0"
    end

    if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
        event.log.upstream_response_length = "0"
    end

    if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
        event.log.upstream_response_time = "0"
    end

    if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
        event.log.upstream_status = "0"
    end

    emit(event)

  end
  """

[transforms.nginx_parse_remove_fields]
    inputs                              = [ "nginx_parse_add_defaults" ]
    type                                = "remove_fields"
    fields                              = ["data", "file", "host", "source_type"]

[transforms.nginx_parse_coercer]

    type                                = "coercer"
    inputs                              = ["nginx_parse_remove_fields"]

    types.request_length = "int"
    types.request_time = "float"

    types.response_status = "int"
    types.response_body_bytes_sent = "int"

    types.remote_port = "int"

    types.upstream_bytes_received = "int"
    types.upstream_bytes_send = "int"
    types.upstream_connect_time = "float"
    types.upstream_header_time = "float"
    types.upstream_response_length = "int"
    types.upstream_response_time = "float"
    types.upstream_status = "int"

    types.timestamp = "timestamp"

[sinks.nginx_output_clickhouse]
    inputs   = ["nginx_parse_coercer"]
    type     = "clickhouse"

    database = "vector"
    healthcheck = true
    host = "http://172.26.10.109:8123" #  Адрес Clickhouse
    table = "logs"

    encoding.timestamp_format = "unix"

    buffer.type = "disk"
    buffer.max_size = 104900000
    buffer.when_full = "block"

    request.in_flight_limit = 20

[sinks.elasticsearch]
    type = "elasticsearch"
    inputs   = ["nginx_parse_coercer"]
    compression = "none"
    healthcheck = true
    # 172.26.10.116 - сервер где установен elasticsearch
    host = "http://172.26.10.116:9200" 
    index = "vector-%Y-%m-%d"

Koj tuaj yeem kho qhov transforms.nginx_parse_add_defaults seem.

txij li thaum Vyacheslav Rakhinsky siv cov kev teeb tsa no rau CDN me me thiab tuaj yeem muaj ob peb qhov tseem ceeb hauv upstream_*

Piv txwv li:

"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"

Yog tias qhov no tsis yog koj qhov xwm txheej, ces ntu no tuaj yeem yooj yim

Cia peb tsim cov kev pabcuam rau systemd /etc/systemd/system/vector.service

# /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Tom qab tsim cov ntxhuav, koj tuaj yeem khiav Vector

systemctl enable vector
systemctl start vector

Vector cav tuaj yeem pom zoo li no:

journalctl -f -u vector

Yuav tsum muaj cov ntaub ntawv zoo li no hauv cov ntaub ntawv

INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.

Ntawm tus neeg siv khoom (Web server) - 1st server

Ntawm tus neeg rau zaub mov nrog nginx, koj yuav tsum tau lov tes taw ipv6, txij li cov lus teev hauv clickhouse siv lub teb upstream_addr IPv4, vim kuv tsis siv ipv6 hauv lub network. Yog tias ipv6 tsis raug kaw, yuav muaj qhov yuam kev:

DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)

Tej zaum cov nyeem, ntxiv ipv6 kev txhawb nqa.

Tsim cov ntaub ntawv /etc/sysctl.d/98-disable-ipv6.conf

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Siv cov kev teeb tsa

sysctl --system

Cia peb nruab nginx.

Ntxiv nginx repository cov ntaub ntawv /etc/yum.repos.d/nginx.repo

[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true

Nruab lub nginx pob

yum install -y nginx

Ua ntej, peb yuav tsum teeb tsa lub cav hom hauv Nginx hauv cov ntaub ntawv /etc/nginx/nginx.conf

user  nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically

# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

# provides the configuration file context in which the directives that affect connection processing are specified.
events {
    # determines how much clients will be served per worker
    # max clients = worker_connections * worker_processes
    # max clients is also limited by the number of socket connections available on the system (~64k)
    worker_connections 4000;

    # optimized to serve many clients with each thread, essential for linux -- for testing environment
    use epoll;

    # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
    multi_accept on;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

log_format vector escape=json
    '{'
        '"node_name":"nginx-vector",'
        '"timestamp":"$time_iso8601",'
        '"server_name":"$server_name",'
        '"request_full": "$request",'
        '"request_user_agent":"$http_user_agent",'
        '"request_http_host":"$http_host",'
        '"request_uri":"$request_uri",'
        '"request_scheme": "$scheme",'
        '"request_method":"$request_method",'
        '"request_length":"$request_length",'
        '"request_time": "$request_time",'
        '"request_referrer":"$http_referer",'
        '"response_status": "$status",'
        '"response_body_bytes_sent":"$body_bytes_sent",'
        '"response_content_type":"$sent_http_content_type",'
        '"remote_addr": "$remote_addr",'
        '"remote_port": "$remote_port",'
        '"remote_user": "$remote_user",'
        '"upstream_addr": "$upstream_addr",'
        '"upstream_bytes_received": "$upstream_bytes_received",'
        '"upstream_bytes_sent": "$upstream_bytes_sent",'
        '"upstream_cache_status":"$upstream_cache_status",'
        '"upstream_connect_time":"$upstream_connect_time",'
        '"upstream_header_time":"$upstream_header_time",'
        '"upstream_response_length":"$upstream_response_length",'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_status": "$upstream_status",'
        '"upstream_content_type":"$upstream_http_content_type"'
    '}';

    access_log  /var/log/nginx/access.log  main;
    access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

Txhawm rau kom tsis txhob ua txhaum koj qhov kev teeb tsa tam sim no, Nginx tso cai rau koj kom muaj ntau cov lus qhia nkag mus

access_log  /var/log/nginx/access.log  main;            # Стандартный лог
access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

Tsis txhob hnov ​​​​qab ntxiv ib txoj cai rau logrotate rau cov cav tshiab (yog tias cov ntaub ntawv log tsis xaus nrog .log)

Tshem tawm default.conf los ntawm /etc/nginx/conf.d/

rm -f /etc/nginx/conf.d/default.conf

Ntxiv virtual host /etc/nginx/conf.d/vhost1.conf

server {
    listen 80;
    server_name vhost1;
    location / {
        proxy_pass http://172.26.10.106:8080;
    }
}

Ntxiv virtual host /etc/nginx/conf.d/vhost2.conf

server {
    listen 80;
    server_name vhost2;
    location / {
        proxy_pass http://172.26.10.108:8080;
    }
}

Ntxiv virtual host /etc/nginx/conf.d/vhost3.conf

server {
    listen 80;
    server_name vhost3;
    location / {
        proxy_pass http://172.26.10.109:8080;
    }
}

Ntxiv virtual host /etc/nginx/conf.d/vhost4.conf

server {
    listen 80;
    server_name vhost4;
    location / {
        proxy_pass http://172.26.10.116:8080;
    }
}

Ntxiv virtual hosts (172.26.10.106 ip ntawm lub server qhov twg nginx tau teeb tsa) rau txhua tus servers rau /etc/hosts file:

172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4

Thiab yog tias txhua yam yog npaj txhij

nginx -t 
systemctl restart nginx

Tam sim no cia peb nruab nws peb tus kheej Vector

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm

Cia peb tsim cov ntaub ntawv teeb tsa rau systemd /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Thiab teeb tsa qhov hloov pauv Filebeat hauv /etc/vector/vector.toml config. IP chaw nyob 172.26.10.108 yog tus IP chaw nyob ntawm lub cav server (Vector-Server)

data_dir = "/var/lib/vector"

[sources.nginx_file]
  type                          = "file"
  include                       = [ "/var/log/nginx/access.json.log" ]
  start_at_beginning            = false
  fingerprinting.strategy       = "device_and_inode"

[sinks.nginx_output_vector]
  type                          = "vector"
  inputs                        = [ "nginx_file" ]

  address                       = "172.26.10.108:9876"

Tsis txhob hnov ​​​​qab ntxiv tus neeg siv vector rau pawg xav tau kom nws tuaj yeem nyeem cov ntaub ntawv teev cia. Piv txwv li, nginx hauv centos tsim cov cav nrog adm pawg cai.

usermod -a -G adm vector

Cia peb pib qhov kev pabcuam vector

systemctl enable vector
systemctl start vector

Vector cav tuaj yeem pom zoo li no:

journalctl -f -u vector

Yuav tsum muaj qhov nkag zoo li no hauv cov ntawv teev tseg

INFO vector::topology::builder: Healthcheck: Passed.

Kev Ntsuas Kev Nyuaj Siab

Kev ntsuam xyuas yog ua tiav siv Apache benchmark.

Lub pob httpd-tools tau nruab rau txhua lub servers

Peb pib sim siv Apache benchmark los ntawm 4 txawv servers hauv kev tshuaj ntsuam. Ua ntej, peb tso lub vijtsam davhlau ya nyob twg multiplexer, thiab tom qab ntawd peb pib sim siv Apache benchmark. Yuav ua li cas ua haujlwm nrog screen koj tuaj yeem pom hauv Tshooj.

Los ntawm 1st server

while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done

Los ntawm 2st server

while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done

Los ntawm 3st server

while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done

Los ntawm 4st server

while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done

Cia peb tshawb xyuas cov ntaub ntawv hauv Clickhouse

Mus rau Clickhouse

clickhouse-client -h 172.26.10.109 -m

Ua ib qho SQL query

SELECT * FROM vector.logs;

┌─node_name────┬───────────timestamp─┬─server_name─┬─user_id─┬─request_full───┬─request_user_agent─┬─request_http_host─┬─request_uri─┬─request_scheme─┬─request_method─┬─request_length─┬─request_time─┬─request_referrer─┬─response_status─┬─response_body_bytes_sent─┬─response_content_type─┬───remote_addr─┬─remote_port─┬─remote_user─┬─upstream_addr─┬─upstream_port─┬─upstream_bytes_received─┬─upstream_bytes_sent─┬─upstream_cache_status─┬─upstream_connect_time─┬─upstream_header_time─┬─upstream_response_length─┬─upstream_response_time─┬─upstream_status─┬─upstream_content_type─┐
│ nginx-vector │ 2020-08-07 04:32:42 │ vhost1      │         │ GET / HTTP/1.0 │ 1server            │ vhost1            │ /           │ http           │ GET            │             66 │        0.028 │                  │             404 │                       27 │                       │ 172.26.10.106 │       45886 │             │ 172.26.10.106 │             0 │                     109 │                  97 │ DISABLED              │                     0 │                0.025 │                       27 │                  0.029 │             404 │                       │
└──────────────┴─────────────────────┴─────────────┴─────────┴────────────────┴────────────────────┴───────────────────┴─────────────┴────────────────┴────────────────┴────────────────┴──────────────┴──────────────────┴─────────────────┴──────────────────────────┴───────────────────────┴───────────────┴─────────────┴─────────────┴───────────────┴───────────────┴─────────────────────────┴─────────────────────┴───────────────────────┴───────────────────────┴──────────────────────┴──────────────────────────┴────────────────────────┴─────────────────┴───────────────────────

Tshawb nrhiav qhov loj ntawm cov ntxhuav hauv Clickhouse

select concat(database, '.', table)                         as table,
       formatReadableSize(sum(bytes))                       as size,
       sum(rows)                                            as rows,
       max(modification_time)                               as latest_modification,
       sum(bytes)                                           as bytes_size,
       any(engine)                                          as engine,
       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;

Cia peb kawm seb yuav ua li cas cov cav ntoo hauv Clickhouse.

Xa Nginx json cav siv Vector rau Clickhouse thiab Elasticsearch

Cov ntaub ntawv loj yog 857.19 MB.

Xa Nginx json cav siv Vector rau Clickhouse thiab Elasticsearch

Qhov loj ntawm tib cov ntaub ntawv hauv qhov ntsuas hauv Elasticsearch yog 4,5GB.

Yog tias koj tsis qhia meej cov ntaub ntawv hauv vector hauv qhov tsis, Clickhouse siv 4500/857.19 = 5.24 npaug tsawg dua hauv Elasticsearch.

Hauv vector, compression teb yog siv los ntawm lub neej ntawd.

Telegram tham los ntawm clickhouse
Telegram tham los ntawm Elasticsearch
Telegram tham los ntawm "Kev sau thiab txheeb xyuas qhov system lus"

Tau qhov twg los: www.hab.com

Ntxiv ib saib