Ho romela li-logs tsa Nginx json ho sebelisa Vector ho Clickhouse le Elasticsearch

Ho romela li-logs tsa Nginx json ho sebelisa Vector ho Clickhouse le Elasticsearch

vector, e etselitsoeng ho bokella, ho fetola le ho romela data ea log, metrics le liketsahalo.

→ Github

E ngotsoe ka puo ea Rust, e tšoauoa ka ts'ebetso e phahameng le tšebeliso e tlase ea RAM ha e bapisoa le li-analogues tsa eona. Ntle le moo, ho lebisoa tlhokomelo e ngata mesebetsing e amanang le ho nepahala, haholo-holo, bokhoni ba ho boloka liketsahalo tse sa romelloang ho buffer ho disk le ho potoloha lifaele.

Ka meralo, Vector ke router ea ketsahalo e amohelang melaetsa ho tsoa ho e le 'ngoe kapa ho feta mehloli, ka boikhethelo ho sebelisa melaetsa ena liphetoho, le ho li romela ho a le mong kapa ho feta li-drain.

Vector ke sebaka sa filebeat le logstash, e ka sebetsa likarolong tse peli (ho amohela le ho romela li-log), lintlha tse ling ho tsona. sebaka.

Haeba ho Logstash ketane e hahiloe joalo ka kenyelletso → sefa → tlhahiso ebe ho Vector ho joalo Mehloliea fetohasiling

Mehlala e ka fumanoa litokomaneng.

Taelo ena ke taelo e nchafalitsoeng e tsoang ho Vyacheslav Rakhinsky. Litaelo tsa mantlha li na le ts'ebetso ea geoip. Ha u leka geoip ho tsoa ho marang-rang a ka hare, vector e fane ka phoso.

Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30

Haeba mang kapa mang a hloka ho sebetsana le geoip, joale sheba litaelo tsa mantlha tse tsoang ho Vyacheslav Rakhinsky.

Re tla lokisa motsoako oa Nginx (Access logs) → Vector (Client | Filebeat) → Vector (Server | Logstash) → ka thoko ho Clickhouse le ka thoko ho Elasticsearch. Re tla kenya li-server tse 4. Leha o ka e feta ka li-server tse 3.

Ho romela li-logs tsa Nginx json ho sebelisa Vector ho Clickhouse le Elasticsearch

Morero ke ntho e kang ena.

Tlosa Selinux ho li-server tsohle tsa hau

sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot

Re kenya emulator ea seva sa HTTP + lisebelisoa ho li-server tsohle

Joaloka emulator ea seva sa HTTP re tla e sebelisa nodejs-stub-server от Maxim Ignatenko

Nodejs-stub-server ha e na rpm. ke theha rpm bakeng sa eona. rpm e tla hahuoa ho sebelisoa Fedora Copr

Kenya polokelo ea antonpatsev/nodejs-stub-server

yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-server

Kenya nodejs-stub-server, benchmark ea Apache le multiplexer ea skrine ho li-server tsohle

yum -y install stub_http_server screen mc httpd-tools screen

Ke lokiselitse nako ea karabelo ea stub_http_server ho file ea /var/lib/stub_http_server/stub_http_server.js hore ho be le li-log tse ngata.

var max_sleep = 10;

Ha re qaleng stub_http_server.

systemctl start stub_http_server
systemctl enable stub_http_server

Ho kenya Clickhouse ho seva 3

ClickHouse e sebelisa sete ea litaelo tsa SSE 4.2, kahoo ntle le haeba ho boletsoe ka tsela e 'ngoe, tšehetso ea eona ho processor e sebelisitsoeng e fetoha tlhoko e eketsehileng ea sistimi. Mona ke taelo ea ho lekola hore na processor ea hajoale e tšehetsa SSE 4.2:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

Pele o hloka ho hokahanya polokelo ea semmuso:

sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64

Ho kenya liphutheloana u hloka ho tsamaisa litaelo tse latelang:

sudo yum install -y clickhouse-server clickhouse-client

Lumella li-clickhouse-server ho mamela karete ea marang-rang faeleng /etc/clickhouse-server/config.xml

<listen_host>0.0.0.0</listen_host>

Ho fetola boemo ba ho rema lifate ho tloha ho trace ho ea ho debug

debug

Litlhophiso tse tloaelehileng tsa compression:

min_compress_block_size  65536
max_compress_block_size  1048576

Ho kenya tšebetsong compression ea Zstd, ho ile ha eletsoa hore e se ke ea ama tlhophiso, empa ho e-na le hoo e sebelise DDL.

Ho romela li-logs tsa Nginx json ho sebelisa Vector ho Clickhouse le Elasticsearch

Ha kea fumana mokhoa oa ho sebelisa compression ea zstd ka DDL ho Google. Kahoo ke ile ka e tlohela e le joalo.

Basebetsi-'moho le uena ba sebelisang zstd compression ho Clickhouse, ka kopo arolelana litaelo.

Ho qala seva joalo ka daemon, matha:

service clickhouse-server start

Joale ha re tsoeleng pele ho theha Clickhouse

Eya ho Clickhouse

clickhouse-client -h 172.26.10.109 -m

172.26.10.109 - IP ea seva moo Clickhouse e kentsoeng teng.

Ha re theheng database ea vector

CREATE DATABASE vector;

Ha re hlahlobeng hore na database e teng.

show databases;

Theha tafole ea vector.logs.

/* Это таблица где хранятся логи как есть */

CREATE TABLE vector.logs
(
    `node_name` String,
    `timestamp` DateTime,
    `server_name` String,
    `user_id` String,
    `request_full` String,
    `request_user_agent` String,
    `request_http_host` String,
    `request_uri` String,
    `request_scheme` String,
    `request_method` String,
    `request_length` UInt64,
    `request_time` Float32,
    `request_referrer` String,
    `response_status` UInt16,
    `response_body_bytes_sent` UInt64,
    `response_content_type` String,
    `remote_addr` IPv4,
    `remote_port` UInt32,
    `remote_user` String,
    `upstream_addr` IPv4,
    `upstream_port` UInt32,
    `upstream_bytes_received` UInt64,
    `upstream_bytes_sent` UInt64,
    `upstream_cache_status` String,
    `upstream_connect_time` Float32,
    `upstream_header_time` Float32,
    `upstream_response_length` UInt64,
    `upstream_response_time` Float32,
    `upstream_status` UInt16,
    `upstream_content_type` String,
    INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;

Re hlahloba hore na litafole li entsoe. Ha re qaleng clickhouse-client mme o etse kopo.

Ha re ee ho database ea vector.

use vector;

Ok.

0 rows in set. Elapsed: 0.001 sec.

Ha re shebeng litafole.

show tables;

┌─name────────────────┐
│ logs                │
└─────────────────────┘

Ho kenya elasticsearch ho seva sa 4 ho romella data e tšoanang ho Elasticsearch bakeng sa ho bapisa le Clickhouse

Kenya konopo ea public rpm

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Ha re theheng 2 repo:

/etc/yum.repos.d/elasticsearch.repo

[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

/etc/yum.repos.d/kibana.repo

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Kenya elasticsearch le kibana

yum install -y kibana elasticsearch

Kaha e tla ba kopi e le 'ngoe, o hloka ho kenyelletsa tse latelang ho file ea /etc/elasticsearch/elasticsearch.yml:

discovery.type: single-node

E le hore vector e ka romela data ho elasticsearch ho tloha ho seva se seng, ha re fetoleng network.host.

network.host: 0.0.0.0

Ho hokela kibana, fetola parameter ea server.host faeleng /etc/kibana/kibana.yml

server.host: "0.0.0.0"

Khale 'me e kenyelletsa elasticsearch ho autostart

systemctl enable elasticsearch
systemctl start elasticsearch

le kibana

systemctl enable kibana
systemctl start kibana

E hlophisa Elasticsearch bakeng sa mofuta oa node e le 'ngoe 1 shard, 0 replica. Mohlomong u tla ba le sehlopha sa li-server tse ngata 'me ha ho hlokahale hore u etse sena.

Bakeng sa li-index tsa nako e tlang, ntlafatsa template ea kamehla:

curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' 

bophirima vector e le sebaka sa Logstash ho seva 2

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screen

Ha re theheng Vector e le sebaka sa Logstash. Ho hlophisa faele /etc/vector/vector.toml

# /etc/vector/vector.toml

data_dir = "/var/lib/vector"

[sources.nginx_input_vector]
  # General
  type                          = "vector"
  address                       = "0.0.0.0:9876"
  shutdown_timeout_secs         = 30

[transforms.nginx_parse_json]
  inputs                        = [ "nginx_input_vector" ]
  type                          = "json_parser"

[transforms.nginx_parse_add_defaults]
  inputs                        = [ "nginx_parse_json" ]
  type                          = "lua"
  version                       = "2"

  hooks.process = """
  function (event, emit)

    function split_first(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[1];
    end

    function split_last(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[#result];
    end

    event.log.upstream_addr             = split_first(split_last(event.log.upstream_addr, ', '), ':')
    event.log.upstream_bytes_received   = split_last(event.log.upstream_bytes_received, ', ')
    event.log.upstream_bytes_sent       = split_last(event.log.upstream_bytes_sent, ', ')
    event.log.upstream_connect_time     = split_last(event.log.upstream_connect_time, ', ')
    event.log.upstream_header_time      = split_last(event.log.upstream_header_time, ', ')
    event.log.upstream_response_length  = split_last(event.log.upstream_response_length, ', ')
    event.log.upstream_response_time    = split_last(event.log.upstream_response_time, ', ')
    event.log.upstream_status           = split_last(event.log.upstream_status, ', ')

    if event.log.upstream_addr == "" then
        event.log.upstream_addr = "127.0.0.1"
    end

    if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
        event.log.upstream_bytes_received = "0"
    end

    if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
        event.log.upstream_bytes_sent = "0"
    end

    if event.log.upstream_cache_status == "" then
        event.log.upstream_cache_status = "DISABLED"
    end

    if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
        event.log.upstream_connect_time = "0"
    end

    if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
        event.log.upstream_header_time = "0"
    end

    if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
        event.log.upstream_response_length = "0"
    end

    if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
        event.log.upstream_response_time = "0"
    end

    if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
        event.log.upstream_status = "0"
    end

    emit(event)

  end
  """

[transforms.nginx_parse_remove_fields]
    inputs                              = [ "nginx_parse_add_defaults" ]
    type                                = "remove_fields"
    fields                              = ["data", "file", "host", "source_type"]

[transforms.nginx_parse_coercer]

    type                                = "coercer"
    inputs                              = ["nginx_parse_remove_fields"]

    types.request_length = "int"
    types.request_time = "float"

    types.response_status = "int"
    types.response_body_bytes_sent = "int"

    types.remote_port = "int"

    types.upstream_bytes_received = "int"
    types.upstream_bytes_send = "int"
    types.upstream_connect_time = "float"
    types.upstream_header_time = "float"
    types.upstream_response_length = "int"
    types.upstream_response_time = "float"
    types.upstream_status = "int"

    types.timestamp = "timestamp"

[sinks.nginx_output_clickhouse]
    inputs   = ["nginx_parse_coercer"]
    type     = "clickhouse"

    database = "vector"
    healthcheck = true
    host = "http://172.26.10.109:8123" #  Адрес Clickhouse
    table = "logs"

    encoding.timestamp_format = "unix"

    buffer.type = "disk"
    buffer.max_size = 104900000
    buffer.when_full = "block"

    request.in_flight_limit = 20

[sinks.elasticsearch]
    type = "elasticsearch"
    inputs   = ["nginx_parse_coercer"]
    compression = "none"
    healthcheck = true
    # 172.26.10.116 - сервер где установен elasticsearch
    host = "http://172.26.10.116:9200" 
    index = "vector-%Y-%m-%d"

O ka fetola karolo ea transforms.nginx_parse_add_defaults.

ho tloha ka Vyacheslav Rakhinsky e sebelisa li-configs tsena bakeng sa CDN e nyane mme ho ka ba le litekanyetso tse 'maloa holimo holimo_*

Ka mohlala:

"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"

Haeba sena ha se boemo ba hau, joale karolo ena e ka nolofatsoa

Ha re theheng litlhophiso tsa lits'ebeletso bakeng sa systemd /etc/systemd/system/vector.service

# /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Kamora ho theha litafole, o ka tsamaisa Vector

systemctl enable vector
systemctl start vector

Li-logs tsa Vector li ka bonoa ka tsela ena:

journalctl -f -u vector

Ho lokela hore ho be le lintho tse kang tsena ka har'a li-logs

INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.

Ho moreki (Seva ea Webo) - 1st server

Ho seva e nang le nginx, o hloka ho tima ipv6, kaha tafole ea li-log ho clickhouse e sebelisa tšimo. upstream_addr IPv4, kaha ha ke sebelise ipv6 ka har'a marang-rang. Haeba ipv6 e sa tingoa, ho tla ba le liphoso:

DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)

Mohlomong babali, eketsa tšehetso ea ipv6.

Etsa faele /etc/sysctl.d/98-disable-ipv6.conf

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Ho sebelisa li-setting

sysctl --system

Ha re kenye nginx.

E kentse faele ea polokelo ea nginx /etc/yum.repos.d/nginx.repo

[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true

Kenya sephutheloana sa nginx

yum install -y nginx

Taba ea pele, re hloka ho hlophisa sebopeho sa log ho Nginx faeleng /etc/nginx/nginx.conf

user  nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically

# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

# provides the configuration file context in which the directives that affect connection processing are specified.
events {
    # determines how much clients will be served per worker
    # max clients = worker_connections * worker_processes
    # max clients is also limited by the number of socket connections available on the system (~64k)
    worker_connections 4000;

    # optimized to serve many clients with each thread, essential for linux -- for testing environment
    use epoll;

    # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
    multi_accept on;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

log_format vector escape=json
    '{'
        '"node_name":"nginx-vector",'
        '"timestamp":"$time_iso8601",'
        '"server_name":"$server_name",'
        '"request_full": "$request",'
        '"request_user_agent":"$http_user_agent",'
        '"request_http_host":"$http_host",'
        '"request_uri":"$request_uri",'
        '"request_scheme": "$scheme",'
        '"request_method":"$request_method",'
        '"request_length":"$request_length",'
        '"request_time": "$request_time",'
        '"request_referrer":"$http_referer",'
        '"response_status": "$status",'
        '"response_body_bytes_sent":"$body_bytes_sent",'
        '"response_content_type":"$sent_http_content_type",'
        '"remote_addr": "$remote_addr",'
        '"remote_port": "$remote_port",'
        '"remote_user": "$remote_user",'
        '"upstream_addr": "$upstream_addr",'
        '"upstream_bytes_received": "$upstream_bytes_received",'
        '"upstream_bytes_sent": "$upstream_bytes_sent",'
        '"upstream_cache_status":"$upstream_cache_status",'
        '"upstream_connect_time":"$upstream_connect_time",'
        '"upstream_header_time":"$upstream_header_time",'
        '"upstream_response_length":"$upstream_response_length",'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_status": "$upstream_status",'
        '"upstream_content_type":"$upstream_http_content_type"'
    '}';

    access_log  /var/log/nginx/access.log  main;
    access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

E le hore u se ke ua senya tlhophiso ea hau ea hajoale, Nginx e u lumella ho ba le litaelo tse 'maloa tsa access_log

access_log  /var/log/nginx/access.log  main;            # Стандартный лог
access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

U se ke ua lebala ho kenya molao ho logrotate bakeng sa li-log tse ncha (haeba faele ea log e sa felle ka .log)

Tlosa default.conf ho /etc/nginx/conf.d/

rm -f /etc/nginx/conf.d/default.conf

Kenya moamoheli oa sebele /etc/nginx/conf.d/vhost1.conf

server {
    listen 80;
    server_name vhost1;
    location / {
        proxy_pass http://172.26.10.106:8080;
    }
}

Kenya moamoheli oa sebele /etc/nginx/conf.d/vhost2.conf

server {
    listen 80;
    server_name vhost2;
    location / {
        proxy_pass http://172.26.10.108:8080;
    }
}

Kenya moamoheli oa sebele /etc/nginx/conf.d/vhost3.conf

server {
    listen 80;
    server_name vhost3;
    location / {
        proxy_pass http://172.26.10.109:8080;
    }
}

Kenya moamoheli oa sebele /etc/nginx/conf.d/vhost4.conf

server {
    listen 80;
    server_name vhost4;
    location / {
        proxy_pass http://172.26.10.116:8080;
    }
}

Eketsa li-host host (172.26.10.106 ip ea seva moo nginx e kentsoeng) ho li-server tsohle ho /etc/hosts file:

172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4

Mme haeba tsohle di lokile ka nako eo

nginx -t 
systemctl restart nginx

Joale ha re e instoleng ka borona vector

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm

Ha re theheng faele ea litlhophiso bakeng sa systemd /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

'Me u lokise sebaka sa Filebeat ho /etc/vector/vector.toml config. Aterese ea IP 172.26.10.108 ke aterese ea IP ea seva sa log (Vector-Server)

data_dir = "/var/lib/vector"

[sources.nginx_file]
  type                          = "file"
  include                       = [ "/var/log/nginx/access.json.log" ]
  start_at_beginning            = false
  fingerprinting.strategy       = "device_and_inode"

[sinks.nginx_output_vector]
  type                          = "vector"
  inputs                        = [ "nginx_file" ]

  address                       = "172.26.10.108:9876"

U se ke ua lebala ho kenya mochine oa vector ho sehlopha se hlokahalang e le hore a tsebe ho bala lifaele tsa log. Mohlala, nginx ka centos e etsa li-log tse nang le litokelo tsa sehlopha sa adm.

usermod -a -G adm vector

Ha re qale tšebeletso ea vector

systemctl enable vector
systemctl start vector

Li-logs tsa Vector li ka bonoa ka tsela ena:

journalctl -f -u vector

Ho lokela hore ho be le keno e kang ena ka har'a li-log

INFO vector::topology::builder: Healthcheck: Passed.

Teko ea Khatello ea Maikutlo

Teko e etsoa ho sebelisoa benchmark ea Apache.

Sephutheloana sa httpd-tools se kentsoe ho li-server tsohle

Re qala ho leka ho sebelisa benchmark ea Apache ho tsoa ho li-server tse 4 tse fapaneng skrineng. Taba ea mantlha, re qala "multiplexer" ea skrineng, ebe re qala ho leka ho sebelisa benchmark ea Apache. Mokhoa oa ho sebetsa ka skrine o ka o fumana ho sehlooho.

Ho tsoa ho seva sa pele

while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done

Ho tsoa ho seva sa pele

while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done

Ho tsoa ho seva sa pele

while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done

Ho tsoa ho seva sa pele

while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done

Ha re hlahlobe lintlha ho Clickhouse

Eya ho Clickhouse

clickhouse-client -h 172.26.10.109 -m

Ho etsa potso ea SQL

SELECT * FROM vector.logs;

┌─node_name────┬───────────timestamp─┬─server_name─┬─user_id─┬─request_full───┬─request_user_agent─┬─request_http_host─┬─request_uri─┬─request_scheme─┬─request_method─┬─request_length─┬─request_time─┬─request_referrer─┬─response_status─┬─response_body_bytes_sent─┬─response_content_type─┬───remote_addr─┬─remote_port─┬─remote_user─┬─upstream_addr─┬─upstream_port─┬─upstream_bytes_received─┬─upstream_bytes_sent─┬─upstream_cache_status─┬─upstream_connect_time─┬─upstream_header_time─┬─upstream_response_length─┬─upstream_response_time─┬─upstream_status─┬─upstream_content_type─┐
│ nginx-vector │ 2020-08-07 04:32:42 │ vhost1      │         │ GET / HTTP/1.0 │ 1server            │ vhost1            │ /           │ http           │ GET            │             66 │        0.028 │                  │             404 │                       27 │                       │ 172.26.10.106 │       45886 │             │ 172.26.10.106 │             0 │                     109 │                  97 │ DISABLED              │                     0 │                0.025 │                       27 │                  0.029 │             404 │                       │
└──────────────┴─────────────────────┴─────────────┴─────────┴────────────────┴────────────────────┴───────────────────┴─────────────┴────────────────┴────────────────┴────────────────┴──────────────┴──────────────────┴─────────────────┴──────────────────────────┴───────────────────────┴───────────────┴─────────────┴─────────────┴───────────────┴───────────────┴─────────────────────────┴─────────────────────┴───────────────────────┴───────────────────────┴──────────────────────┴──────────────────────────┴────────────────────────┴─────────────────┴───────────────────────

Fumana boholo ba litafole ho Clickhouse

select concat(database, '.', table)                         as table,
       formatReadableSize(sum(bytes))                       as size,
       sum(rows)                                            as rows,
       max(modification_time)                               as latest_modification,
       sum(bytes)                                           as bytes_size,
       any(engine)                                          as engine,
       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;

Ha re fumane hore na ho na le lifate tse kae tse kentsoeng Clickhouse.

Ho romela li-logs tsa Nginx json ho sebelisa Vector ho Clickhouse le Elasticsearch

Boholo ba tafole ea lifate ke 857.19 MB.

Ho romela li-logs tsa Nginx json ho sebelisa Vector ho Clickhouse le Elasticsearch

Boholo ba data e tšoanang ho index ho Elasticsearch ke 4,5GB.

Haeba u sa hlakise lintlha ka har'a vector ho liparamente, Clickhouse e nka 4500/857.19 = makhetlo a 5.24 ka tlase ho Elasticsearch.

Ho vector, sebaka sa compression se sebelisoa ka ho sa feleng.

Puisano ea thelekramo ka Clickhouse
Puisano ea thelekramo ka Elasticsearch
Puisano ka Telegraph ka "Pokello le tlhahlobo ea tsamaiso melaetsa"

Source: www.habr.com

Eketsa ka tlhaloso