Schéckt Nginx json Logbicher mat Vector op Clickhouse an Elasticsearch

Schéckt Nginx json Logbicher mat Vector op Clickhouse an Elasticsearch

Vecteure, entwéckelt fir Logdaten, Metriken an Eventer ze sammelen, transforméieren a schécken.

→ Github

Schreift an der Rust Sprooch, ass et charakteriséiert duerch héich Leeschtung an niddereg RAM Konsum am Verglach zu sengen Analoga. Zousätzlech gëtt vill Opmierksamkeet u Funktiounen am Zesummenhang mat der Richtegkeet bezuelt, besonnesch d'Kapazitéit fir ongeschéckt Eventer op e Puffer op der Disk ze späicheren an Dateien ze rotéieren.

Architektonesch ass Vector en Event Router deen Messagen vun engem oder méi kritt Quellen, optional iwwer dës Messagen applizéiert Transformatiounen, a schéckt se un een oder méi drain.

Vector ass en Ersatz fir Filebeat a Logstash, et kann a béid Rollen handelen (Logbicher kréien a schécken), méi Detailer doriwwer Site.

Wann am Logstash d'Kette als Input → Filter → Ausgang gebaut ass, dann ass et am Vector Quellentransforméiertsénkt

Beispiller kënnen an der Dokumentatioun fonnt ginn.

Dës Instruktioun ass eng iwwerschafft Instruktioun aus Vyacheslav Rakhinsky. D'Original Instruktioune enthalen geoip Veraarbechtung. Wann Geoip aus engem internen Netzwierk getest gouf, huet Vecteure e Feeler ginn.

Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30

Wann iergendeen Geoip muss veraarbecht, da kuckt op d'Original Instruktioune vun Vyacheslav Rakhinsky.

Mir konfiguréieren d'Kombinatioun vun Nginx (Access Logs) → Vector (Client | Filebeat) → Vector (Server | Logstash) → separat am Clickhouse a separat an Elasticsearch. Mir installéieren 4 Serveren. Och wann Dir et mat 3 Serveren ëmgoe kënnt.

Schéckt Nginx json Logbicher mat Vector op Clickhouse an Elasticsearch

De Schema ass sou eppes.

Desaktivéiere Selinux op all Äre Serveren

sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot

Mir installéieren en HTTP Server Emulator + Utilities op all Server

Als HTTP Server Emulator wäerte mir benotzen nodejs-stub-server от Maxim Ignatenko

Nodejs-stub-Server huet keen rpm. et ass rpm dofir erstellen. rpm wäert kompiléiert ginn benotzt Fedora Cop

Füügt de Repository antonpatsev/nodejs-stub-server

yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-server

Installéiert nodejs-stub-Server, Apache Benchmark an Écran Terminal Multiplexer op all Server

yum -y install stub_http_server screen mc httpd-tools screen

Ech korrigéiert der stub_http_server Äntwert Zäit an der /var/lib/stub_http_server/stub_http_server.js Datei sou datt et méi Logbicher goufen.

var max_sleep = 10;

Loosst eis stub_http_server starten.

systemctl start stub_http_server
systemctl enable stub_http_server

Clickhouse Installatioun op server 3

ClickHouse benotzt den SSE 4.2 Instruktiounsset, also wann net anescht uginn, gëtt Ënnerstëtzung dofir am benotzte Prozessor eng zousätzlech Systemfuerderung. Hei ass de Kommando fir ze kontrolléieren ob den aktuelle Prozessor SSE 4.2 ënnerstëtzt:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

Als éischt musst Dir den offiziellen Repository verbannen:

sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64

Fir Packagen z'installéieren, musst Dir déi folgend Kommandoen ausféieren:

sudo yum install -y clickhouse-server clickhouse-client

Erlaabt Clickhouse-Server fir d'Netzwierkkaart an der Datei /etc/clickhouse-server/config.xml ze lauschteren

<listen_host>0.0.0.0</listen_host>

Änneren de Logbicherniveau vu Spuer bis Debug

Debug-

Standard Kompressioun Astellunge:

min_compress_block_size  65536
max_compress_block_size  1048576

Fir d'Zstd-Kompressioun z'aktivéieren, gouf ugeroden d'Konfiguratioun net ze beréieren, mä éischter DDL ze benotzen.

Schéckt Nginx json Logbicher mat Vector op Clickhouse an Elasticsearch

Ech konnt net fannen wéi ech zstd Kompressioun iwwer DDL op Google benotzen. Also hunn ech et gelooss wéi et ass.

Kollegen déi zstd Kompressioun am Clickhouse benotzen, deelt w.e.g. d'Instruktioune.

Fir de Server als Daemon ze starten, lafen:

service clickhouse-server start

Loosst eis elo weidergoen fir Clickhouse opzestellen

Gitt op Clickhouse

clickhouse-client -h 172.26.10.109 -m

172.26.10.109 - IP vum Server wou Clickhouse installéiert ass.

Loosst eis eng Vector Datebank erstellen

CREATE DATABASE vector;

Loosst eis kucken ob d'Datebank existéiert.

show databases;

Schafen eng vector.logs Dësch.

/* Это таблица где хранятся логи как есть */

CREATE TABLE vector.logs
(
    `node_name` String,
    `timestamp` DateTime,
    `server_name` String,
    `user_id` String,
    `request_full` String,
    `request_user_agent` String,
    `request_http_host` String,
    `request_uri` String,
    `request_scheme` String,
    `request_method` String,
    `request_length` UInt64,
    `request_time` Float32,
    `request_referrer` String,
    `response_status` UInt16,
    `response_body_bytes_sent` UInt64,
    `response_content_type` String,
    `remote_addr` IPv4,
    `remote_port` UInt32,
    `remote_user` String,
    `upstream_addr` IPv4,
    `upstream_port` UInt32,
    `upstream_bytes_received` UInt64,
    `upstream_bytes_sent` UInt64,
    `upstream_cache_status` String,
    `upstream_connect_time` Float32,
    `upstream_header_time` Float32,
    `upstream_response_length` UInt64,
    `upstream_response_time` Float32,
    `upstream_status` UInt16,
    `upstream_content_type` String,
    INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;

Mir kontrolléieren ob d'Dëscher erstallt goufen. Loosst eis starten clickhouse-client an eng Demande maachen.

Loosst eis an d'Vektordatenbank goen.

use vector;

Ok.

0 rows in set. Elapsed: 0.001 sec.

Loosst eis d'Dëscher kucken.

show tables;

┌─name────────────────┐
│ logs                │
└─────────────────────┘

Installéiere vun elasticsearch um 4. Server fir déiselwecht Donnéeën op Elasticsearch ze schécken fir de Verglach mam Clickhouse

Füügt en ëffentleche Rpm Schlëssel

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Loosst eis 2 Repo erstellen:

/etc/yum.repos.d/elasticsearch.repo

[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

/etc/yum.repos.d/kibana.repo

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Installéiert elasticsearch an kibana

yum install -y kibana elasticsearch

Well et an 1 Kopie wäert sinn, musst Dir déi folgend an d' /etc/elasticsearch/elasticsearch.yml Datei addéieren:

discovery.type: single-node

Also datt Vecteure kann Daten op elasticsearch vun engem anere Server schécken, loosst eis network.host änneren.

network.host: 0.0.0.0

Fir mat Kibana ze verbannen, ännert de Server.host Parameter an der Datei /etc/kibana/kibana.yml

server.host: "0.0.0.0"

Al an enthalen elasticsearch am Autostart

systemctl enable elasticsearch
systemctl start elasticsearch

an kibana

systemctl enable kibana
systemctl start kibana

Elasticsearch konfiguréieren fir Single-Node Modus 1 Shard, 0 Replica. Wahrscheinlech wäert Dir e Stärekoup vun enger grousser Zuel vu Serveren hunn an Dir braucht dat net ze maachen.

Fir zukünfteg Indexen, update d'Standard Schabloun:

curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' 

Kader Vecteure als Ersatz fir Logstash um Server 2

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screen

Loosst eis Vector opsetzen als Ersatz fir Logstash. Änneren vun der Datei /etc/vector/vector.toml

# /etc/vector/vector.toml

data_dir = "/var/lib/vector"

[sources.nginx_input_vector]
  # General
  type                          = "vector"
  address                       = "0.0.0.0:9876"
  shutdown_timeout_secs         = 30

[transforms.nginx_parse_json]
  inputs                        = [ "nginx_input_vector" ]
  type                          = "json_parser"

[transforms.nginx_parse_add_defaults]
  inputs                        = [ "nginx_parse_json" ]
  type                          = "lua"
  version                       = "2"

  hooks.process = """
  function (event, emit)

    function split_first(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[1];
    end

    function split_last(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[#result];
    end

    event.log.upstream_addr             = split_first(split_last(event.log.upstream_addr, ', '), ':')
    event.log.upstream_bytes_received   = split_last(event.log.upstream_bytes_received, ', ')
    event.log.upstream_bytes_sent       = split_last(event.log.upstream_bytes_sent, ', ')
    event.log.upstream_connect_time     = split_last(event.log.upstream_connect_time, ', ')
    event.log.upstream_header_time      = split_last(event.log.upstream_header_time, ', ')
    event.log.upstream_response_length  = split_last(event.log.upstream_response_length, ', ')
    event.log.upstream_response_time    = split_last(event.log.upstream_response_time, ', ')
    event.log.upstream_status           = split_last(event.log.upstream_status, ', ')

    if event.log.upstream_addr == "" then
        event.log.upstream_addr = "127.0.0.1"
    end

    if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
        event.log.upstream_bytes_received = "0"
    end

    if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
        event.log.upstream_bytes_sent = "0"
    end

    if event.log.upstream_cache_status == "" then
        event.log.upstream_cache_status = "DISABLED"
    end

    if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
        event.log.upstream_connect_time = "0"
    end

    if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
        event.log.upstream_header_time = "0"
    end

    if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
        event.log.upstream_response_length = "0"
    end

    if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
        event.log.upstream_response_time = "0"
    end

    if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
        event.log.upstream_status = "0"
    end

    emit(event)

  end
  """

[transforms.nginx_parse_remove_fields]
    inputs                              = [ "nginx_parse_add_defaults" ]
    type                                = "remove_fields"
    fields                              = ["data", "file", "host", "source_type"]

[transforms.nginx_parse_coercer]

    type                                = "coercer"
    inputs                              = ["nginx_parse_remove_fields"]

    types.request_length = "int"
    types.request_time = "float"

    types.response_status = "int"
    types.response_body_bytes_sent = "int"

    types.remote_port = "int"

    types.upstream_bytes_received = "int"
    types.upstream_bytes_send = "int"
    types.upstream_connect_time = "float"
    types.upstream_header_time = "float"
    types.upstream_response_length = "int"
    types.upstream_response_time = "float"
    types.upstream_status = "int"

    types.timestamp = "timestamp"

[sinks.nginx_output_clickhouse]
    inputs   = ["nginx_parse_coercer"]
    type     = "clickhouse"

    database = "vector"
    healthcheck = true
    host = "http://172.26.10.109:8123" #  Адрес Clickhouse
    table = "logs"

    encoding.timestamp_format = "unix"

    buffer.type = "disk"
    buffer.max_size = 104900000
    buffer.when_full = "block"

    request.in_flight_limit = 20

[sinks.elasticsearch]
    type = "elasticsearch"
    inputs   = ["nginx_parse_coercer"]
    compression = "none"
    healthcheck = true
    # 172.26.10.116 - сервер где установен elasticsearch
    host = "http://172.26.10.116:9200" 
    index = "vector-%Y-%m-%d"

Dir kënnt d'transforms.nginx_parse_add_defaults Rubrik ajustéieren.

zënter Vyacheslav Rakhinsky benotzt dës Konfiguratioune fir e klengen CDN an et kënne verschidde Wäerter am Upstream_* sinn

Zum Beispill:

"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"

Wann dëst net Är Situatioun ass, da kann dës Sektioun vereinfacht ginn

Loosst eis Service Astellunge fir systemd /etc/systemd/system/vector.service erstellen

# /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Nodeems Dir d'Dëscher erstallt hutt, kënnt Dir Vector lafen

systemctl enable vector
systemctl start vector

Vector Logbicher kënnen esou gekuckt ginn:

journalctl -f -u vector

Et sollen esou Entréen an de Logbicher sinn

INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.

Op de Client (Webserver) - 1. Server

Um Server mat nginx musst Dir ipv6 auszeschalten, well d'Logbicher Tabell am Clickhouse benotzt d'Feld upstream_addr IPv4, well ech benotzen IPv6 net am Netz. Wann ipv6 net ausgeschalt ass, ginn et Feeler:

DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)

Vläicht Lieser, füügt ipv6 Ënnerstëtzung.

Erstellt d'Datei /etc/sysctl.d/98-disable-ipv6.conf

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Uwendung vun den Astellungen

sysctl --system

Loosst eis nginx installéieren.

Zousätzlech nginx Repository Datei /etc/yum.repos.d/nginx.repo

[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true

Installéiert den nginx Package

yum install -y nginx

Als éischt musse mir de Logformat an Nginx an der Datei /etc/nginx/nginx.conf konfiguréieren

user  nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically

# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

# provides the configuration file context in which the directives that affect connection processing are specified.
events {
    # determines how much clients will be served per worker
    # max clients = worker_connections * worker_processes
    # max clients is also limited by the number of socket connections available on the system (~64k)
    worker_connections 4000;

    # optimized to serve many clients with each thread, essential for linux -- for testing environment
    use epoll;

    # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
    multi_accept on;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

log_format vector escape=json
    '{'
        '"node_name":"nginx-vector",'
        '"timestamp":"$time_iso8601",'
        '"server_name":"$server_name",'
        '"request_full": "$request",'
        '"request_user_agent":"$http_user_agent",'
        '"request_http_host":"$http_host",'
        '"request_uri":"$request_uri",'
        '"request_scheme": "$scheme",'
        '"request_method":"$request_method",'
        '"request_length":"$request_length",'
        '"request_time": "$request_time",'
        '"request_referrer":"$http_referer",'
        '"response_status": "$status",'
        '"response_body_bytes_sent":"$body_bytes_sent",'
        '"response_content_type":"$sent_http_content_type",'
        '"remote_addr": "$remote_addr",'
        '"remote_port": "$remote_port",'
        '"remote_user": "$remote_user",'
        '"upstream_addr": "$upstream_addr",'
        '"upstream_bytes_received": "$upstream_bytes_received",'
        '"upstream_bytes_sent": "$upstream_bytes_sent",'
        '"upstream_cache_status":"$upstream_cache_status",'
        '"upstream_connect_time":"$upstream_connect_time",'
        '"upstream_header_time":"$upstream_header_time",'
        '"upstream_response_length":"$upstream_response_length",'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_status": "$upstream_status",'
        '"upstream_content_type":"$upstream_http_content_type"'
    '}';

    access_log  /var/log/nginx/access.log  main;
    access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

Fir Är aktuell Konfiguratioun net ze briechen, Nginx erlaabt Iech e puer Access_log Direktiven ze hunn

access_log  /var/log/nginx/access.log  main;            # Стандартный лог
access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

Vergiesst net eng Regel ze addéieren fir nei Logbicher ze logrotéieren (wann d'Logdatei net mat .log endet)

Ewechzehuelen default.conf vun /etc/nginx/conf.d/

rm -f /etc/nginx/conf.d/default.conf

Füügt virtuelle Host /etc/nginx/conf.d/vhost1.conf

server {
    listen 80;
    server_name vhost1;
    location / {
        proxy_pass http://172.26.10.106:8080;
    }
}

Füügt virtuelle Host /etc/nginx/conf.d/vhost2.conf

server {
    listen 80;
    server_name vhost2;
    location / {
        proxy_pass http://172.26.10.108:8080;
    }
}

Füügt virtuelle Host /etc/nginx/conf.d/vhost3.conf

server {
    listen 80;
    server_name vhost3;
    location / {
        proxy_pass http://172.26.10.109:8080;
    }
}

Füügt virtuelle Host /etc/nginx/conf.d/vhost4.conf

server {
    listen 80;
    server_name vhost4;
    location / {
        proxy_pass http://172.26.10.116:8080;
    }
}

Füügt virtuell Hosten un d'/etc/hosts Datei (172.26.10.106 IP vum Server wou nginx installéiert ass) op all Server:

172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4

A wann alles fäerdeg ass dann

nginx -t 
systemctl restart nginx

Loosst eis et elo selwer installéieren Vecteure

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm

Loosst eis eng Astellungsdatei fir systemd /etc/systemd/system/vector.service erstellen

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

A konfiguréieren de Filebeat Ersatz an der /etc/vector/vector.toml config. IP Adress 172.26.10.108 ass d'IP Adress vum Log-Server (Vector-Server)

data_dir = "/var/lib/vector"

[sources.nginx_file]
  type                          = "file"
  include                       = [ "/var/log/nginx/access.json.log" ]
  start_at_beginning            = false
  fingerprinting.strategy       = "device_and_inode"

[sinks.nginx_output_vector]
  type                          = "vector"
  inputs                        = [ "nginx_file" ]

  address                       = "172.26.10.108:9876"

Vergiesst net de Benotzervektor an déi entspriechend Grupp bäizefügen, fir datt en Logdateien liesen kann. Zum Beispill, nginx an centos erstellt Logbicher mat Admin-Grupprechter.

usermod -a -G adm vector

Loosst eis de Vektorservice starten

systemctl enable vector
systemctl start vector

Vector Logbicher kënnen esou gekuckt ginn:

journalctl -f -u vector

Et soll esou eng Entrée an de Logbicher sinn

INFO vector::topology::builder: Healthcheck: Passed.

Stress Test

Mir maachen Tester mat Apache Benchmark.

Den httpd-Tools Package gouf op all Server installéiert

Mir fänken un ze testen mat Apache Benchmark vu 4 verschiddene Serveren um Bildschierm. Als éischt lancéiere mir den Écran Terminal Multiplexer, an da fänken mir un mat der Apache Benchmark ze testen. Wéi mat Écran ze schaffen, fannt Dir an Artikel.

Vun 1. Server

while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done

Vun 2. Server

while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done

Vun 3. Server

while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done

Vun 4. Server

while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done

Loosst eis d'Donnéeën am Clickhouse iwwerpréiwen

Gitt op Clickhouse

clickhouse-client -h 172.26.10.109 -m

Eng SQL Ufro maachen

SELECT * FROM vector.logs;

┌─node_name────┬───────────timestamp─┬─server_name─┬─user_id─┬─request_full───┬─request_user_agent─┬─request_http_host─┬─request_uri─┬─request_scheme─┬─request_method─┬─request_length─┬─request_time─┬─request_referrer─┬─response_status─┬─response_body_bytes_sent─┬─response_content_type─┬───remote_addr─┬─remote_port─┬─remote_user─┬─upstream_addr─┬─upstream_port─┬─upstream_bytes_received─┬─upstream_bytes_sent─┬─upstream_cache_status─┬─upstream_connect_time─┬─upstream_header_time─┬─upstream_response_length─┬─upstream_response_time─┬─upstream_status─┬─upstream_content_type─┐
│ nginx-vector │ 2020-08-07 04:32:42 │ vhost1      │         │ GET / HTTP/1.0 │ 1server            │ vhost1            │ /           │ http           │ GET            │             66 │        0.028 │                  │             404 │                       27 │                       │ 172.26.10.106 │       45886 │             │ 172.26.10.106 │             0 │                     109 │                  97 │ DISABLED              │                     0 │                0.025 │                       27 │                  0.029 │             404 │                       │
└──────────────┴─────────────────────┴─────────────┴─────────┴────────────────┴────────────────────┴───────────────────┴─────────────┴────────────────┴────────────────┴────────────────┴──────────────┴──────────────────┴─────────────────┴──────────────────────────┴───────────────────────┴───────────────┴─────────────┴─────────────┴───────────────┴───────────────┴─────────────────────────┴─────────────────────┴───────────────────────┴───────────────────────┴──────────────────────┴──────────────────────────┴────────────────────────┴─────────────────┴───────────────────────

Fannt d'Gréisst vun den Dëscher am Clickhouse eraus

select concat(database, '.', table)                         as table,
       formatReadableSize(sum(bytes))                       as size,
       sum(rows)                                            as rows,
       max(modification_time)                               as latest_modification,
       sum(bytes)                                           as bytes_size,
       any(engine)                                          as engine,
       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;

Loosst eis erausfannen wéi vill Logbicher am Clickhouse opgeholl hunn.

Schéckt Nginx json Logbicher mat Vector op Clickhouse an Elasticsearch

D'Logbicherstabelgréisst ass 857.19 MB.

Schéckt Nginx json Logbicher mat Vector op Clickhouse an Elasticsearch

D'Gréisst vun de selwechten Donnéeën am Index an Elasticsearch ass 4,5GB.

Wann Dir keng Donnéeën am Vecteure an de Parameteren uginn, hëlt Clickhouse 4500/857.19 = 5.24 Mol manner wéi an Elasticsearch.

Am Vektor gëtt d'Kompressiounsfeld als Standard benotzt.

Telegram Chat vum Klickhaus
Telegram Chat vum Elastikerzuch
Telegram Chat vum "Sammlung an Analyse vum System Messagen"

Source: will.com

Kaaft zouverlässeg Hosting fir Site mat DDoS Schutz, VPS VDS Server 🔥 Kaaft zouverléissegt Websäithosting mat DDoS-Schutz, VPS VDS Server | ProHoster