Kutumiza zipika za Nginx json pogwiritsa ntchito Vector ku Clickhouse ndi Elasticsearch

Kutumiza zipika za Nginx json pogwiritsa ntchito Vector ku Clickhouse ndi Elasticsearch

vekitala, opangidwa kuti asonkhanitse, kusintha ndi kutumiza zipika, ma metric ndi zochitika.

→ Github

Polembedwa m'chinenero cha Rust, imadziwika ndi machitidwe apamwamba komanso kuchepa kwa RAM poyerekeza ndi zofanana zake. Kuphatikiza apo, chidwi chachikulu chimaperekedwa ku ntchito zokhudzana ndi kulondola, makamaka, kuthekera kosunga zochitika zosatumizidwa ku buffer pa disk ndikusintha mafayilo.

Zomangamanga, Vector ndi rauta yamwambo yomwe imalandira mauthenga kuchokera kwa amodzi kapena angapo magwero, ngati mukufuna kugwiritsa ntchito mauthengawa kusintha, ndi kuwatumiza kwa mmodzi kapena angapo ngalande.

Vector ndi m'malo mwa filebeat ndi logstash, imatha kugwira ntchito zonse ziwiri (kulandira ndi kutumiza zipika), zambiri pa iwo. malo.

Ngati mu Logstash unyolo umamangidwa ngati cholowetsa → fyuluta → zotulutsa ndiye mu Vector ndizomwe zili magweroamasinthaimamira

Zitsanzo zitha kupezeka m'zolembedwa.

Langizo ili ndi malangizo osinthidwa kuchokera Vyacheslav Rakhinsky. Malangizo apachiyambi ali ndi geoip processing. Poyesa geoip kuchokera pa netiweki yamkati, vekitala idapereka cholakwika.

Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30

Ngati wina akufunika kukonza geoip, onani malangizo oyambira kuchokera Vyacheslav Rakhinsky.

Tikonza kuphatikiza kwa Nginx (Kufikira zipika) → Vector (Client | Filebeat) → Vector (Server | Logstash) → padera mu Clickhouse komanso padera mu Elasticsearch. Tikhazikitsa ma seva 4. Ngakhale mutha kuzilambalala ndi ma seva atatu.

Kutumiza zipika za Nginx json pogwiritsa ntchito Vector ku Clickhouse ndi Elasticsearch

Chiwembu ndi chinthu chonga ichi.

Letsani Selinux pa maseva anu onse

sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot

Timayika HTTP seva emulator + zofunikira pa maseva onse

Monga emulator ya seva ya HTTP tidzagwiritsa ntchito nodejs-stub-server от Maxim Ignatenko

Nodejs-stub-server ilibe rpm. ndi pangani rpm kwa izo. rpm idzapangidwa pogwiritsa ntchito Fedora Copr

Onjezani chosungira antonpatsev/nodejs-stub-server

yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-server

Ikani nodejs-stub-server, benchmark ya Apache ndi multiplexer screen terminal pa ma seva onse

yum -y install stub_http_server screen mc httpd-tools screen

Ndinakonza nthawi yoyankha ya stub_http_server mu fayilo ya /var/lib/stub_http_server/stub_http_server.js kotero kuti panali zolemba zambiri.

var max_sleep = 10;

Tiyeni tiyambitse stub_http_server.

systemctl start stub_http_server
systemctl enable stub_http_server

Kukhazikitsa kwa Clickhouse pa seva 3

ClickHouse imagwiritsa ntchito malangizo a SSE 4.2, kotero pokhapokha atanenedwa mwanjira ina, kuthandizira kwa purosesa yogwiritsidwa ntchito kumakhala chofunikira padongosolo. Nali lamulo loti muwone ngati purosesa yamakono imathandizira SSE 4.2:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

Choyamba muyenera kulumikiza malo ovomerezeka:

sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64

Kuti muyike phukusi muyenera kuyendetsa malamulo awa:

sudo yum install -y clickhouse-server clickhouse-client

Lolani clickhouse-server kuti imvetsere khadi la netiweki mufayilo /etc/clickhouse-server/config.xml

<listen_host>0.0.0.0</listen_host>

Kusintha mulingo wodula mitengo kuchokera pakufufuza kupita ku debug

yesa

Zokonda zapakatikati:

min_compress_block_size  65536
max_compress_block_size  1048576

Kuti muyambitse kukakamiza kwa Zstd, adalangizidwa kuti asakhudze kasinthidwe, koma m'malo mwake agwiritse ntchito DDL.

Kutumiza zipika za Nginx json pogwiritsa ntchito Vector ku Clickhouse ndi Elasticsearch

Sindinapeze momwe ndingagwiritsire ntchito zstd compression kudzera pa DDL mu Google. Choncho ndinazisiya zilili.

Anzanu omwe amagwiritsa ntchito zstd compression ku Clickhouse, chonde gawanani malangizowo.

Kuyambitsa seva ngati daemon, thamangani:

service clickhouse-server start

Tsopano tiyeni tipitirire kukhazikitsa Clickhouse

Pitani ku Clickhouse

clickhouse-client -h 172.26.10.109 -m

172.26.10.109 - IP ya seva pomwe Clickhouse imayikidwa.

Tiyeni tipange nkhokwe ya vekitala

CREATE DATABASE vector;

Tiyeni tiwone ngati database ilipo.

show databases;

Pangani tebulo la vector.logs.

/* Это таблица где хранятся логи как есть */

CREATE TABLE vector.logs
(
    `node_name` String,
    `timestamp` DateTime,
    `server_name` String,
    `user_id` String,
    `request_full` String,
    `request_user_agent` String,
    `request_http_host` String,
    `request_uri` String,
    `request_scheme` String,
    `request_method` String,
    `request_length` UInt64,
    `request_time` Float32,
    `request_referrer` String,
    `response_status` UInt16,
    `response_body_bytes_sent` UInt64,
    `response_content_type` String,
    `remote_addr` IPv4,
    `remote_port` UInt32,
    `remote_user` String,
    `upstream_addr` IPv4,
    `upstream_port` UInt32,
    `upstream_bytes_received` UInt64,
    `upstream_bytes_sent` UInt64,
    `upstream_cache_status` String,
    `upstream_connect_time` Float32,
    `upstream_header_time` Float32,
    `upstream_response_length` UInt64,
    `upstream_response_time` Float32,
    `upstream_status` UInt16,
    `upstream_content_type` String,
    INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;

Timayang'ana kuti matebulo apangidwa. Tiyeni tiyambitse clickhouse-client ndi kupempha.

Tiyeni tipite ku database ya vekitala.

use vector;

Ok.

0 rows in set. Elapsed: 0.001 sec.

Tiyeni tione matebulo.

show tables;

┌─name────────────────┐
│ logs                │
└─────────────────────┘

Kuyika elasticsearch pa seva ya 4 kuti mutumize zomwezo ku Elasticsearch poyerekeza ndi Clickhouse

Onjezani kiyi ya public rpm

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Tiyeni tipange 2 repo:

/etc/yum.repos.d/elasticsearch.repo

[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

/etc/yum.repos.d/kibana.repo

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Ikani elasticsearch ndi kibana

yum install -y kibana elasticsearch

Popeza ikhala mu kope limodzi, muyenera kuwonjezera zotsatirazi ku fayilo ya /etc/elasticsearch/elasticsearch.yml:

discovery.type: single-node

Kotero vekitalayo imatha kutumiza deta ku elasticsearch kuchokera ku seva ina, tiyeni tisinthe network.host.

network.host: 0.0.0.0

Kuti mugwirizane ndi kibana, sinthani parameter ya seva.host mu fayilo /etc/kibana/kibana.yml

server.host: "0.0.0.0"

Zakale ndikuphatikizanso elasticsearch mu autostart

systemctl enable elasticsearch
systemctl start elasticsearch

ndi chibana

systemctl enable kibana
systemctl start kibana

Kukonza Elasticsearch ya single-node mode 1 shard, 0 replica. Mwachidziwikire mudzakhala ndi gulu la ma seva ambiri ndipo simuyenera kuchita izi.

Pazolozera zam'tsogolo, sinthani template yokhazikika:

curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' 

kolowera vekitala m'malo mwa Logstash pa seva 2

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screen

Tiyeni tiyike Vector m'malo mwa Logstash. Kusintha fayilo /etc/vector/vector.toml

# /etc/vector/vector.toml

data_dir = "/var/lib/vector"

[sources.nginx_input_vector]
  # General
  type                          = "vector"
  address                       = "0.0.0.0:9876"
  shutdown_timeout_secs         = 30

[transforms.nginx_parse_json]
  inputs                        = [ "nginx_input_vector" ]
  type                          = "json_parser"

[transforms.nginx_parse_add_defaults]
  inputs                        = [ "nginx_parse_json" ]
  type                          = "lua"
  version                       = "2"

  hooks.process = """
  function (event, emit)

    function split_first(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[1];
    end

    function split_last(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[#result];
    end

    event.log.upstream_addr             = split_first(split_last(event.log.upstream_addr, ', '), ':')
    event.log.upstream_bytes_received   = split_last(event.log.upstream_bytes_received, ', ')
    event.log.upstream_bytes_sent       = split_last(event.log.upstream_bytes_sent, ', ')
    event.log.upstream_connect_time     = split_last(event.log.upstream_connect_time, ', ')
    event.log.upstream_header_time      = split_last(event.log.upstream_header_time, ', ')
    event.log.upstream_response_length  = split_last(event.log.upstream_response_length, ', ')
    event.log.upstream_response_time    = split_last(event.log.upstream_response_time, ', ')
    event.log.upstream_status           = split_last(event.log.upstream_status, ', ')

    if event.log.upstream_addr == "" then
        event.log.upstream_addr = "127.0.0.1"
    end

    if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
        event.log.upstream_bytes_received = "0"
    end

    if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
        event.log.upstream_bytes_sent = "0"
    end

    if event.log.upstream_cache_status == "" then
        event.log.upstream_cache_status = "DISABLED"
    end

    if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
        event.log.upstream_connect_time = "0"
    end

    if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
        event.log.upstream_header_time = "0"
    end

    if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
        event.log.upstream_response_length = "0"
    end

    if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
        event.log.upstream_response_time = "0"
    end

    if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
        event.log.upstream_status = "0"
    end

    emit(event)

  end
  """

[transforms.nginx_parse_remove_fields]
    inputs                              = [ "nginx_parse_add_defaults" ]
    type                                = "remove_fields"
    fields                              = ["data", "file", "host", "source_type"]

[transforms.nginx_parse_coercer]

    type                                = "coercer"
    inputs                              = ["nginx_parse_remove_fields"]

    types.request_length = "int"
    types.request_time = "float"

    types.response_status = "int"
    types.response_body_bytes_sent = "int"

    types.remote_port = "int"

    types.upstream_bytes_received = "int"
    types.upstream_bytes_send = "int"
    types.upstream_connect_time = "float"
    types.upstream_header_time = "float"
    types.upstream_response_length = "int"
    types.upstream_response_time = "float"
    types.upstream_status = "int"

    types.timestamp = "timestamp"

[sinks.nginx_output_clickhouse]
    inputs   = ["nginx_parse_coercer"]
    type     = "clickhouse"

    database = "vector"
    healthcheck = true
    host = "http://172.26.10.109:8123" #  Адрес Clickhouse
    table = "logs"

    encoding.timestamp_format = "unix"

    buffer.type = "disk"
    buffer.max_size = 104900000
    buffer.when_full = "block"

    request.in_flight_limit = 20

[sinks.elasticsearch]
    type = "elasticsearch"
    inputs   = ["nginx_parse_coercer"]
    compression = "none"
    healthcheck = true
    # 172.26.10.116 - сервер где установен elasticsearch
    host = "http://172.26.10.116:9200" 
    index = "vector-%Y-%m-%d"

Mutha kusintha gawo la transforms.nginx_parse_add_defaults.

chifukwa Vyacheslav Rakhinsky amagwiritsa ntchito masinthidwe awa pa CDN yaying'ono ndipo pakhoza kukhala zinthu zingapo kumtunda_*

Mwachitsanzo:

"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"

Ngati izi sizili zanu, ndiye kuti gawoli litha kukhala losavuta

Tiyeni tipange zoikamo zautumiki kwa systemd /etc/systemd/system/vector.service

# /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Mukapanga matebulo, mutha kuyendetsa Vector

systemctl enable vector
systemctl start vector

Zolemba za Vector zitha kuwonedwa motere:

journalctl -f -u vector

Payenera kukhala zolembedwa ngati izi m'zipika

INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.

Pa kasitomala (Web seva) - 1 seva

Pa seva yokhala ndi nginx, muyenera kuletsa ipv6, popeza tebulo la zipika mu clickhouse limagwiritsa ntchito gawoli. upstream_addr IPv4, popeza sindigwiritsa ntchito ipv6 mkati mwa netiweki. Ngati ipv6 sinazimitsidwe, padzakhala zolakwika:

DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)

Mwina owerenga, onjezani chithandizo cha ipv6.

Pangani fayilo /etc/sysctl.d/98-disable-ipv6.conf

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Kugwiritsa ntchito zoikamo

sysctl --system

Tiyeni tiyike nginx.

Fayilo yowonjezera ya nginx yowonjezera /etc/yum.repos.d/nginx.repo

[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true

Ikani phukusi la nginx

yum install -y nginx

Choyamba, tiyenera kukonza chipika mu Nginx mu fayilo /etc/nginx/nginx.conf

user  nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically

# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

# provides the configuration file context in which the directives that affect connection processing are specified.
events {
    # determines how much clients will be served per worker
    # max clients = worker_connections * worker_processes
    # max clients is also limited by the number of socket connections available on the system (~64k)
    worker_connections 4000;

    # optimized to serve many clients with each thread, essential for linux -- for testing environment
    use epoll;

    # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
    multi_accept on;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

log_format vector escape=json
    '{'
        '"node_name":"nginx-vector",'
        '"timestamp":"$time_iso8601",'
        '"server_name":"$server_name",'
        '"request_full": "$request",'
        '"request_user_agent":"$http_user_agent",'
        '"request_http_host":"$http_host",'
        '"request_uri":"$request_uri",'
        '"request_scheme": "$scheme",'
        '"request_method":"$request_method",'
        '"request_length":"$request_length",'
        '"request_time": "$request_time",'
        '"request_referrer":"$http_referer",'
        '"response_status": "$status",'
        '"response_body_bytes_sent":"$body_bytes_sent",'
        '"response_content_type":"$sent_http_content_type",'
        '"remote_addr": "$remote_addr",'
        '"remote_port": "$remote_port",'
        '"remote_user": "$remote_user",'
        '"upstream_addr": "$upstream_addr",'
        '"upstream_bytes_received": "$upstream_bytes_received",'
        '"upstream_bytes_sent": "$upstream_bytes_sent",'
        '"upstream_cache_status":"$upstream_cache_status",'
        '"upstream_connect_time":"$upstream_connect_time",'
        '"upstream_header_time":"$upstream_header_time",'
        '"upstream_response_length":"$upstream_response_length",'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_status": "$upstream_status",'
        '"upstream_content_type":"$upstream_http_content_type"'
    '}';

    access_log  /var/log/nginx/access.log  main;
    access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

Kuti musaphwanye kasinthidwe kanu, Nginx imakupatsani mwayi wokhala ndi njira zingapo zofikira_log

access_log  /var/log/nginx/access.log  main;            # Стандартный лог
access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

Musaiwale kuwonjezera lamulo kuti mulowetse zipika zatsopano (ngati fayilo ya chipika sichikutha ndi .log)

Chotsani default.conf ku /etc/nginx/conf.d/

rm -f /etc/nginx/conf.d/default.conf

Onjezani wolandila /etc/nginx/conf.d/vhost1.conf

server {
    listen 80;
    server_name vhost1;
    location / {
        proxy_pass http://172.26.10.106:8080;
    }
}

Onjezani wolandila /etc/nginx/conf.d/vhost2.conf

server {
    listen 80;
    server_name vhost2;
    location / {
        proxy_pass http://172.26.10.108:8080;
    }
}

Onjezani wolandila /etc/nginx/conf.d/vhost3.conf

server {
    listen 80;
    server_name vhost3;
    location / {
        proxy_pass http://172.26.10.109:8080;
    }
}

Onjezani wolandila /etc/nginx/conf.d/vhost4.conf

server {
    listen 80;
    server_name vhost4;
    location / {
        proxy_pass http://172.26.10.116:8080;
    }
}

Onjezani makamu enieni (172.26.10.106 ip ya seva pomwe nginx imayikidwa) kumaseva onse ku /etc/hosts file:

172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4

Ndipo ngati chirichonse chiri chokonzeka ndiye

nginx -t 
systemctl restart nginx

Tsopano tiyeni tiyike tokha vekitala

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm

Tiyeni tipange zoikamo za systemd /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Ndipo konzekerani kusintha kwa Filebeat mu /etc/vector/vector.toml config. IP adilesi 172.26.10.108 ndi adilesi ya IP ya seva ya chipika (Vector-Server)

data_dir = "/var/lib/vector"

[sources.nginx_file]
  type                          = "file"
  include                       = [ "/var/log/nginx/access.json.log" ]
  start_at_beginning            = false
  fingerprinting.strategy       = "device_and_inode"

[sinks.nginx_output_vector]
  type                          = "vector"
  inputs                        = [ "nginx_file" ]

  address                       = "172.26.10.108:9876"

Musaiwale kuwonjezera wogwiritsa ntchito vekitala ku gulu lofunikira kuti athe kuwerenga mafayilo a log. Mwachitsanzo, nginx mu centos imapanga zipika zokhala ndi ufulu wamagulu adm.

usermod -a -G adm vector

Tiyeni tiyambe ntchito vekitala

systemctl enable vector
systemctl start vector

Zolemba za Vector zitha kuwonedwa motere:

journalctl -f -u vector

Payenera kukhala cholowa chonga ichi m'zipika

INFO vector::topology::builder: Healthcheck: Passed.

Kuyesa Kupanikizika

Timayesa pogwiritsa ntchito benchmark ya Apache.

Phukusi la zida za httpd linayikidwa pa ma seva onse

Timayamba kuyesa pogwiritsa ntchito benchmark ya Apache kuchokera ku maseva 4 osiyanasiyana pazenera. Choyamba, timatsegula chojambulira chowonjezera pazenera, kenako timayamba kuyesa pogwiritsa ntchito benchmark ya Apache. Momwe mungagwiritsire ntchito ndi skrini yomwe mungapeze nkhani.

Kuchokera pa seva yoyamba

while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done

Kuchokera pa seva yoyamba

while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done

Kuchokera pa seva yoyamba

while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done

Kuchokera pa seva yoyamba

while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done

Tiyeni tiwone zomwe zili mu Clickhouse

Pitani ku Clickhouse

clickhouse-client -h 172.26.10.109 -m

Kupanga funso la SQL

SELECT * FROM vector.logs;

┌─node_name────┬───────────timestamp─┬─server_name─┬─user_id─┬─request_full───┬─request_user_agent─┬─request_http_host─┬─request_uri─┬─request_scheme─┬─request_method─┬─request_length─┬─request_time─┬─request_referrer─┬─response_status─┬─response_body_bytes_sent─┬─response_content_type─┬───remote_addr─┬─remote_port─┬─remote_user─┬─upstream_addr─┬─upstream_port─┬─upstream_bytes_received─┬─upstream_bytes_sent─┬─upstream_cache_status─┬─upstream_connect_time─┬─upstream_header_time─┬─upstream_response_length─┬─upstream_response_time─┬─upstream_status─┬─upstream_content_type─┐
│ nginx-vector │ 2020-08-07 04:32:42 │ vhost1      │         │ GET / HTTP/1.0 │ 1server            │ vhost1            │ /           │ http           │ GET            │             66 │        0.028 │                  │             404 │                       27 │                       │ 172.26.10.106 │       45886 │             │ 172.26.10.106 │             0 │                     109 │                  97 │ DISABLED              │                     0 │                0.025 │                       27 │                  0.029 │             404 │                       │
└──────────────┴─────────────────────┴─────────────┴─────────┴────────────────┴────────────────────┴───────────────────┴─────────────┴────────────────┴────────────────┴────────────────┴──────────────┴──────────────────┴─────────────────┴──────────────────────────┴───────────────────────┴───────────────┴─────────────┴─────────────┴───────────────┴───────────────┴─────────────────────────┴─────────────────────┴───────────────────────┴───────────────────────┴──────────────────────┴──────────────────────────┴────────────────────────┴─────────────────┴───────────────────────

Dziwani kukula kwa matebulo ku Clickhouse

select concat(database, '.', table)                         as table,
       formatReadableSize(sum(bytes))                       as size,
       sum(rows)                                            as rows,
       max(modification_time)                               as latest_modification,
       sum(bytes)                                           as bytes_size,
       any(engine)                                          as engine,
       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;

Tiyeni tiwone kuchuluka kwa zipika zomwe zidatenga ku Clickhouse.

Kutumiza zipika za Nginx json pogwiritsa ntchito Vector ku Clickhouse ndi Elasticsearch

Kukula kwa tebulo la zipika ndi 857.19 MB.

Kutumiza zipika za Nginx json pogwiritsa ntchito Vector ku Clickhouse ndi Elasticsearch

Kukula kwa data yomweyi mu index mu Elasticsearch ndi 4,5GB.

Ngati simunatchule zambiri mu vector mu magawo, Clickhouse imatenga 4500/857.19 = 5.24 kuchepera kuposa mu Elasticsearch.

Mu vector, gawo la compression limagwiritsidwa ntchito mwachisawawa.

Macheza a Telegraph ndi clickhouse
Macheza a Telegraph ndi Elasticsearch
Macheza a Telegraph ndi "Kusonkhanitsa ndi kusanthula dongosolo mauthenga"

Source: www.habr.com

Kuwonjezera ndemanga