Mandefa logs Nginx json mampiasa Vector mankany Clickhouse sy Elasticsearch

Mandefa logs Nginx json mampiasa Vector mankany Clickhouse sy Elasticsearch

Vector, natao hanangonana, hanovana ary handefasana angona, metrika ary hetsika.

→ Github

Ny fanoratana amin'ny fiteny Rust dia miavaka amin'ny fampisehoana avo lenta sy ny fanjifana RAM ambany raha oharina amin'ny analogues. Ho fanampin'izany, ny saina be dia be amin'ny asa mifandraika amin'ny fahitsiana, indrindra ny fahafahana mitahiry hetsika tsy alefa amin'ny buffer amin'ny kapila ary manodina rakitra.

Amin'ny Architectural, Vector dia router hetsika izay mandray hafatra avy amin'ny iray na maromaro loharanom-baovao, raha azo atao dia ampiharina amin'ireo hafatra ireo fiovana, ary mandefa azy ireo amin'ny iray na maromaro fantsona ivoahan'ny.

Vector dia fanoloana ny filebeat sy logstash, afaka miasa amin'ny andraikitra roa (mandray sy mandefa logs), antsipiriany bebe kokoa momba azy ireo. tranonkala.

Raha ao amin'ny Logstash ny rojo dia natsangana ho input → filter → output dia ao amin'ny Vector izany loharanom-baovaomanovamilentika

Misy ohatra hita ao amin'ny antontan-taratasy.

Ity torolàlana ity dia torolàlana nohavaozina avy amin'ny Vyacheslav Rakhinsky. Ny torolàlana tany am-boalohany dia misy fanodinana geoip. Rehefa nanandrana geoip avy amin'ny tambajotra anatiny, dia nanome fahadisoana ny vector.

Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30

Raha misy mila manodina geoip dia jereo ny toromarika voalohany avy amin'ny Vyacheslav Rakhinsky.

Hamboarina ny fitambaran'ny Nginx (Login'ny fidirana) → Vector (Client | Filebeat) → Vector (Server | Logstash) → misaraka amin'ny Clickhouse ary misaraka amin'ny Elasticsearch. Hametraka mpizara 4 izahay. Na dia afaka mandingana azy amin'ny mpizara 3 aza ianao.

Mandefa logs Nginx json mampiasa Vector mankany Clickhouse sy Elasticsearch

Ny drafitra dia zavatra toy izany.

Atsaharo ny Selinux amin'ny lohamilinao rehetra

sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot

Mametraka emulator server + HTTP izahay amin'ny mpizara rehetra

Amin'ny maha-emulator mpizara HTTP dia hampiasainay nodejs-stub-server от Maxim Ignatenko

Nodejs-stub-server dia tsy manana rpm. izany mamorona rpm ho azy. rpm dia hatsangana amin'ny fampiasana Fedora Copr

Ampio ny antonpatsev/nodejs-stub-server repository

yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-server

Mametraka nodejs-stub-server, benchmark Apache ary multiplexer terminal screen amin'ny mpizara rehetra

yum -y install stub_http_server screen mc httpd-tools screen

Nahitsy ny fotoana famaliana stub_http_server tao amin'ny rakitra /var/lib/stub_http_server/stub_http_server.js aho mba hisian'ny logs bebe kokoa.

var max_sleep = 10;

Andao hanomboka stub_http_server.

systemctl start stub_http_server
systemctl enable stub_http_server

Fametrahana Clickhouse amin'ny server 3

ClickHouse dia mampiasa ny SSE 4.2 toromarika napetraka, ka raha tsy misy fepetra manokana, ny fanohanana azy ao amin'ny processeur ampiasaina dia lasa fepetra takiana rafitra fanampiny. Ity ny baiko hanamarinana raha manohana ny SSE 4.2 ny processeur amin'izao fotoana izao:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

Voalohany dia mila mampifandray ny tahiry ofisialy ianao:

sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64

Mba hametrahana fonosana dia mila manatanteraka ireto baiko manaraka ireto ianao:

sudo yum install -y clickhouse-server clickhouse-client

Avelao ny clickhouse-server hihaino ny karatra tambajotra ao amin'ny rakitra /etc/clickhouse-server/config.xml

<listen_host>0.0.0.0</listen_host>

Fanovàna ny haavon'ny logging avy amin'ny trace ho debug

debug

Fikirana fanerena mahazatra:

min_compress_block_size  65536
max_compress_block_size  1048576

Mba hampavitrika ny famatrarana Zstd, dia nanoro hevitra ny tsy hikasika ny config, fa hampiasa DDL.

Mandefa logs Nginx json mampiasa Vector mankany Clickhouse sy Elasticsearch

Tsy hitako ny fomba fampiasana zstd compression amin'ny DDL ao amin'ny Google. Dia navelako ho toy izany.

Ireo mpiara-miasa mampiasa zstd compression ao amin'ny Clickhouse, azafady mba zarao ny torolàlana.

Mba hanombohana ny server ho daemon dia mandehana:

service clickhouse-server start

Andeha isika hiroso amin'ny fametrahana Clickhouse

Mandehana any amin'ny Clickhouse

clickhouse-client -h 172.26.10.109 -m

172.26.10.109 — IP an'ny mpizara izay apetraka Clickhouse.

Andao hamorona angona vector

CREATE DATABASE vector;

Andeha hojerentsika fa misy ny angon-drakitra.

show databases;

Mamorona tabilao vector.logs.

/* Это таблица где хранятся логи как есть */

CREATE TABLE vector.logs
(
    `node_name` String,
    `timestamp` DateTime,
    `server_name` String,
    `user_id` String,
    `request_full` String,
    `request_user_agent` String,
    `request_http_host` String,
    `request_uri` String,
    `request_scheme` String,
    `request_method` String,
    `request_length` UInt64,
    `request_time` Float32,
    `request_referrer` String,
    `response_status` UInt16,
    `response_body_bytes_sent` UInt64,
    `response_content_type` String,
    `remote_addr` IPv4,
    `remote_port` UInt32,
    `remote_user` String,
    `upstream_addr` IPv4,
    `upstream_port` UInt32,
    `upstream_bytes_received` UInt64,
    `upstream_bytes_sent` UInt64,
    `upstream_cache_status` String,
    `upstream_connect_time` Float32,
    `upstream_header_time` Float32,
    `upstream_response_length` UInt64,
    `upstream_response_time` Float32,
    `upstream_status` UInt16,
    `upstream_content_type` String,
    INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;

Manamarina izahay fa efa noforonina ireo tabilao. Andao hanomboka clickhouse-client ary manao fangatahana.

Andao ho any amin'ny angon-drakitra vector.

use vector;

Ok.

0 rows in set. Elapsed: 0.001 sec.

Andeha hojerentsika ny latabatra.

show tables;

┌─name────────────────┐
│ logs                │
└─────────────────────┘

Fametrahana elasticsearch amin'ny mpizara faha-4 handefasana data mitovy amin'ny Elasticsearch mba hampitahaina amin'ny Clickhouse

Manampia fanalahidy rpm ho an'ny daholobe

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Andao hamorona repo 2:

/etc/yum.repos.d/elasticsearch.repo

[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

/etc/yum.repos.d/kibana.repo

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Mametraka elasticsearch sy kibana

yum install -y kibana elasticsearch

Koa satria ao anaty kopia 1 izy io dia mila ampidirinao ao amin'ny rakitra /etc/elasticsearch/elasticsearch.yml ireto manaraka ireto:

discovery.type: single-node

Mba hahafahan'ny vector mandefa angona mankany amin'ny elasticsearch avy amin'ny mpizara hafa, andao hanova network.host.

network.host: 0.0.0.0

Raha te hifandray amin'ny kibana dia ovay ny parameter server.host ao amin'ny rakitra /etc/kibana/kibana.yml

server.host: "0.0.0.0"

Antitra ary ampidiro ny elasticsearch amin'ny autostart

systemctl enable elasticsearch
systemctl start elasticsearch

sy kibana

systemctl enable kibana
systemctl start kibana

Fanamboarana Elasticsearch ho an'ny maodely node tokana 1 shard, 0 replika. Azo inoana fa hanana andiana mpizara marobe ianao ary tsy mila manao izany ianao.

Ho an'ny fanondro ho avy, havaozy ny môdely default:

curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' 

fametrahana Vector ho solon'ny Logstash amin'ny mpizara 2

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screen

Andao hametraka Vector ho solon'ny Logstash. Fanovana ny rakitra /etc/vector/vector.toml

# /etc/vector/vector.toml

data_dir = "/var/lib/vector"

[sources.nginx_input_vector]
  # General
  type                          = "vector"
  address                       = "0.0.0.0:9876"
  shutdown_timeout_secs         = 30

[transforms.nginx_parse_json]
  inputs                        = [ "nginx_input_vector" ]
  type                          = "json_parser"

[transforms.nginx_parse_add_defaults]
  inputs                        = [ "nginx_parse_json" ]
  type                          = "lua"
  version                       = "2"

  hooks.process = """
  function (event, emit)

    function split_first(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[1];
    end

    function split_last(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[#result];
    end

    event.log.upstream_addr             = split_first(split_last(event.log.upstream_addr, ', '), ':')
    event.log.upstream_bytes_received   = split_last(event.log.upstream_bytes_received, ', ')
    event.log.upstream_bytes_sent       = split_last(event.log.upstream_bytes_sent, ', ')
    event.log.upstream_connect_time     = split_last(event.log.upstream_connect_time, ', ')
    event.log.upstream_header_time      = split_last(event.log.upstream_header_time, ', ')
    event.log.upstream_response_length  = split_last(event.log.upstream_response_length, ', ')
    event.log.upstream_response_time    = split_last(event.log.upstream_response_time, ', ')
    event.log.upstream_status           = split_last(event.log.upstream_status, ', ')

    if event.log.upstream_addr == "" then
        event.log.upstream_addr = "127.0.0.1"
    end

    if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
        event.log.upstream_bytes_received = "0"
    end

    if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
        event.log.upstream_bytes_sent = "0"
    end

    if event.log.upstream_cache_status == "" then
        event.log.upstream_cache_status = "DISABLED"
    end

    if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
        event.log.upstream_connect_time = "0"
    end

    if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
        event.log.upstream_header_time = "0"
    end

    if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
        event.log.upstream_response_length = "0"
    end

    if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
        event.log.upstream_response_time = "0"
    end

    if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
        event.log.upstream_status = "0"
    end

    emit(event)

  end
  """

[transforms.nginx_parse_remove_fields]
    inputs                              = [ "nginx_parse_add_defaults" ]
    type                                = "remove_fields"
    fields                              = ["data", "file", "host", "source_type"]

[transforms.nginx_parse_coercer]

    type                                = "coercer"
    inputs                              = ["nginx_parse_remove_fields"]

    types.request_length = "int"
    types.request_time = "float"

    types.response_status = "int"
    types.response_body_bytes_sent = "int"

    types.remote_port = "int"

    types.upstream_bytes_received = "int"
    types.upstream_bytes_send = "int"
    types.upstream_connect_time = "float"
    types.upstream_header_time = "float"
    types.upstream_response_length = "int"
    types.upstream_response_time = "float"
    types.upstream_status = "int"

    types.timestamp = "timestamp"

[sinks.nginx_output_clickhouse]
    inputs   = ["nginx_parse_coercer"]
    type     = "clickhouse"

    database = "vector"
    healthcheck = true
    host = "http://172.26.10.109:8123" #  Адрес Clickhouse
    table = "logs"

    encoding.timestamp_format = "unix"

    buffer.type = "disk"
    buffer.max_size = 104900000
    buffer.when_full = "block"

    request.in_flight_limit = 20

[sinks.elasticsearch]
    type = "elasticsearch"
    inputs   = ["nginx_parse_coercer"]
    compression = "none"
    healthcheck = true
    # 172.26.10.116 - сервер где установен elasticsearch
    host = "http://172.26.10.116:9200" 
    index = "vector-%Y-%m-%d"

Azonao atao ny manitsy ny fizarana transforms.nginx_parse_add_defaults.

satria Vyacheslav Rakhinsky mampiasa ireto configs ireto ho an'ny CDN kely ary mety misy sanda maromaro any ambony_*

Ohatra:

"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"

Raha tsy izany no toe-javatra misy anao, dia azo tsotsotra ity fizarana ity

Andao hamorona fikandrana serivisy ho an'ny systemd /etc/systemd/system/vector.service

# /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Aorian'ny famoronana latabatra dia azonao atao ny mihazakazaka Vector

systemctl enable vector
systemctl start vector

Ny logs vector dia azo jerena toy izao:

journalctl -f -u vector

Tokony hisy fidirana toy izany ao amin'ny diary

INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.

Amin'ny mpanjifa (Web server) - mpizara voalohany

Ao amin'ny server miaraka amin'ny nginx, mila manaisotra ny ipv6 ianao, satria ny latabatra logs ao amin'ny clickhouse dia mampiasa ny saha. upstream_addr IPv4, satria tsy mampiasa ipv6 ao anaty tambajotra aho. Raha tsy vonoina ny ipv6 dia hisy lesoka:

DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)

Angamba ry mpamaky, ampio ny fanohanana ipv6.

Mamorona ny rakitra /etc/sysctl.d/98-disable-ipv6.conf

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Mampihatra ny fanovana

sysctl --system

Andao hametraka nginx.

Nampiana rakitra fitehirizana nginx /etc/yum.repos.d/nginx.repo

[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true

Apetraho ny fonosana nginx

yum install -y nginx

Voalohany, mila manamboatra ny format log ao amin'ny Nginx ao amin'ny rakitra /etc/nginx/nginx.conf

user  nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically

# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

# provides the configuration file context in which the directives that affect connection processing are specified.
events {
    # determines how much clients will be served per worker
    # max clients = worker_connections * worker_processes
    # max clients is also limited by the number of socket connections available on the system (~64k)
    worker_connections 4000;

    # optimized to serve many clients with each thread, essential for linux -- for testing environment
    use epoll;

    # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
    multi_accept on;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

log_format vector escape=json
    '{'
        '"node_name":"nginx-vector",'
        '"timestamp":"$time_iso8601",'
        '"server_name":"$server_name",'
        '"request_full": "$request",'
        '"request_user_agent":"$http_user_agent",'
        '"request_http_host":"$http_host",'
        '"request_uri":"$request_uri",'
        '"request_scheme": "$scheme",'
        '"request_method":"$request_method",'
        '"request_length":"$request_length",'
        '"request_time": "$request_time",'
        '"request_referrer":"$http_referer",'
        '"response_status": "$status",'
        '"response_body_bytes_sent":"$body_bytes_sent",'
        '"response_content_type":"$sent_http_content_type",'
        '"remote_addr": "$remote_addr",'
        '"remote_port": "$remote_port",'
        '"remote_user": "$remote_user",'
        '"upstream_addr": "$upstream_addr",'
        '"upstream_bytes_received": "$upstream_bytes_received",'
        '"upstream_bytes_sent": "$upstream_bytes_sent",'
        '"upstream_cache_status":"$upstream_cache_status",'
        '"upstream_connect_time":"$upstream_connect_time",'
        '"upstream_header_time":"$upstream_header_time",'
        '"upstream_response_length":"$upstream_response_length",'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_status": "$upstream_status",'
        '"upstream_content_type":"$upstream_http_content_type"'
    '}';

    access_log  /var/log/nginx/access.log  main;
    access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

Mba tsy handravana ny fandrindranao amin'izao fotoana izao, Nginx dia mamela anao hanana torolalana access_log maromaro

access_log  /var/log/nginx/access.log  main;            # Стандартный лог
access_log  /var/log/nginx/access.json.log vector;      # Новый лог в формате json

Aza adino ny manisy fitsipika ho logrotate ho an'ny log vaovao (raha tsy mifarana amin'ny .log ny rakitra log)

Esory ny default.conf amin'ny /etc/nginx/conf.d/

rm -f /etc/nginx/conf.d/default.conf

Ampio mpampiantrano virtoaly /etc/nginx/conf.d/vhost1.conf

server {
    listen 80;
    server_name vhost1;
    location / {
        proxy_pass http://172.26.10.106:8080;
    }
}

Ampio mpampiantrano virtoaly /etc/nginx/conf.d/vhost2.conf

server {
    listen 80;
    server_name vhost2;
    location / {
        proxy_pass http://172.26.10.108:8080;
    }
}

Ampio mpampiantrano virtoaly /etc/nginx/conf.d/vhost3.conf

server {
    listen 80;
    server_name vhost3;
    location / {
        proxy_pass http://172.26.10.109:8080;
    }
}

Ampio mpampiantrano virtoaly /etc/nginx/conf.d/vhost4.conf

server {
    listen 80;
    server_name vhost4;
    location / {
        proxy_pass http://172.26.10.116:8080;
    }
}

Ampio mpampiantrano virtoaly (172.26.10.106 ip an'ny mpizara izay apetraka nginx) amin'ny mpizara rehetra amin'ny rakitra /etc/hosts:

172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4

Ary raha efa vonona ny zava-drehetra

nginx -t 
systemctl restart nginx

Andeha isika hametraka azy io Vector

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm

Andao hamorona rakitra fichier ho an'ny systemd /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Ary amboary ny fanoloana Filebeat ao amin'ny /etc/vector/vector.toml config. Ny adiresy IP 172.26.10.108 dia ny adiresy IP an'ny mpizara log (Vector-Server)

data_dir = "/var/lib/vector"

[sources.nginx_file]
  type                          = "file"
  include                       = [ "/var/log/nginx/access.json.log" ]
  start_at_beginning            = false
  fingerprinting.strategy       = "device_and_inode"

[sinks.nginx_output_vector]
  type                          = "vector"
  inputs                        = [ "nginx_file" ]

  address                       = "172.26.10.108:9876"

Aza adino ny manampy ny mpampiasa vector amin'ny vondrona ilaina mba hahafahany mamaky rakitra log. Ohatra, ny nginx in centos dia mamorona logs miaraka amin'ny zon'ny vondrona adm.

usermod -a -G adm vector

Andao hanomboka ny serivisy vector

systemctl enable vector
systemctl start vector

Ny logs vector dia azo jerena toy izao:

journalctl -f -u vector

Tokony hisy fidirana toy izany ao amin'ny logs

INFO vector::topology::builder: Healthcheck: Passed.

Fitsapana adin-tsaina

Ny fitsapana dia atao amin'ny alàlan'ny benchmark Apache.

Ny fonosana httpd-tools dia napetraka amin'ny mpizara rehetra

Manomboka manao fitiliana amin'ny fampiasana benchmark Apache avy amin'ny mpizara 4 samihafa amin'ny efijery izahay. Voalohany, manomboka ny multiplexer terminal screen isika, ary avy eo dia manomboka manandrana mampiasa ny mari-pamantarana Apache. Ahoana ny fomba fiasa amin'ny efijery azonao jerena ao lahatsoratra.

Avy amin'ny mpizara voalohany

while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done

Avy amin'ny mpizara voalohany

while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done

Avy amin'ny mpizara voalohany

while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done

Avy amin'ny mpizara voalohany

while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done

Andeha hojerentsika ny angona ao amin'ny Clickhouse

Mandehana any amin'ny Clickhouse

clickhouse-client -h 172.26.10.109 -m

Manao fangatahana SQL

SELECT * FROM vector.logs;

┌─node_name────┬───────────timestamp─┬─server_name─┬─user_id─┬─request_full───┬─request_user_agent─┬─request_http_host─┬─request_uri─┬─request_scheme─┬─request_method─┬─request_length─┬─request_time─┬─request_referrer─┬─response_status─┬─response_body_bytes_sent─┬─response_content_type─┬───remote_addr─┬─remote_port─┬─remote_user─┬─upstream_addr─┬─upstream_port─┬─upstream_bytes_received─┬─upstream_bytes_sent─┬─upstream_cache_status─┬─upstream_connect_time─┬─upstream_header_time─┬─upstream_response_length─┬─upstream_response_time─┬─upstream_status─┬─upstream_content_type─┐
│ nginx-vector │ 2020-08-07 04:32:42 │ vhost1      │         │ GET / HTTP/1.0 │ 1server            │ vhost1            │ /           │ http           │ GET            │             66 │        0.028 │                  │             404 │                       27 │                       │ 172.26.10.106 │       45886 │             │ 172.26.10.106 │             0 │                     109 │                  97 │ DISABLED              │                     0 │                0.025 │                       27 │                  0.029 │             404 │                       │
└──────────────┴─────────────────────┴─────────────┴─────────┴────────────────┴────────────────────┴───────────────────┴─────────────┴────────────────┴────────────────┴────────────────┴──────────────┴──────────────────┴─────────────────┴──────────────────────────┴───────────────────────┴───────────────┴─────────────┴─────────────┴───────────────┴───────────────┴─────────────────────────┴─────────────────────┴───────────────────────┴───────────────────────┴──────────────────────┴──────────────────────────┴────────────────────────┴─────────────────┴───────────────────────

Fantaro ny haben'ny latabatra ao amin'ny Clickhouse

select concat(database, '.', table)                         as table,
       formatReadableSize(sum(bytes))                       as size,
       sum(rows)                                            as rows,
       max(modification_time)                               as latest_modification,
       sum(bytes)                                           as bytes_size,
       any(engine)                                          as engine,
       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;

Andao hojerentsika hoe ohatrinona ny logs tao amin'ny Clickhouse.

Mandefa logs Nginx json mampiasa Vector mankany Clickhouse sy Elasticsearch

Ny haben'ny latabatra dia 857.19 MB.

Mandefa logs Nginx json mampiasa Vector mankany Clickhouse sy Elasticsearch

Ny haben'ny angon-drakitra mitovy amin'ny index ao amin'ny Elasticsearch dia 4,5GB.

Raha tsy mamaritra ny angon-drakitra ao amin'ny vector ianao amin'ny paramètre, Clickhouse dia maka 4500/857.19 = 5.24 heny noho ny ao amin'ny Elasticsearch.

Ao amin'ny vector, ny saha famatrarana dia ampiasaina amin'ny alàlan'ny default.

Telegram chat avy amin'ny clickhouse
Telegram chat avy amin'ny Elasticsearch
Telegram chat amin'ny "Fanangonana sy famakafakana ny rafitra hafatra"

Source: www.habr.com

Add a comment