TĂȘketinĂȘn Nginx json bi karanĂźna Vector ji Clickhouse Ă» Elasticsearch re diƟünin

TĂȘketinĂȘn Nginx json bi karanĂźna Vector ji Clickhouse Ă» Elasticsearch re diƟünin

Vektor, ji bo berhevkirin, veguherandin Ă» Ɵandina daneyĂȘn tĂȘketinĂȘ, metrĂźk Ă» bĂ»yeran hatĂź çĂȘkirin.

→ Github

Ji ber ku bi zimanĂȘ Rust hatĂź nivĂźsandin, li gorĂź analogĂȘn xwe bi performansa bilind Ă» xerckirina kĂȘm RAM-ĂȘ tĂȘ destnüƟan kirin. Digel vĂȘ yekĂȘ, pir bal tĂȘ kiƟandin ser fonksiyonĂȘn ku bi rastbĂ»nĂȘ ve girĂȘdayĂź ne, nemaze, Ɵiyana hilanĂźna bĂ»yerĂȘn neƟandiye li tamponek li ser dĂźskĂȘ Ă» zivirĂźna pelan.

Ji hĂȘla mĂźmarĂź ve, Vector routerek bĂ»yerĂȘ ye ku ji yek an jĂź bĂȘtir peyaman distĂźne çavkaniyĂȘn, vebijarkĂź li ser van peyaman sepandin veguherĂźnan, Ă» Ɵandina wan ji yek an jĂź zĂȘdetir dirijĂźne.

Vektor ji bo pelĂȘbeat Ă» logstash-ĂȘ veguhezek e, ew dikare di her du rolan de tevbigere (qeydĂȘn wergirtin Ă» Ɵandin), bĂȘtir hĂ»rguliyĂȘn li ser wan malperĂȘ.

Ger di Logstash de zincĂźre wekĂź tĂȘketin → ParzĂ»n → derketin çĂȘbibe wĂȘ hingĂȘ ew di Vector de ye çavkaniyĂȘn → veguherĂźne → diherikĂź

NimĂ»neyĂȘn di belgeyĂȘ de tĂȘne dĂźtin.

Ev talĂźmat talĂźmatek revĂźzekirĂź ye ji Vyacheslav Rakhinsky. RĂȘbernameyĂȘn orĂźjĂźnal pĂȘvajoyek geoip vedihewĂźne. Dema ceribandina geoip ji torgilokek navxweyĂź, vektor xeletiyek da.

Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=«geoip.country_name» rate_limit_secs=30

Ger kesek hewce bike ku geoip pĂȘvajoyĂȘ bike, wĂȘ hingĂȘ serĂź li rĂȘwerzĂȘn orĂźjĂźnal ĂȘn ji Vyacheslav Rakhinsky.

Em ĂȘ lihevhatina Nginx (qeydĂȘn gihüƟtinĂȘ) → Vektor (XerĂźdar | PelĂȘbeat) → Vektor (Server | Logstash) → veqetandĂź di Clickhouse de Ă» ji hev veqetandĂź di Elasticsearch de mĂźheng bikin. Em ĂȘ 4 pĂȘƟkĂȘƟkeran saz bikin. Her çend hĂ»n dikarin wĂȘ bi 3 serveran re derbas bikin.

TĂȘketinĂȘn Nginx json bi karanĂźna Vector ji Clickhouse Ă» Elasticsearch re diƟünin

Plan tiƟtekü bi vü rengü ye.

Selinux li ser hemĂź serverĂȘn xwe neçalak bikin

sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot

Em li ser hemß serveran emulatorek servera HTTP + karûbar saz dikin

Em ĂȘ wekĂź emĂ»latorek servera HTTP bikar bĂźnin nodejs-stub-server ji Maxim Ignatenko

Nodejs-stub-server rpm tune. Ev e ji bo wĂȘ rpm çĂȘbikin. rpm dĂȘ bi kar were berhev kirin Fedora Copr

Depoya antonpatsev/nodejs-stub-server zĂȘde bikin

yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-server

Nodejs-stub-server, pĂźvana Apache Ă» multiplexera termĂźnalĂȘ ekranĂȘ li ser hemĂź pĂȘƟkĂȘƟkeran saz bikin

yum -y install stub_http_server screen mc httpd-tools screen

Min dema bersivĂȘ ya stub_http_server di pelĂȘ /var/lib/stub_http_server/stub_http_server.js de rast kir da ku bĂȘtir tĂȘketin hebin.

var max_sleep = 10;

Ka em stub_http_server dest pĂȘ bikin.

systemctl start stub_http_server
systemctl enable stub_http_server

Sazkirina Clickhouse li ser server 3

ClickHouse koma rĂȘwerzĂȘn SSE 4.2 bikar tĂźne, ji ber vĂȘ yekĂȘ heya ku wekĂź din neyĂȘ destnüƟan kirin, piƟtgirĂźya wĂȘ di pĂȘvajoya ku hatĂź bikar anĂźn de dibe pĂȘdivĂźyek pergalĂȘ. Li vir emrĂȘ ye ku hĂ»n kontrol bikin ka pĂȘvajoya heyĂź SSE 4.2 piƟtgirĂź dike:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

PĂȘƟü hĂ»n hewce ne ku depoya fermĂź ve girĂȘdin:

sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64

Ji bo sazkirina pakĂȘtan hĂ»n hewce ne ku emrĂȘn jĂȘrĂźn bimeƟünin:

sudo yum install -y clickhouse-server clickhouse-client

DestĂ»rĂȘ bide clickhouse-server ku di pelĂȘ de /etc/clickhouse-server/config.xml guh bide qerta torĂȘ

<listen_host>0.0.0.0</listen_host>

Guhertina asta tĂȘketinĂȘ ji ƟopĂȘ berbi debugĂȘ

debug

MĂźhengĂȘn compression standard:

min_compress_block_size  65536
max_compress_block_size  1048576

Ji bo çalakkirina Zstd compression, hate Ɵüret kirin ku meriv dest nede mĂźhengĂȘ, lĂȘ li ĆŸĂ»na DDL bikar bĂźne.

TĂȘketinĂȘn Nginx json bi karanĂźna Vector ji Clickhouse Ă» Elasticsearch re diƟünin

Min nekarĂź bibĂźnim ka meriv çawa bi DDL-ĂȘ di Google-ĂȘ de bihevra zstd bikar tĂźne. Ji ber vĂȘ yekĂȘ min ew wekĂź xwe hiƟt.

HevalĂȘn ku di Clickhouse de compression zstd bikar tĂźnin, ji kerema xwe rĂȘwerzan parve bikin.

Ji bo destpĂȘkirina serverĂȘ wekĂź daemon, bixebitin:

service clickhouse-server start

Naha em biçin sazkirina Clickhouse

Biçe Clickhouse

clickhouse-client -h 172.26.10.109 -m

172.26.10.109 - IP-ya servera ku Clickhouse lĂȘ hatĂź saz kirin.

Werin em databasek vektorĂȘ biafirĂźnin

CREATE DATABASE vector;

Ka em kontrol bikin ku databas heye.

show databases;

Tabloyek vector.logs çĂȘbikin.

/* Đ­Ń‚ĐŸ таблОца гЎД Ń…Ń€Đ°ĐœŃŃ‚ŃŃ Đ»ĐŸĐłĐž ĐșĐ°Đș Đ”ŃŃ‚ŃŒ */

CREATE TABLE vector.logs
(
    `node_name` String,
    `timestamp` DateTime,
    `server_name` String,
    `user_id` String,
    `request_full` String,
    `request_user_agent` String,
    `request_http_host` String,
    `request_uri` String,
    `request_scheme` String,
    `request_method` String,
    `request_length` UInt64,
    `request_time` Float32,
    `request_referrer` String,
    `response_status` UInt16,
    `response_body_bytes_sent` UInt64,
    `response_content_type` String,
    `remote_addr` IPv4,
    `remote_port` UInt32,
    `remote_user` String,
    `upstream_addr` IPv4,
    `upstream_port` UInt32,
    `upstream_bytes_received` UInt64,
    `upstream_bytes_sent` UInt64,
    `upstream_cache_status` String,
    `upstream_connect_time` Float32,
    `upstream_header_time` Float32,
    `upstream_response_length` UInt64,
    `upstream_response_time` Float32,
    `upstream_status` UInt16,
    `upstream_content_type` String,
    INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;

Em kontrol dikin ku tablo hatine çĂȘkirin. Ka em dest pĂȘ bikin clickhouse-client Ă» daxwazek bike.

Ka em biçin databasa vektorĂȘ.

use vector;

Ok.

0 rows in set. Elapsed: 0.001 sec.

Ka em li tabloyan binĂȘrin.

show tables;

┌─name────────────────┐
│ logs                │
└─────────────────────┘

Sazkirina elasticsearch-ĂȘ li ser servera 4-an da ku heman daneyan ji Elasticsearch re ji bo berhevdana bi Clickhouse re biƟüne

Mifteyek rpm ya gelemperĂź zĂȘde bikin

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Ka em 2 repo ava bikin:

/etc/yum.repos.d/elasticsearch.repo

[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

/etc/yum.repos.d/kibana.repo

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Elasticsearch Ă» kibana saz bikin

yum install -y kibana elasticsearch

Ji ber ku ew ĂȘ di 1 kopĂź de be, hĂ»n hewce ne ku jĂȘrĂźn li pelĂȘ /etc/elasticsearch/elasticsearch.yml zĂȘde bikin:

discovery.type: single-node

Ji ber vĂȘ yekĂȘ ew vektor dikare ji serverek din re daneyan biƟüne elasticsearch, werin em network.host biguherĂźnin.

network.host: 0.0.0.0

Ji bo girĂȘdana bi kibana re, di pelĂȘ /etc/kibana/kibana.yml de parametreya server.host biguherĂźnin.

server.host: "0.0.0.0"

Kevin Ă» elasticsearch-ĂȘ di destpĂȘka otomatĂźkĂȘ de vedihewĂźne

systemctl enable elasticsearch
systemctl start elasticsearch

Ă» kibana

systemctl enable kibana
systemctl start kibana

Veavakirina Elasticsearch-ĂȘ ji bo moda yek-girĂȘk 1 ƟirĂźn, 0 replica. Bi ĂźhtĂźmaleke mezin hĂ»n ĂȘ komek ji hejmareke mezin pĂȘƟkĂȘƟkeran hebin Ă» hĂ»n ne hewce ne ku vĂȘ yekĂȘ bikin.

Ji bo navnüƟĂȘn pĂȘƟerojĂȘ, ƟablonĂȘ xwerĂ» nĂ»ve bikin:

curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' 

mĂźhengĂȘ Vektor wekĂź ĆŸĂ»na Logstash li ser server 2

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screen

Werin em Vector wekĂź ĆŸĂ»na Logstash saz bikin. Guhertina pelĂȘ /etc/vector/vector.toml

# /etc/vector/vector.toml

data_dir = "/var/lib/vector"

[sources.nginx_input_vector]
  # General
  type                          = "vector"
  address                       = "0.0.0.0:9876"
  shutdown_timeout_secs         = 30

[transforms.nginx_parse_json]
  inputs                        = [ "nginx_input_vector" ]
  type                          = "json_parser"

[transforms.nginx_parse_add_defaults]
  inputs                        = [ "nginx_parse_json" ]
  type                          = "lua"
  version                       = "2"

  hooks.process = """
  function (event, emit)

    function split_first(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[1];
    end

    function split_last(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[#result];
    end

    event.log.upstream_addr             = split_first(split_last(event.log.upstream_addr, ', '), ':')
    event.log.upstream_bytes_received   = split_last(event.log.upstream_bytes_received, ', ')
    event.log.upstream_bytes_sent       = split_last(event.log.upstream_bytes_sent, ', ')
    event.log.upstream_connect_time     = split_last(event.log.upstream_connect_time, ', ')
    event.log.upstream_header_time      = split_last(event.log.upstream_header_time, ', ')
    event.log.upstream_response_length  = split_last(event.log.upstream_response_length, ', ')
    event.log.upstream_response_time    = split_last(event.log.upstream_response_time, ', ')
    event.log.upstream_status           = split_last(event.log.upstream_status, ', ')

    if event.log.upstream_addr == "" then
        event.log.upstream_addr = "127.0.0.1"
    end

    if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
        event.log.upstream_bytes_received = "0"
    end

    if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
        event.log.upstream_bytes_sent = "0"
    end

    if event.log.upstream_cache_status == "" then
        event.log.upstream_cache_status = "DISABLED"
    end

    if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
        event.log.upstream_connect_time = "0"
    end

    if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
        event.log.upstream_header_time = "0"
    end

    if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
        event.log.upstream_response_length = "0"
    end

    if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
        event.log.upstream_response_time = "0"
    end

    if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
        event.log.upstream_status = "0"
    end

    emit(event)

  end
  """

[transforms.nginx_parse_remove_fields]
    inputs                              = [ "nginx_parse_add_defaults" ]
    type                                = "remove_fields"
    fields                              = ["data", "file", "host", "source_type"]

[transforms.nginx_parse_coercer]

    type                                = "coercer"
    inputs                              = ["nginx_parse_remove_fields"]

    types.request_length = "int"
    types.request_time = "float"

    types.response_status = "int"
    types.response_body_bytes_sent = "int"

    types.remote_port = "int"

    types.upstream_bytes_received = "int"
    types.upstream_bytes_send = "int"
    types.upstream_connect_time = "float"
    types.upstream_header_time = "float"
    types.upstream_response_length = "int"
    types.upstream_response_time = "float"
    types.upstream_status = "int"

    types.timestamp = "timestamp"

[sinks.nginx_output_clickhouse]
    inputs   = ["nginx_parse_coercer"]
    type     = "clickhouse"

    database = "vector"
    healthcheck = true
    host = "http://172.26.10.109:8123" #  АЎрДс Clickhouse
    table = "logs"

    encoding.timestamp_format = "unix"

    buffer.type = "disk"
    buffer.max_size = 104900000
    buffer.when_full = "block"

    request.in_flight_limit = 20

[sinks.elasticsearch]
    type = "elasticsearch"
    inputs   = ["nginx_parse_coercer"]
    compression = "none"
    healthcheck = true
    # 172.26.10.116 - сДрĐČДр гЎД ŃƒŃŃ‚Đ°ĐœĐŸĐČĐ”Đœ elasticsearch
    host = "http://172.26.10.116:9200" 
    index = "vector-%Y-%m-%d"

HĂ»n dikarin beƟa transforms.nginx_parse_add_defaults eyar bikin.

ji ber ku Vyacheslav Rakhinsky van mßhengan ji bo CDNek piçûk bikar tßne û di jorßn_* de dikare çend nirx hebin

Bo nimûne:

"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"

Ger ev ne rewƟa we ye, wĂȘ hingĂȘ ev beƟ dikare were hĂȘsan kirin

Ka em ji bo systemd /etc/systemd/system/vector.service mĂźhengĂȘn karĂ»barĂȘ biafirĂźnin

# /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

PiƟtĂź çĂȘkirina tabloyan, hĂ»n dikarin Vector bimeƟünin

systemctl enable vector
systemctl start vector

TĂȘketinĂȘn vektorĂź dikarin bi vĂź rengĂź werin dĂźtin:

journalctl -f -u vector

DivĂȘ navnüƟĂȘn bi vĂź rengĂź di qeydan de hebin

INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.

Li ser xerĂźdar (pĂȘƟkĂȘƟkara malperĂȘ) - pĂȘƟkĂȘƟkara 1

Li ser servera bi nginx, hĂ»n hewce ne ku ipv6 neçalak bikin, ji ber ku tabloya tĂȘketinĂȘ li clickhouse zeviyĂȘ bikar tĂźne upstream_addr IPv4, ji ber ku ez ipv6 di hundurĂȘ torĂȘ de bikar naynim. Ger ipv6 neyĂȘ girtin, dĂȘ xeletĂź hebin:

DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)

Dibe ku xwendevanan, piƟtgiriya ipv6 zĂȘde bikin.

PelĂȘ /etc/sysctl.d/98-disable-ipv6.conf biafirĂźne

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Sepandina mĂźhengan

sysctl --system

Ka em nginx saz bikin.

PelĂȘ depoya nginx /etc/yum.repos.d/nginx.repo zĂȘde kir

[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true

PakĂȘta nginx saz bikin

yum install -y nginx

PĂȘƟü, pĂȘdivĂź ye ku em di pelĂȘ /etc/nginx/nginx.conf de formata tĂȘketinĂȘ ya Nginx-ĂȘ mĂźheng bikin.

user  nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically

# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

# provides the configuration file context in which the directives that affect connection processing are specified.
events {
    # determines how much clients will be served per worker
    # max clients = worker_connections * worker_processes
    # max clients is also limited by the number of socket connections available on the system (~64k)
    worker_connections 4000;

    # optimized to serve many clients with each thread, essential for linux -- for testing environment
    use epoll;

    # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
    multi_accept on;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

log_format vector escape=json
    '{'
        '"node_name":"nginx-vector",'
        '"timestamp":"$time_iso8601",'
        '"server_name":"$server_name",'
        '"request_full": "$request",'
        '"request_user_agent":"$http_user_agent",'
        '"request_http_host":"$http_host",'
        '"request_uri":"$request_uri",'
        '"request_scheme": "$scheme",'
        '"request_method":"$request_method",'
        '"request_length":"$request_length",'
        '"request_time": "$request_time",'
        '"request_referrer":"$http_referer",'
        '"response_status": "$status",'
        '"response_body_bytes_sent":"$body_bytes_sent",'
        '"response_content_type":"$sent_http_content_type",'
        '"remote_addr": "$remote_addr",'
        '"remote_port": "$remote_port",'
        '"remote_user": "$remote_user",'
        '"upstream_addr": "$upstream_addr",'
        '"upstream_bytes_received": "$upstream_bytes_received",'
        '"upstream_bytes_sent": "$upstream_bytes_sent",'
        '"upstream_cache_status":"$upstream_cache_status",'
        '"upstream_connect_time":"$upstream_connect_time",'
        '"upstream_header_time":"$upstream_header_time",'
        '"upstream_response_length":"$upstream_response_length",'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_status": "$upstream_status",'
        '"upstream_content_type":"$upstream_http_content_type"'
    '}';

    access_log  /var/log/nginx/access.log  main;
    access_log  /var/log/nginx/access.json.log vector;      # ĐĐŸĐČыĐč Đ»ĐŸĐł ĐČ Ń„ĐŸŃ€ĐŒĐ°Ń‚Đ” json

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

Ji bo ku hĂ»n veavakirina xweya heyĂź neƟikĂźnin, Nginx dihĂȘle hĂ»n çend rĂȘwerzĂȘn gihüƟtina_logĂȘ hebin

access_log  /var/log/nginx/access.log  main;            # ĐĄŃ‚Đ°ĐœĐŽĐ°Ń€Ń‚ĐœŃ‹Đč Đ»ĐŸĐł
access_log  /var/log/nginx/access.json.log vector;      # ĐĐŸĐČыĐč Đ»ĐŸĐł ĐČ Ń„ĐŸŃ€ĐŒĐ°Ń‚Đ” json

Ji bĂźr nekin ku ji bo tĂȘketinĂȘn nĂ» qaĂźdeyek lĂȘ zĂȘde bikin (heke pelĂȘ tĂȘketinĂȘ bi .log neqede)

default.conf ji /etc/nginx/conf.d/ rake

rm -f /etc/nginx/conf.d/default.conf

MĂȘvandarĂȘ virtual lĂȘ zĂȘde bike /etc/nginx/conf.d/vhost1.conf

server {
    listen 80;
    server_name vhost1;
    location / {
        proxy_pass http://172.26.10.106:8080;
    }
}

MĂȘvandarĂȘ virtual lĂȘ zĂȘde bike /etc/nginx/conf.d/vhost2.conf

server {
    listen 80;
    server_name vhost2;
    location / {
        proxy_pass http://172.26.10.108:8080;
    }
}

MĂȘvandarĂȘ virtual lĂȘ zĂȘde bike /etc/nginx/conf.d/vhost3.conf

server {
    listen 80;
    server_name vhost3;
    location / {
        proxy_pass http://172.26.10.109:8080;
    }
}

MĂȘvandarĂȘ virtual lĂȘ zĂȘde bike /etc/nginx/conf.d/vhost4.conf

server {
    listen 80;
    server_name vhost4;
    location / {
        proxy_pass http://172.26.10.116:8080;
    }
}

MĂȘvanĂȘn virtual (172.26.10.106 ip ya servera ku nginx lĂȘ hatĂź saz kirin) li hemĂź serveran li pelĂȘ /etc/hosts zĂȘde bikin:

172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4

Û heke her tiƟt amade ye wĂȘ hingĂȘ

nginx -t 
systemctl restart nginx

Niha em bi xwe saz bikin Vektor

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm

Ka em ji bo systemd /etc/systemd/system/vector.service pelek mĂźhengan çĂȘbikin

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

Û di veavakirina /etc/vector/vector.toml de veguhertina Filebeat mĂźheng bikin. NavnüƟana IP-ĂȘ 172.26.10.108 navnüƟana IP-ya servera tĂȘketinĂȘ ye (Vector-Server)

data_dir = "/var/lib/vector"

[sources.nginx_file]
  type                          = "file"
  include                       = [ "/var/log/nginx/access.json.log" ]
  start_at_beginning            = false
  fingerprinting.strategy       = "device_and_inode"

[sinks.nginx_output_vector]
  type                          = "vector"
  inputs                        = [ "nginx_file" ]

  address                       = "172.26.10.108:9876"

Ji bĂźr nekin ku bikarhĂȘnerĂȘ vektorĂȘ li koma hewce zĂȘde bikin da ku ew pelĂȘn tĂȘketinĂȘ bixwĂźne. MĂźnakĂź, nginx di centos de tĂȘketinĂȘn bi mafĂȘn koma adm diafirĂźne.

usermod -a -G adm vector

Werin em dest bi karĂ»barĂȘ vektorĂȘ bikin

systemctl enable vector
systemctl start vector

TĂȘketinĂȘn vektorĂź dikarin bi vĂź rengĂź werin dĂźtin:

journalctl -f -u vector

DivĂȘ navnüƟek bi vĂź rengĂź di qeydan de hebe

INFO vector::topology::builder: Healthcheck: Passed.

Testkirina StresĂȘ

Em ceribandinĂȘ bi karanĂźna pĂźvana Apache pĂȘk tĂźnin.

PakĂȘta httpd-tools li ser hemĂź pĂȘƟkĂȘƟkeran hate saz kirin

Em bi karanĂźna pĂźvana Apache ji 4 serverĂȘn cihĂȘreng ĂȘn li ser ekranĂȘ dest bi ceribandinĂȘ dikin. PĂȘƟün, em multiplekserĂȘ termĂźnalĂȘ ekranĂȘ didin destpĂȘkirin, Ă» dĂ»v re jĂź em bi karanĂźna pĂźvana Apache dest bi ceribandinĂȘ dikin. Meriv çawa bi ekranĂȘ re dixebite ku hĂ»n dikarin tĂȘ de bibĂźnin gotara.

Ji servera 1st

while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done

Ji servera 2st

while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done

Ji servera 3st

while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done

Ji servera 4st

while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done

Ka em daneyĂȘn li Clickhouse kontrol bikin

Biçe Clickhouse

clickhouse-client -h 172.26.10.109 -m

ÇĂȘkirina pirsek SQL

SELECT * FROM vector.logs;

┌─node_name────┬───────────timestamp─┬─server_name─┬─user_id─┬─request_full───┬─request_user_agent─┬─request_http_host─┬─request_uri─┬─request_scheme─┬─request_method─┬─request_length─┬─request_time─┬─request_referrer─┬─response_status─┬─response_body_bytes_sent─┬─response_content_type─┬───remote_addr─┬─remote_port─┬─remote_user─┬─upstream_addr─┬─upstream_port─┬─upstream_bytes_received─┬─upstream_bytes_sent─┬─upstream_cache_status─┬─upstream_connect_time─┬─upstream_header_time─┬─upstream_response_length─┬─upstream_response_time─┬─upstream_status─┬─upstream_content_type─┐
│ nginx-vector │ 2020-08-07 04:32:42 │ vhost1      │         │ GET / HTTP/1.0 │ 1server            │ vhost1            │ /           │ http           │ GET            │             66 │        0.028 │                  │             404 │                       27 │                       │ 172.26.10.106 │       45886 │             │ 172.26.10.106 │             0 │                     109 │                  97 │ DISABLED              │                     0 │                0.025 │                       27 │                  0.029 │             404 │                       │
└──────────────┮─────────────────────┮─────────────┮─────────┮────────────────┮────────────────────┮───────────────────┮─────────────┮────────────────┮────────────────┮────────────────┮──────────────┮──────────────────┮─────────────────┮──────────────────────────┮───────────────────────┮───────────────┮─────────────┮─────────────┮───────────────┮───────────────┮─────────────────────────┮─────────────────────┮───────────────────────┮───────────────────────┮──────────────────────┮──────────────────────────┮────────────────────────┮─────────────────┮───────────────────────

Mezinahiya tabloyĂȘn li Clickhouse bibĂźnin

select concat(database, '.', table)                         as table,
       formatReadableSize(sum(bytes))                       as size,
       sum(rows)                                            as rows,
       max(modification_time)                               as latest_modification,
       sum(bytes)                                           as bytes_size,
       any(engine)                                          as engine,
       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;

Ka em fĂȘr bibin ka çiqas tĂȘketin li Clickhouse girt.

TĂȘketinĂȘn Nginx json bi karanĂźna Vector ji Clickhouse Ă» Elasticsearch re diƟünin

Mezinahiya tabloya tĂȘketin 857.19 MB e.

TĂȘketinĂȘn Nginx json bi karanĂźna Vector ji Clickhouse Ă» Elasticsearch re diƟünin

Mezinahiya heman daneyĂȘ di navnüƟa Elasticsearch de 4,5 GB ye.

Heke hĂ»n di vektorĂȘ de di pĂźvanan de daneyan diyar nekin, Clickhouse 4500/857.19 = 5.24 carĂź ji Elasticsearch kĂȘmtir digire.

Di vektorĂȘ de, qada kompresyonĂȘ ji hĂȘla xwerĂ» ve tĂȘ bikar anĂźn.

chat Telegram ji aliyĂȘ Clickhouse
chat Telegram ji aliyĂȘ Elasticsearch
chat Telegram ji hĂȘla "Berhevkirin Ă» analĂźzkirina pergalĂȘ peyam"

Source: www.habr.com

Add a comment