Vector๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Nginx json ๋กœ๊ทธ๋ฅผ Clickhouse ๋ฐ Elasticsearch๋กœ ๋ณด๋‚ด๊ธฐ

Vector๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Nginx json ๋กœ๊ทธ๋ฅผ Clickhouse ๋ฐ Elasticsearch๋กœ ๋ณด๋‚ด๊ธฐ

๋ฒกํ„ฐ, ๋กœ๊ทธ ๋ฐ์ดํ„ฐ, ์ง€ํ‘œ ๋ฐ ์ด๋ฒคํŠธ๋ฅผ ์ˆ˜์ง‘, ๋ณ€ํ™˜ ๋ฐ ์ „์†กํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

โ†’ ๊นƒํ—ˆ๋ธŒ

Rust ์–ธ์–ด๋กœ ์ž‘์„ฑ๋˜์–ด ์œ ์‚ฌ ์–ธ์–ด์— ๋น„ํ•ด ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚˜๊ณ  RAM ์†Œ๋น„๊ฐ€ ์ ์€ ๊ฒƒ์ด ํŠน์ง•์ž…๋‹ˆ๋‹ค. ๋˜ํ•œ ์ •ํ™•์„ฑ๊ณผ ๊ด€๋ จ๋œ ๊ธฐ๋Šฅ, ํŠนํžˆ ์ „์†ก๋˜์ง€ ์•Š์€ ์ด๋ฒคํŠธ๋ฅผ ๋””์Šคํฌ์˜ ๋ฒ„ํผ์— ์ €์žฅํ•˜๊ณ  ํŒŒ์ผ์„ ํšŒ์ „ํ•˜๋Š” ๊ธฐ๋Šฅ์— ๋งŽ์€ ๊ด€์‹ฌ์„ ๊ธฐ์šธ์ž…๋‹ˆ๋‹ค.

๊ตฌ์กฐ์ ์œผ๋กœ Vector๋Š” ํ•˜๋‚˜ ์ด์ƒ์˜ ๋ฉ”์‹œ์ง€๋ฅผ ์ˆ˜์‹ ํ•˜๋Š” ์ด๋ฒคํŠธ ๋ผ์šฐํ„ฐ์ž…๋‹ˆ๋‹ค. ์ถœ์ฒ˜, ์„ ํƒ์ ์œผ๋กœ ์ด๋Ÿฌํ•œ ๋ฉ”์‹œ์ง€์— ์ ์šฉ ๋ณ€ํ˜•, ํ•˜๋‚˜ ์ด์ƒ์˜ ์‚ฌ๋žŒ์—๊ฒŒ ๋ณด๋ƒ…๋‹ˆ๋‹ค. ๋ฐฐ์ˆ˜๊ตฌ.

Vector๋Š” filebeat์™€ logstash๋ฅผ ๋Œ€์ฒดํ•˜๋ฉฐ ๋‘ ๊ฐ€์ง€ ์—ญํ• (๋กœ๊ทธ ์ˆ˜์‹  ๋ฐ ์ „์†ก)์„ ๋ชจ๋‘ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์˜จ๋ผ์ธ์œผ๋กœ.

Logstash์—์„œ ์ฒด์ธ์ด ์ž…๋ ฅ โ†’ ํ•„ํ„ฐ โ†’ ์ถœ๋ ฅ์œผ๋กœ ๊ตฌ์ถ•๋œ ๊ฒฝ์šฐ Vector์—์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์†Œ์Šค โ†’ ๋ณ€ํ™˜ โ†’ ์‹ฑํฌ

์˜ˆ์ œ๋Š” ๋ฌธ์„œ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ์ง€์นจ์€ ๋‹ค์Œ์˜ ๊ฐœ์ •๋œ ์ง€์นจ์ž…๋‹ˆ๋‹ค. ๋ฑŒ์ฒด์Šฌ๋ผํ”„ ๋ผํ‚จ์Šคํ‚ค. ์›๋ณธ ์ง€์นจ์—๋Š” geoip ์ฒ˜๋ฆฌ๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚ด๋ถ€ ๋„คํŠธ์›Œํฌ์—์„œ geoip๋ฅผ ํ…Œ์ŠคํŠธํ•  ๋•Œ ๋ฒกํ„ฐ์—์„œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

Aug 05 06:25:31.889 DEBUG transform{name=nginx_parse_rename_fields type=rename_fields}: vector::transforms::rename_fields: Field did not exist field=ยซgeoip.country_nameยป rate_limit_secs=30

geoip์„ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•˜๋Š” ์‚ฌ๋žŒ์ด ์žˆ์œผ๋ฉด ๋‹ค์Œ์˜ ์›๋ณธ ์ง€์นจ์„ ์ฐธ์กฐํ•˜์„ธ์š”. ๋ฑŒ์ฒด์Šฌ๋ผํ”„ ๋ผํ‚จ์Šคํ‚ค.

Nginx(์•ก์„ธ์Šค ๋กœ๊ทธ) โ†’ ๋ฒกํ„ฐ(ํด๋ผ์ด์–ธํŠธ | Filebeat) โ†’ ๋ฒกํ„ฐ(์„œ๋ฒ„ | Logstash) โ†’ ์กฐํ•ฉ์„ Clickhouse์—์„œ ๋ณ„๋„๋กœ, Elasticsearch์—์„œ ๋ณ„๋„๋กœ ๊ตฌ์„ฑํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. 4๊ฐœ์˜ ์„œ๋ฒ„๋ฅผ ์„ค์น˜ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. 3๊ฐœ์˜ ์„œ๋ฒ„๋กœ ์šฐํšŒํ•  ์ˆ˜ ์žˆ์ง€๋งŒ.

Vector๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Nginx json ๋กœ๊ทธ๋ฅผ Clickhouse ๋ฐ Elasticsearch๋กœ ๋ณด๋‚ด๊ธฐ

๊ณ„ํš์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ชจ๋“  ์„œ๋ฒ„์—์„œ Selinux๋ฅผ ๋น„ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค.

sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
reboot

๋ชจ๋“  ์„œ๋ฒ„์— HTTP ์„œ๋ฒ„ ์—๋ฎฌ๋ ˆ์ดํ„ฐ + ์œ ํ‹ธ๋ฆฌํ‹ฐ๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

HTTP ์„œ๋ฒ„ ์—๋ฎฌ๋ ˆ์ดํ„ฐ๋กœ์„œ ์šฐ๋ฆฌ๋Š” nodejs-์Šคํ…-์„œ๋ฒ„ ๋ถ€ํ„ฐ ๋ง‰์‹ฌ ์ด๊ทธ๋‚˜ํ…์ฝ”

Nodejs-stub-server์—๋Š” rpm์ด ์—†์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์— rpm์„ ์ƒ์„ฑํ•˜์‹ญ์‹œ์˜ค. rpm์€ ๋‹ค์Œ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ปดํŒŒ์ผ๋ฉ๋‹ˆ๋‹ค. ํŽ˜๋„๋ผ ์ฝ”ํผ

antonpatsev/nodejs-stub-server ์ €์žฅ์†Œ ์ถ”๊ฐ€

yum -y install yum-plugin-copr epel-release
yes | yum copr enable antonpatsev/nodejs-stub-server

๋ชจ๋“  ์„œ๋ฒ„์— nodejs-stub-server, Apache ๋ฒค์น˜๋งˆํฌ ๋ฐ ํ™”๋ฉด ํ„ฐ๋ฏธ๋„ ๋ฉ€ํ‹ฐํ”Œ๋ ‰์„œ๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

yum -y install stub_http_server screen mc httpd-tools screen

/var/lib/stub_http_server/stub_http_server.js ํŒŒ์ผ์—์„œ stub_http_server ์‘๋‹ต ์‹œ๊ฐ„์„ ์ˆ˜์ •ํ•˜์—ฌ ๋กœ๊ทธ๊ฐ€ ๋” ๋งŽ์•„์กŒ์Šต๋‹ˆ๋‹ค.

var max_sleep = 10;

stub_http_server๋ฅผ ์‹คํ–‰ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

systemctl start stub_http_server
systemctl enable stub_http_server

ํด๋ฆญํ•˜์šฐ์Šค ์„ค์น˜ ์„œ๋ฒ„ 3์—์„œ

ClickHouse๋Š” SSE 4.2 ๋ช…๋ น ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ๋ณ„๋„๋กœ ์ง€์ •ํ•˜์ง€ ์•Š๋Š” ํ•œ ์‚ฌ์šฉ๋˜๋Š” ํ”„๋กœ์„ธ์„œ์—์„œ ์ด์— ๋Œ€ํ•œ ์ง€์›์€ ์ถ”๊ฐ€ ์‹œ์Šคํ…œ ์š”๊ตฌ ์‚ฌํ•ญ์ด ๋ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์€ ํ˜„์žฌ ํ”„๋กœ์„ธ์„œ๊ฐ€ SSE 4.2๋ฅผ ์ง€์›ํ•˜๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๋ช…๋ น์ž…๋‹ˆ๋‹ค.

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

๋จผ์ € ๊ณต์‹ ์ €์žฅ์†Œ๋ฅผ ์—ฐ๊ฒฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:

sudo yum install -y yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64

ํŒจํ‚ค์ง€๋ฅผ ์„ค์น˜ํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋ช…๋ น์„ ์‹คํ–‰ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

sudo yum install -y clickhouse-server clickhouse-client

clickhouse-server๊ฐ€ /etc/clickhouse-server/config.xml ํŒŒ์ผ์—์„œ ๋„คํŠธ์›Œํฌ ์นด๋“œ๋ฅผ ์ˆ˜์‹ ํ•˜๋„๋ก ํ—ˆ์šฉํ•ฉ๋‹ˆ๋‹ค.

<listen_host>0.0.0.0</listen_host>

์ถ”์ ์—์„œ ๋””๋ฒ„๊ทธ๋กœ ๋กœ๊น… ์ˆ˜์ค€ ๋ณ€๊ฒฝ

๋””๋ฒ„๊ทธ

ํ‘œ์ค€ ์••์ถ• ์„ค์ •:

min_compress_block_size  65536
max_compress_block_size  1048576

Zstd ์••์ถ•์„ ํ™œ์„ฑํ™”ํ•˜๋ ค๋ฉด ๊ตฌ์„ฑ์„ ๊ฑด๋“œ๋ฆฌ์ง€ ๋ง๊ณ  DDL์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

Vector๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Nginx json ๋กœ๊ทธ๋ฅผ Clickhouse ๋ฐ Elasticsearch๋กœ ๋ณด๋‚ด๊ธฐ

Google์—์„œ DDL์„ ํ†ตํ•ด zstd ์••์ถ•์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๊ทธ๋Œ€๋กœ ๋‘์—ˆ์Šต๋‹ˆ๋‹ค.

Clickhouse์—์„œ zstd ์••์ถ•์„ ์‚ฌ์šฉํ•˜๋Š” ๋™๋ฃŒ๋“ค์€ ์ง€์นจ์„ ๊ณต์œ ํ•ด ์ฃผ์„ธ์š”.

์„œ๋ฒ„๋ฅผ ๋ฐ๋ชฌ์œผ๋กœ ์‹œ์ž‘ํ•˜๋ ค๋ฉด ๋‹ค์Œ์„ ์‹คํ–‰ํ•˜์„ธ์š”.

service clickhouse-server start

์ด์ œ Clickhouse ์„ค์ •์œผ๋กœ ๋„˜์–ด๊ฐ€๊ฒ ์Šต๋‹ˆ๋‹ค.

ํด๋ฆญํ•˜์šฐ์Šค ๋ฐ”๋กœ๊ฐ€๊ธฐ

clickhouse-client -h 172.26.10.109 -m

172.26.10.109 โ€” ํด๋ฆญํ•˜์šฐ์Šค๊ฐ€ ์„ค์น˜๋œ ์„œ๋ฒ„์˜ IP์ž…๋‹ˆ๋‹ค.

๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋ฅผ ๋งŒ๋“ค์–ด ๋ด…์‹œ๋‹ค

CREATE DATABASE vector;

๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๊ฐ€ ์กด์žฌํ•˜๋Š”์ง€ ํ™•์ธํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

show databases;

vector.logs ํ…Œ์ด๋ธ”์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

/* ะญั‚ะพ ั‚ะฐะฑะปะธั†ะฐ ะณะดะต ั…ั€ะฐะฝัั‚ัั ะปะพะณะธ ะบะฐะบ ะตัั‚ัŒ */

CREATE TABLE vector.logs
(
    `node_name` String,
    `timestamp` DateTime,
    `server_name` String,
    `user_id` String,
    `request_full` String,
    `request_user_agent` String,
    `request_http_host` String,
    `request_uri` String,
    `request_scheme` String,
    `request_method` String,
    `request_length` UInt64,
    `request_time` Float32,
    `request_referrer` String,
    `response_status` UInt16,
    `response_body_bytes_sent` UInt64,
    `response_content_type` String,
    `remote_addr` IPv4,
    `remote_port` UInt32,
    `remote_user` String,
    `upstream_addr` IPv4,
    `upstream_port` UInt32,
    `upstream_bytes_received` UInt64,
    `upstream_bytes_sent` UInt64,
    `upstream_cache_status` String,
    `upstream_connect_time` Float32,
    `upstream_header_time` Float32,
    `upstream_response_length` UInt64,
    `upstream_response_time` Float32,
    `upstream_status` UInt16,
    `upstream_content_type` String,
    INDEX idx_http_host request_http_host TYPE set(0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1)
SETTINGS index_granularity = 8192;

ํ…Œ์ด๋ธ”์ด ์ƒ์„ฑ๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ๋ฐœ์‚ฌํ•˜์ž clickhouse-client ๊ทธ๋ฆฌ๊ณ  ์š”์ฒญ์„ ํ•ด๋ณด์„ธ์š”.

๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋กœ ๊ฐ€๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

use vector;

Ok.

0 rows in set. Elapsed: 0.001 sec.

ํ…Œ์ด๋ธ”์„ ์‚ดํŽด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

show tables;

โ”Œโ”€nameโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ logs                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Clickhouse์™€์˜ ๋น„๊ต๋ฅผ ์œ„ํ•ด ๋™์ผํ•œ ๋ฐ์ดํ„ฐ๋ฅผ Elasticsearch๋กœ ๋ณด๋‚ด๊ธฐ ์œ„ํ•ด 4๋ฒˆ์งธ ์„œ๋ฒ„์— elasticsearch๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

๊ณต๊ฐœ rpm ํ‚ค ์ถ”๊ฐ€

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

2๊ฐœ์˜ ์ €์žฅ์†Œ๋ฅผ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

/etc/yum.repos.d/elasticsearch.repo

[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

/etc/yum.repos.d/kibana.repo

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Elasticsearch ๋ฐ Kibana ์„ค์น˜

yum install -y kibana elasticsearch

1๊ฐœ์˜ ๋ณต์‚ฌ๋ณธ์— ํฌํ•จ๋˜๋ฏ€๋กœ /etc/elasticsearch/elasticsearch.yml ํŒŒ์ผ์— ๋‹ค์Œ์„ ์ถ”๊ฐ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

discovery.type: single-node

๋ฒกํ„ฐ๊ฐ€ ๋‹ค๋ฅธ ์„œ๋ฒ„์—์„œ Elasticsearch๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๋‚ผ ์ˆ˜ ์žˆ๋„๋ก network.host๋ฅผ ๋ณ€๊ฒฝํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

network.host: 0.0.0.0

Kibana์— ์—ฐ๊ฒฐํ•˜๋ ค๋ฉด /etc/kibana/kibana.yml ํŒŒ์ผ์—์„œ server.host ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋ณ€๊ฒฝํ•˜์‹ญ์‹œ์˜ค.

server.host: "0.0.0.0"

์˜ค๋ž˜๋˜์—ˆ์œผ๋ฉฐ ์ž๋™ ์‹œ์ž‘์— elasticsearch ํฌํ•จ

systemctl enable elasticsearch
systemctl start elasticsearch

๊ทธ๋ฆฌ๊ณ  ํ‚ค๋ฐ”๋‚˜

systemctl enable kibana
systemctl start kibana

๋‹จ์ผ ๋…ธ๋“œ ๋ชจ๋“œ ์ƒค๋“œ 1๊ฐœ, ๋ณต์ œ๋ณธ 0๊ฐœ๋กœ Elasticsearch๋ฅผ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ ๋งŽ์€ ์ˆ˜์˜ ์„œ๋ฒ„๋กœ ๊ตฌ์„ฑ๋œ ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ์žˆ์œผ๋ฏ€๋กœ ์ด๋ฅผ ์ˆ˜ํ–‰ํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

ํ–ฅํ›„ ์ƒ‰์ธ์„ ์œ„ํ•ด ๊ธฐ๋ณธ ํ…œํ”Œ๋ฆฟ์„ ์—…๋ฐ์ดํŠธํ•ฉ๋‹ˆ๋‹ค.

curl -X PUT http://localhost:9200/_template/default -H 'Content-Type: application/json' -d '{"index_patterns": ["*"],"order": -1,"settings": {"number_of_shards": "1","number_of_replicas": "0"}}' 

์„ค์น˜ ๋ฒกํ„ฐ ์„œ๋ฒ„ 2์˜ Logstash ๋Œ€์ฒดํ’ˆ

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm mc httpd-tools screen

Logstash๋ฅผ ๋Œ€์ฒดํ•˜๊ธฐ ์œ„ํ•ด Vector๋ฅผ ์„ค์ •ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. /etc/Vector/Vector.toml ํŒŒ์ผ ํŽธ์ง‘

# /etc/vector/vector.toml

data_dir = "/var/lib/vector"

[sources.nginx_input_vector]
  # General
  type                          = "vector"
  address                       = "0.0.0.0:9876"
  shutdown_timeout_secs         = 30

[transforms.nginx_parse_json]
  inputs                        = [ "nginx_input_vector" ]
  type                          = "json_parser"

[transforms.nginx_parse_add_defaults]
  inputs                        = [ "nginx_parse_json" ]
  type                          = "lua"
  version                       = "2"

  hooks.process = """
  function (event, emit)

    function split_first(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[1];
    end

    function split_last(s, delimiter)
      result = {};
      for match in (s..delimiter):gmatch("(.-)"..delimiter) do
          table.insert(result, match);
      end
      return result[#result];
    end

    event.log.upstream_addr             = split_first(split_last(event.log.upstream_addr, ', '), ':')
    event.log.upstream_bytes_received   = split_last(event.log.upstream_bytes_received, ', ')
    event.log.upstream_bytes_sent       = split_last(event.log.upstream_bytes_sent, ', ')
    event.log.upstream_connect_time     = split_last(event.log.upstream_connect_time, ', ')
    event.log.upstream_header_time      = split_last(event.log.upstream_header_time, ', ')
    event.log.upstream_response_length  = split_last(event.log.upstream_response_length, ', ')
    event.log.upstream_response_time    = split_last(event.log.upstream_response_time, ', ')
    event.log.upstream_status           = split_last(event.log.upstream_status, ', ')

    if event.log.upstream_addr == "" then
        event.log.upstream_addr = "127.0.0.1"
    end

    if (event.log.upstream_bytes_received == "-" or event.log.upstream_bytes_received == "") then
        event.log.upstream_bytes_received = "0"
    end

    if (event.log.upstream_bytes_sent == "-" or event.log.upstream_bytes_sent == "") then
        event.log.upstream_bytes_sent = "0"
    end

    if event.log.upstream_cache_status == "" then
        event.log.upstream_cache_status = "DISABLED"
    end

    if (event.log.upstream_connect_time == "-" or event.log.upstream_connect_time == "") then
        event.log.upstream_connect_time = "0"
    end

    if (event.log.upstream_header_time == "-" or event.log.upstream_header_time == "") then
        event.log.upstream_header_time = "0"
    end

    if (event.log.upstream_response_length == "-" or event.log.upstream_response_length == "") then
        event.log.upstream_response_length = "0"
    end

    if (event.log.upstream_response_time == "-" or event.log.upstream_response_time == "") then
        event.log.upstream_response_time = "0"
    end

    if (event.log.upstream_status == "-" or event.log.upstream_status == "") then
        event.log.upstream_status = "0"
    end

    emit(event)

  end
  """

[transforms.nginx_parse_remove_fields]
    inputs                              = [ "nginx_parse_add_defaults" ]
    type                                = "remove_fields"
    fields                              = ["data", "file", "host", "source_type"]

[transforms.nginx_parse_coercer]

    type                                = "coercer"
    inputs                              = ["nginx_parse_remove_fields"]

    types.request_length = "int"
    types.request_time = "float"

    types.response_status = "int"
    types.response_body_bytes_sent = "int"

    types.remote_port = "int"

    types.upstream_bytes_received = "int"
    types.upstream_bytes_send = "int"
    types.upstream_connect_time = "float"
    types.upstream_header_time = "float"
    types.upstream_response_length = "int"
    types.upstream_response_time = "float"
    types.upstream_status = "int"

    types.timestamp = "timestamp"

[sinks.nginx_output_clickhouse]
    inputs   = ["nginx_parse_coercer"]
    type     = "clickhouse"

    database = "vector"
    healthcheck = true
    host = "http://172.26.10.109:8123" #  ะะดั€ะตั Clickhouse
    table = "logs"

    encoding.timestamp_format = "unix"

    buffer.type = "disk"
    buffer.max_size = 104900000
    buffer.when_full = "block"

    request.in_flight_limit = 20

[sinks.elasticsearch]
    type = "elasticsearch"
    inputs   = ["nginx_parse_coercer"]
    compression = "none"
    healthcheck = true
    # 172.26.10.116 - ัะตั€ะฒะตั€ ะณะดะต ัƒัั‚ะฐะฝะพะฒะตะฝ elasticsearch
    host = "http://172.26.10.116:9200" 
    index = "vector-%Y-%m-%d"

๋ณ€ํ™˜.nginx_parse_add_defaults ์„น์…˜์„ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์œผ๋กœ ๋ฑŒ์ฒด์Šฌ๋ผํ”„ ๋ผํ‚จ์Šคํ‚ค ์†Œ๊ทœ๋ชจ CDN์— ์ด๋Ÿฌํ•œ ๊ตฌ์„ฑ์„ ์‚ฌ์šฉํ•˜๋ฉฐ upstream_*์—๋Š” ์—ฌ๋Ÿฌ ๊ฐ’์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค

"upstream_addr": "128.66.0.10:443, 128.66.0.11:443, 128.66.0.12:443"
"upstream_bytes_received": "-, -, 123"
"upstream_status": "502, 502, 200"

์ด๊ฒƒ์ด ๊ท€ํ•˜์˜ ์ƒํ™ฉ์ด ์•„๋‹ˆ๋ผ๋ฉด ์ด ์„น์…˜์„ ๋‹จ์ˆœํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

systemd /etc/systemd/system/Vector.service์— ๋Œ€ํ•œ ์„œ๋น„์Šค ์„ค์ •์„ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

# /etc/systemd/system/vector.service

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

ํ…Œ์ด๋ธ”์„ ์ƒ์„ฑํ•œ ํ›„ Vector๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

systemctl enable vector
systemctl start vector

๋ฒกํ„ฐ ๋กœ๊ทธ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

journalctl -f -u vector

๋กœ๊ทธ์— ์ด์™€ ๊ฐ™์€ ํ•ญ๋ชฉ์ด ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

INFO vector::topology::builder: Healthcheck: Passed.
INFO vector::topology::builder: Healthcheck: Passed.

ํด๋ผ์ด์–ธํŠธ(์›น ์„œ๋ฒ„) - ์ฒซ ๋ฒˆ์งธ ์„œ๋ฒ„

nginx๊ฐ€ ์žˆ๋Š” ์„œ๋ฒ„์—์„œ๋Š” clickhouse์˜ ๋กœ๊ทธ ํ…Œ์ด๋ธ”์ด ํ•„๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ipv6์„ ๋น„ํ™œ์„ฑํ™”ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. upstream_addr IPv4, ๋„คํŠธ์›Œํฌ ๋‚ด๋ถ€์—์„œ ipv6์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ipv6์„ ๋„์ง€ ์•Š์œผ๋ฉด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

DB::Exception: Invalid IPv4 value.: (while read the value of key upstream_addr)

์•„๋งˆ๋„ ๋…์ž ์—ฌ๋Ÿฌ๋ถ„์€ ipv6 ์ง€์›์„ ์ถ”๊ฐ€ํ•˜์‹ญ์‹œ์˜ค.

/etc/sysctl.d/98-disable-ipv6.conf ํŒŒ์ผ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

์„ค์ • ์ ์šฉ

sysctl --system

nginx๋ฅผ ์„ค์น˜ํ•ด๋ณด์ž.

nginx ์ €์žฅ์†Œ ํŒŒ์ผ /etc/yum.repos.d/nginx.repo๋ฅผ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true

nginx ํŒจํ‚ค์ง€ ์„ค์น˜

yum install -y nginx

๋จผ์ € /etc/nginx/nginx.conf ํŒŒ์ผ์—์„œ Nginx์˜ ๋กœ๊ทธ ํ˜•์‹์„ ๊ตฌ์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

user  nginx;
# you must set worker processes based on your CPU cores, nginx does not benefit from setting more than that
worker_processes auto; #some last versions calculate it automatically

# number of file descriptors used for nginx
# the limit for the maximum FDs on the server is usually set by the OS.
# if you don't set FD's then OS settings will be used which is by default 2000
worker_rlimit_nofile 100000;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

# provides the configuration file context in which the directives that affect connection processing are specified.
events {
    # determines how much clients will be served per worker
    # max clients = worker_connections * worker_processes
    # max clients is also limited by the number of socket connections available on the system (~64k)
    worker_connections 4000;

    # optimized to serve many clients with each thread, essential for linux -- for testing environment
    use epoll;

    # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
    multi_accept on;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

log_format vector escape=json
    '{'
        '"node_name":"nginx-vector",'
        '"timestamp":"$time_iso8601",'
        '"server_name":"$server_name",'
        '"request_full": "$request",'
        '"request_user_agent":"$http_user_agent",'
        '"request_http_host":"$http_host",'
        '"request_uri":"$request_uri",'
        '"request_scheme": "$scheme",'
        '"request_method":"$request_method",'
        '"request_length":"$request_length",'
        '"request_time": "$request_time",'
        '"request_referrer":"$http_referer",'
        '"response_status": "$status",'
        '"response_body_bytes_sent":"$body_bytes_sent",'
        '"response_content_type":"$sent_http_content_type",'
        '"remote_addr": "$remote_addr",'
        '"remote_port": "$remote_port",'
        '"remote_user": "$remote_user",'
        '"upstream_addr": "$upstream_addr",'
        '"upstream_bytes_received": "$upstream_bytes_received",'
        '"upstream_bytes_sent": "$upstream_bytes_sent",'
        '"upstream_cache_status":"$upstream_cache_status",'
        '"upstream_connect_time":"$upstream_connect_time",'
        '"upstream_header_time":"$upstream_header_time",'
        '"upstream_response_length":"$upstream_response_length",'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_status": "$upstream_status",'
        '"upstream_content_type":"$upstream_http_content_type"'
    '}';

    access_log  /var/log/nginx/access.log  main;
    access_log  /var/log/nginx/access.json.log vector;      # ะะพะฒั‹ะน ะปะพะณ ะฒ ั„ะพั€ะผะฐั‚ะต json

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

ํ˜„์žฌ ๊ตฌ์„ฑ์„ ์ค‘๋‹จํ•˜์ง€ ์•Š๊ธฐ ์œ„ํ•ด Nginx์—์„œ๋Š” ์—ฌ๋Ÿฌ access_log ์ง€์‹œ๋ฌธ์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

access_log  /var/log/nginx/access.log  main;            # ะกั‚ะฐะฝะดะฐั€ั‚ะฝั‹ะน ะปะพะณ
access_log  /var/log/nginx/access.json.log vector;      # ะะพะฒั‹ะน ะปะพะณ ะฒ ั„ะพั€ะผะฐั‚ะต json

์ƒˆ ๋กœ๊ทธ์— ๋Œ€ํ•ด logrotate ๊ทœ์น™์„ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ์žŠ์ง€ ๋งˆ์„ธ์š”(๋กœ๊ทธ ํŒŒ์ผ์ด .log๋กœ ๋๋‚˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ).

/etc/nginx/conf.d/์—์„œ default.conf๋ฅผ ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค.

rm -f /etc/nginx/conf.d/default.conf

๊ฐ€์ƒ ํ˜ธ์ŠคํŠธ ์ถ”๊ฐ€ /etc/nginx/conf.d/vhost1.conf

server {
    listen 80;
    server_name vhost1;
    location / {
        proxy_pass http://172.26.10.106:8080;
    }
}

๊ฐ€์ƒ ํ˜ธ์ŠคํŠธ ์ถ”๊ฐ€ /etc/nginx/conf.d/vhost2.conf

server {
    listen 80;
    server_name vhost2;
    location / {
        proxy_pass http://172.26.10.108:8080;
    }
}

๊ฐ€์ƒ ํ˜ธ์ŠคํŠธ ์ถ”๊ฐ€ /etc/nginx/conf.d/vhost3.conf

server {
    listen 80;
    server_name vhost3;
    location / {
        proxy_pass http://172.26.10.109:8080;
    }
}

๊ฐ€์ƒ ํ˜ธ์ŠคํŠธ ์ถ”๊ฐ€ /etc/nginx/conf.d/vhost4.conf

server {
    listen 80;
    server_name vhost4;
    location / {
        proxy_pass http://172.26.10.116:8080;
    }
}

๋ชจ๋“  ์„œ๋ฒ„์— ๊ฐ€์ƒ ํ˜ธ์ŠคํŠธ(nginx๊ฐ€ ์„ค์น˜๋œ ์„œ๋ฒ„์˜ 172.26.10.106 IP)๋ฅผ /etc/hosts ํŒŒ์ผ์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

172.26.10.106 vhost1
172.26.10.106 vhost2
172.26.10.106 vhost3
172.26.10.106 vhost4

๊ทธ๋ฆฌ๊ณ  ๋ชจ๋“  ๊ฒƒ์ด ์ค€๋น„๋˜์—ˆ๋‹ค๋ฉด

nginx -t 
systemctl restart nginx

์ด์ œ ์ง์ ‘ ์„ค์น˜ํ•ด๋ณด์ž ๋ฒกํ„ฐ

yum install -y https://packages.timber.io/vector/0.9.X/vector-x86_64.rpm

systemd /etc/systemd/system/Vector.service์— ๋Œ€ํ•œ ์„ค์ • ํŒŒ์ผ์„ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

[Unit]
Description=Vector
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStart=/usr/bin/vector
ExecReload=/bin/kill -HUP $MAINPID
Restart=no
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=vector

[Install]
WantedBy=multi-user.target

๊ทธ๋ฆฌ๊ณ  /etc/Vector/Vector.toml ๊ตฌ์„ฑ์—์„œ Filebeat ๋Œ€์ฒด๋ฅผ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค. IP ์ฃผ์†Œ 172.26.10.108์€ ๋กœ๊ทธ ์„œ๋ฒ„(Vector-Server)์˜ IP ์ฃผ์†Œ์ž…๋‹ˆ๋‹ค.

data_dir = "/var/lib/vector"

[sources.nginx_file]
  type                          = "file"
  include                       = [ "/var/log/nginx/access.json.log" ]
  start_at_beginning            = false
  fingerprinting.strategy       = "device_and_inode"

[sinks.nginx_output_vector]
  type                          = "vector"
  inputs                        = [ "nginx_file" ]

  address                       = "172.26.10.108:9876"

๋ฒกํ„ฐ ์‚ฌ์šฉ์ž๊ฐ€ ๋กœ๊ทธ ํŒŒ์ผ์„ ์ฝ์„ ์ˆ˜ ์žˆ๋„๋ก ํ•„์ˆ˜ ๊ทธ๋ฃน์— ๋ฒกํ„ฐ ์‚ฌ์šฉ์ž๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ์žŠ์ง€ ๋งˆ์‹ญ์‹œ์˜ค. ์˜ˆ๋ฅผ ๋“ค์–ด centos์˜ nginx๋Š” adm ๊ทธ๋ฃน ๊ถŒํ•œ์œผ๋กœ ๋กœ๊ทธ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

usermod -a -G adm vector

๋ฒกํ„ฐ ์„œ๋น„์Šค๋ฅผ ์‹œ์ž‘ํ•ด๋ณด์ž

systemctl enable vector
systemctl start vector

๋ฒกํ„ฐ ๋กœ๊ทธ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

journalctl -f -u vector

๋กœ๊ทธ์— ์ด์™€ ๊ฐ™์€ ํ•ญ๋ชฉ์ด ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

INFO vector::topology::builder: Healthcheck: Passed.

์ŠคํŠธ๋ ˆ์Šค ํ…Œ์ŠคํŠธ

Apache ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ด์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

httpd-tools ํŒจํ‚ค์ง€๊ฐ€ ๋ชจ๋“  ์„œ๋ฒ„์— ์„ค์น˜๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

ํ™”๋ฉด์— ํ‘œ์‹œ๋œ 4๊ฐœ์˜ ๋‹ค๋ฅธ ์„œ๋ฒ„์—์„œ Apache ๋ฒค์น˜๋งˆํฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ๋ฅผ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ๋จผ์ € ์Šคํฌ๋ฆฐ ํ„ฐ๋ฏธ๋„ ๋ฉ€ํ‹ฐํ”Œ๋ ‰์„œ๋ฅผ ์‹œ์ž‘ํ•œ ๋‹ค์Œ Apache ๋ฒค์น˜๋งˆํฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ๋ฅผ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ํ™”๋ฉด ์ž‘์—… ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ์‚ฌ.

1์ฐจ ์„œ๋ฒ„๋ถ€ํ„ฐ

while true; do ab -H "User-Agent: 1server" -c 100 -n 10 -t 10 http://vhost1/; sleep 1; done

2์ฐจ ์„œ๋ฒ„๋ถ€ํ„ฐ

while true; do ab -H "User-Agent: 2server" -c 100 -n 10 -t 10 http://vhost2/; sleep 1; done

3์ฐจ ์„œ๋ฒ„๋ถ€ํ„ฐ

while true; do ab -H "User-Agent: 3server" -c 100 -n 10 -t 10 http://vhost3/; sleep 1; done

4์ฐจ ์„œ๋ฒ„๋ถ€ํ„ฐ

while true; do ab -H "User-Agent: 4server" -c 100 -n 10 -t 10 http://vhost4/; sleep 1; done

ํด๋ฆญํ•˜์šฐ์Šค์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ํ™•์ธํ•ด๋ณด์ž

ํด๋ฆญํ•˜์šฐ์Šค ๋ฐ”๋กœ๊ฐ€๊ธฐ

clickhouse-client -h 172.26.10.109 -m

SQL ์ฟผ๋ฆฌ ๋งŒ๋“ค๊ธฐ

SELECT * FROM vector.logs;

โ”Œโ”€node_nameโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€timestampโ”€โ”ฌโ”€server_nameโ”€โ”ฌโ”€user_idโ”€โ”ฌโ”€request_fullโ”€โ”€โ”€โ”ฌโ”€request_user_agentโ”€โ”ฌโ”€request_http_hostโ”€โ”ฌโ”€request_uriโ”€โ”ฌโ”€request_schemeโ”€โ”ฌโ”€request_methodโ”€โ”ฌโ”€request_lengthโ”€โ”ฌโ”€request_timeโ”€โ”ฌโ”€request_referrerโ”€โ”ฌโ”€response_statusโ”€โ”ฌโ”€response_body_bytes_sentโ”€โ”ฌโ”€response_content_typeโ”€โ”ฌโ”€โ”€โ”€remote_addrโ”€โ”ฌโ”€remote_portโ”€โ”ฌโ”€remote_userโ”€โ”ฌโ”€upstream_addrโ”€โ”ฌโ”€upstream_portโ”€โ”ฌโ”€upstream_bytes_receivedโ”€โ”ฌโ”€upstream_bytes_sentโ”€โ”ฌโ”€upstream_cache_statusโ”€โ”ฌโ”€upstream_connect_timeโ”€โ”ฌโ”€upstream_header_timeโ”€โ”ฌโ”€upstream_response_lengthโ”€โ”ฌโ”€upstream_response_timeโ”€โ”ฌโ”€upstream_statusโ”€โ”ฌโ”€upstream_content_typeโ”€โ”
โ”‚ nginx-vector โ”‚ 2020-08-07 04:32:42 โ”‚ vhost1      โ”‚         โ”‚ GET / HTTP/1.0 โ”‚ 1server            โ”‚ vhost1            โ”‚ /           โ”‚ http           โ”‚ GET            โ”‚             66 โ”‚        0.028 โ”‚                  โ”‚             404 โ”‚                       27 โ”‚                       โ”‚ 172.26.10.106 โ”‚       45886 โ”‚             โ”‚ 172.26.10.106 โ”‚             0 โ”‚                     109 โ”‚                  97 โ”‚ DISABLED              โ”‚                     0 โ”‚                0.025 โ”‚                       27 โ”‚                  0.029 โ”‚             404 โ”‚                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

ํด๋ฆญํ•˜์šฐ์Šค ํ…Œ์ด๋ธ” ํฌ๊ธฐ ์•Œ์•„๋ณด๊ธฐ

select concat(database, '.', table)                         as table,
       formatReadableSize(sum(bytes))                       as size,
       sum(rows)                                            as rows,
       max(modification_time)                               as latest_modification,
       sum(bytes)                                           as bytes_size,
       any(engine)                                          as engine,
       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size
from system.parts
where active
group by database, table
order by bytes_size desc;

ํด๋ฆญํ•˜์šฐ์Šค์—์„œ ๋กœ๊ทธ๊ฐ€ ์–ผ๋งˆ๋‚˜ ์ฐจ์ง€ํ•˜๋Š”์ง€ ์•Œ์•„๋ด…์‹œ๋‹ค.

Vector๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Nginx json ๋กœ๊ทธ๋ฅผ Clickhouse ๋ฐ Elasticsearch๋กœ ๋ณด๋‚ด๊ธฐ

๋กœ๊ทธ ํ…Œ์ด๋ธ” ํฌ๊ธฐ๋Š” 857.19MB์ž…๋‹ˆ๋‹ค.

Vector๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Nginx json ๋กœ๊ทธ๋ฅผ Clickhouse ๋ฐ Elasticsearch๋กœ ๋ณด๋‚ด๊ธฐ

Elasticsearch์˜ ์ธ๋ฑ์Šค์— ์žˆ๋Š” ๋™์ผํ•œ ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ๋Š” 4,5GB์ž…๋‹ˆ๋‹ค.

๋งค๊ฐœ๋ณ€์ˆ˜์— ๋ฒกํ„ฐ์— ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์ •ํ•˜์ง€ ์•Š์œผ๋ฉด Clickhouse๋Š” Elasticsearch๋ณด๋‹ค 4500/857.19 = 5.24๋ฐฐ ์ ์€ ์–‘์„ ์ฐจ์ง€ํ•ฉ๋‹ˆ๋‹ค.

๋ฒกํ„ฐ์—์„œ๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ์••์ถ• ํ•„๋“œ๊ฐ€ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

ํ…”๋ ˆ๊ทธ๋žจ ์ฑ„ํŒ… ํด๋ฆญํ•˜์šฐ์Šค
ํ…”๋ ˆ๊ทธ๋žจ ์ฑ„ํŒ… ํƒ„์„ฑ ๊ฒ€์ƒ‰
ํ…”๋ ˆ๊ทธ๋žจ ์ฑ„ํŒ… "์‹œ์Šคํ…œ ์ˆ˜์ง‘ ๋ฐ ๋ถ„์„ ๋ฉ”์‹œ์ง€"

์ถœ์ฒ˜ : habr.com

์ฝ”๋ฉ˜ํŠธ๋ฅผ ์ถ”๊ฐ€