ʻO Aleksey Lizunov, ke poʻo o ka Center Competence for Remote Service Channels of the Directorate of Information Technologies of the MKB
Ma keʻano heʻokoʻa i ka ELK stack (ElasticSearch, Logstash, Kibana), ke hana nei mākou i ka noiʻiʻana i ka hoʻohanaʻana i ka waihona ClickHouse ma keʻano he waihonaʻikepili no nā lāʻau.
Ma kēia ʻatikala, makemake mākou e kamaʻilio e pili ana i kā mākou ʻike i ka hoʻohana ʻana i ka waihona ClickHouse a me nā hopena mua o ka hana hoʻokele. Pono e hoʻomaopopo koke ʻia he mea kupanaha nā hopena.
Ma hope aʻe, e wehewehe mākou i nā kikoʻī i ke ʻano o ka hoʻonohonoho ʻana o kā mākou ʻōnaehana, a me nā ʻāpana o ia mea. Akā i kēia manawa makemake wau e kamaʻilio liʻiliʻi e pili ana i kēia waihona ma ke ʻano holoʻokoʻa, a no ke aha e pono ai ka nānā ʻana. ʻO ka waihona ClickHouse kahi waihona kolamu analytical kiʻekiʻe mai Yandex. Hoʻohana ʻia ia i nā lawelawe ʻo Yandex, ma mua ʻo ia ka waihona ʻikepili nui no Yandex.Metrica. Pūnaehana kumu wehe, manuahi. Mai ka manaʻo o ka mea hoʻomohala, ua noʻonoʻo wau pehea lākou i hoʻokō ai, no ka mea, aia ka ʻikepili nui. A ʻo ka mea hoʻohana ʻo Metrica ponoʻī he maʻalahi a wikiwiki hoʻi. I ka ʻike mua ʻana i kēia waihona, ʻo ka manaʻo: "ʻAe, hope loa! Hana ʻia no nā kānaka! E hoʻomaka ana mai ke kaʻina hana a hoʻopau me ka hoʻouna ʻana i nā noi.
He haʻahaʻa loa ko kēia waihona waihona. Hiki i ka mea hoʻomohala akamai ke hoʻokomo i kēia waihona i loko o kekahi mau minuke a hoʻomaka e hoʻohana. Hana maopopo nā mea a pau. Hiki i nā poʻe hou i Linux ke hoʻopaʻa koke i ka hoʻonohonoho ʻana a hana i nā hana maʻalahi. Inā ma mua, me nā huaʻōlelo Big Data, Hadoop, Google BigTable, HDFS, he mea hoʻomohala maʻamau i manaʻo e pili ana i kekahi terabytes, petabytes, ua komo kekahi mau superhumans i nā hoʻonohonoho a me ka hoʻomohala ʻana no kēia mau ʻōnaehana, a laila me ka hiki ʻana mai o ka ClickHouse. waihona, loaʻa iā mākou kahi mea hana maʻalahi a hiki ke hoʻoponopono i kahi ʻano hana i hiki ʻole ke loaʻa mua. Hoʻokahi wale nō mīkini maʻamau a me ʻelima mau minuke e hoʻokomo ai. ʻO ia hoʻi, loaʻa iā mākou kahi waihona e like me MySql, akā no ka mālama ʻana i nā piliona o nā moʻolelo! ʻO kekahi super-archiver me ka ʻōlelo SQL. Me he mea lā ua hāʻawi ʻia nā mea kaua a nā malihini.
E pili ana i kā mākou ʻōnaehana logging
No ka hōʻiliʻili ʻana i ka ʻike, hoʻohana ʻia nā faila log IIS o nā palapala noi pūnaewele maʻamau (ke hoʻopau nei mākou i nā moʻolelo noiʻi i kēia manawa, akā ʻo ka pahuhopu nui i ka pae pilote ʻo ia ka hōʻiliʻili ʻana i nā log IIS).
No nā kumu like ʻole, ʻaʻole hiki iā mākou ke haʻalele loa i ka waihona ELK, a ke hoʻomau nei mākou i ka hoʻohana ʻana i nā ʻāpana LogStash a me Filebeat, i hōʻoia maikaʻi iā lākou iho a hana me ka hilinaʻi a me ka wānana.
Hōʻike ʻia ka ʻōnaehana logging maʻamau ma ke kiʻi ma lalo nei:
ʻO kahi hiʻohiʻona o ka kākau ʻana i ka ʻikepili i ka waihona ClickHouse ʻaʻole pinepine (hoʻokahi i kēlā me kēia kekona) ka hoʻokomo ʻana i nā moʻolelo i nā pūʻulu nui. ʻO kēia, ʻoiai, ʻo ia ka ʻāpana "pilikia" loa āu e ʻike ai i ka wā e ʻike mua ai ʻoe i ka hana ʻana me ka waihona ClickHouse: lilo ka hoʻolālā i mea paʻakikī iki.
ʻO ka plugin no LogStash, ka mea e hoʻokomo pololei i ka ʻikepili i ClickHouse, kōkua nui ma aneʻi. Hoʻokomo ʻia kēia ʻāpana ma ka kikowaena like me ka waihona ʻikepili ponoʻī. No laila, ma ka ʻōlelo maʻamau, ʻaʻole ia e ʻōlelo ʻia e hana ia, akā mai kahi ʻike kūpono, i ʻole e hana i nā kikowaena kaʻawale i ka wā e kau ʻia ana ma ka kikowaena like. ʻAʻole mākou i ʻike i nā hāʻule a i ʻole nā pilikia waiwai me ka waihona. Eia kekahi, pono e hoʻomaopopo ʻia he hana hoʻāʻo hou ka plugin i ka hihia o nā hewa. A inā he hewa, kākau ka plugin i kahi ʻāpana o ka ʻikepili i hiki ʻole ke hoʻokomo ʻia (maʻalahi ke ʻano o ka faila: ma hope o ka hoʻoponopono ʻana, hiki iā ʻoe ke hoʻokomo maʻalahi i ka pūʻulu hoʻoponopono me ka clickhouse-client).
Hōʻike ʻia kahi papa inoa piha o nā polokalamu i hoʻohana ʻia i ka papahana ma ka papa:
Ka papa inoa o nā lako polokalamu i hoʻohana ʻia
Inoa
hōʻikeʻano
loulou hoʻolaha
NGINX
Reverse-proxy e hoʻopaʻa i ke komo ʻana e nā awa a hoʻonohonoho i ka ʻae
ʻAʻole hoʻohana ʻia i kēia manawa i ka papahana
FileBeat
Ka hoʻoili ʻana i nā moʻolelo waihona.
waihona lāʻau
ʻOhi lāʻau.
Hoʻohana ʻia e hōʻiliʻili i nā lāʻau mai FileBeat, a me ka hōʻiliʻili ʻana i nā lāʻau mai ka queue RabbitMQ (no nā kikowaena i loko o ka DMZ.)
Logstash-output-clickhouse
Loagstash plugin no ka hoʻoili ʻana i nā lāʻau i ka waihona ClickHouse i nā pūʻulu
/usr/share/logstash/bin/logstash-plugin e hoʻokomo i ka logstash-output-clickhouse
/usr/share/logstash/bin/logstash-plugin hoʻokomo i ka logstash-filter-prune
/usr/share/logstash/bin/logstash-plugin hoʻokomo i ka logstash-filter-multiline
KaomiHouse
Waihona moʻolelo
Nānā. E hoʻomaka ana mai ʻAukake 2018, kūkulu ʻia ka rpm "maʻamau" no RHEL i ka waihona Yandex, no laila hiki iā ʻoe ke hoʻāʻo e hoʻohana iā lākou. I ka manawa o ke kau ʻana, hoʻohana mākou i nā pūʻolo i kūkulu ʻia e Altinity.
grafana
ʻIke ʻike moʻolelo. Hoʻonohonoho i nā papa kuhikuhi
Redhat & Centos(64 Bit) - mana hou loa
ʻIkepili ClickHouse no Grafana 4.6+
Pākuʻi no Grafana me ClickHouse kumu ʻikepili
waihona lāʻau
E hoʻopaʻa inoa i ka mea hoʻokele mai FileBeat a i ka queue RabbitMQ.
Nānā. ʻO ka mea pōʻino, ʻaʻole i loaʻa pololei i ka FileBeat i RabbitMQ, no laila pono kahi loulou waena ma ke ʻano o Logstash.
ʻO RabbitMQ
pila memo. ʻO kēia ka log buffer ma ka DMZ
Erlang Runtime (Koi ʻia no RabbitMQ)
ʻO Erlang manawa holo. Pono no RabbitMQ e hana
Hōʻike ʻia ka hoʻonohonoho kikowaena me ka waihona ClickHouse ma ka papa aʻe:
Inoa
waiwai
i hoʻopuka
Kauoa
HDD: 40GB
RAM: 8GB
Kaʻina hana: Core 2 2Ghz
Pono e hoʻolohe i nā ʻōlelo aʻoaʻo no ka hana ʻana i ka waihona ClickHouse (
polokalamu ʻōnaehana maʻamau
OS: Red Hat Enterprise Linux Server (Maipo)
JRE (Java 8)
E like me kāu e ʻike ai, he hale hana maʻamau kēia.
ʻO ke ʻano o ka papa no ka mālama ʻana i nā lāʻau penei:
log_web.sql
CREATE TABLE log_web (
logdate Date,
logdatetime DateTime CODEC(Delta, LZ4HC),
fld_log_file_name LowCardinality( String ),
fld_server_name LowCardinality( String ),
fld_app_name LowCardinality( String ),
fld_app_module LowCardinality( String ),
fld_website_name LowCardinality( String ),
serverIP LowCardinality( String ),
method LowCardinality( String ),
uriStem String,
uriQuery String,
port UInt32,
username LowCardinality( String ),
clientIP String,
clientRealIP String,
userAgent String,
referer String,
response String,
subresponse String,
win32response String,
timetaken UInt64
, uriQuery__utm_medium String
, uriQuery__utm_source String
, uriQuery__utm_campaign String
, uriQuery__utm_term String
, uriQuery__utm_content String
, uriQuery__yclid String
, uriQuery__region String
) Engine = MergeTree()
PARTITION BY toYYYYMM(logdate)
ORDER BY (fld_app_name, fld_app_module, logdatetime)
SETTINGS index_granularity = 8192;
Hoʻohana mākou i ka ʻāpana paʻamau (ma ka mahina) a me ka granularity index. Hoʻopili pono nā kahua āpau me nā hoʻokomo log IIS no ka hoʻopaʻa inoa ʻana i nā noi http. Ma kahi kaʻawale, ʻike mākou aia nā kahua ʻokoʻa no ka mālama ʻana i nā utm-tags (ua paʻi ʻia lākou ma ke kahua o ka hoʻokomo ʻana i ka papaʻaina mai ke kahua string query).
Eia kekahi, ua hoʻohui ʻia kekahi mau kahua ʻōnaehana i ka papa e mālama i ka ʻike e pili ana i nā ʻōnaehana, nā ʻāpana, nā kikowaena. E nānā i ka papa ma lalo no ka wehewehe ʻana i kēia mau kahua. Ma ka papa hoʻokahi, mālama mākou i nā lāʻau no nā ʻōnaehana he nui.
Inoa
hōʻikeʻano
Pākuhi:
fld_app_name
inoa noi/system
Nā waiwai kūpono:
- site1.domain.com Paena waho 1
- site2.domain.com Paena waho 2
- internal-site1.domain.local Paena kūloko 1
kahua1.domain.com
fld_app_module
Pūnaehana module
Nā waiwai kūpono:
- pūnaewele - Pūnaewele
- svc - lawelawe pūnaewele pūnaewele
- intgr - lawelawe pūnaewele hoʻohui
- bo - Admin (BackOffice)
pūnaewele
fld_website_name
Ka inoa pūnaewele ma IIS
Hiki ke kau ʻia kekahi mau ʻōnaehana ma kahi kikowaena, a i ʻole kekahi mau manawa o hoʻokahi module ʻōnaehana
punaewele nui
fld_server_name
inoa kikowaena
web1.domain.com
fld_log_file_name
Ala i ka waihona log ma ke kikowaena
C:inetpublogsLogFiles
W3SVC1u_ex190711.log
Hiki iā ʻoe ke kūkulu pono i nā kiʻi ma Grafana. No ka laʻana, e nānā i nā noi mai ka mua o kahi ʻōnaehana. Ua like kēia me ka helu pūnaewele ma Yandex.Metrica.
Eia kekahi mau ʻikepili no ka hoʻohana ʻana i ka waihona no ʻelua mahina.
Ka helu o nā moʻolelo i wāwahi ʻia e nā ʻōnaehana a me kā lākou mau ʻāpana
SELECT
fld_app_name,
fld_app_module,
count(fld_app_name) AS rows_count
FROM log_web
GROUP BY
fld_app_name,
fld_app_module
WITH TOTALS
ORDER BY
fld_app_name ASC,
rows_count DESC
┌─fld_app_name─────┬─fld_app_module─┬─rows_count─┐
│ site1.domain.ru │ web │ 131441 │
│ site2.domain.ru │ web │ 1751081 │
│ site3.domain.ru │ web │ 106887543 │
│ site3.domain.ru │ svc │ 44908603 │
│ site3.domain.ru │ intgr │ 9813911 │
│ site4.domain.ru │ web │ 772095 │
│ site5.domain.ru │ web │ 17037221 │
│ site5.domain.ru │ intgr │ 838559 │
│ site5.domain.ru │ bo │ 7404 │
│ site6.domain.ru │ web │ 595877 │
│ site7.domain.ru │ web │ 27778858 │
└──────────────────┴────────────────┴────────────┘
Totals:
┌─fld_app_name─┬─fld_app_module─┬─rows_count─┐
│ │ │ 210522593 │
└──────────────┴────────────────┴────────────┘
11 rows in set. Elapsed: 4.874 sec. Processed 210.52 million rows, 421.67 MB (43.19 million rows/s., 86.51 MB/s.)
Ka nui o ka ʻikepili ma ka diski
SELECT
formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed,
formatReadableSize(sum(data_compressed_bytes)) AS compressed,
sum(rows) AS total_rows
FROM system.parts
WHERE table = 'log_web'
┌─uncompressed─┬─compressed─┬─total_rows─┐
│ 54.50 GiB │ 4.86 GiB │ 211427094 │
└──────────────┴────────────┴────────────┘
1 rows in set. Elapsed: 0.035 sec.
Degere o ke kōmi ʻikepili ma nā kolamu
SELECT
name,
formatReadableSize(data_uncompressed_bytes) AS uncompressed,
formatReadableSize(data_compressed_bytes) AS compressed,
data_uncompressed_bytes / data_compressed_bytes AS compress_ratio
FROM system.columns
WHERE table = 'log_web'
┌─name───────────────────┬─uncompressed─┬─compressed─┬─────compress_ratio─┐
│ logdate │ 401.53 MiB │ 1.80 MiB │ 223.16665968777315 │
│ logdatetime │ 803.06 MiB │ 35.91 MiB │ 22.363966401202305 │
│ fld_log_file_name │ 220.66 MiB │ 2.60 MiB │ 84.99905736932571 │
│ fld_server_name │ 201.54 MiB │ 50.63 MiB │ 3.980924816977078 │
│ fld_app_name │ 201.17 MiB │ 969.17 KiB │ 212.55518183686877 │
│ fld_app_module │ 201.17 MiB │ 968.60 KiB │ 212.67805817411906 │
│ fld_website_name │ 201.54 MiB │ 1.24 MiB │ 162.7204926761546 │
│ serverIP │ 201.54 MiB │ 50.25 MiB │ 4.010824061219731 │
│ method │ 201.53 MiB │ 43.64 MiB │ 4.617721053304486 │
│ uriStem │ 5.13 GiB │ 832.51 MiB │ 6.311522291936919 │
│ uriQuery │ 2.58 GiB │ 501.06 MiB │ 5.269731450124478 │
│ port │ 803.06 MiB │ 3.98 MiB │ 201.91673864241824 │
│ username │ 318.08 MiB │ 26.93 MiB │ 11.812513794583598 │
│ clientIP │ 2.35 GiB │ 82.59 MiB │ 29.132328640073343 │
│ clientRealIP │ 2.49 GiB │ 465.05 MiB │ 5.478382297052563 │
│ userAgent │ 18.34 GiB │ 764.08 MiB │ 24.57905114484208 │
│ referer │ 14.71 GiB │ 1.37 GiB │ 10.736792723669906 │
│ response │ 803.06 MiB │ 83.81 MiB │ 9.582334090987247 │
│ subresponse │ 399.87 MiB │ 1.83 MiB │ 218.4831068635027 │
│ win32response │ 407.86 MiB │ 7.41 MiB │ 55.050315514606815 │
│ timetaken │ 1.57 GiB │ 402.06 MiB │ 3.9947395692010637 │
│ uriQuery__utm_medium │ 208.17 MiB │ 12.29 MiB │ 16.936148912472955 │
│ uriQuery__utm_source │ 215.18 MiB │ 13.00 MiB │ 16.548367623199912 │
│ uriQuery__utm_campaign │ 381.46 MiB │ 37.94 MiB │ 10.055156353418509 │
│ uriQuery__utm_term │ 231.82 MiB │ 10.78 MiB │ 21.502540454070672 │
│ uriQuery__utm_content │ 441.34 MiB │ 87.60 MiB │ 5.038260760449327 │
│ uriQuery__yclid │ 216.88 MiB │ 16.58 MiB │ 13.07721335008116 │
│ uriQuery__region │ 204.35 MiB │ 9.49 MiB │ 21.52661903446796 │
└────────────────────────┴──────────────┴────────────┴────────────────────┘
28 rows in set. Elapsed: 0.005 sec.
ʻO ka wehewehe ʻana i nā mea i hoʻohana ʻia
FileBeat. Ka hoʻoili ʻana i nā moʻolelo waihona
Mālama kēia ʻāpana i nā loli e hoʻopaʻa i nā faila ma ka disk a hāʻawi i ka ʻike iā LogStash. Hoʻokomo ʻia ma nā kikowaena āpau kahi i kākau ʻia ai nā faila log (maʻamau IIS). Hana ʻia ma ke ʻano huelo (ʻo ia ka hoʻoili ʻana i nā moʻolelo i hoʻohui ʻia i ka faila). Akā ma kahi kaʻawale hiki ke hoʻonohonoho ʻia e hoʻoili i nā faila holoʻokoʻa. Pono kēia inā pono ʻoe e hoʻoiho i ka ʻikepili mai nā mahina i hala. E hoʻokomo wale i ka faila log i loko o kahi waihona a e heluhelu ʻo ia i kona holoʻokoʻa.
Ke pau ka lawelawe, ʻaʻole e hoʻoneʻe hou ʻia ka ʻikepili i ka waihona.
ʻO kahi hoʻonohonoho hoʻohālike e like me kēia:
filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- C:/inetpub/logs/LogFiles/W3SVC1/*.log
exclude_files: ['.gz$','.zip$']
tail_files: true
ignore_older: 24h
fields:
fld_server_name: "site1.domain.ru"
fld_app_name: "site1.domain.ru"
fld_app_module: "web"
fld_website_name: "web-main"
- type: log
enabled: true
paths:
- C:/inetpub/logs/LogFiles/__Import/access_log-*
exclude_files: ['.gz$','.zip$']
tail_files: false
fields:
fld_server_name: "site2.domain.ru"
fld_app_name: "site2.domain.ru"
fld_app_module: "web"
fld_website_name: "web-main"
fld_logformat: "logformat__apache"
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
reload.period: 2s
output.logstash:
hosts: ["log.domain.com:5044"]
ssl.enabled: true
ssl.certificate_authorities: ["C:/filebeat/certs/ca.pem", "C:/filebeat/certs/ca-issuing.pem"]
ssl.certificate: "C:/filebeat/certs/site1.domain.ru.cer"
ssl.key: "C:/filebeat/certs/site1.domain.ru.key"
#================================ Processors =====================================
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
waihona lāʻau. ʻOhi lāʻau
Hoʻolālā ʻia kēia ʻāpana no ka loaʻa ʻana o nā hoʻokomo log mai FileBeat (a i ʻole ma o ka queue RabbitMQ), e hoʻopaʻa a hoʻokomo i nā pūʻulu i ka waihona ClickHouse.
No ka hoʻokomo ʻana i ClickHouse, hoʻohana ʻia ka plugin Logstash-output-clickhouse. Loaʻa i ka plugin Logstash kahi noi hoʻāʻo hou, akā me ka pani maʻamau, ʻoi aku ka maikaʻi o ka hoʻōki ʻana i ka lawelawe ponoʻī. I ka wā i kū ai, e hōʻiliʻili ʻia nā memo ma ka queue RabbitMQ, no laila inā lōʻihi ka hoʻomaha ʻana, a laila ʻoi aku ka maikaʻi o ka hoʻopau ʻana iā Filebeats ma nā kikowaena. Ma kahi hoʻolālā kahi i hoʻohana ʻole ʻia ai ʻo RabbitMQ (ma ka pūnaewele kūloko, hoʻouna pololei ʻo Filebeat i nā lāʻau i Logstash), hana maikaʻi ʻo Filebeats a paʻa, no laila no lākou ka loaʻa ʻole o ka puka ʻana me ka hopena ʻole.
ʻO kahi hoʻonohonoho hoʻohālike e like me kēia:
log_web__filebeat_clickhouse.conf
input {
beats {
port => 5044
type => 'iis'
ssl => true
ssl_certificate_authorities => ["/etc/logstash/certs/ca.cer", "/etc/logstash/certs/ca-issuing.cer"]
ssl_certificate => "/etc/logstash/certs/server.cer"
ssl_key => "/etc/logstash/certs/server-pkcs8.key"
ssl_verify_mode => "peer"
add_field => {
"fld_server_name" => "%{[fields][fld_server_name]}"
"fld_app_name" => "%{[fields][fld_app_name]}"
"fld_app_module" => "%{[fields][fld_app_module]}"
"fld_website_name" => "%{[fields][fld_website_name]}"
"fld_log_file_name" => "%{source}"
"fld_logformat" => "%{[fields][fld_logformat]}"
}
}
rabbitmq {
host => "queue.domain.com"
port => 5671
user => "q-reader"
password => "password"
queue => "web_log"
heartbeat => 30
durable => true
ssl => true
#ssl_certificate_path => "/etc/logstash/certs/server.p12"
#ssl_certificate_password => "password"
add_field => {
"fld_server_name" => "%{[fields][fld_server_name]}"
"fld_app_name" => "%{[fields][fld_app_name]}"
"fld_app_module" => "%{[fields][fld_app_module]}"
"fld_website_name" => "%{[fields][fld_website_name]}"
"fld_log_file_name" => "%{source}"
"fld_logformat" => "%{[fields][fld_logformat]}"
}
}
}
filter {
if [message] =~ "^#" {
drop {}
}
if [fld_logformat] == "logformat__iis_with_xrealip" {
grok {
match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{IP:serverIP} %{WORD:method} %{NOTSPACE:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:userAgent} %{NOTSPACE:referer} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response} %{NUMBER:timetaken} %{NOTSPACE:xrealIP} %{NOTSPACE:xforwarderfor}"]
}
} else {
grok {
match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{IP:serverIP} %{WORD:method} %{NOTSPACE:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:userAgent} %{NOTSPACE:referer} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response} %{NUMBER:timetaken}"]
}
}
date {
match => [ "log_timestamp", "YYYY-MM-dd HH:mm:ss" ]
timezone => "Etc/UTC"
remove_field => [ "log_timestamp", "@timestamp" ]
target => [ "log_timestamp2" ]
}
ruby {
code => "tstamp = event.get('log_timestamp2').to_i
event.set('logdatetime', Time.at(tstamp).strftime('%Y-%m-%d %H:%M:%S'))
event.set('logdate', Time.at(tstamp).strftime('%Y-%m-%d'))"
}
if [bytesSent] {
ruby {
code => "event['kilobytesSent'] = event['bytesSent'].to_i / 1024.0"
}
}
if [bytesReceived] {
ruby {
code => "event['kilobytesReceived'] = event['bytesReceived'].to_i / 1024.0"
}
}
ruby {
code => "event.set('clientRealIP', event.get('clientIP'))"
}
if [xrealIP] {
ruby {
code => "event.set('clientRealIP', event.get('xrealIP'))"
}
}
if [xforwarderfor] {
ruby {
code => "event.set('clientRealIP', event.get('xforwarderfor'))"
}
}
mutate {
convert => ["bytesSent", "integer"]
convert => ["bytesReceived", "integer"]
convert => ["timetaken", "integer"]
convert => ["port", "integer"]
add_field => {
"clientHostname" => "%{clientIP}"
}
}
useragent {
source=> "useragent"
prefix=> "browser"
}
kv {
source => "uriQuery"
prefix => "uriQuery__"
allow_duplicate_values => false
field_split => "&"
include_keys => [ "utm_medium", "utm_source", "utm_campaign", "utm_term", "utm_content", "yclid", "region" ]
}
mutate {
join => { "uriQuery__utm_source" => "," }
join => { "uriQuery__utm_medium" => "," }
join => { "uriQuery__utm_campaign" => "," }
join => { "uriQuery__utm_term" => "," }
join => { "uriQuery__utm_content" => "," }
join => { "uriQuery__yclid" => "," }
join => { "uriQuery__region" => "," }
}
}
output {
#stdout {codec => rubydebug}
clickhouse {
headers => ["Authorization", "Basic abcdsfks..."]
http_hosts => ["http://127.0.0.1:8123"]
save_dir => "/etc/logstash/tmp"
table => "log_web"
request_tolerance => 1
flush_size => 10000
idle_flush_time => 1
mutations => {
"fld_log_file_name" => "fld_log_file_name"
"fld_server_name" => "fld_server_name"
"fld_app_name" => "fld_app_name"
"fld_app_module" => "fld_app_module"
"fld_website_name" => "fld_website_name"
"logdatetime" => "logdatetime"
"logdate" => "logdate"
"serverIP" => "serverIP"
"method" => "method"
"uriStem" => "uriStem"
"uriQuery" => "uriQuery"
"port" => "port"
"username" => "username"
"clientIP" => "clientIP"
"clientRealIP" => "clientRealIP"
"userAgent" => "userAgent"
"referer" => "referer"
"response" => "response"
"subresponse" => "subresponse"
"win32response" => "win32response"
"timetaken" => "timetaken"
"uriQuery__utm_medium" => "uriQuery__utm_medium"
"uriQuery__utm_source" => "uriQuery__utm_source"
"uriQuery__utm_campaign" => "uriQuery__utm_campaign"
"uriQuery__utm_term" => "uriQuery__utm_term"
"uriQuery__utm_content" => "uriQuery__utm_content"
"uriQuery__yclid" => "uriQuery__yclid"
"uriQuery__region" => "uriQuery__region"
}
}
}
pipelines.yml
# This file is where you define your pipelines. You can define multiple.
# For more information on multiple pipelines, see the documentation:
# https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html
- pipeline.id: log_web__filebeat_clickhouse
path.config: "/etc/logstash/log_web__filebeat_clickhouse.conf"
hale kaomi. Waihona moʻolelo
Hoʻopaʻa ʻia nā moʻolelo no nā ʻōnaehana āpau i ka papa hoʻokahi (e ʻike i ka hoʻomaka o ka ʻatikala). Manaʻo ia e mālama i ka ʻike e pili ana i nā noi: ua like nā ʻāpana āpau no nā ʻano like ʻole, e like me IIS logs, apache a me nginx logs. No nā loina noi, kahi, no ka laʻana, nā hewa, nā memo ʻike, nā ʻōlelo aʻo i hoʻopaʻa ʻia, e hāʻawi ʻia kahi papa ʻokoʻa me ke ʻano kūpono (i kēia manawa ma ka pae hoʻolālā).
I ka hoʻolālā ʻana i kahi papaʻaina, he mea nui e hoʻoholo i ke kī nui (kahi e hoʻokaʻawale ʻia ai ka ʻikepili i ka wā mālama). ʻO ke kiʻekiʻe o ka hoʻopili ʻana i ka ʻikepili a me ka wikiwiki o ka nīnau e pili ana i kēia. I kā mākou laʻana, ʻo ke kī
KAUOHA MA (fld_app_name, fld_app_module, logdatetime)
ʻO ia, ma ka inoa o ka ʻōnaehana, ka inoa o ka ʻāpana ʻōnaehana a me ka lā o ka hanana. I ka hoʻomaka ʻana, ua hele mua ka lā o ka hanana. Ma hope o ka neʻe ʻana iā ia i kahi hope, hoʻomaka nā nīnau e hana ʻelua ʻoi aku ka wikiwiki. Pono ka hoʻololi ʻana i ke kī nui e hana hou i ka papaʻaina a hoʻouka hou i ka ʻikepili i hiki ai iā ClickHouse ke hoʻonohonoho hou i ka ʻikepili ma ka disk. He hana koʻikoʻi kēia, no laila he manaʻo maikaʻi e noʻonoʻo nui e pili ana i ka mea e hoʻokomo ʻia i ke kī ʻano.
Pono e hoʻomaopopo ʻia ua ʻike ʻia ka ʻano data LowCardinality i nā mana hou. I ka hoʻohana ʻana iā ia, ua hoʻemi nui ʻia ka nui o nā ʻikepili i hoʻopili ʻia no kēlā mau māla i loaʻa ka cardinality haʻahaʻa (mau koho liʻiliʻi).
Ke hoʻohana ʻia nei ka mana 19.6 a hoʻolālā mākou e hoʻāʻo e hoʻonui i ka mana hou loa. Loaʻa iā lākou nā hiʻohiʻona nani e like me Adaptive Granularity, Skipping indices a me ka codec DoubleDelta, no ka laʻana.
Ma ka maʻamau, i ka wā o ka hoʻokomo ʻana, ua hoʻonohonoho ʻia ka pae logging e trace. Hoʻololi ʻia nā lāʻau a hoʻopaʻa ʻia, akā i ka manawa like e hoʻonui lākou i kahi gigabyte. Inā ʻaʻohe pono, a laila hiki iā ʻoe ke hoʻonohonoho i ka pae ʻōlelo aʻo, a laila hoʻemi nui ʻia ka nui o ka log. Hoʻonohonoho ʻia ka hoʻonohonoho logging ma ka faila config.xml:
<!-- Possible levels: https://github.com/pocoproject/poco/blob/develop/Foundation/include/Poco/Logger. h#L105 -->
<level>warning</level>
Kekahi mau kauoha pono
Поскольку оригинальные пакеты установки собираются по Debian, то для других версий Linux необходимо использовать пакеты собранные компанией Altinity.
Вот по этой ссылке есть инструкции с ссылками на их репозиторий: https://www.altinity.com/blog/2017/12/18/logstash-with-clickhouse
sudo yum search clickhouse-server
sudo yum install clickhouse-server.noarch
1. проверка статуса
sudo systemctl status clickhouse-server
2. остановка сервера
sudo systemctl stop clickhouse-server
3. запуск сервера
sudo systemctl start clickhouse-server
Запуск для выполнения запросов в многострочном режиме (выполнение после знака ";")
clickhouse-client --multiline
clickhouse-client --multiline --host 127.0.0.1 --password pa55w0rd
clickhouse-client --multiline --host 127.0.0.1 --port 9440 --secure --user default --password pa55w0rd
Плагин кликлауза для логстеш в случае ошибки в одной строке сохраняет всю пачку в файл /tmp/log_web_failed.json
Можно вручную исправить этот файл и попробовать залить его в БД вручную:
clickhouse-client --host 127.0.0.1 --password password --query="INSERT INTO log_web FORMAT JSONEachRow" < /tmp/log_web_failed__fixed.json
sudo mv /etc/logstash/tmp/log_web_failed.json /etc/logstash/tmp/log_web_failed__fixed.json
sudo chown user_dev /etc/logstash/tmp/log_web_failed__fixed.json
sudo clickhouse-client --host 127.0.0.1 --password password --query="INSERT INTO log_web FORMAT JSONEachRow" < /etc/logstash/tmp/log_web_failed__fixed.json
sudo mv /etc/logstash/tmp/log_web_failed__fixed.json /etc/logstash/tmp/log_web_failed__fixed_.json
выход из командной строки
quit;
## Настройка TLS
https://www.altinity.com/blog/2019/3/5/clickhouse-networking-part-2
openssl s_client -connect log.domain.com:9440 < /dev/null
waihona lāʻau. E hoʻopaʻa inoa i ka mea hoʻokele mai FileBeat i ka queue RabbitMQ
Hoʻohana ʻia kēia ʻāpana e ala i nā lāʻau e hele mai ana mai FileBeat i ka queue RabbitMQ. ʻElua mau wahi ma ʻaneʻi:
- ʻO ka mea pōʻino, ʻaʻohe o FileBeat i kahi plugin output e kākau pololei iā RabbitMQ. A ʻo ia mau hana, e hoʻoholo ana i ka pilikia ma kā lākou github, ʻaʻole i hoʻolālā ʻia no ka hoʻokō. Aia kahi plugin no Kafka, akā no kekahi kumu ʻaʻole hiki iā mākou ke hoʻohana ma ka home.
- Aia nā koi no ka ʻohi ʻana i nā lāʻau ma ka DMZ. Ma muli o ia mau mea, pono e hoʻohui mua ʻia nā lāʻau i ka pila a laila heluhelu ʻo LogStash i nā mea komo mai ka pila mai waho.
No laila, no ka hihia kahi i loaʻa ai nā kikowaena i ka DMZ e hoʻohana kekahi i kahi hoʻolālā paʻakikī. ʻO kahi laʻana hoʻonohonoho e like me kēia:
iis_w3c_logs__filebeat_rabbitmq.conf
input {
beats {
port => 5044
type => 'iis'
ssl => true
ssl_certificate_authorities => ["/etc/pki/tls/certs/app/ca.pem", "/etc/pki/tls/certs/app/ca-issuing.pem"]
ssl_certificate => "/etc/pki/tls/certs/app/queue.domain.com.cer"
ssl_key => "/etc/pki/tls/certs/app/queue.domain.com-pkcs8.key"
ssl_verify_mode => "peer"
}
}
output {
#stdout {codec => rubydebug}
rabbitmq {
host => "127.0.0.1"
port => 5672
exchange => "monitor.direct"
exchange_type => "direct"
key => "%{[fields][fld_app_name]}"
user => "q-writer"
password => "password"
ssl => false
}
}
RabbitMQ. pila memo
Hoʻohana ʻia kēia ʻāpana no ka hoʻopaʻa ʻana i nā hoʻokomo log ma ka DMZ. Hana ʻia ka hoʻopaʻa ʻana ma o kahi hui o Filebeat → LogStash. Hana ʻia ka heluhelu ʻana mai waho o ka DMZ ma o LogStash. I ka hana ʻana ma o RabboitMQ, ma kahi o 4 tausani mau memo i kēlā me kēia kekona.
Hoʻonohonoho ʻia ka hoʻokele memo e ka inoa ʻōnaehana, ʻo ia hoʻi ma muli o ka ʻikepili hoʻonohonoho FileBeat. Hele nā memo a pau i hoʻokahi pila. Inā no kekahi kumu i hoʻopau ʻia ka lawelawe queuing, a laila ʻaʻole ia e alakaʻi i ka nalowale o nā leka: E loaʻa iā FileBeats nā hewa pili a hoʻokuʻu i ka hoʻouna ʻana. A ʻo LogStash e heluhelu ana mai ka queue e loaʻa pū i nā hewa pūnaewele a kali i ka hoʻihoʻi ʻana o ka pilina. I kēia hihia, ʻaʻole e kākau hou ʻia ka ʻikepili i ka waihona.
Hoʻohana ʻia nā ʻōlelo aʻoaʻo e hana a hoʻonohonoho i nā queues:
sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin declare exchange --vhost=/ name=monitor.direct type=direct sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin declare queue --vhost=/ name=web_log durable=true
sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin --vhost="/" declare binding source="monitor.direct" destination_type="queue" destination="web_log" routing_key="site1.domain.ru"
sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin --vhost="/" declare binding source="monitor.direct" destination_type="queue" destination="web_log" routing_key="site2.domain.ru"
Grafana. Nā papa kuhikuhi
Hoʻohana ʻia kēia ʻāpana e nānā i ka ʻikepili nānā. I kēia hihia, pono ʻoe e hoʻokomo i ka ʻikepili ClickHouse no Grafana 4.6+ plugin. Pono mākou e hoʻololi iki i mea e hoʻomaikaʻi ai i ka hana ʻana i nā kānana SQL ma ka dashboard.
No ka laʻana, hoʻohana mākou i nā mea hoʻololi, a inā ʻaʻole i hoʻonohonoho ʻia i loko o ke kahua kānana, a laila makemake mākou ʻaʻole e hana i kahi kūlana ma WHERE o ke ʻano ( uriStem = » AND uriStem != » ). I kēia hihia, e heluhelu ʻo ClickHouse i ke kolamu uriStem. Ma keʻano laulā, ua hoʻāʻo mākou i nā koho like ʻole a hoʻoponopono hope i ka plugin (ka $ valueIfEmpty macro) i mea e hoʻihoʻi mai ai ka waiwai i ka 1, me ka ʻole o ka haʻi ʻana i ke kolamu ponoʻī.
A i kēia manawa hiki iā ʻoe ke hoʻohana i kēia nīnau no ka pakuhi
$columns(response, count(*) c) from $table where $adhoc
and $valueIfEmpty($fld_app_name, 1, fld_app_name = '$fld_app_name')
and $valueIfEmpty($fld_app_module, 1, fld_app_module = '$fld_app_module') and $valueIfEmpty($fld_server_name, 1, fld_server_name = '$fld_server_name') and $valueIfEmpty($uriStem, 1, uriStem like '%$uriStem%')
and $valueIfEmpty($clientRealIP, 1, clientRealIP = '$clientRealIP')
ʻo ia ka unuhi i kēia SQL (e hoʻomaopopo ua hoʻololi ʻia nā kahua uriStem hakahaka i 1 wale nō)
SELECT
t,
groupArray((response, c)) AS groupArr
FROM (
SELECT
(intDiv(toUInt32(logdatetime), 60) * 60) * 1000 AS t, response,
count(*) AS c FROM default.log_web
WHERE (logdate >= toDate(1565061982)) AND (logdatetime >= toDateTime(1565061982)) AND 1 AND (fld_app_name = 'site1.domain.ru') AND (fld_app_module = 'web') AND 1 AND 1 AND 1
GROUP BY
t, response
ORDER BY
t ASC,
response ASC
)
GROUP BY t ORDER BY t ASC
hopena
ʻO ka hiʻohiʻona o ka waihona ClickHouse ua lilo i mea hōʻailona ma ka mākeke. He mea paʻakikī ke noʻonoʻo, me ka uku ʻole ʻole, i ka manawa koke ua paʻa mākou i kahi mea hana ikaika a kūpono no ka hana ʻana me ka ʻikepili nui. ʻOiaʻiʻo, me ka hoʻonui ʻana i nā pono (e like me ka sharding a me ka hoʻopiʻi ʻana i nā server he nui), e ʻoi aku ka paʻakikī o ka hoʻolālā. Akā ma nā manaʻo mua, ʻoluʻolu loa ka hana ʻana me kēia waihona. Hiki keʻikeʻia ua hanaʻia ka huahana "no nā kānaka."
Ke hoʻohālikelike ʻia me ElasticSearch, ua manaʻo ʻia e hoʻemi ʻia ke kumukūʻai o ka mālama ʻana a me ka hoʻoili ʻana i nā lāʻau. I nā huaʻōlelo ʻē aʻe, inā no ka nui o ka ʻikepili i kēia manawa e hoʻonohonoho mākou i kahi hui o nā mīkini, a laila i ka wā e hoʻohana ai iā ClickHouse, lawa ka mīkini haʻahaʻa haʻahaʻa iā mākou. ʻAe, ʻoiaʻiʻo, loaʻa iā ElasticSearch nā mīkini hoʻopili ʻikepili ma ka disk a me nā hiʻohiʻona ʻē aʻe e hiki ke hōʻemi nui i ka hoʻohana ʻana i nā kumuwaiwai, akā ke hoʻohālikelike ʻia me ClickHouse, ʻoi aku ka nui o ke kumukūʻai.
Me ka loaʻa ʻole o nā optimizations kūikawā ma kā mākou ʻaoʻao, ma nā hoʻonohonoho paʻamau, hoʻouka ʻana i ka ʻikepili a me ke koho ʻana mai ka waihona e hana i kahi wikiwiki kupaianaha. ʻAʻole nui kā mākou ʻikepili i kēia manawa (ma kahi o 200 miliona mau moʻolelo), akā nāwaliwali ke kikowaena ponoʻī. Hiki iā mākou ke hoʻohana i kēia mea hana i ka wā e hiki mai ana no nā kumu ʻē aʻe i pili ʻole i ka mālama ʻana i nā lāʻau. No ka laʻana, no ka ʻikepili hope-a-hope, ma ke kahua o ka palekana, aʻo mīkini.
I ka hopena, he wahi liʻiliʻi e pili ana i nā pono a me nā pōʻino.
Минусы
- Hoʻouka i nā moʻolelo ma nā pūʻulu nui. Ma ka lima hoʻokahi, he hiʻohiʻona kēia, akā pono ʻoe e hoʻohana i nā ʻāpana ʻē aʻe no ka hoʻopaʻa ʻana i nā moʻolelo. ʻAʻole maʻalahi kēia hana i nā manawa a pau, akā hiki ke hoʻonā. A makemake wau e maʻalahi i ka papahana.
- Ua haki pinepine kekahi mau hana exotic a i ʻole nā hiʻohiʻona hou i nā mana hou. Hoʻopilikia kēia i ka hopohopo, e hōʻemi ana i ka makemake e hoʻonui i kahi mana hou. No ka laʻana, ʻo ka mīkini papaʻaina Kafka kahi hiʻohiʻona maikaʻi loa e hiki ai iā ʻoe ke heluhelu pololei i nā hanana mai Kafka, me ka ʻole o ka hoʻokō ʻana i nā mea kūʻai. Akā i ka hoʻoholo ʻana i ka nui o nā pilikia ma ka github, mālama mākou i ʻole e hoʻohana i kēia mīkini i ka hana. Eia nō naʻe, inā ʻaʻole ʻoe e hana koke i ka ʻaoʻao a hoʻohana i ka hana nui, a laila hana paʻa.
Плюсы
- ʻAʻole lohi.
- paepae komo haʻahaʻa.
- Māka-kumu.
- Kuokoa.
- Paipai maikaʻi ʻia (sharding/replication out of the box)
- Hoʻokomo ʻia i loko o ka papa inoa o nā polokalamu Lūkini i ʻōlelo ʻia e ka Ministry of Communications.
- ʻO ka loaʻa ʻana o ke kākoʻo mana mai Yandex.
Source: www.habr.com