U-Alexey Lizunov, intloko yeziko lobuchule kumajelo eenkonzo ezikude zoLawulo lweTekhnoloji yoLwazi lwe-ICB
Njengenye indlela ye-ELK stack (ElasticSearch, Logstash, Kibana), senza uphando ngokusebenzisa i-ClickHouse database njengendawo yokugcina idatha kwiilogi.
Kweli nqaku singathanda ukuthetha ngamava ethu ngokusebenzisa i-ClickHouse database kunye neziphumo zokuqala ukusuka ekusebenzeni komqhubi. Kuyafaneleka ukuba uqaphele kwangoko ukuba iziphumo ziyamangalisa.
Okulandelayo siza kuchaza ngokweenkcukacha ngakumbi indlela inkqubo yethu eqwalaselwe ngayo kwaye ibandakanya ntoni na amalungu. Kodwa ngoku ndingathanda ukuthetha kancinci malunga nale database xa iyonke, kwaye kutheni kufanelekile ukunikela ingqalelo. I-database ye-ClickHouse yi-database ephezulu yohlalutyo lwekholamu evela kwi-Yandex. Isetyenziswe kwiinkonzo zeYandex, ekuqaleni oku kukugcina idatha ephambili yeYandex.Metrica. Inkqubo yomthombo ovulekileyo, simahla. Ukusuka kwimbono yomphuhlisi, bendihlala ndizibuza ukuba bayenze njani le nto, kuba kukho idatha enkulu kakhulu. Kwaye ujongano lomsebenzisi weMetrica ngokwalo lubhetyebhetye kwaye lusebenza ngokukhawuleza. Xa uqala ukuqhelana nale database, ufumana umbono: "Ewe, ekugqibeleni! Yenzelwe “abantu”! Ukusuka kwinkqubo yokufakela ukuya ekuthumeleni izicelo.”
Le database inomqobo ophantsi kakhulu wokungena. Nokuba umphuhlisi ophakathi angafaka le database kwimizuzu embalwa kwaye aqalise ukuyisebenzisa. Yonke into isebenza kakuhle. Nokuba abantu abatsha kwiLinux banokukhawuleza ukumelana nofakelo kwaye benze imisebenzi elula. Ukuba ngaphambili, xa besiva amagama athi Big Data, Hadoop, Google BigTable, HDFS, umphuhlisi oqhelekileyo wayenombono wokuba bathetha ngezinye iiterabytes, i-petabytes, ukuba abanye abantu abanamandla angaphezu kwawomntu babandakanyeka ekusekweni nasekuphuhliseni ezi nkqubo, ngoko Ukufika kweClickHouse database sifumene isixhobo esilula, esiqondakalayo onokusombulula ngaso uluhlu lweengxaki ebezingenakufikelelwa. Ekuphela kwento efunekayo ngumatshini omnye ngokufanelekileyo kunye nemizuzu emihlanu ukuyifaka. Oko kukuthi, sifumene i-database efana, umzekelo, i-MySql, kodwa kuphela ukugcina iibhiliyoni zeerekhodi! A uhlobo superarchiver ngolwimi SQL. Kuba ngathi abantu banikwa izixhobo zasemzini.
Malunga nenkqubo yethu yokuqokelela log
Ukuqokelela ulwazi, iifayile zelogi ze-IIS zezicelo zewebhu zefomathi esemgangathweni zisetyenziswa (kwangoku sibandakanyeka ekucazululeni iilogi zezicelo, kodwa eyona njongo yethu iphambili kwinqanaba lokulinga kukuqokelela iilogi ze-IIS).
Asikwazanga ukushiya ngokupheleleyo isitaki se-ELK ngenxa yezizathu ezahlukeneyo, kwaye siyaqhubeka sisebenzisa i-LogStash kunye ne-Filebeat components, eziye zazibonakalisa kakuhle kwaye zisebenza ngokuthembekileyo nangokuqikelelekayo.
Iskimu sokugawulwa kwemithi ngokubanzi sibonisiwe kumzobo ongezantsi:
Isici sedatha yokurekhoda kwi-database ye-ClickHouse yi-infrequent (kanye ngesibini) ukufakwa kweerekhodi kwiibhetshi ezinkulu. Oku, ngokucacileyo, yeyona nxalenye "yingxaki" odibana nayo xa usebenza neClickHouse database okokuqala: iskimu siba nzima ngakumbi.
I-plugin ye-LogStash, efaka ngokuthe ngqo idatha kwi-ClickHouse, incede kakhulu apha. Eli candelo lisetyenziswe kwiseva efanayo njengesiseko sedatha ngokwayo. Ke, xa sithetha ngokubanzi, akukhuthazwa ukwenza oku, kodwa ngokwembono ebonakalayo, ukuze ungenzi iiseva ezihlukeneyo ngelixa zibekwe kwiseva enye. Asikhange siqwalasele nakuphi na ukungaphumeleli okanye ukungqubana kwemithombo novimba weenkcukacha. Ukongezelela, kufuneka kuqatshelwe ukuba i-plugin inendlela yokubuyisela xa kukho iimpazamo. Kwaye kwimeko yeempazamo, iplagin ibhala kwidisk i-batch yedatha engakwazi ukufakwa (ifomati yefayile ifanelekile: emva kokuhlela, unokufaka ngokulula ibhetshi echanekileyo usebenzisa i-clickhouse-client).
Uluhlu olupheleleyo lwesoftware esetyenziswa kwiskim luthiwe thaca kwitheyibhile:
Uluhlu lwesoftware esetyenziswayo
Isihloko
inkcazelo
Ikhonkco kulwabiwo
NGINX
I-reverse-proxy yokuthintela ufikelelo ngezibuko kunye nogunyaziso lolungiselelo
Okwangoku ayisetyenziswanga kwiskim
FileBeat
Ukugqithiselwa kweelog zefayile.
LogStash
Umqokeleli welogi.
Isetyenziselwa ukuqokelela iilogi kwiFayileBeat, kunye nokuqokelela iilogi ukusuka kumgca we-RabbitMQ (kwiiseva eziku-DMZ.)
Logstash- imveliso- clickhouse
Loagstash plugin yokudlulisela iingodo kwiClickHouse database kwiibhetshi
/usr/share/logstash/bin/logstash-plugin faka logstash-output-clickhouse
/usr/share/logstash/bin/logstash-plugin install logstash-filter-prune
/usr/share/logstash/bin/logstash-plugin faka logstash-filter-multiline
Cofa indlu
Ukugcinwa kwelog
Phawula. Ukususela ngo-Agasti 2018, i-rpm "eqhelekileyo" yakha i-RHEL ibonakala kwindawo yokugcina i-Yandex, ngoko unokuzama ukuyisebenzisa. Ngexesha lofakelo besisebenzisa iipakethe ezihlanganiswe nguAltinity.
IGrafana
Ukubonwa kweelog. Ukumisela iideshibhodi
Redhat & Centos(64 Bit) – inguqulelo yamva nje
Umthombo wedatha weClickHouse weGrafana 4.6+
Iplagi yeGrafana enomthombo wedatha weClickHouse
LogStash
Loga umzila ukusuka kwiFayileBeat ukuya kumgca weRabbitMQ.
Phawula. Ngelishwa iFayileBeat ayinayo imveliso ngokuthe ngqo kwi-RabbitMQ, ngoko ke ikhonkco eliphakathi ngendlela yeLogstash liyafuneka.
UmvundlaMQ
Umgca womyalezo. Esi sisithinteli samangeno elogi kwi-DMZ
Erlang Runtime (Iyafuneka kwiRabbitMQ)
Erlang ixesha lokusebenza. Iyimfuneko ukuze iRabbitMQ isebenze
Ulungelelwaniso lomncedisi ngeClickHouse database inikwe kolu luhlu lulandelayo:
Isihloko
Nentsingiselo
Qaphela:
Isimo
I-HDD: 40GB
I-RAM: 8GB
Iprosesa: Core 2 2Ghz
Kufuneka ubeke ingqalelo kwiingcebiso zokusebenzisa i-ClickHouse database (
Inkqubo ebanzi yesoftware
OS:Red Hat Enterprise Linux Server (Maipo)
I-JRE (iJava 8)
Njengoko ubona, le yindawo yokusebenza eqhelekileyo.
Ulwakhiwo lwetheyibhile yokugcina iinkuni lulolu hlobo lulandelayo:
log_web.sql
CREATE TABLE log_web (
logdate Date,
logdatetime DateTime CODEC(Delta, LZ4HC),
fld_log_file_name LowCardinality( String ),
fld_server_name LowCardinality( String ),
fld_app_name LowCardinality( String ),
fld_app_module LowCardinality( String ),
fld_website_name LowCardinality( String ),
serverIP LowCardinality( String ),
method LowCardinality( String ),
uriStem String,
uriQuery String,
port UInt32,
username LowCardinality( String ),
clientIP String,
clientRealIP String,
userAgent String,
referer String,
response String,
subresponse String,
win32response String,
timetaken UInt64
, uriQuery__utm_medium String
, uriQuery__utm_source String
, uriQuery__utm_campaign String
, uriQuery__utm_term String
, uriQuery__utm_content String
, uriQuery__yclid String
, uriQuery__region String
) Engine = MergeTree()
PARTITION BY toYYYYMM(logdate)
ORDER BY (fld_app_name, fld_app_module, logdatetime)
SETTINGS index_granularity = 8192;
Sisebenzisa amaxabiso angagqibekanga ukwahlulahlula (ngenyanga) kunye nesalathiso segranularity. Zonke iindawo zingqinelana ne-IIS log entries yokurekhoda izicelo ze-http. Ngokwahlukileyo, siqaphela ukuba kukho iindawo ezihlukeneyo zokugcina iithegi ze-utm (zihlanjululwe kwinqanaba lokufaka kwitafile ukusuka kwintsimi yomtya wombuzo).
Kwakhona, iindawo ezininzi zesistim zongezwe kwitheyibhile ukugcina ulwazi malunga neenkqubo, amalungu, kunye nabancedisi. Ngengcaciso yale mimandla, jonga le theyibhile ingezantsi. Kwitheyibhile enye sigcina iingodo kwiinkqubo ezininzi.
Isihloko
inkcazelo
Umzekelo:
fld_app_name
Isicelo/igama lenkqubo
Amaxabiso asebenzayo:
- Isayithi1.domain.com Indawo yangaphandle 1
- Isayithi2.domain.com Indawo yangaphandle 2
- indawo yangaphakathi1.domain.local indawo yangaphakathi 1
indawo1.domain.com
fld_app_modyuli
Imodyuli yenkqubo
Amaxabiso asebenzayo:
- iwebhu-Iwebhusayithi
- svc-Inkonzo yewebhu yewebhu
- intgr — Inkonzo yodibaniso lwewebhu
- bo — Umlawuli (Iofisi yangasemva)
Kwiwebhu
fld_igama_lewebhusayithi
Igama lesiza kwi-IIS
Iinkqubo ezininzi zinokubekwa kumncedisi omnye, okanye iimeko ezininzi zemodyuli enye yesixokelelwano
web-main
fld_server_igama
Igama leseva
web1.domain.com
fld_log_file_name
Indlela eya kwifayile yelog kumncedisi
Ukusuka:inetpublogsLogFiles
W3SVC1u_ex190711.log
Oku kukuvumela ukuba wenze ngokufanelekileyo iigrafu eGrafana. Umzekelo, jonga izicelo ukusuka kumphambili wenkqubo ethile. Oku kufana ne-site counter kwi-Yandex.Metrica.
Nazi ezinye izibalo zokusetyenziswa kwedatha yeenyanga ezimbini.
Inani leerekhodi ngokwenkqubo namacandelo
SELECT
fld_app_name,
fld_app_module,
count(fld_app_name) AS rows_count
FROM log_web
GROUP BY
fld_app_name,
fld_app_module
WITH TOTALS
ORDER BY
fld_app_name ASC,
rows_count DESC
┌─fld_app_name─────┬─fld_app_module─┬─rows_count─┐
│ site1.domain.ru │ web │ 131441 │
│ site2.domain.ru │ web │ 1751081 │
│ site3.domain.ru │ web │ 106887543 │
│ site3.domain.ru │ svc │ 44908603 │
│ site3.domain.ru │ intgr │ 9813911 │
│ site4.domain.ru │ web │ 772095 │
│ site5.domain.ru │ web │ 17037221 │
│ site5.domain.ru │ intgr │ 838559 │
│ site5.domain.ru │ bo │ 7404 │
│ site6.domain.ru │ web │ 595877 │
│ site7.domain.ru │ web │ 27778858 │
└──────────────────┴────────────────┴────────────┘
Totals:
┌─fld_app_name─┬─fld_app_module─┬─rows_count─┐
│ │ │ 210522593 │
└──────────────┴────────────────┴────────────┘
11 rows in set. Elapsed: 4.874 sec. Processed 210.52 million rows, 421.67 MB (43.19 million rows/s., 86.51 MB/s.)
Umthamo wedatha yediski
SELECT
formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed,
formatReadableSize(sum(data_compressed_bytes)) AS compressed,
sum(rows) AS total_rows
FROM system.parts
WHERE table = 'log_web'
┌─uncompressed─┬─compressed─┬─total_rows─┐
│ 54.50 GiB │ 4.86 GiB │ 211427094 │
└──────────────┴────────────┴────────────┘
1 rows in set. Elapsed: 0.035 sec.
Umlinganiselo woxinzelelo lwekholamu yedatha
SELECT
name,
formatReadableSize(data_uncompressed_bytes) AS uncompressed,
formatReadableSize(data_compressed_bytes) AS compressed,
data_uncompressed_bytes / data_compressed_bytes AS compress_ratio
FROM system.columns
WHERE table = 'log_web'
┌─name───────────────────┬─uncompressed─┬─compressed─┬─────compress_ratio─┐
│ logdate │ 401.53 MiB │ 1.80 MiB │ 223.16665968777315 │
│ logdatetime │ 803.06 MiB │ 35.91 MiB │ 22.363966401202305 │
│ fld_log_file_name │ 220.66 MiB │ 2.60 MiB │ 84.99905736932571 │
│ fld_server_name │ 201.54 MiB │ 50.63 MiB │ 3.980924816977078 │
│ fld_app_name │ 201.17 MiB │ 969.17 KiB │ 212.55518183686877 │
│ fld_app_module │ 201.17 MiB │ 968.60 KiB │ 212.67805817411906 │
│ fld_website_name │ 201.54 MiB │ 1.24 MiB │ 162.7204926761546 │
│ serverIP │ 201.54 MiB │ 50.25 MiB │ 4.010824061219731 │
│ method │ 201.53 MiB │ 43.64 MiB │ 4.617721053304486 │
│ uriStem │ 5.13 GiB │ 832.51 MiB │ 6.311522291936919 │
│ uriQuery │ 2.58 GiB │ 501.06 MiB │ 5.269731450124478 │
│ port │ 803.06 MiB │ 3.98 MiB │ 201.91673864241824 │
│ username │ 318.08 MiB │ 26.93 MiB │ 11.812513794583598 │
│ clientIP │ 2.35 GiB │ 82.59 MiB │ 29.132328640073343 │
│ clientRealIP │ 2.49 GiB │ 465.05 MiB │ 5.478382297052563 │
│ userAgent │ 18.34 GiB │ 764.08 MiB │ 24.57905114484208 │
│ referer │ 14.71 GiB │ 1.37 GiB │ 10.736792723669906 │
│ response │ 803.06 MiB │ 83.81 MiB │ 9.582334090987247 │
│ subresponse │ 399.87 MiB │ 1.83 MiB │ 218.4831068635027 │
│ win32response │ 407.86 MiB │ 7.41 MiB │ 55.050315514606815 │
│ timetaken │ 1.57 GiB │ 402.06 MiB │ 3.9947395692010637 │
│ uriQuery__utm_medium │ 208.17 MiB │ 12.29 MiB │ 16.936148912472955 │
│ uriQuery__utm_source │ 215.18 MiB │ 13.00 MiB │ 16.548367623199912 │
│ uriQuery__utm_campaign │ 381.46 MiB │ 37.94 MiB │ 10.055156353418509 │
│ uriQuery__utm_term │ 231.82 MiB │ 10.78 MiB │ 21.502540454070672 │
│ uriQuery__utm_content │ 441.34 MiB │ 87.60 MiB │ 5.038260760449327 │
│ uriQuery__yclid │ 216.88 MiB │ 16.58 MiB │ 13.07721335008116 │
│ uriQuery__region │ 204.35 MiB │ 9.49 MiB │ 21.52661903446796 │
└────────────────────────┴──────────────┴────────────┴────────────────────┘
28 rows in set. Elapsed: 0.005 sec.
Inkcazo yamacandelo asetyenzisiweyo
FileBeat. Ukuhanjiswa kweelog zefayile
Eli candelo lijonga utshintsho kwiifayile zelog kwidiski kwaye ligqithise ulwazi kwiLogStash. Ifakwe kuzo zonke iiseva apho iifayile zelog zibhalwayo (ngokuqhelekileyo IIS). Isebenza kwimodi yomsila (oko kukuthi, idlulisela kuphela iirekhodi ezongezelelweyo kwifayile). Kodwa ungayiqwalasela ngokwahlukeneyo ukuze udlulise zonke iifayile. Oku kulungele xa ufuna ukukhuphela idatha kwiinyanga ezidlulileyo. Faka nje ifayile yelog kwifolda kwaye iya kuyifunda yonke.
Xa inkonzo iyeka, idatha iyayeka ukudluliselwa kwindawo yokugcina.
Umzekelo woqwalaselo ujongeka ngolu hlobo:
filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- C:/inetpub/logs/LogFiles/W3SVC1/*.log
exclude_files: ['.gz$','.zip$']
tail_files: true
ignore_older: 24h
fields:
fld_server_name: "site1.domain.ru"
fld_app_name: "site1.domain.ru"
fld_app_module: "web"
fld_website_name: "web-main"
- type: log
enabled: true
paths:
- C:/inetpub/logs/LogFiles/__Import/access_log-*
exclude_files: ['.gz$','.zip$']
tail_files: false
fields:
fld_server_name: "site2.domain.ru"
fld_app_name: "site2.domain.ru"
fld_app_module: "web"
fld_website_name: "web-main"
fld_logformat: "logformat__apache"
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
reload.period: 2s
output.logstash:
hosts: ["log.domain.com:5044"]
ssl.enabled: true
ssl.certificate_authorities: ["C:/filebeat/certs/ca.pem", "C:/filebeat/certs/ca-issuing.pem"]
ssl.certificate: "C:/filebeat/certs/site1.domain.ru.cer"
ssl.key: "C:/filebeat/certs/site1.domain.ru.key"
#================================ Processors =====================================
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
LogStash. Log Collector
Eli candelo lenzelwe ukufumana iirekhodi zelogi kwiFayileBeat (okanye ngokusebenzisa umgca we-RabbitMQ), uhlalutye kwaye uwafake kwiibhetshi kwi-ClickHouse database.
Ukufaka kwiClickHouse, sebenzisa iplagi yeLogstash-output-clickhouse. I-plugin ye-Logstash inendlela yokubuyisela izicelo, kodwa ngexesha lokuvalwa rhoqo, kungcono ukumisa inkonzo ngokwayo. Xa imisiwe, imiyalezo iya kuqokelela kwi-RabbitMQ emgceni, ngoko ke ukuba ukumisa ixesha elide, kungcono ukumisa iiFayilebeats kwiiseva. Kwiskimu apho i-RabbitMQ ingasetyenziswanga (kwinethiwekhi yendawo yeFayilebeat ithumela ngokuthe ngqo iilog kwi-Logstash), iiFayilebeats zisebenza ngokwamkelekileyo kwaye zikhuselekile, ngoko ke ukungabikho kwemveliso akunaziphumo.
Umzekelo woqwalaselo ujongeka ngolu hlobo:
log_web__filebeat_clickhouse.conf
input {
beats {
port => 5044
type => 'iis'
ssl => true
ssl_certificate_authorities => ["/etc/logstash/certs/ca.cer", "/etc/logstash/certs/ca-issuing.cer"]
ssl_certificate => "/etc/logstash/certs/server.cer"
ssl_key => "/etc/logstash/certs/server-pkcs8.key"
ssl_verify_mode => "peer"
add_field => {
"fld_server_name" => "%{[fields][fld_server_name]}"
"fld_app_name" => "%{[fields][fld_app_name]}"
"fld_app_module" => "%{[fields][fld_app_module]}"
"fld_website_name" => "%{[fields][fld_website_name]}"
"fld_log_file_name" => "%{source}"
"fld_logformat" => "%{[fields][fld_logformat]}"
}
}
rabbitmq {
host => "queue.domain.com"
port => 5671
user => "q-reader"
password => "password"
queue => "web_log"
heartbeat => 30
durable => true
ssl => true
#ssl_certificate_path => "/etc/logstash/certs/server.p12"
#ssl_certificate_password => "password"
add_field => {
"fld_server_name" => "%{[fields][fld_server_name]}"
"fld_app_name" => "%{[fields][fld_app_name]}"
"fld_app_module" => "%{[fields][fld_app_module]}"
"fld_website_name" => "%{[fields][fld_website_name]}"
"fld_log_file_name" => "%{source}"
"fld_logformat" => "%{[fields][fld_logformat]}"
}
}
}
filter {
if [message] =~ "^#" {
drop {}
}
if [fld_logformat] == "logformat__iis_with_xrealip" {
grok {
match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{IP:serverIP} %{WORD:method} %{NOTSPACE:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:userAgent} %{NOTSPACE:referer} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response} %{NUMBER:timetaken} %{NOTSPACE:xrealIP} %{NOTSPACE:xforwarderfor}"]
}
} else {
grok {
match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{IP:serverIP} %{WORD:method} %{NOTSPACE:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:userAgent} %{NOTSPACE:referer} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response} %{NUMBER:timetaken}"]
}
}
date {
match => [ "log_timestamp", "YYYY-MM-dd HH:mm:ss" ]
timezone => "Etc/UTC"
remove_field => [ "log_timestamp", "@timestamp" ]
target => [ "log_timestamp2" ]
}
ruby {
code => "tstamp = event.get('log_timestamp2').to_i
event.set('logdatetime', Time.at(tstamp).strftime('%Y-%m-%d %H:%M:%S'))
event.set('logdate', Time.at(tstamp).strftime('%Y-%m-%d'))"
}
if [bytesSent] {
ruby {
code => "event['kilobytesSent'] = event['bytesSent'].to_i / 1024.0"
}
}
if [bytesReceived] {
ruby {
code => "event['kilobytesReceived'] = event['bytesReceived'].to_i / 1024.0"
}
}
ruby {
code => "event.set('clientRealIP', event.get('clientIP'))"
}
if [xrealIP] {
ruby {
code => "event.set('clientRealIP', event.get('xrealIP'))"
}
}
if [xforwarderfor] {
ruby {
code => "event.set('clientRealIP', event.get('xforwarderfor'))"
}
}
mutate {
convert => ["bytesSent", "integer"]
convert => ["bytesReceived", "integer"]
convert => ["timetaken", "integer"]
convert => ["port", "integer"]
add_field => {
"clientHostname" => "%{clientIP}"
}
}
useragent {
source=> "useragent"
prefix=> "browser"
}
kv {
source => "uriQuery"
prefix => "uriQuery__"
allow_duplicate_values => false
field_split => "&"
include_keys => [ "utm_medium", "utm_source", "utm_campaign", "utm_term", "utm_content", "yclid", "region" ]
}
mutate {
join => { "uriQuery__utm_source" => "," }
join => { "uriQuery__utm_medium" => "," }
join => { "uriQuery__utm_campaign" => "," }
join => { "uriQuery__utm_term" => "," }
join => { "uriQuery__utm_content" => "," }
join => { "uriQuery__yclid" => "," }
join => { "uriQuery__region" => "," }
}
}
output {
#stdout {codec => rubydebug}
clickhouse {
headers => ["Authorization", "Basic abcdsfks..."]
http_hosts => ["http://127.0.0.1:8123"]
save_dir => "/etc/logstash/tmp"
table => "log_web"
request_tolerance => 1
flush_size => 10000
idle_flush_time => 1
mutations => {
"fld_log_file_name" => "fld_log_file_name"
"fld_server_name" => "fld_server_name"
"fld_app_name" => "fld_app_name"
"fld_app_module" => "fld_app_module"
"fld_website_name" => "fld_website_name"
"logdatetime" => "logdatetime"
"logdate" => "logdate"
"serverIP" => "serverIP"
"method" => "method"
"uriStem" => "uriStem"
"uriQuery" => "uriQuery"
"port" => "port"
"username" => "username"
"clientIP" => "clientIP"
"clientRealIP" => "clientRealIP"
"userAgent" => "userAgent"
"referer" => "referer"
"response" => "response"
"subresponse" => "subresponse"
"win32response" => "win32response"
"timetaken" => "timetaken"
"uriQuery__utm_medium" => "uriQuery__utm_medium"
"uriQuery__utm_source" => "uriQuery__utm_source"
"uriQuery__utm_campaign" => "uriQuery__utm_campaign"
"uriQuery__utm_term" => "uriQuery__utm_term"
"uriQuery__utm_content" => "uriQuery__utm_content"
"uriQuery__yclid" => "uriQuery__yclid"
"uriQuery__region" => "uriQuery__region"
}
}
}
imibhobho.yml
# This file is where you define your pipelines. You can define multiple.
# For more information on multiple pipelines, see the documentation:
# https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html
- pipeline.id: log_web__filebeat_clickhouse
path.config: "/etc/logstash/log_web__filebeat_clickhouse.conf"
ClickHouse. Ukugcinwa kwelog
Iilogi zazo zonke iinkqubo zigcinwa kwitafile enye (jonga ekuqaleni kwenqaku). Yenzelwe ukugcina ulwazi malunga nezicelo: zonke iiparameters ziyafana kwiifomati ezahlukeneyo, umzekelo iilogi ze-IIS, ii-apache kunye ne-nginx logs. Kwiilogi zesicelo apho, umzekelo, iimpazamo, imiyalezo yolwazi, izilumkiso zirekhodwa, itafile eyahlukileyo iya kunikwa isakhiwo esifanelekileyo (okwangoku kwinqanaba lokuyila).
Xa uyila itafile, kubaluleke kakhulu ukwenza isigqibo malunga nesitshixo esiphambili (apho idatha iya kulungiswa ngexesha lokugcinwa). Iqondo loxinzelelo lwedatha kunye nesantya sombuzo sixhomekeke koku. Kumzekelo wethu, undoqo
U-ORDER BY (fld_app_name, fld_app_module, logdatetime)
Oko kukuthi, ngegama lenkqubo, igama lecandelo lenkqubo kunye nomhla wesiganeko. Ekuqaleni, umhla womsitho weza kuqala. Emva kokuyisusa kwindawo yokugqibela, imibuzo yaqala ukusebenza phantse kabini ngokukhawuleza. Ukutshintsha iqhosha eliphambili kuya kufuna ukuphinda wenze itafile kwaye uphinde ulayishe idatha ukuze iClickHouse iphinde ihlele idatha kwidiski. Lo ngumsebenzi onzima, ngoko kuyacetyiswa ukuba ucinge ngononophelo kwangaphambili malunga noko kufuneka kufakwe kwisitshixo sohlobo.
Kufuneka kwakhona kuqatshelwe ukuba uhlobo lwedatha ye-LowCardinality luvele kwiinguqulelo zamva nje. Xa uyisebenzisa, ubungakanani bedatha ecinezelweyo buncitshiswe ngokukhawuleza kulawo mabala anekhadinality ephantsi (izinketho ezimbalwa).
Ngoku sisebenzisa uguqulelo 19.6 kwaye siceba ukuzama ukuhlaziya inguqulelo yamva nje. Baneempawu ezintle ezifana ne-Adaptive Granularity, Ukutsiba i-indices kunye ne-DoubleDelta codec, umzekelo.
Ngokungagqibekanga, ngexesha lofakelo inqanaba lokugawulwa koqwalaselo limiselwe ukulandelela. Izigodo zijikeleziswa kwaye zigcinwe, kodwa kwangaxeshanye zanda ukuya kwigigabyte. Ukuba akukho mfuneko, ngoko unokuseta umgangatho kwisilumkiso, ngoko ubungakanani belogi buya kuncipha ngokukhawuleza. Izicwangciso zokuloga zikhankanyiwe kwifayile ye config.xml:
<!-- Possible levels: https://github.com/pocoproject/poco/blob/develop/Foundation/include/Poco/Logger. h#L105 -->
<level>warning</level>
Eminye imiyalelo eluncedo
Поскольку оригинальные пакеты установки собираются по Debian, то для других версий Linux необходимо использовать пакеты собранные компанией Altinity.
Вот по этой ссылке есть инструкции с ссылками на их репозиторий: https://www.altinity.com/blog/2017/12/18/logstash-with-clickhouse
sudo yum search clickhouse-server
sudo yum install clickhouse-server.noarch
1. проверка статуса
sudo systemctl status clickhouse-server
2. остановка сервера
sudo systemctl stop clickhouse-server
3. запуск сервера
sudo systemctl start clickhouse-server
Запуск для выполнения запросов в многострочном режиме (выполнение после знака ";")
clickhouse-client --multiline
clickhouse-client --multiline --host 127.0.0.1 --password pa55w0rd
clickhouse-client --multiline --host 127.0.0.1 --port 9440 --secure --user default --password pa55w0rd
Плагин кликлауза для логстеш в случае ошибки в одной строке сохраняет всю пачку в файл /tmp/log_web_failed.json
Можно вручную исправить этот файл и попробовать залить его в БД вручную:
clickhouse-client --host 127.0.0.1 --password password --query="INSERT INTO log_web FORMAT JSONEachRow" < /tmp/log_web_failed__fixed.json
sudo mv /etc/logstash/tmp/log_web_failed.json /etc/logstash/tmp/log_web_failed__fixed.json
sudo chown user_dev /etc/logstash/tmp/log_web_failed__fixed.json
sudo clickhouse-client --host 127.0.0.1 --password password --query="INSERT INTO log_web FORMAT JSONEachRow" < /etc/logstash/tmp/log_web_failed__fixed.json
sudo mv /etc/logstash/tmp/log_web_failed__fixed.json /etc/logstash/tmp/log_web_failed__fixed_.json
выход из командной строки
quit;
## Настройка TLS
https://www.altinity.com/blog/2019/3/5/clickhouse-networking-part-2
openssl s_client -connect log.domain.com:9440 < /dev/null
LogStash. Loga umzila ukusuka kwiFayileBeat ukuya kumgca weRabbitMQ
Eli candelo lisetyenziselwa ukuhambisa iilogi ezivela kwiFayileBeat ukuya kumgca weRabbitMQ. Kukho amanqaku amabini apha:
- Ngelishwa, iFayileBeat ayinayo iplagin ephumayo yokubhala ngokuthe ngqo kwiRabbitMQ. Kwaye ukusebenza okunjalo, ukugweba ngeposi kwi-github yabo, akucwangciswanga ukuphunyezwa. Kukho iplagin yeKafka, kodwa ngenxa yezizathu ezithile asinakuyisebenzisa ngokwethu.
- Kukho iimfuno zokuqokelela iilogi kwi-DMZ. Ngokusekelwe kuzo, iilogi maziqale zibekwe emgceni kwaye emva koko i-LogStash ifunde iirekhodi ukusuka kumgca ngaphandle.
Ke ngoko, ngokukodwa kwimeko yeeseva ezibekwe kwi-DMZ, kuyafuneka ukusebenzisa iskimu esinobunzima obuncinci. Umzekelo woqwalaselo ujongeka ngolu hlobo:
iis_w3c_logs__filebeat_rabbitmq.conf
input {
beats {
port => 5044
type => 'iis'
ssl => true
ssl_certificate_authorities => ["/etc/pki/tls/certs/app/ca.pem", "/etc/pki/tls/certs/app/ca-issuing.pem"]
ssl_certificate => "/etc/pki/tls/certs/app/queue.domain.com.cer"
ssl_key => "/etc/pki/tls/certs/app/queue.domain.com-pkcs8.key"
ssl_verify_mode => "peer"
}
}
output {
#stdout {codec => rubydebug}
rabbitmq {
host => "127.0.0.1"
port => 5672
exchange => "monitor.direct"
exchange_type => "direct"
key => "%{[fields][fld_app_name]}"
user => "q-writer"
password => "password"
ssl => false
}
}
UmvundlaMQ. Uluhlu lomyalezo
Eli candelo lisetyenziselwa ukuthintela ukungena kwelogi kwi-DMZ. Ukurekhoda kwenziwa ngeFayilebeat → LogStash ikhonkco. Ukufunda kwenziwa ngaphandle kweDMZ ngeLogStash. Xa usebenza ngeRabbitMQ, malunga nemiyalezo engamawaka angama-4 ngesekhondi icutshungulwa.
Ukuhanjiswa komyalezo kuqwalaselwe ngegama lenkqubo, oko kukuthi, ngokusekelwe kwidatha yokumisela iFayileBeat. Yonke imiyalezo ingena kumgca omnye. Ukuba ngesizathu esithile inkonzo yokufola iyekile, oku akuyi kukhokelela ekulahlekeni komyalezo: IiFayileBeats ziya kufumana iimpazamo zoqhagamshelwano kwaye ziya kuyeka ukuthumela okwethutyana. Kwaye i-LogStash, efundeka emgceni, iya kufumana kwakhona iimpazamo zenethiwekhi kwaye ilinde uxhulumaniso ukuba lubuyiselwe. Kule meko, ngokuqinisekileyo, idatha ayisayi kubhalwa kwisiseko sedatha.
Le miyalelo ilandelayo isetyenziswa ukwenza kunye nokuqwalasela imigca:
sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin declare exchange --vhost=/ name=monitor.direct type=direct sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin declare queue --vhost=/ name=web_log durable=true
sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin --vhost="/" declare binding source="monitor.direct" destination_type="queue" destination="web_log" routing_key="site1.domain.ru"
sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin --vhost="/" declare binding source="monitor.direct" destination_type="queue" destination="web_log" routing_key="site2.domain.ru"
Grafana. Dashboards
Eli candelo lisetyenziselwa ukujonga idatha yokubeka iliso. Kule meko, kufuneka ufake i-ClickHouse datasource ye-plugin ye-Grafana 4.6+. Kwafuneka siyilungise kancinci ukuphucula ukusebenza kakuhle kokucoca izihluzi zeSQL kwideshibhodi.
Umzekelo, sisebenzisa izinto eziguquguqukayo, kwaye ukuba azichazwanga kumhlaba wokucoca, ngoko singathanda ukuba ingavelisi imeko kwindawo APHO kwifom ( uriStem = "KUNYE uriStem != "). Kule meko, i-ClickHouse iya kufunda ikholamu ye-uriStem. Ngoko, sizame iindlela ezahlukeneyo kwaye ekugqibeleni silungise iplagin (i-$valueIfEmpty macro) ukubuyisela i-1 kwimeko yexabiso elingenanto, ngaphandle kokukhankanya ikholamu ngokwayo.
Kwaye ngoku ungasebenzisa lo mbuzo kwigrafu
$columns(response, count(*) c) from $table where $adhoc
and $valueIfEmpty($fld_app_name, 1, fld_app_name = '$fld_app_name')
and $valueIfEmpty($fld_app_module, 1, fld_app_module = '$fld_app_module') and $valueIfEmpty($fld_server_name, 1, fld_server_name = '$fld_server_name') and $valueIfEmpty($uriStem, 1, uriStem like '%$uriStem%')
and $valueIfEmpty($clientRealIP, 1, clientRealIP = '$clientRealIP')
eguqulelwa kwiSQL ngolu hlobo (qaphela ukuba imihlaba ye-uriStem engenanto iguqulelwa ku-1 nje)
SELECT
t,
groupArray((response, c)) AS groupArr
FROM (
SELECT
(intDiv(toUInt32(logdatetime), 60) * 60) * 1000 AS t, response,
count(*) AS c FROM default.log_web
WHERE (logdate >= toDate(1565061982)) AND (logdatetime >= toDateTime(1565061982)) AND 1 AND (fld_app_name = 'site1.domain.ru') AND (fld_app_module = 'web') AND 1 AND 1 AND 1
GROUP BY
t, response
ORDER BY
t ASC,
response ASC
)
GROUP BY t ORDER BY t ASC
isiphelo
Ukubonakala kweClickHouse database ibe yinto ephawulekayo kwimarike. Kwakunzima ukucinga ukuba ngephanyazo, ngaphandle kwentlawulo, sasixhobe ngesixhobo esinamandla nesisebenzayo sokusebenza ngedatha enkulu. Ewe, njengoko iimfuno zinyuka (umzekelo, ukwahlula kunye nokuphindaphinda kwiiseva ezininzi), iskimu siya kuba nzima ngakumbi. Kodwa ngokwemibono yokuqala, ukusebenza nale database kumnandi kakhulu. Kucacile ukuba imveliso yenzelwe "abantu".
Xa kuthelekiswa ne-ElasticSearch, iindleko zokugcina kunye nokucubungula iinkuni, ngokoqikelelo lokuqala, zincitshiswa izihlandlo ezihlanu ukuya kwezilishumi. Ngamanye amazwi, ukuba umthamo wangoku wedatha kuya kufuneka simise iqoqo loomatshini abaninzi, ngoko xa usebenzisa i-ClickHouse sifuna kuphela umatshini omnye ophantsi kwamandla. Ewe, kunjalo, i-ElasticSearch nayo ineendlela zokunyanzeliswa kwedatha kwi-disk kunye nezinye iimpawu ezinokunciphisa kakhulu ukusetyenziswa kwezixhobo, kodwa xa kuthelekiswa neClickHouse oku kuya kufuna iindleko ezinkulu.
Ngaphandle kokulungiswa okukhethekileyo kwicala lethu, kunye nezicwangciso ezingagqibekanga, ukulayisha idatha kunye nokubuyisela idatha kwisiseko sedatha isebenza ngesantya esimangalisayo. Asinayo idatha eninzi okwangoku (malunga neerekhodi ze-200 yezigidi), kodwa iseva ngokwayo ibuthathaka. Sinokusebenzisa esi sixhobo kwixa elizayo ngezinye iinjongo ezingahambelani nokugcinwa kweelog. Ngokomzekelo, kwi-analytics yokuphela kokuphela, kwintsimi yokhuseleko, ukufunda ngomatshini.
Ekugqibeleni, incinci malunga neenzuzo kunye neengxaki.
Минусы
- Ilayisha iirekhodi kwiibhetshi ezinkulu. Kwelinye icala, eli liphawu, kodwa kusafuneka usebenzise amacandelo awongezelelweyo ukufihla iirekhodi. Lo msebenzi awusoloko ulula, kodwa usenokusombulula. Kwaye ndingathanda ukwenza lula inkqubo.
- Ezinye izinto ezingaqhelekanga okanye izinto ezintsha zihlala ziqhekeka kwiinguqulelo ezintsha. Oku kuphakamisa iinkxalabo, ukunciphisa umnqweno wokunyusela kwinguqulelo entsha. Ngokomzekelo, i-injini yetafile ye-Kafka yinto eluncedo kakhulu evumela ukuba ufunde ngokuthe ngqo iziganeko ezivela eKafka, ngaphandle kokuphumeza abathengi. Kodwa ngokujonga inani leMiba kwi-Github, sisenoloyiko lokusebenzisa le njini kwimveliso. Nangona kunjalo, ukuba awukwenzi ukunyakaza ngokukhawuleza kwicala kwaye usebenzise umsebenzi osisiseko, ngoko usebenza ngokuzinzileyo.
Плюсы
- Ayicothi.
- Umda wokungena ophantsi.
- Vula Umnikezi.
- Mahala.
- I-scalable (ukwahlulwa/ukuphindwa ngaphandle kwebhokisi)
- Ibandakanyiwe kwirejista yesoftware yaseRussia ecetyiswa nguMphathiswa wezoNxibelelwano.
- Ukufumaneka kwenkxaso esemthethweni evela kwiYandex.
umthombo: www.habr.com