ClickHouse Database mo Tagata, poʻo Alien Technologies

Aleksey Lizunov, Ulu o le Competence Center mo Auala Tautua Mamao o le Directorate of Information Technologies a le MKB

ClickHouse Database mo Tagata, poʻo Alien Technologies

I le avea ai ma se isi mea i le ELK stack (ElasticSearch, Logstash, Kibana), o loʻo matou faia suʻesuʻega i le faʻaaogaina o le ClickHouse database e fai ma faʻamaumauga mo faʻamaumauga.

I totonu o lenei tusiga, matou te fia talanoa e uiga i lo matou poto masani i le faʻaaogaina o le ClickHouse database ma faʻaiʻuga muamua o le galuega pailate. E tatau ona matauina i le taimi lava lena o taunuuga na mataʻina.


ClickHouse Database mo Tagata, poʻo Alien Technologies

O le isi, o le a matou faʻamatalaina atili faʻamatalaga pe faʻafefea ona faʻapipiʻiina la matou polokalama, ma o a vaega e aofia ai. Ae o lenei ou te fia talanoa laitiiti e uiga i lenei database i lona atoaga, ma pe aisea e aoga ai le gauai i ai. O le ClickHouse database o se faʻamaumauga faʻamaumauga maualuga faʻataʻitaʻiga mai Yandex. E faʻaaogaina i auaunaga a Yandex, muamua o le faʻamaumauga autu mo le Yandex.Metrica. Open-source system, leai se totogi. Mai le vaaiga a le tagata atiaʻe, ou te mafaufau pea pe faʻapefea ona latou faʻatinoina, aua o loʻo i ai faʻamatalaga tetele. Ma o le faʻaoga faʻaoga a Metrica lava ia e matua fetuutuunai ma vave. I le taimi muamua na masani ai i lenei faʻamaumauga, o le lagona e faapea: “Ia, mulimuli ane! Ua faia mo tagata! Amata mai le faʻapipiʻiina ma faʻaiʻu i le tuʻuina atu o talosaga.

O lenei fa'amaumauga o lo'o i ai se fa'ailoga maualalo tele. E oo lava i se tagata atiaʻe masani e mafai ona faʻapipiʻi lenei faʻamaumauga i ni nai minute ma amata faʻaaogaina. E lelei mea uma. E oo lava i tagata e fou i Linux e mafai ona vave taulimaina le faʻapipiʻiina ma faia galuega sili ona faigofie. Afai na muamua atu, faatasi ai ma upu Big Data, Hadoop, Google BigTable, HDFS, na i ai i se tagata e masani ona faia ni manatu e uiga i nisi terabytes, petabytes, o nisi tagata sili o loʻo galulue i tulaga ma atinaʻe mo nei faiga, ona oʻo mai lea o le ClickHouse. database, matou te maua se meafaigaluega faigofie, malamalama e mafai ona e foia ai le tele o galuega e le mafai ona maua muamua. E na'o le tasi le masini masani ma le lima minute e fa'apipi'i ai. O lona uiga, matou maua se faʻamaumauga e pei o, mo se faʻataʻitaʻiga, MySql, ae naʻo le teuina o piliona o faʻamaumauga! O se fa'amaumauga sili ma le gagana SQL. E pei na tuuina atu i tagata auupega a tagata ese.

E uiga i la matou faiga fa'amau

Ina ia aoina faʻamatalaga, o loʻo faʻaogaina faila ogalaau IIS o faʻasologa masani i luga o le upega tafaʻilagi (o loʻo matou faʻaogaina nei tusi talosaga, ae o le sini autu i le tulaga pailate o le aoina o ogalaau IIS).

Mo mafuaaga eseese, e le mafai ona matou lafoaia atoa le ELK stack, ma o loʻo faʻaauau pea ona matou faʻaogaina vaega LogStash ma Filebeat, lea ua faʻamaonia lelei i latou lava ma galulue faʻalagolago ma faʻamaonia.

O lo'o fa'aalia i le ata o lo'o i lalo:

ClickHouse Database mo Tagata, poʻo Alien Technologies

O se vaega o le tusiaina o faʻamatalaga i le ClickHouse database e seasea (tasi i le sekone) faʻapipiʻiina o faʻamaumauga i vaega tetele. O lenei, e foliga mai, o le vaega pito sili ona "faʻafitauli" e te faʻafeiloaʻi i le taimi muamua e te galue ai i le ClickHouse database: o le polokalame e fai sina lavelave.
O le faʻapipiʻi mo LogStash, lea e tuʻu saʻo ai faʻamatalaga i totonu ClickHouse, fesoasoani tele iinei. O lenei vaega o loʻo faʻapipiʻiina i luga o le server tutusa ma le database lava ia. O le mea lea, e masani ona tautala, e le fautuaina e fai, ae mai se vaaiga faʻapitoa, ina ia aua neʻi maua ni 'auʻaunaga eseese aʻo faʻapipiʻiina i luga o le server tutusa. Matou te le'i va'aia ni fa'aletonu po'o ni fete'ena'iga puna'oa ma fa'amaumauga. E le gata i lea, e tatau ona maitauina o le faʻapipiʻi o loʻo i ai se masini toe faʻataʻitaʻi pe a iai ni mea sese. Ma i le tulaga o mea sese, e tusia e le plugin i le tisiki se vaega o faʻamaumauga e le mafai ona faʻaofiina (o le faila faila e faigofie: pe a uma ona faʻasaʻo, e faigofie ona e faʻaofiina le faʻasaʻo faʻaoga e faʻaaoga ai le clickhouse-client).

O se lisi atoa o polokalama faʻaaogaina i le polokalame o loʻo tuʻuina atu i le laulau:

Lisi o polokalame fa'aoga

Ulutala

faʻamatalaga

So'oga fa'asoa

NGINX

Reverse-proxy e faʻatapulaʻa le avanoa e ala i ports ma faʻatulagaina faʻatagaina

Le taimi nei e le o faʻaaogaina i le polokalame

https://nginx.org/ru/download.html

https://nginx.org/download/nginx-1.16.0.tar.gz

FileBeat

Fa'aliliuina o fa'amaumauga faila.

https://www.elastic.co/downloads/beats/filebeat (tusi tufa mo le Windows 64bit).

https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.3.0-windows-x86_64.zip

fa'amaufa'ailoga

Fa'aputu ogalaau.

Fa'aaoga e aoina mai ai ogalaau mai le FileBeat, fa'apea fo'i ma le aoina mai o ogalaau mai le RabbitMQ queue (mo 'au'aunaga o lo'o i le DMZ.)

https://www.elastic.co/products/logstash

https://artifacts.elastic.co/downloads/logstash/logstash-7.0.1.rpm

Logstash-output-clickhouse

Loagstash plugin mo le fesiitaiga o ogalaau i le ClickHouse database i vaega

https://github.com/mikechris/logstash-output-clickhouse

/usr/share/logstash/bin/logstash-plugin fa'apipi'i logstash-output-clickhouse

/usr/share/logstash/bin/logstash-plugin fa'apipi'i logstash-filter-prune

/usr/share/logstash/bin/logstash-plugin fa'apipi'i logstash-filter-multiline

Kiliki Fale

Ogalaau teuina https://clickhouse.yandex/docs/ru/

https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-server-19.5.3.8-1.el7.x86_64.rpm

https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-client-19.5.3.8-1.el7.x86_64.rpm

Manatua. Amata mai ia Aokuso 2018, "masani" rpm fausia mo RHEL na faʻaalia i le faleoloa Yandex, o lea e mafai ai ona e taumafai e faʻaoga. I le taimi o le faʻapipiʻiina, sa matou faʻaogaina afifi na fausia e Altinity.

tusifana

Va'aiga o fa'amaumauga. Fa'atulagaina o dashboards

https://grafana.com/

https://grafana.com/grafana/download

Redhat & Centos(64 Bit) - lomiga fou

ClickHouse datasource mo Grafana 4.6+

Faʻapipiʻi mo Grafana faʻatasi ai ma punaoa faʻamatalaga ClickHouse

https://grafana.com/plugins/vertamedia-clickhouse-datasource

https://grafana.com/api/plugins/vertamedia-clickhouse-datasource/versions/1.8.1/download

fa'amaufa'ailoga

Fa'amau le telefoni mai le FileBeat i le RabbitMQ queue.

Manatua. Ae paga lea, e le tu'u sa'o mai le FileBeat i RabbitMQ, o lea e mana'omia ai se feso'ota'iga vavalalata i le tulaga o Logstash.

https://www.elastic.co/products/logstash

https://artifacts.elastic.co/downloads/logstash/logstash-7.0.1.rpm

LapitiMQ

laina savali. Ole ogalaau fa'amau lea ile DMZ

https://www.rabbitmq.com/download.html

https://github.com/rabbitmq/rabbitmq-server/releases/download/v3.7.14/rabbitmq-server-3.7.14-1.el7.noarch.rpm

Erlang Runtime (Manaomia mo RabbitMQ)

Erlang taimi ta'avale. Manaomia mo RabbitMQ e galue

http://www.erlang.org/download.html

https://www.rabbitmq.com/install-rpm.html#install-erlang http://www.erlang.org/downloads/21.3

O le faʻatulagaina o le server ma le ClickHouse database o loʻo tuʻuina atu i le laulau o loʻo i lalo:

Ulutala

tāua

mataʻi

Fetuunaiga

HDD: 40GB
RAM: 8GB
Galue: Autu 2 2Ghz

E tatau ona gauai atu i fautuaga mo le faʻaogaina o le ClickHouse database (https://clickhouse.yandex/docs/ru/operations/tips/)

Polokalama faiga lautele

OS: Red Hat Enterprise Linux Server (Maipo)

JRE (Java 8)

 

E pei ona e vaʻai, o se fale faigaluega masani.

O le fausaga o le laulau mo le teuina o ogalaau e faapea:

log_web.sql

CREATE TABLE log_web (
  logdate Date,
  logdatetime DateTime CODEC(Delta, LZ4HC),
   
  fld_log_file_name LowCardinality( String ),
  fld_server_name LowCardinality( String ),
  fld_app_name LowCardinality( String ),
  fld_app_module LowCardinality( String ),
  fld_website_name LowCardinality( String ),
 
  serverIP LowCardinality( String ),
  method LowCardinality( String ),
  uriStem String,
  uriQuery String,
  port UInt32,
  username LowCardinality( String ),
  clientIP String,
  clientRealIP String,
  userAgent String,
  referer String,
  response String,
  subresponse String,
  win32response String,
  timetaken UInt64
   
  , uriQuery__utm_medium String
  , uriQuery__utm_source String
  , uriQuery__utm_campaign String
  , uriQuery__utm_term String
  , uriQuery__utm_content String
  , uriQuery__yclid String
  , uriQuery__region String
 
) Engine = MergeTree()
PARTITION BY toYYYYMM(logdate)
ORDER BY (fld_app_name, fld_app_module, logdatetime)
SETTINGS index_granularity = 8192;

Matou te fa'aogaina le vaeluaga fa'aletonu (i le masina) ma le fa'asologa o fa'amatalaga. O fa'afanua uma e fetaui lelei ma fa'amaumauga o fa'amaumauga a le IIS mo le fa'aputuina o talosaga http. E ese mai, matou te maitauina o loʻo i ai vaega eseese mo le teuina o utm-tags (o loʻo faʻapipiʻiina i le tulaga o le faʻaofiina i totonu o le laulau mai le manoa fesili).

E le gata i lea, o le tele o faʻalapotopotoga faʻapipiʻi ua faʻaopoopoina i le laulau e teu ai faʻamatalaga e uiga i faiga, vaega, 'auʻaunaga. Va'ai le laulau i lalo mo se fa'amatalaga o nei fanua. I le laulau e tasi, matou te teuina ogalaau mo le tele o faiga.

Ulutala

faʻamatalaga

Faataitaiga:

fld_app_name

Talosaga/igoa faiga
Tau aoga:

  • site1.domain.com Nofoaga i fafo 1
  • site2.domain.com Nofoaga i fafo 2
  • internal-site1.domain.local Nofoaga i totonu 1

site1.domain.com

fld_app_module

System module
Tau aoga:

  • upegatafa'ilagi - Upega tafa'ilagi
  • svc - Auaunaga i luga ole laiga
  • intgr - Tu'ufa'atasiga Upega Tafa'ilagi
  • bo - Pule (BackOffice)

apogāleveleve

fld_website_name

igoa ole nofoaga ile IIS

E tele faiga e mafai ona fa'apipi'iina i luga o le tasi 'au'aunaga, po'o le tele fo'i o fa'ata'ita'iga o le tasi masini fa'aoga

upegatafa'ilagi autu

fld_server_name

Igoa o le server

web1.domain.com

fld_log_file_name

Auala i le faila ogalaau i luga o le server

C:inetpublogsLogFiles
W3SVC1u_ex190711.log

Ole mea lea e mafai ai ona e fausia lelei kalafi ile Grafana. Mo se faʻataʻitaʻiga, vaʻai i talosaga mai le pito i luma o se faiga faʻapitoa. E tali tutusa lea ma le upegatafa'ilagi i Yandex.Metrica.

O nisi nei o fa'amaumauga i le fa'aogaina o fa'amaumauga mo le lua masina.

Numera o fa'amaumauga fa'amavae i faiga ma a latou vaega

SELECT
    fld_app_name,
    fld_app_module,
    count(fld_app_name) AS rows_count
FROM log_web
GROUP BY
    fld_app_name,
    fld_app_module
    WITH TOTALS
ORDER BY
    fld_app_name ASC,
    rows_count DESC
 
┌─fld_app_name─────┬─fld_app_module─┬─rows_count─┐
│ site1.domain.ru  │ web            │     131441 │
│ site2.domain.ru  │ web            │    1751081 │
│ site3.domain.ru  │ web            │  106887543 │
│ site3.domain.ru  │ svc            │   44908603 │
│ site3.domain.ru  │ intgr          │    9813911 │
│ site4.domain.ru  │ web            │     772095 │
│ site5.domain.ru  │ web            │   17037221 │
│ site5.domain.ru  │ intgr          │     838559 │
│ site5.domain.ru  │ bo             │       7404 │
│ site6.domain.ru  │ web            │     595877 │
│ site7.domain.ru  │ web            │   27778858 │
└──────────────────┴────────────────┴────────────┘
 
Totals:
┌─fld_app_name─┬─fld_app_module─┬─rows_count─┐
│              │                │  210522593 │
└──────────────┴────────────────┴────────────┘
 
11 rows in set. Elapsed: 4.874 sec. Processed 210.52 million rows, 421.67 MB (43.19 million rows/s., 86.51 MB/s.)

Le aofaʻi o faʻamatalaga i luga o le disk

SELECT
    formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed,
    formatReadableSize(sum(data_compressed_bytes)) AS compressed,
    sum(rows) AS total_rows
FROM system.parts
WHERE table = 'log_web'
 
┌─uncompressed─┬─compressed─┬─total_rows─┐
│ 54.50 GiB    │ 4.86 GiB   │  211427094 │
└──────────────┴────────────┴────────────┘
 
1 rows in set. Elapsed: 0.035 sec.

Tulaga o fa'amau fa'amaumauga i koluma

SELECT
    name,
    formatReadableSize(data_uncompressed_bytes) AS uncompressed,
    formatReadableSize(data_compressed_bytes) AS compressed,
    data_uncompressed_bytes / data_compressed_bytes AS compress_ratio
FROM system.columns
WHERE table = 'log_web'
 
┌─name───────────────────┬─uncompressed─┬─compressed─┬─────compress_ratio─┐
│ logdate                │ 401.53 MiB   │ 1.80 MiB   │ 223.16665968777315 │
│ logdatetime            │ 803.06 MiB   │ 35.91 MiB  │ 22.363966401202305 │
│ fld_log_file_name      │ 220.66 MiB   │ 2.60 MiB   │  84.99905736932571 │
│ fld_server_name        │ 201.54 MiB   │ 50.63 MiB  │  3.980924816977078 │
│ fld_app_name           │ 201.17 MiB   │ 969.17 KiB │ 212.55518183686877 │
│ fld_app_module         │ 201.17 MiB   │ 968.60 KiB │ 212.67805817411906 │
│ fld_website_name       │ 201.54 MiB   │ 1.24 MiB   │  162.7204926761546 │
│ serverIP               │ 201.54 MiB   │ 50.25 MiB  │  4.010824061219731 │
│ method                 │ 201.53 MiB   │ 43.64 MiB  │  4.617721053304486 │
│ uriStem                │ 5.13 GiB     │ 832.51 MiB │  6.311522291936919 │
│ uriQuery               │ 2.58 GiB     │ 501.06 MiB │  5.269731450124478 │
│ port                   │ 803.06 MiB   │ 3.98 MiB   │ 201.91673864241824 │
│ username               │ 318.08 MiB   │ 26.93 MiB  │ 11.812513794583598 │
│ clientIP               │ 2.35 GiB     │ 82.59 MiB  │ 29.132328640073343 │
│ clientRealIP           │ 2.49 GiB     │ 465.05 MiB │  5.478382297052563 │
│ userAgent              │ 18.34 GiB    │ 764.08 MiB │  24.57905114484208 │
│ referer                │ 14.71 GiB    │ 1.37 GiB   │ 10.736792723669906 │
│ response               │ 803.06 MiB   │ 83.81 MiB  │  9.582334090987247 │
│ subresponse            │ 399.87 MiB   │ 1.83 MiB   │  218.4831068635027 │
│ win32response          │ 407.86 MiB   │ 7.41 MiB   │ 55.050315514606815 │
│ timetaken              │ 1.57 GiB     │ 402.06 MiB │ 3.9947395692010637 │
│ uriQuery__utm_medium   │ 208.17 MiB   │ 12.29 MiB  │ 16.936148912472955 │
│ uriQuery__utm_source   │ 215.18 MiB   │ 13.00 MiB  │ 16.548367623199912 │
│ uriQuery__utm_campaign │ 381.46 MiB   │ 37.94 MiB  │ 10.055156353418509 │
│ uriQuery__utm_term     │ 231.82 MiB   │ 10.78 MiB  │ 21.502540454070672 │
│ uriQuery__utm_content  │ 441.34 MiB   │ 87.60 MiB  │  5.038260760449327 │
│ uriQuery__yclid        │ 216.88 MiB   │ 16.58 MiB  │  13.07721335008116 │
│ uriQuery__region       │ 204.35 MiB   │ 9.49 MiB   │  21.52661903446796 │
└────────────────────────┴──────────────┴────────────┴────────────────────┘
 
28 rows in set. Elapsed: 0.005 sec.

Faʻamatalaga o vaega faʻaaogaina

FileBeat. Fa'aliliuina o ogalaau faila

O lenei vaega e siaki suiga i faila faila i luga o le disk ma pasi le faʻamatalaga i LogStash. Faʻapipiʻi i luga o 'auʻaunaga uma o loʻo tusia ai faila faila (masani IIS). E galue i le si'usi'u tulaga (e pei o le fesiitaiga na o faamaumauga faaopoopo i le faila). Ae tuueseese, e mafai ona e configure le faila faila atoa. E aoga lenei mea pe a mana'omia le la'uina mai o fa'amaumauga mai masina talu ai. Na'o le tu'u o le faila ogalaau i totonu o se pusa ma o le a faitau atoa ai.

A taofi le tautua, e le toe faʻafeiloaʻi faʻamatalaga i le teuina.

O se faʻataʻitaʻiga faʻatulagaina e pei o lenei:

failabeat.yml

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - C:/inetpub/logs/LogFiles/W3SVC1/*.log
  exclude_files: ['.gz$','.zip$']
  tail_files: true
  ignore_older: 24h
  fields:
    fld_server_name: "site1.domain.ru"
    fld_app_name: "site1.domain.ru"
    fld_app_module: "web"
    fld_website_name: "web-main"
 
- type: log
  enabled: true
  paths:
    - C:/inetpub/logs/LogFiles/__Import/access_log-*
  exclude_files: ['.gz$','.zip$']
  tail_files: false
  fields:
    fld_server_name: "site2.domain.ru"
    fld_app_name: "site2.domain.ru"
    fld_app_module: "web"
    fld_website_name: "web-main"
    fld_logformat: "logformat__apache"
 
 
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
  reload.period: 2s
 
output.logstash:
  hosts: ["log.domain.com:5044"]
 
  ssl.enabled: true
  ssl.certificate_authorities: ["C:/filebeat/certs/ca.pem", "C:/filebeat/certs/ca-issuing.pem"]
  ssl.certificate: "C:/filebeat/certs/site1.domain.ru.cer"
  ssl.key: "C:/filebeat/certs/site1.domain.ru.key"
 
#================================ Processors =====================================
 
processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~

fa'amaufa'ailoga. Aoina o ogalaau

O lenei vaega ua mamanuina e maua ai faʻamaumauga ogalaau mai le FileBeat (poʻo le ala i le RabbitMQ queue), faʻasalalau ma faʻapipiʻi vaega i totonu o le ClickHouse database.

Mo le faʻaofiina i totonu o le ClickHouse, o loʻo faʻaogaina le Logstash-output-clickhouse plugin. O le Logstash plugin o loʻo i ai se talosaga toe taumafai, ae faʻatasi ai ma le tapunia masani, e sili atu le taofi o le auaunaga lava ia. A taofi, o le a faʻaputuina feʻau i le RabbitMQ queue, o lea afai o le taofi e umi se taimi, ona sili atu lea ona taofi Filebeats i luga o sapalai. I se polokalame e le faʻaogaina ai le RabbitMQ (i luga o le upega tafaʻilagi, Filebeat lafo saʻo ogalaau i Logstash), Filebeats galue e matua talia ma malupuipuia, o lea mo i latou o le le maua o galuega e pasi e aunoa ma ni taunuuga.

O se faʻataʻitaʻiga faʻatulagaina e pei o lenei:

log_web__filebeat_clickhouse.conf

input {
 
    beats {
        port => 5044
        type => 'iis'
        ssl => true
        ssl_certificate_authorities => ["/etc/logstash/certs/ca.cer", "/etc/logstash/certs/ca-issuing.cer"]
        ssl_certificate => "/etc/logstash/certs/server.cer"
        ssl_key => "/etc/logstash/certs/server-pkcs8.key"
        ssl_verify_mode => "peer"
 
            add_field => {
                "fld_server_name" => "%{[fields][fld_server_name]}"
                "fld_app_name" => "%{[fields][fld_app_name]}"
                "fld_app_module" => "%{[fields][fld_app_module]}"
                "fld_website_name" => "%{[fields][fld_website_name]}"
                "fld_log_file_name" => "%{source}"
                "fld_logformat" => "%{[fields][fld_logformat]}"
            }
    }
 
    rabbitmq {
        host => "queue.domain.com"
        port => 5671
        user => "q-reader"
        password => "password"
        queue => "web_log"
        heartbeat => 30
        durable => true
        ssl => true
        #ssl_certificate_path => "/etc/logstash/certs/server.p12"
        #ssl_certificate_password => "password"
 
        add_field => {
            "fld_server_name" => "%{[fields][fld_server_name]}"
            "fld_app_name" => "%{[fields][fld_app_name]}"
            "fld_app_module" => "%{[fields][fld_app_module]}"
            "fld_website_name" => "%{[fields][fld_website_name]}"
            "fld_log_file_name" => "%{source}"
            "fld_logformat" => "%{[fields][fld_logformat]}"
        }
    }
 
}
 
filter { 
 
      if [message] =~ "^#" {
        drop {}
      }
 
      if [fld_logformat] == "logformat__iis_with_xrealip" {
     
          grok {
            match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{IP:serverIP} %{WORD:method} %{NOTSPACE:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:userAgent} %{NOTSPACE:referer} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response} %{NUMBER:timetaken} %{NOTSPACE:xrealIP} %{NOTSPACE:xforwarderfor}"]
          }
      } else {
   
          grok {
             match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{IP:serverIP} %{WORD:method} %{NOTSPACE:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:userAgent} %{NOTSPACE:referer} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response} %{NUMBER:timetaken}"]
          }
 
      }
 
      date {
        match => [ "log_timestamp", "YYYY-MM-dd HH:mm:ss" ]
          timezone => "Etc/UTC"
        remove_field => [ "log_timestamp", "@timestamp" ]
        target => [ "log_timestamp2" ]
      }
 
        ruby {
            code => "tstamp = event.get('log_timestamp2').to_i
                        event.set('logdatetime', Time.at(tstamp).strftime('%Y-%m-%d %H:%M:%S'))
                        event.set('logdate', Time.at(tstamp).strftime('%Y-%m-%d'))"
        }
 
      if [bytesSent] {
        ruby {
          code => "event['kilobytesSent'] = event['bytesSent'].to_i / 1024.0"
        }
      }
 
 
      if [bytesReceived] {
        ruby {
          code => "event['kilobytesReceived'] = event['bytesReceived'].to_i / 1024.0"
        }
      }
 
   
        ruby {
            code => "event.set('clientRealIP', event.get('clientIP'))"
        }
        if [xrealIP] {
            ruby {
                code => "event.set('clientRealIP', event.get('xrealIP'))"
            }
        }
        if [xforwarderfor] {
            ruby {
                code => "event.set('clientRealIP', event.get('xforwarderfor'))"
            }
        }
 
      mutate {
        convert => ["bytesSent", "integer"]
        convert => ["bytesReceived", "integer"]
        convert => ["timetaken", "integer"] 
        convert => ["port", "integer"]
 
        add_field => {
            "clientHostname" => "%{clientIP}"
        }
      }
 
        useragent {
            source=> "useragent"
            prefix=> "browser"
        }
 
        kv {
            source => "uriQuery"
            prefix => "uriQuery__"
            allow_duplicate_values => false
            field_split => "&"
            include_keys => [ "utm_medium", "utm_source", "utm_campaign", "utm_term", "utm_content", "yclid", "region" ]
        }
 
        mutate {
            join => { "uriQuery__utm_source" => "," }
            join => { "uriQuery__utm_medium" => "," }
            join => { "uriQuery__utm_campaign" => "," }
            join => { "uriQuery__utm_term" => "," }
            join => { "uriQuery__utm_content" => "," }
            join => { "uriQuery__yclid" => "," }
            join => { "uriQuery__region" => "," }
        }
 
}
 
output { 
  #stdout {codec => rubydebug}
    clickhouse {
      headers => ["Authorization", "Basic abcdsfks..."]
      http_hosts => ["http://127.0.0.1:8123"]
      save_dir => "/etc/logstash/tmp"
      table => "log_web"
      request_tolerance => 1
      flush_size => 10000
      idle_flush_time => 1
        mutations => {
            "fld_log_file_name" => "fld_log_file_name"
            "fld_server_name" => "fld_server_name"
            "fld_app_name" => "fld_app_name"
            "fld_app_module" => "fld_app_module"
            "fld_website_name" => "fld_website_name"
 
            "logdatetime" => "logdatetime"
            "logdate" => "logdate"
            "serverIP" => "serverIP"
            "method" => "method"
            "uriStem" => "uriStem"
            "uriQuery" => "uriQuery"
            "port" => "port"
            "username" => "username"
            "clientIP" => "clientIP"
            "clientRealIP" => "clientRealIP"
            "userAgent" => "userAgent"
            "referer" => "referer"
            "response" => "response"
            "subresponse" => "subresponse"
            "win32response" => "win32response"
            "timetaken" => "timetaken"
             
            "uriQuery__utm_medium" => "uriQuery__utm_medium"
            "uriQuery__utm_source" => "uriQuery__utm_source"
            "uriQuery__utm_campaign" => "uriQuery__utm_campaign"
            "uriQuery__utm_term" => "uriQuery__utm_term"
            "uriQuery__utm_content" => "uriQuery__utm_content"
            "uriQuery__yclid" => "uriQuery__yclid"
            "uriQuery__region" => "uriQuery__region"
        }
    }
 
}

pipelines.yml

# This file is where you define your pipelines. You can define multiple.
# For more information on multiple pipelines, see the documentation:
#   https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html
 
- pipeline.id: log_web__filebeat_clickhouse
  path.config: "/etc/logstash/log_web__filebeat_clickhouse.conf"

fale kiliki. Ogalaau teuina

Ogalaau mo faiga uma e teuina i le laulau e tasi (vaai i le amataga o le tusiga). O loʻo faʻamoemoe e teu faʻamatalaga e uiga i talosaga: o faʻamaufaʻailoga uma e tutusa mo faʻasologa eseese, e pei o IIS logs, apache ma nginx logs. Mo faʻamaumauga o talosaga, lea, mo se faʻataʻitaʻiga, mea sese, faʻamatalaga faʻamatalaga, lapataiga o loʻo faʻamauina, o le a tuʻuina atu se laulau eseʻese ma le fausaga talafeagai (i le taimi nei i le tulaga o le mamanu).

Pe a mamanuina se laulau, e taua tele le filifili i le ki autu (lea o le a faʻavasega ai faʻamaumauga i le taimi o le teuina). Ole maualuga ole fa'amauina o fa'amaumauga ma le saoasaoa ole fesili e fa'alagolago ile mea lea. I la tatou faataitaiga, o le ki
ORDER BY (fld_app_name, fld_app_module, logdatetime)
O lona uiga, i le igoa o le faiga, le igoa o le vaega o le polokalama ma le aso o le mea na tupu. I le taimi muamua, na muamua mai le aso o le mea na tupu. Ina ua uma ona ave i le nofoaga mulimuli, na amata ona galue fesili pe a faalua le saoasaoa. O le suia o le ki autu o le a manaʻomia ai le toe faia o le laulau ma toe faʻapipiʻi faʻamaumauga ina ia toe faʻavasega e ClickHouse faʻamaumauga i luga o le disk. Ose galuega mamafa, ose manatu lelei le mafaufau tele i mea e tatau ona aofia i le ki fa'avasega.

E tatau foi ona maitauina o le LowCardinality data type na aliali mai i ni faʻamatalaga lata mai. A faʻaaogaina, o le tele o faʻamaumauga faʻapipiʻi e matua faʻaititia mo na fanua e maualalo le cardinality ( nai filifiliga).

Version 19.6 o lo'o fa'aogaina nei ma matou fuafua e taumafai e fa'afou i le lomiga fou. E iai a latou foliga matagofie e pei o le Adaptive Granularity, Skipping indices ma le DoubleDelta codec, mo se faʻataʻitaʻiga.

E ala i le faaletonu, i le taimi o le faʻapipiʻiina, ua seti le tulaga o le logging e suʻe. O ogalaau e feliuliuai ma teuina, ae i le taimi lava e tasi latou te faalautele atu i le gigabyte. Afai e leai se manaʻoga, ona mafai lea ona e setiina le tulaga o le lapataiga, ona faʻaitiitia ai lea o le tele o le ogalaau. O lo'o fa'atulaga le fa'amauina i le faila config.xml:

<!-- Possible levels: https://github.com/pocoproject/poco/blob/develop/Foundation/include/Poco/Logger. h#L105 -->
<level>warning</level>

O nisi o poloaiga aoga

Поскольку оригинальные пакеты установки собираются по Debian, то для других версий Linux необходимо использовать пакеты собранные компанией Altinity.
 
Вот по этой ссылке есть инструкции с ссылками на их репозиторий: https://www.altinity.com/blog/2017/12/18/logstash-with-clickhouse
sudo yum search clickhouse-server
sudo yum install clickhouse-server.noarch
  
1. проверка статуса
sudo systemctl status clickhouse-server
 
2. остановка сервера
sudo systemctl stop clickhouse-server
 
3. запуск сервера
sudo systemctl start clickhouse-server
 
Запуск для выполнения запросов в многострочном режиме (выполнение после знака ";")
clickhouse-client --multiline
clickhouse-client --multiline --host 127.0.0.1 --password pa55w0rd
clickhouse-client --multiline --host 127.0.0.1 --port 9440 --secure --user default --password pa55w0rd
 
Плагин кликлауза для логстеш в случае ошибки в одной строке сохраняет всю пачку в файл /tmp/log_web_failed.json
Можно вручную исправить этот файл и попробовать залить его в БД вручную:
clickhouse-client --host 127.0.0.1 --password password --query="INSERT INTO log_web FORMAT JSONEachRow" < /tmp/log_web_failed__fixed.json
 
sudo mv /etc/logstash/tmp/log_web_failed.json /etc/logstash/tmp/log_web_failed__fixed.json
sudo chown user_dev /etc/logstash/tmp/log_web_failed__fixed.json
sudo clickhouse-client --host 127.0.0.1 --password password --query="INSERT INTO log_web FORMAT JSONEachRow" < /etc/logstash/tmp/log_web_failed__fixed.json
sudo mv /etc/logstash/tmp/log_web_failed__fixed.json /etc/logstash/tmp/log_web_failed__fixed_.json
 
выход из командной строки
quit;
## Настройка TLS
https://www.altinity.com/blog/2019/3/5/clickhouse-networking-part-2
 
openssl s_client -connect log.domain.com:9440 < /dev/null

fa'amaufa'ailoga. Fa'amau le telefoni mai le FileBeat i le RabbitMQ queue

O lenei vaega e faʻaaogaina e faʻasalalau ai ogalaau e sau mai le FileBeat i le RabbitMQ queue. E lua manatu iinei:

  1. Ae paga lea, o le FileBeat e leai se masini faʻapipiʻi e tusi saʻo i RabbitMQ. Ma o ia gaioiga, faʻamasino i le mataupu i luga o la latou github, e leʻi fuafuaina mo le faʻatinoga. O loʻo i ai se faʻapipiʻi mo Kafka, ae mo nisi mafuaʻaga e le mafai ona matou faʻaogaina i le fale.
  2. E iai mana'oga mo le aoina o ogalaau i le DMZ. Faʻavae i luga o latou, e tatau ona faʻapipiʻi muamua ogalaau i le laina ona faitau lea e LogStash faʻamaumauga mai le laina mai fafo.

O le mea lea, e mo le tulaga o loʻo i ai 'auʻaunaga i le DMZ e tatau i se tasi ona faʻaogaina se faiga laʻititi laʻititi. O se faʻataʻitaʻiga faʻatulagaina e pei o lenei:

iis_w3c_logs__filebeat_rabbitmq.conf

input {
 
    beats {
        port => 5044
        type => 'iis'
        ssl => true
        ssl_certificate_authorities => ["/etc/pki/tls/certs/app/ca.pem", "/etc/pki/tls/certs/app/ca-issuing.pem"]
        ssl_certificate => "/etc/pki/tls/certs/app/queue.domain.com.cer"
        ssl_key => "/etc/pki/tls/certs/app/queue.domain.com-pkcs8.key"
        ssl_verify_mode => "peer"
    }
 
}
 
output { 
  #stdout {codec => rubydebug}
 
    rabbitmq {
        host => "127.0.0.1"
        port => 5672
        exchange => "monitor.direct"
        exchange_type => "direct"
        key => "%{[fields][fld_app_name]}"
        user => "q-writer"
        password => "password"
        ssl => false
    }
}

RabbitMQ. laina savali

O le vaega lea e fa'aogaina e fa'amalo ai fa'amaumauga o ogalaau i le DMZ. O le faʻamaumauga e faia e ala i le tele o Filebeat → LogStash. E faia le faitau mai fafo o le DMZ e ala ile LogStash. A fa'agaoioia e ala i RabboitMQ, e tusa ma le 4 afe fe'au i le sekone e fa'agasolo.

O le faʻaogaina o feʻau e faʻapipiʻiina e le igoa o le polokalama, o lona uiga e faʻavae i luga o faʻamaumauga faʻatulagaina FileBeat. O fe'au uma e alu i le laina e tasi. Afai mo nisi mafuaʻaga ua taofia ai le tuʻuina atu o auaunaga, o le a le taʻitaʻia ai lea i le leiloa o feʻau: FileBeats o le a maua fesoʻotaʻiga sese ma faʻagata mo sina taimi le lafoina. Ma o LogStash e faitau mai le laina o le a maua foi mea sese fesoʻotaʻiga ma faʻatali mo le fesoʻotaʻiga e toe faʻaleleia. I lenei tulaga, o faʻamaumauga, ioe, o le a le toe tusia i le database.

O fa'atonuga nei e fa'aoga e fai ma fa'atulaga laina:

sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin declare exchange --vhost=/ name=monitor.direct type=direct sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin declare queue --vhost=/ name=web_log durable=true
sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin --vhost="/" declare binding source="monitor.direct" destination_type="queue" destination="web_log" routing_key="site1.domain.ru"
sudo /usr/local/bin/rabbitmqadmin/rabbitmqadmin --vhost="/" declare binding source="monitor.direct" destination_type="queue" destination="web_log" routing_key="site2.domain.ru"

Grafana. Laupapa laupapa

O lenei vaega e faʻaaogaina e vaʻaia ai faʻamatalaga mataʻituina. I lenei tulaga, e tatau ona e faʻapipiʻi le ClickHouse datasource mo Grafana 4.6+ plugin. E tatau ona matou faʻaleleia teisi e faʻaleleia atili le lelei o le faʻaogaina o filiga SQL i luga o le laupapa.

Mo se faʻataʻitaʻiga, matou te faʻaaogaina fesuiaiga, ma afai e le o setiina i totonu o le fanua faamama, matou te manaʻo e aua neʻi faia se tulaga i le WHERE o le fomu ( uriStem = » AND uriStem != » ). I lenei tulaga, ClickHouse o le a faitau le koluma uriStem. I se tulaga lautele, sa matou taumafai i filifiliga eseese ma mulimuli ane faʻasaʻo le plugin (le $valueIfEmpty macro) ina ia i le tulaga o se tau gaogao e toe foʻi mai ai le 1, e aunoa ma le taʻua o le koluma lava ia.

Ma o lea e mafai ona e faʻaogaina lenei fesili mo le kalafi

$columns(response, count(*) c) from $table where $adhoc
and $valueIfEmpty($fld_app_name, 1, fld_app_name = '$fld_app_name')
and $valueIfEmpty($fld_app_module, 1, fld_app_module = '$fld_app_module') and $valueIfEmpty($fld_server_name, 1, fld_server_name = '$fld_server_name') and $valueIfEmpty($uriStem, 1, uriStem like '%$uriStem%')
and $valueIfEmpty($clientRealIP, 1, clientRealIP = '$clientRealIP')

lea e fa'aliliuina i lenei SQL (ia maitauina o le avanoa o le uriStem fields ua liua i le na'o le 1)

SELECT
t,
groupArray((response, c)) AS groupArr
FROM (
SELECT
(intDiv(toUInt32(logdatetime), 60) * 60) * 1000 AS t, response,
count(*) AS c FROM default.log_web
WHERE (logdate >= toDate(1565061982)) AND (logdatetime >= toDateTime(1565061982)) AND 1 AND (fld_app_name = 'site1.domain.ru') AND (fld_app_module = 'web') AND 1 AND 1 AND 1
GROUP BY
t, response
ORDER BY
t ASC,
response ASC
)
GROUP BY t ORDER BY t ASC

iʻuga

O le faʻaalia o le ClickHouse database ua avea ma mea taua i le maketi. Sa faigata ona mafaufauina, e matua leai se totogi, i se taimi vave sa matou faʻaauupegaina i se meafaigaluega mamana ma aoga mo le galulue ai ma faʻamatalaga tetele. Ioe, faʻatasi ai ma le faʻateleina o manaʻoga (mo se faʻataʻitaʻiga, sharding ma replication i le tele o sapalai), o le polokalame o le a sili atu ona faigata. Ae i luga o faʻamatalaga muamua, o le galue ma lenei faʻamaumauga e matua manaia lava. E mafai ona iloa o le oloa ua faia "mo tagata."

Pe a faatusatusa i le ElasticSearch, o le tau o le teuina ma le gaosiga o ogalaau e fuafua e faaitiitia i le lima i le sefulu taimi. I se isi faaupuga, afai mo le aofaʻi o faʻamaumauga o loʻo i ai nei e tatau ona matou setiina se fuifui o ni masini, ona faʻaaogaina lea o le ClickHouse, e tasi le masini maualalo e lava mo i matou. Ioe, o le mea moni, ElasticSearch o loʻo i ai foi i luga o le disk faʻapipiʻi faʻamaumauga ma isi mea e mafai ona faʻaitiitia ai le faʻaaogaina o punaoa, ae faʻatusatusa i le ClickHouse, o le a sili atu le taugata.

A aunoa ma ni faʻataʻitaʻiga faʻapitoa i la matou vaega, i luga o tulaga le lelei, faʻapipiʻiina o faʻamaumauga ma le filifilia mai le database e galue i se saoasaoa ofoofogia. E leʻi tele a matou faʻamatalaga (e tusa ma le 200 miliona faʻamaumauga), ae o le server lava ia e vaivai. E mafai ona tatou faʻaogaina lenei meafaigaluega i le lumanaʻi mo isi faʻamoemoega e le fesoʻotaʻi ma le teuina o ogalaau. Mo se faʻataʻitaʻiga, mo auʻiliʻiliga pito i luga, i le tulaga o le saogalemu, aʻoaʻoga masini.

I le faaiuga, o sina mea itiiti e uiga i le lelei ma le le lelei.

Igoa

  1. La'uina fa'amaumauga i vaega tetele. I le tasi itu, o se vaega lenei, ae e tatau lava ona e faʻaogaina vaega faaopoopo mo faʻamaumauga faʻamau. O lenei galuega e le faigofie i taimi uma, ae e mafai lava ona foia. Ma ou te fia faafaigofie le polokalame.
  2. O nisi galuega fa'apitoa po'o foliga fou e masani ona malepe i ni fa'aliliuga fou. O lenei mea e mafua ai le popole, faʻaitiitia ai le manaʻo e faʻaleleia i se lomiga fou. Mo se faʻataʻitaʻiga, o le Kafka table engine o se mea aoga tele e mafai ai ona e faitau saʻo mea na tutupu mai Kafka, e aunoa ma le faʻaaogaina o tagata faʻatau. Ae faʻamasinoina i le numera o Faʻamatalaga i luga o le github, matou te faʻaeteete pea e aua le faʻaogaina lenei afi i le gaosiga. Ae peitaʻi, afai e te le faia faʻafuaseʻi taga i le itu ma faʻaoga le galuega autu, ona galue malosi lea.

Плюсы

  1. E le fa'agesegese.
  2. Fa'ailoga maualalo.
  3. puna tatala.
  4. Sa'oloto.
  5. Fua lelei (fa'ata'i/fa'atusa mai le pusa)
  6. E aofia i le tusi resitala o polokalama Rusia fautuaina e le Matagaluega o Fesootaiga.
  7. Le i ai o le lagolago aloaia mai Yandex.

puna: www.habr.com

Faaopoopo i ai se faamatalaga