Tricks rau kev ua metrics hauv Kapacitor

Feem ntau, niaj hnub no tsis muaj leej twg nug vim li cas nws thiaj li tsim nyog los sau cov kev ntsuas kev pabcuam. Cov kauj ruam tom ntej yog txhawm rau teeb tsa kev ceeb toom rau cov ntsuas ntsuas, uas yuav ceeb toom txog kev sib txawv ntawm cov ntaub ntawv hauv cov channel yooj yim rau koj (mail, Slack, Telegram). Hauv online booking service Ostrovok.ru tag nrho cov kev ntsuas ntawm peb cov kev pabcuam raug nchuav rau hauv InfluxDB thiab tso tawm hauv Grafana, thiab kev ceeb toom tseem ceeb kuj tau teeb tsa rau ntawd. Rau cov dej num xws li "koj yuav tsum xam ib yam dab tsi thiab piv nrog nws," peb siv Kapacitor.

Tricks rau kev ua metrics hauv Kapacitor
Kapacitor yog ib feem ntawm TICK pawg uas tuaj yeem ua cov kev ntsuas los ntawm InfluxDB. Nws tuaj yeem txuas ntau qhov kev ntsuas ua ke (sib koom), suav qee yam tseem ceeb ntawm cov ntaub ntawv tau txais, sau cov txiaj ntsig rov qab rau InfluxDB, xa ntawv ceeb toom rau Slack / Telegram / mail.

Tag nrho pawg yog txias thiab nthuav dav cov ntaub ntawv, tab sis yeej yuav muaj cov khoom muaj txiaj ntsig uas tsis tau qhia meej hauv phau ntawv qhia. Hauv tsab xov xwm no, kuv tau txiav txim siab los sau ntau yam tseem ceeb, cov lus qhia tsis pom tseeb (cov ntsiab lus tseem ceeb ntawm TICKscipt tau piav qhia no) thiab qhia seb lawv tuaj yeem siv tau li cas siv piv txwv ntawm kev daws ib qho ntawm peb cov teeb meem.

Cia peb mus!

float & int, xam yuam kev

Ib qho teeb meem tiag tiag, daws los ntawm castes:

var alert_float = 5.0
var alert_int = 10
data|eval(lambda: float("value") > alert_float OR float("value") < float("alert_int"))

siv default()

Yog tias daim ntawv / daim teb tsis tau sau rau hauv, kev suav yuam kev yuav tshwm sim:

|default()
        .tag('status', 'empty')
        .field('value', 0)

sau nyob rau hauv koom (hauv vs sab nraud)

Los ntawm lub neej ntawd, koom yuav pov tseg cov ntsiab lus uas tsis muaj cov ntaub ntawv (sab hauv).
Nrog sau ('null'), ib qho kev koom ua ke sab nrauv yuav raug ua, tom qab ntawd koj yuav tsum ua lub neej ntawd () thiab sau rau hauv qhov tsis muaj nuj nqis:

var data = res1
    |join(res2)
        .as('res1', 'res2)
        .fill('null')
    |default()
        .field('res1.value', 0.0)
        .field('res2.value', 100.0)

Tseem muaj ib tug nuance ntawm no. Hauv qhov piv txwv saum toj no, yog tias ib qho ntawm cov koob (res1 lossis res2) yog khoob, qhov tshwm sim series (cov ntaub ntawv) kuj yuav khoob. Muaj ntau daim pib ntawm lub ncauj lus no ntawm Github (1633, 1871, 6967) - Peb tab tom tos kev kho thiab kev txom nyem me ntsis.

Siv cov xwm txheej hauv kev suav (yog tias hauv lambda)

|eval(lambda: if("value" > 0, true, false)

Xeem tsib feeb los ntawm cov raj xa dej rau lub sijhawm

Piv txwv li, koj yuav tsum sib piv cov txiaj ntsig ntawm tsib feeb dhau los nrog lub lim tiam dhau los. Koj tuaj yeem nqa ob pawg ntawm cov ntaub ntawv hauv ob pawg cais lossis rho tawm ib feem ntawm cov ntaub ntawv los ntawm lub sijhawm loj:

 |where(lambda: duration((unixNano(now()) - unixNano("time"))/1000, 1u) < 5m)

Lwm txoj hauv kev rau tsib feeb kawg yuav yog siv BarrierNode, uas txiav tawm cov ntaub ntawv ua ntej lub sijhawm teev tseg:

|barrier()
        .period(5m)

Piv txwv ntawm kev siv Go templates hauv lus

Templates sib haum rau hom ntawv los ntawm pob ntawv. templateHauv qab no yog qee qhov kev sib tw uas nquag ntsib.

yog-lwm

Peb muab cov khoom tso rau hauv kev txiav txim thiab tsis txhob ua rau tib neeg nrog cov ntawv nyeem ib zaug ntxiv:

|alert()
    ...
    .message(
        '{{ if eq .Level "OK" }}It is ok now{{ else }}Chief, everything is broken{{end}}'
    )

Ob tus lej tom qab tus lej lej hauv cov lus

Txhim kho kev nyeem ntawv ntawm cov lus:

|alert()
    ...
    .message(
        'now value is {{ index .Fields "value" | printf "%0.2f" }}'
    )

Expanding variables nyob rau hauv lus

Peb tso cov ntaub ntawv ntau ntxiv hauv cov lus los teb cov lus nug "Vim li cas nws thiaj qw"?

var warnAlert = 10
  |alert()
    ...
    .message(
       'Today value less then '+string(warnAlert)+'%'
    )

Cim ceeb toom identifier

Qhov no yog qhov tsim nyog thaum muaj ntau tshaj ib pab pawg hauv cov ntaub ntawv, txwv tsis pub tsuas yog ib qho kev ceeb toom yuav raug tsim tawm:

|alert()
      ...
      .id('{{ index .Tags "myname" }}/{{ index .Tags "myfield" }}')

Kev cai handler's

Cov npe loj ntawm cov neeg ua haujlwm suav nrog exec, uas tso cai rau koj ua tiav koj tsab ntawv nrog cov tsis dhau (stdin) - muaj tswv yim thiab tsis muaj dab tsi ntxiv!

Ib qho ntawm peb cov kev lis kev cai yog daim ntawv me me Python rau kev xa cov ntawv ceeb toom rau slack.
Thaum xub thawj, peb xav xa daim ntawv tso cai-tiv thaiv grafana hauv cov lus. Tom qab ntawd, sau OK nyob rau hauv cov xov mus rau yav dhau los ceeb toom los ntawm tib pab pawg neeg, thiab tsis raws li ib tug cais cov lus. Ib me ntsis tom qab - ntxiv rau cov lus uas feem ntau yuam kev hauv X feeb kawg.

Ib lub ntsiab lus txawv yog kev sib txuas lus nrog lwm cov kev pabcuam thiab txhua yam kev ua haujlwm tau pib los ntawm kev ceeb toom (tsuas yog tias koj qhov kev saib xyuas ua haujlwm zoo txaus).
Ib qho piv txwv ntawm cov lus piav qhia, qhov twg slack_handler.py yog peb tus kheej sau ntawv:

topic: slack_graph
id: slack_graph.alert
match: level() != INFO AND changed() == TRUE
kind: exec
options:
  prog: /sbin/slack_handler.py
  args: ["-c", "CHANNELID", "--graph", "--search"]

Yuav ua li cas debug?

Kev xaiv nrog lub cav tso zis

|log()
      .level("error")
      .prefix("something")

Saib (cli): kapacitor -url host-or-ip:9092 logs lvl=error

Kev xaiv nrog httpOut

Qhia cov ntaub ntawv hauv cov kav dej tam sim no:

|httpOut('something')

Saib (tau): host-or-ip:9092/kapacitor/v1/tasks/task_name/something

Txoj kev ua tiav

  • Txhua txoj haujlwm rov qab ua tiav tsob ntoo nrog cov lej muaj txiaj ntsig hauv hom ntawv duab.
  • Nqa ib qho thaiv dot.
  • Muab nws tso rau hauv viewer, lom zem.

Lwm qhov koj tuaj yeem tau txais rake?

timestamp hauv influxdb ntawm writeback

Piv txwv li, peb teeb tsa kev ceeb toom rau cov lej ntawm kev thov ib teev (groupBy(1h)) thiab xav sau cov lus ceeb toom uas tshwm sim hauv influxdb (kom zoo nkauj qhia qhov tseeb ntawm qhov teeb meem ntawm daim duab hauv grafana).

influxDBOut() yuav sau lub sij hawm tus nqi los ntawm kev ceeb toom mus rau lub sij hawm; raws li, lub ntsiab lus ntawm daim ntawv yuav muab sau ua ntej / tom qab lub ceeb toom tuaj txog.

Thaum yuav tsum tau muaj tseeb: peb ua hauj lwm nyob ib ncig ntawm qhov teeb meem no los ntawm kev hu rau ib tug kev cai handler, uas yuav sau cov ntaub ntawv rau influxdb nrog lub sij hawm tam sim no.

docker, tsim thiab xa tawm

Thaum pib, kapacitor tuaj yeem thauj cov haujlwm, cov qauv thiab cov neeg ua haujlwm los ntawm cov npe teev tseg hauv qhov teeb tsa hauv [load] thaiv.

Txhawm rau tsim ib txoj haujlwm kom raug, koj xav tau cov khoom hauv qab no:

  1. Cov ntaub ntawv npe - nthuav dav rau hauv tsab ntawv ID / npe
  2. Hom – kwj/batch
  3. dbrp - lo lus tseem ceeb los qhia qhov database + txoj cai cov ntawv sau rau hauv (dbrp "supplier." "autogen")

Yog tias qee qhov haujlwm ua haujlwm tsis muaj kab nrog dbrp, tag nrho cov kev pabcuam yuav tsis kam pib thiab yuav ua siab ncaj sau txog nws hauv lub cav.

Hauv chronograf, ntawm qhov tsis sib xws, kab no yuav tsum tsis muaj nyob; nws tsis raug lees txais los ntawm kev sib tshuam thiab tsim qhov yuam kev.

Hack thaum tsim lub thawv: Dockerfile tawm nrog -1 yog tias muaj kab nrog //.+dbrp, uas yuav tso cai rau koj nkag siab tam sim vim li cas rau qhov tsis ua haujlwm thaum sib sau ua ke.

koom ib rau ntau

Piv txwv li kev ua haujlwm: koj yuav tsum siv 95 feem pua ​​​​ntawm cov kev pabcuam lub sijhawm ua haujlwm rau ib lub limtiam, sib piv txhua feeb ntawm 10 kawg nrog tus nqi no.

Koj tsis tuaj yeem ua ib-rau-ntau koom nrog, qhov kawg / nruab nrab / nruab nrab ntawm ib pawg ntawm cov ntsiab lus hloov cov node rau hauv cov kwj dej, qhov yuam kev "tsis tuaj yeem ntxiv cov menyuam tsis sib haum: batch -> kwj" yuav raug xa rov qab.

Qhov tshwm sim ntawm ib pawg, raws li qhov sib txawv ntawm lambda qhia, kuj tsis hloov pauv.

Muaj kev xaiv kom txuag tau tus lej tsim nyog los ntawm thawj pawg mus rau cov ntaub ntawv ntawm udf thiab thauj cov ntaub ntawv no ntawm sideload.

Peb tau daws dab tsi nrog qhov no?

Peb muaj txog 100 tus neeg xa khoom hauv tsev so, lawv txhua tus tuaj yeem muaj ntau qhov kev sib txuas, cia peb hu nws lub channel. Muaj kwv yees li 300 ntawm cov channel no, txhua tus channel tuaj yeem poob. Ntawm tag nrho cov metrics kaw, peb yuav saib xyuas tus nqi yuam kev (thov thiab yuam kev).

Vim li cas ho tsis grafana?

Kev ceeb toom yuam kev tau teeb tsa hauv Grafana muaj ntau qhov tsis zoo. Qee qhov tseem ceeb, qee qhov koj tuaj yeem kaw koj lub qhov muag, nyob ntawm qhov xwm txheej.

Grafana tsis paub yuav suav li cas ntawm kev ntsuas + kev ceeb toom, tab sis peb xav tau tus nqi (requests-errors)/requests.

Qhov yuam kev zoo li tsis zoo:

Tricks rau kev ua metrics hauv Kapacitor

Thiab tsawg dua kev phem thaum saib nrog kev thov ua tiav:

Tricks rau kev ua metrics hauv Kapacitor

Okay, peb tuaj yeem xam tus nqi ua ntej hauv kev pabcuam ua ntej grafana, thiab qee zaum qhov no yuav ua haujlwm. Tab sis tsis nyob hauv peb, vim ... rau txhua tus channel nws tus kheej piv yog suav tias yog "ib txwm", thiab kev ceeb toom ua haujlwm raws li qhov muaj txiaj ntsig zoo li qub (peb nrhiav lawv nrog peb lub qhov muag, hloov lawv yog tias muaj kev ceeb toom nquag).

Cov no yog cov piv txwv ntawm "ib txwm" rau cov channel sib txawv:

Tricks rau kev ua metrics hauv Kapacitor

Tricks rau kev ua metrics hauv Kapacitor

Peb tsis quav ntsej cov ntsiab lus dhau los thiab xav tias daim duab "ib txwm" zoo ib yam rau txhua tus neeg muag khoom. Tam sim no txhua yam zoo, thiab peb tuaj yeem tau txais kev ceeb toom hauv grafana?
Peb tuaj yeem, tab sis peb yeej tsis xav, vim peb yuav tsum xaiv ib qho ntawm cov kev xaiv:
a) ua ntau daim duab rau txhua qhov sib cais (thiab mob siab rau lawv)
b) tawm ib daim ntawv nrog rau txhua txoj kev (thiab poob rau hauv cov yeeb yuj kab thiab kev ceeb toom customized)

Tricks rau kev ua metrics hauv Kapacitor

Koj ua li cas?

Ntxiv dua thiab, muaj qhov piv txwv zoo pib hauv cov ntaub ntawv (Xam cov nqi thoob plaws koom nrog series), tuaj yeem peeked ntawm lossis coj los ua lub hauv paus hauv cov teeb meem zoo sib xws.

Qhov peb tau ua thaum kawg:

  • koom nrog ob koob hauv ob peb teev, pab pawg los ntawm cov channel;
  • sau cov koob los ntawm pab pawg yog tias tsis muaj ntaub ntawv;
  • piv qhov nruab nrab ntawm 10 feeb kawg nrog cov ntaub ntawv dhau los;
  • peb qw yog tias peb pom ib yam dab tsi;
  • peb sau cov lej suav thiab ceeb toom uas tshwm sim hauv influxdb;
  • xa cov lus muaj txiaj ntsig rau slack.

Hauv kuv lub tswv yim, peb tau tswj hwm kom ua tiav txhua yam peb xav tau thaum kawg (thiab txawm tias me ntsis ntxiv nrog cov neeg tuav kev cai) kom zoo nkauj li sai tau.

Koj tuaj yeem saib ntawm github.com code piv ΠΈ Tsawg kawg Circuit Court (graphviz) cov ntawv sau.

Ib qho piv txwv ntawm qhov tshwm sim code:

dbrp "supplier"."autogen"
var name = 'requests.rate'
var grafana_dash = 'pczpmYZWU/mydashboard'
var grafana_panel = '26'
var period = 8h
var todayPeriod = 10m
var every = 1m
var warnAlert = 15
var warnReset = 5
var reqQuery = 'SELECT sum("count") AS value FROM "supplier"."autogen"."requests"'
var errQuery = 'SELECT sum("count") AS value FROM "supplier"."autogen"."errors"'

var prevErr = batch
    |query(errQuery)
        .period(period)
        .every(every)
        .groupBy(1m, 'channel', 'supplier')

var prevReq = batch
    |query(reqQuery)
        .period(period)
        .every(every)
        .groupBy(1m, 'channel', 'supplier')

var rates = prevReq
    |join(prevErr)
        .as('req', 'err')
        .tolerance(1m)
        .fill('null')
    // заполняСм значСния нулями, Ссли ΠΈΡ… Π½Π΅ Π±Ρ‹Π»ΠΎ
    |default()
        .field('err.value', 0.0)
        .field('req.value', 0.0)
    // if Π² lambda: считаСм Ρ€Π΅ΠΉΡ‚, Ρ‚ΠΎΠ»ΡŒΠΊΠΎ Ссли ошибки Π±Ρ‹Π»ΠΈ
    |eval(lambda: if("err.value" > 0, 100.0 * (float("req.value") - float("err.value")) / float("req.value"), 100.0))
        .as('rate')

// записываСм посчитанныС значСния Π² ΠΈΠ½Ρ„Π»ΡŽΠΊΡ
rates
    |influxDBOut()
        .quiet()
        .create()
        .database('kapacitor')
        .retentionPolicy('autogen')
        .measurement('rates')

// Π²Ρ‹Π±ΠΈΡ€Π°Π΅ΠΌ Π΄Π°Π½Π½Ρ‹Π΅ Π·Π° послСдниС 10 ΠΌΠΈΠ½ΡƒΡ‚, считаСм ΠΌΠ΅Π΄ΠΈΠ°Π½Ρƒ
var todayRate = rates
    |where(lambda: duration((unixNano(now()) - unixNano("time")) / 1000, 1u) < todayPeriod)
    |median('rate')
        .as('median')

var prevRate = rates
    |median('rate')
        .as('median')

var joined = todayRate
    |join(prevRate)
        .as('today', 'prev')
    |httpOut('join')

var trigger = joined
    |alert()
        .warn(lambda: ("prev.median" - "today.median") > warnAlert)
        .warnReset(lambda: ("prev.median" - "today.median") < warnReset)
        .flapping(0.25, 0.5)
        .stateChangesOnly()
        // собираСм Π² message ссылку Π½Π° Π³Ρ€Π°Ρ„ΠΈΠΊ Π΄Π°ΡˆΠ±ΠΎΡ€Π΄Π° Π³Ρ€Π°Ρ„Π°Π½Ρ‹
        .message(
            '{{ .Level }}: {{ index .Tags "channel" }} err/req ratio ({{ index .Tags "supplier" }})
{{ if eq .Level "OK" }}It is ok now{{ else }}
'+string(todayPeriod)+' median is {{ index .Fields "today.median" | printf "%0.2f" }}%, by previous '+string(period)+' is {{ index .Fields "prev.median" | printf "%0.2f" }}%{{ end }}
http://grafana.ostrovok.in/d/'+string(grafana_dash)+
'?var-supplier={{ index .Tags "supplier" }}&var-channel={{ index .Tags "channel" }}&panelId='+string(grafana_panel)+'&fullscreen&tz=UTC%2B03%3A00'
        )
        .id('{{ index .Tags "name" }}/{{ index .Tags "channel" }}')
        .levelTag('level')
        .messageField('message')
        .durationField('duration')
        .topic('slack_graph')

// "today.median" Π΄ΡƒΠ±Π»ΠΈΡ€ΡƒΠ΅ΠΌ ΠΊΠ°ΠΊ "value", Ρ‚Π°ΠΊΠΆΠ΅ пишСм Π² ΠΈΠ½Ρ„Π»ΡŽΠΊΡ ΠΎΡΡ‚Π°Π»ΡŒΠ½Ρ‹Π΅ Ρ„ΠΈΠ»Π΄Ρ‹ Π°Π»Π΅Ρ€Ρ‚Π° (keep)
trigger
    |eval(lambda: "today.median")
        .as('value')
        .keep()
    |influxDBOut()
        .quiet()
        .create()
        .database('kapacitor')
        .retentionPolicy('autogen')
        .measurement('alerts')
        .tag('alertName', name)

Qhov xaus yog dab tsi?

Kapacitor yog qhov zoo ntawm kev saib xyuas-kev ceeb toom nrog ntau pab pawg, ua cov kev suav ntxiv raws li cov kev ntsuas uas twb tau sau tseg lawm, ua cov kev cai ua thiab khiav cov ntawv sau (udf).

Qhov thaiv kev nkag tsis yog siab heev - sim nws yog tias grafana lossis lwm yam cuab yeej tsis txaus siab rau koj lub siab nyiam.

Tau qhov twg los: www.hab.com

Ntxiv ib saib