ãããããä»æ¥ã§ã¯ããªããµãŒãã¹ ã¡ããªã¯ã¹ãåéããå¿
èŠãããã®ãââãå°ãã人ã¯ããªãã§ãããã 次ã®è«ççãªã¹ãããã¯ãåéããã¡ããªã¯ã¹ã«å¯Ÿããã¢ã©ãŒããèšå®ããããšã§ããããã«ãããéœåã®è¯ããã£ãã« (ã¡ãŒã«ãSlackãé»å ±) ã§ããŒã¿ã®éžè±ãéç¥ãããŸãã ãªã³ã©ã€ã³ããã«äºçŽãµãŒãã¹ã§ã¯
Kapacitor ã¯ãInfluxDB ããã®ã¡ããªã¯ã¹ãåŠçã§ãã TICK ã¹ã¿ãã¯ã®äžéšã§ãã è€æ°ã®æž¬å®å€ãçµå (çµå) ããåä¿¡ããããŒã¿ããæçšãªæ
å ±ãèšç®ããçµæã InfluxDB ã«æžãæ»ããSlack/Telegram/mail ã«ã¢ã©ãŒããéä¿¡ã§ããŸãã
ã¹ã¿ãã¯å
šäœãã¯ãŒã«ã§è©³çŽ°ã§ã
float ãš intãèšç®ãšã©ãŒ
ã«ãŒã¹ããéããŠè§£æ±ºãããããŸã£ããæšæºçãªåé¡ã§ãã
var alert_float = 5.0
var alert_int = 10
data|eval(lambda: float("value") > alert_float OR float("value") < float("alert_int"))
ããã©ã«ã()ã®äœ¿çš
ã¿ã°/ãã£ãŒã«ããå ¥åãããŠããªãå Žåãèšç®ãšã©ãŒãçºçããŸãã
|default()
.tag('status', 'empty')
.field('value', 0)
ãã£ã«ã€ã³çµå (å éšãšå€éš)
ããã©ã«ãã§ã¯ãçµåã¯ããŒã¿ã®ãªããã€ã³ã (å
éš) ãç Žæ£ããŸãã
fill('null') ã䜿çšãããšãå€éšçµåãå®è¡ããããã®åŸãdefault() ãå®è¡ããŠç©ºã®å€ãåããå¿
èŠããããŸãã
var data = res1
|join(res2)
.as('res1', 'res2)
.fill('null')
|default()
.field('res1.value', 0.0)
.field('res2.value', 100.0)
ããã«ã¯ãŸã ãã¥ã¢ã³ã¹ããããŸãã äžã®äŸã§ã¯ãç³»å (res1 ãŸã㯠res2) ã® XNUMX ã€ã空ã®å Žåãçµæã®ç³»å (ããŒã¿) ã空ã«ãªããŸãã Github ã«ã¯ãã®ãããã¯ã«é¢ãããã±ãããããã€ããããŸã (
èšç®ã§ã®æ¡ä»¶ã®äœ¿çš (ã©ã ãã®å Žå)
|eval(lambda: if("value" > 0, true, false)
æéäžã®ãã€ãã©ã€ã³ããã®æåŸã® XNUMX åé
ããšãã°ãéå» XNUMX åéã®å€ãåã®é±ãšæ¯èŒããå¿ èŠããããŸãã ããŒã¿ã® XNUMX ã€ã®ãããã XNUMX ã€ã®å¥ã ã®ãããã§ååŸããããšãããã倧ããªæéããããŒã¿ã®äžéšãæœåºããããšãã§ããŸãã
|where(lambda: duration((unixNano(now()) - unixNano("time"))/1000, 1u) < 5m)
æåŸã® XNUMX åéã®ä»£æ¿æ¹æ³ã¯ãæå®ãããæéã®åã«ããŒã¿ãé®æãã BarrierNode ã䜿çšããããšã§ãã
|barrier()
.period(5m)
ã¡ãã»ãŒãžã§ã® Go ãã³ãã¬ãŒãã®äœ¿çšäŸ
ãã³ãã¬ãŒãã¯ããã±ãŒãžã®åœ¢åŒã«å¯Ÿå¿ããŸã
ããã§ãªããã°
ç§ãã¡ã¯ç©äºãæŽçããåã³ããã¹ãã§äººã ãåºæ¿ããããšã¯ãããŸããã
|alert()
...
.message(
'{{ if eq .Level "OK" }}It is ok now{{ else }}Chief, everything is broken{{end}}'
)
ã¡ãã»ãŒãžã®å°æ°ç¹ä»¥äž XNUMX æ¡
ã¡ãã»ãŒãžã®èªã¿ãããã®åäž:
|alert()
...
.message(
'now value is {{ index .Fields "value" | printf "%0.2f" }}'
)
ã¡ãã»ãŒãžå ã®å€æ°ãå±éãã
ããªãå«ãã§ããã®ã?ããšãã質åã«çããããã«ãã¡ãã»ãŒãžã«è©³çŽ°æ å ±ã衚瀺ãããŸãã
var warnAlert = 10
|alert()
...
.message(
'Today value less then '+string(warnAlert)+'%'
)
äžæã®ã¢ã©ãŒãèå¥å
ããã¯ãããŒã¿å ã«è€æ°ã®ã°ã«ãŒããååšããå Žåã«å¿ èŠã§ããããã§ãªãå Žåãã¢ã©ãŒã㯠XNUMX ã€ã ãçæãããŸãã
|alert()
...
.id('{{ index .Tags "myname" }}/{{ index .Tags "myfield" }}')
ã«ã¹ã¿ã ãã³ãã©ãŒ
ãã³ãã©ãŒã®å€§ããªãªã¹ãã«ã¯ exec ãå«ãŸããŠãããããã䜿çšãããšãæž¡ããããã©ã¡ãŒã¿ãŒ (æšæºå ¥å) ã䜿çšããŠã¹ã¯ãªãããå®è¡ã§ããŸããåµé æ§ã ãã§ååã§ãã
ç§ãã¡ã®ã«ã¹ã¿ã ã® XNUMX ã€ã¯ãSlack ã«éç¥ãéä¿¡ããããã®å°ã㪠Python ã¹ã¯ãªããã§ãã
æåã¯ãèªèšŒã§ä¿è·ããã grafana ç»åãã¡ãã»ãŒãžã§éä¿¡ããããšèããŠããŸããã ãã®åŸãå¥ã®ã¡ãã»ãŒãžãšããŠã§ã¯ãªããåãã°ã«ãŒãããã®åã®ã¢ã©ãŒãã«å¯Ÿããã¹ã¬ããã« OK ãæžã蟌ã¿ãŸãã å°ãåŸãéå» X åéã«æãå€ãã£ãééããã¡ãã»ãŒãžã«è¿œå ããŸãã
å¥ã®ãããã¯ãšããŠãä»ã®ãµãŒãã¹ãšã®éä¿¡ãããã³ã¢ã©ãŒãã«ãã£ãŠéå§ãããã¢ã¯ã·ã§ã³ (ç£èŠãååã«æ©èœããŠããå Žåã®ã¿) ã«ã€ããŠèª¬æããŸãã
ãã³ãã©ãŒã®èª¬æã®äŸãslack_handler.py ã¯èªåã§æžããã¹ã¯ãªããã§ãã
topic: slack_graph
id: slack_graph.alert
match: level() != INFO AND changed() == TRUE
kind: exec
options:
prog: /sbin/slack_handler.py
args: ["-c", "CHANNELID", "--graph", "--search"]
ãããã°æ¹æ³ã¯?
ãã°åºåä»ããªãã·ã§ã³
|log()
.level("error")
.prefix("something")
ãŠã©ãã (cli): kapacitor -url
httpOut ã®ãªãã·ã§ã³
çŸåšã®ãã€ãã©ã€ã³å ã®ããŒã¿ã衚瀺ããŸãã
|httpOut('something')
èŠã(å
¥æãã):
å®è¡ã¹ããŒã
- åã¿ã¹ã¯ã¯ãæçšãªæ°å€ãå«ãå®è¡ããªãŒã圢åŒã§è¿ããŸãã
graphviz . - ãããã¯ãåã
ããã . - ããããã¥ãŒã¢ã«è²Œãä»ããŠã
楜ãã .
ä»ã«çæã¯ã©ãã§å ¥æã§ããŸããïŒ
ã©ã€ãããã¯æã® influxdb ã®ã¿ã€ã ã¹ã¿ã³ã
ããšãã°ã1 æéãããã®ãªã¯ãšã¹ãã®åèš (groupBy(XNUMXh)) ã®ã¢ã©ãŒããèšå®ããçºçããã¢ã©ãŒãã influxdb ã«èšé²ããããšããŸã (åé¡ã®äºå®ã grafana ã®ã°ã©ãã«çŸãã衚瀺ãããã)ã
influxDBOut() ã¯ã¢ã©ãŒãããã®æéå€ãã¿ã€ã ã¹ã¿ã³ãã«æžã蟌ã¿ãŸããããã«å¿ããŠããã£ãŒãäžã®ãã€ã³ãã¯ã¢ã©ãŒãã®å°çãããæ©ã/é ãæžã蟌ãŸããŸãã
æ£ç¢ºããå¿ èŠãªå Žå: çŸåšã®ã¿ã€ã ã¹ã¿ã³ãã§ããŒã¿ã influxdb ã«æžã蟌ãã«ã¹ã¿ã ãã³ãã©ãŒãåŒã³åºãããšã§ããã®åé¡ãåé¿ããŸãã
Dockerããã«ããšãããã€ã¡ã³ã
èµ·åæã«ãkapacitor 㯠[load] ãããã¯ã®èšå®ã§æå®ããããã£ã¬ã¯ããªããã¿ã¹ã¯ããã³ãã¬ãŒããããã³ãã³ãã©ãŒãããŒãã§ããŸãã
ã¿ã¹ã¯ãæ£ããäœæããã«ã¯ã次ã®ãã®ãå¿ èŠã§ãã
- ãã¡ã€ã«å â ã¹ã¯ãªãã ID/ååã«å±éãããŸã
- ã¿ã€ã - ã¹ããªãŒã /ããã
- dbrp â ã¹ã¯ãªãããå®è¡ãããããŒã¿ããŒã¹ãšããªã·ãŒã瀺ãããŒã¯ãŒã (dbrp âsupplier.â âautogenâ)
äžéšã®ããã ã¿ã¹ã¯ã« dbrp ãå«ãè¡ãå«ãŸããŠããªãå ŽåããµãŒãã¹å šäœãéå§ãæåŠããããã«ã€ããŠæ£çŽã«ãã°ã«æžã蟌ã¿ãŸãã
éã«ãã¯ããã°ã©ãã§ã¯ããã®è¡ã¯ååšãã¹ãã§ã¯ãªããã€ã³ã¿ãŒãã§ãŒã¹ãéããŠåãå ¥ããããããšã©ãŒãçæãããŸãã
ã³ã³ãããŒã®ãã«ãæã®ããã¯: //.+dbrp ãå«ãè¡ãããå ŽåãDockerfile 㯠-1 ã§çµäºããŸããããã«ããããã«ãã®ã¢ã»ã³ãã«æã«å€±æã®çç±ãããã«ç解ã§ããããã«ãªããŸãã
XNUMX察å€ã«åå ãã
ã¿ã¹ã¯äŸ: 95 é±éã®ãµãŒãã¹çšŒåæéã® 10 ããŒã»ã³ã¿ã€ã«ãååŸããæåŸã® XNUMX åã®ååããã®å€ãšæ¯èŒããå¿ èŠããããŸãã
XNUMX 察å€ã®çµåã¯å®è¡ã§ããŸããããã€ã³ãã®ã°ã«ãŒãã«ãããæçµ/å¹³å/äžå€®å€ãããŒããã¹ããªãŒã ã«å€æãããšããåã®äžäžèŽãšããžãè¿œå ã§ããŸãã: ããã -> ã¹ããªãŒã ããšãããšã©ãŒãè¿ãããŸãã
ã©ã ãåŒã®å€æ°ãšããŠã®ãããã®çµæã眮æãããŸããã
æåã®ãããããå¿ èŠãªæ°ã udf çµç±ã§ãã¡ã€ã«ã«ä¿åãããã®ãã¡ã€ã«ããµã€ãããŒãçµç±ã§ããŒããããªãã·ã§ã³ããããŸãã
ããã§äœã解決ããã®ã§ããããïŒ
åœç€Ÿã«ã¯çŽ 100 瀟ã®ããã« ãµãã©ã€ã€ãŒãããããããããè€æ°ã®æ¥ç¶ (ãã£ãã«ãšåŒã¶ããšã«ããŸã) ãæã€ããšãã§ããŸãã ãããã®ãã£ãã«ã¯çŽ 300 ãããåãã£ãã«ãè±èœããå¯èœæ§ããããŸãã èšé²ããããã¹ãŠã®ã¡ããªã¯ã¹ã®ãã¡ããšã©ãŒç (ãªã¯ãšã¹ããšãšã©ãŒ) ãç£èŠããŸãã
ãªãã°ã©ãã¡ãã§ã¯ãªãã®ã§ããããïŒ
Grafana ã§èšå®ããããšã©ãŒ ã¢ã©ãŒãã«ã¯ããã€ãã®æ¬ ç¹ããããŸãã ç¶æ³ã«å¿ããŠãé倧ãªãã®ãããã°ãç®ãã€ã¶ã£ãŠãåé¡ãªããã®ããããŸãã
Grafana ã¯æž¬å®ãšã¢ã©ãŒãã®éã®èšç®æ¹æ³ãç¥ããŸããããã¬ãŒã (ãªã¯ãšã¹ã - ãšã©ãŒ)/ãªã¯ãšã¹ããå¿ èŠã§ãã
ãšã©ãŒã¯åä»ãªããã§ã:
æåãããªã¯ãšã¹ãã§èŠããšãæªåœ±é¿ã¯å°ãªããªããŸãã
ããŠãgrafana ã®åã«ãµãŒãã¹ã§ã¬ãŒããäºåã«èšç®ããããšãã§ããå Žåã«ãã£ãŠã¯ãããæ©èœããŸãã ããããç§ãã¡ã®ãã®ã§ã¯ããã§ã¯ãããŸããããªããªã... åãã£ãã«ã§ã¯ç¬èªã®æ¯çããæ£åžžããšã¿ãªãããã¢ã©ãŒãã¯éçãªå€ã«åŸã£ãŠæ©èœããŸãïŒã¢ã©ãŒããç®ã§æ¢ããé »ç¹ã«ã¢ã©ãŒããããå Žåã¯å€æŽããŸãïŒã
以äžã¯ãããŸããŸãªãã£ãã«ã®ãéåžžãã®äŸã§ãã
ç§ãã¡ã¯åã®ç¹ãç¡èŠãããéåžžã®ãç¶æ³ã¯ãã¹ãŠã®ãµãã©ã€ã€ãŒã§åæ§ã§ãããšä»®å®ããŸãã ããã§ãã¹ãŠãããŸããããgrafana ã§ã¢ã©ãŒãã衚瀺ã§ããããã«ãªããŸãã?
å¯èœã§ããã次ã®ãªãã·ã§ã³ã® XNUMX ã€ãéžæããå¿
èŠããããããå®éã«ã¯ããããããããŸããã
a) ãã£ãã«ããšã«åå¥ã«å€æ°ã®ã°ã©ããäœæããŸã (ãããŠããããã䌎ãã®ãèŠçã§ã)
b) ãã¹ãŠã®ãã£ãã«ãå«ã XNUMX ã€ã®ãã£ãŒããæ®ã (ã«ã©ãã«ãªç·ãšã«ã¹ã¿ãã€ãºãããã¢ã©ãŒãã«å€¢äžã«ãªã)
ã©ããã£ãŠãã£ãã®ã§ããïŒ
ç¹°ãè¿ããŸãããããã¥ã¡ã³ãã«è¯ãéå§äŸããããŸã (
æçµçã«ç§ãã¡ããã£ãããš:
- ãã£ã³ãã«ããšã«ã°ã«ãŒãåããŠãæ°æé㧠XNUMX ã€ã®ã·ãªãŒãºã«åå ããŸãã
- ããŒã¿ããªãå Žåã¯ã°ã«ãŒãããšã«ç³»åãå ¥åããŸãã
- éå» 10 åéã®äžå€®å€ã以åã®ããŒã¿ãšæ¯èŒããŸãã
- ç§ãã¡ã¯äœããèŠã€ãããšå«ã³ãŸãã
- èšç®ãããã¬ãŒããšçºçããã¢ã©ãŒãã influxdb ã«æžã蟌ã¿ãŸãã
- 圹ç«ã€ã¡ãã»ãŒãžã Slack ã«éä¿¡ããŸãã
ç§ã®æèŠã§ã¯ãæçµçã«åŸãããšæã£ãŠãããã®ã¯ãã¹ãŠ (ã«ã¹ã¿ã ãã³ãã©ãŒã䜿çšããŠããã«å°ã) ã§ããã ãçŸããéæããããšãã§ããŸããã
github.com ãèŠãããšãã§ããŸã
çµæã®ã³ãŒãã®äŸ:
dbrp "supplier"."autogen"
var name = 'requests.rate'
var grafana_dash = 'pczpmYZWU/mydashboard'
var grafana_panel = '26'
var period = 8h
var todayPeriod = 10m
var every = 1m
var warnAlert = 15
var warnReset = 5
var reqQuery = 'SELECT sum("count") AS value FROM "supplier"."autogen"."requests"'
var errQuery = 'SELECT sum("count") AS value FROM "supplier"."autogen"."errors"'
var prevErr = batch
|query(errQuery)
.period(period)
.every(every)
.groupBy(1m, 'channel', 'supplier')
var prevReq = batch
|query(reqQuery)
.period(period)
.every(every)
.groupBy(1m, 'channel', 'supplier')
var rates = prevReq
|join(prevErr)
.as('req', 'err')
.tolerance(1m)
.fill('null')
// запПлМÑеЌ зМаÑÐµÐœÐžÑ ÐœÑлÑЌО, еÑлО ОÑ
Ме бÑлП
|default()
.field('err.value', 0.0)
.field('req.value', 0.0)
// if в lambda: ÑÑОÑаеЌ ÑейÑ, ÑПлÑкП еÑлО ПÑОбкО бÑлО
|eval(lambda: if("err.value" > 0, 100.0 * (float("req.value") - float("err.value")) / float("req.value"), 100.0))
.as('rate')
// запОÑÑваеЌ пПÑÑОÑаММÑе зМаÑÐµÐœÐžÑ Ð² ОМÑлÑкÑ
rates
|influxDBOut()
.quiet()
.create()
.database('kapacitor')
.retentionPolicy('autogen')
.measurement('rates')
// вÑбОÑаеЌ ЎаММÑе за пПÑлеЎМОе 10 ЌОМÑÑ, ÑÑОÑаеЌ ЌеЎОаМÑ
var todayRate = rates
|where(lambda: duration((unixNano(now()) - unixNano("time")) / 1000, 1u) < todayPeriod)
|median('rate')
.as('median')
var prevRate = rates
|median('rate')
.as('median')
var joined = todayRate
|join(prevRate)
.as('today', 'prev')
|httpOut('join')
var trigger = joined
|alert()
.warn(lambda: ("prev.median" - "today.median") > warnAlert)
.warnReset(lambda: ("prev.median" - "today.median") < warnReset)
.flapping(0.25, 0.5)
.stateChangesOnly()
// ÑПбОÑаеЌ в message ÑÑÑÐ»ÐºÑ ÐœÐ° гÑаÑОк ЎаÑбПÑЎа гÑаÑаМÑ
.message(
'{{ .Level }}: {{ index .Tags "channel" }} err/req ratio ({{ index .Tags "supplier" }})
{{ if eq .Level "OK" }}It is ok now{{ else }}
'+string(todayPeriod)+' median is {{ index .Fields "today.median" | printf "%0.2f" }}%, by previous '+string(period)+' is {{ index .Fields "prev.median" | printf "%0.2f" }}%{{ end }}
http://grafana.ostrovok.in/d/'+string(grafana_dash)+
'?var-supplier={{ index .Tags "supplier" }}&var-channel={{ index .Tags "channel" }}&panelId='+string(grafana_panel)+'&fullscreen&tz=UTC%2B03%3A00'
)
.id('{{ index .Tags "name" }}/{{ index .Tags "channel" }}')
.levelTag('level')
.messageField('message')
.durationField('duration')
.topic('slack_graph')
// "today.median" ÐŽÑблОÑÑеЌ как "value", Ñакже пОÑеЌ в ОМÑлÑÐºÑ ÐŸÑÑалÑÐœÑе ÑÐžÐ»ÐŽÑ Ð°Ð»ÐµÑÑа (keep)
trigger
|eval(lambda: "today.median")
.as('value')
.keep()
|influxDBOut()
.quiet()
.create()
.database('kapacitor')
.retentionPolicy('autogen')
.measurement('alerts')
.tag('alertName', name)
ãããŠãçµè«ã¯äœã§ããïŒ
Kapacitor ã¯ãå€æ°ã®ã°ã«ãŒãåã«ããç£èŠã¢ã©ãŒãã®å®è¡ããã§ã«èšé²ãããã¡ããªã¯ã¹ã«åºã¥ãè¿œå ã®èšç®ã®å®è¡ãã«ã¹ã¿ã ã¢ã¯ã·ã§ã³ã®å®è¡ãããã³ã¹ã¯ãªãã (udf) ã®å®è¡ã«åªããŠããŸãã
åå
¥éå£ã¯ããã»ã©é«ããããŸãããgrafana ããã®ä»ã®ããŒã«ã§ã¯æºè¶³ã§ããªãå Žåã¯ãè©ŠããŠã¿ãŠãã ããã
åºæïŒ habr.com