Prometheus 2 ã®æç³»åããŒã¿ããŒã¹ (TSDB) ã¯ãããŒã¿èç©é床ãã¯ãšãªå®è¡ããªãœãŒã¹å¹çã®ç¹ã§ Prometheus 2 ã® v1 ã¹ãã¬ãŒãžã«æ¯ã¹ãŠå€§å¹
ãªæ¹åãããããããšã³ãžãã¢ãªã³ã° ãœãªã¥ãŒã·ã§ã³ã®åªããäŸã§ãã ç§ãã¡ã¯ Prometheus 2 ã Percona Monitoring and Management (PMM) ã«å®è£
ããŠãããPrometheus 2 TSDB ã®ããã©ãŒãã³ã¹ãç解ããæ©äŒããããŸããã ãã®èšäºã§ã¯ããããã®èŠ³å¯çµæã«ã€ããŠèª¬æããŸãã
Prometheus ã®å¹³åã¯ãŒã¯ããŒã
æ±çšããŒã¿ããŒã¹ã®æ±ãã«æ
£ããŠãã人ã«ãšã£ãŠãå
žåç㪠Prometheus ã¯ãŒã¯ããŒãã¯éåžžã«èå³æ·±ããã®ã§ãã ããŒã¿ã®èç©é床ã¯å®å®ããåŸåããããŸããéåžžãç£èŠããŠãããµãŒãã¹ã¯ã»ãŒåãæ°ã®ã¡ããªã¯ã¹ãéä¿¡ããã€ã³ãã©ã¹ãã©ã¯ãã£ã®å€åã¯æ¯èŒçãã£ããã§ãã
æ
å ±ã®èŠæ±ã¯ããŸããŸãªãœãŒã¹ããè¡ãããå ŽåããããŸãã ã¢ã©ãŒããªã©ã®äžéšã®ãã®ããå®å®ããäºæž¬å¯èœãªå€ãç®æããŠããŸãã ã»ãšãã©ã®ã¯ãŒã¯ããŒãã«ã¯åœãŠã¯ãŸããŸãããããŠãŒã¶ãŒèŠæ±ãªã©ã®ãã®ä»ã®èŠæ±ã«ãã£ãŠããŒã¹ããçºçããå¯èœæ§ããããŸãã
è² è·è©Šéš
ãã¹ãäžã¯ããŒã¿ãèç©ã§ããããšã«éç¹ã眮ããŸããã 次ã®ã¹ã¯ãªããã䜿çšããŠãGo 2.3.2 (PMM 1.10.1 ã®äžéšãšããŠ) ã§ã³ã³ãã€ã«ããã Prometheus 1.14 ã Linode ãµãŒãã¹ã«ãããã€ããŸããã
以äžã®ãã¹ãã¯ãã¹ãŠã32 ã€ã®ä»®æ³ã³ã¢ãš 20 GB ã®ã¡ã¢ãªãåãã Linode ãµãŒããŒã§å®è¡ããã800 ã® MySQL ã€ã³ã¹ã¿ã³ã¹ãç£èŠãã 440 ã®è² è·ã·ãã¥ã¬ãŒã·ã§ã³ãå®è¡ããŸããã Prometheus ã®çšèªã§ã¯ã380 ã®ã¿ãŒã²ããã1,7 ç§ããã XNUMX ã®ã¹ã¯ã¬ã€ãã³ã°ãXNUMX ç§ããã XNUMX äžã®ã¬ã³ãŒããããã³ XNUMX äžã®ã¢ã¯ãã£ããªæç³»åã«ãªããŸãã
ãã¶ã€ã³
Prometheus 1.x ã§äœ¿çšãããŠãããã®ãå«ããåŸæ¥ã®ããŒã¿ããŒã¹ã®éåžžã®ã¢ãããŒãã¯æ¬¡ã®ãšããã§ãã storage.tsdb.min-block-duration
ããã£ã¹ã¯ã«ãã©ãã·ã¥ããåã«é²é³ãã¡ã¢ãªã«ä¿æããæéã決å®ããŸã (ããã©ã«ã㯠2 æé)ã å¿
èŠãªã¡ã¢ãªã®éã¯ãæ£å³ã®åä¿¡ã¹ããªãŒã ã«è¿œå ãããæç³»åãã©ãã«ãããã³ã¹ã¯ã¬ã€ãã³ã°ã®æ°ã«ãã£ãŠç°ãªããŸãã ãã£ã¹ã¯å®¹éã®èŠ³ç¹ãããPrometheus ã¯ã¬ã³ãŒã (ãµã³ãã«) ããšã« 3 ãã€ãã䜿çšããããšãç®æããŠããŸãã äžæ¹ã§ãã¡ã¢ãªèŠä»¶ã¯ã¯ããã«é«ããªããŸãã
ããã㯠ãµã€ãºãæ§æããããšã¯å¯èœã§ãããæåã§æ§æããããšã¯ãå§ãã§ããŸããããã®ãããã¯ãŒã¯ããŒãã«å¿
èŠãªã ãã®ã¡ã¢ãªã Prometheus ã«äžããå¿
èŠããããŸãã
ã¡ããªã¯ã¹ã®åä¿¡ã¹ããªãŒã ããµããŒãããã®ã«ååãªã¡ã¢ãªããªãå ŽåãPrometheus ãã¡ã¢ãªäžè¶³ã«ãªãããOOM ãã©ãŒãã¡ã¢ãªã«å°éããŸãã
Prometheus ãã¡ã¢ãªäžè¶³ã«ãªã£ããšãã«ã¯ã©ãã·ã¥ãé
ãããããã«ã¹ã¯ãããè¿œå ããŠãããã®é¢æ°ã䜿çšãããšççºçãªã¡ã¢ãªæ¶è²»ãçºçãããããå®éã«ã¯åœ¹ã«ç«ã¡ãŸããã ãã㯠Goããã®ã¬ããŒãž ã³ã¬ã¯ã¿ãŒãããã³ã¹ã¯ããã®åŠçæ¹æ³ã«é¢ä¿ããããšæããŸãã
ãã XNUMX ã€ã®èå³æ·±ãã¢ãããŒãã¯ãããã»ã¹ã®éå§ããã«ãŠã³ãããã®ã§ã¯ãªããç¹å®ã®æç¹ã§ããã ãããã¯ããã£ã¹ã¯ã«ãã©ãã·ã¥ããããã«æ§æããããšã§ãã
ã°ã©ããããããããã«ããã£ã¹ã¯ãžã®ãã©ãã·ã¥ã¯ XNUMX æéããšã«çºçããŸãã min-block-duration ãã©ã¡ãŒã¿ãŒã XNUMX æéã«å€æŽãããšããããã®ãªã»ãã㯠XNUMX ååŸãã XNUMX æéããšã«å®è¡ãããŸãã
Prometheus ã€ã³ã¹ããŒã«ã§ãã®ã°ã©ããšä»ã®ã°ã©ãã䜿çšãããå Žåã¯ãããã䜿çšã§ããŸã
ã¡ã¢ãªã«ä¿åãããŠããããã ãããã¯ãšåŒã°ããã¢ã¯ãã£ã ãããã¯ããããŸãã å€ãããŒã¿ãå«ããããã¯ã¯ã次ã®æ¹æ³ã§å
¥æã§ããŸãã mmap()
ã ããã«ããããã£ãã·ã¥ãåå¥ã«æ§æããå¿
èŠããªããªããŸãããããã ãããã¯ã察å¿ã§ãããããå€ãããŒã¿ãã¯ãšãªããå Žåã¯ããªãã¬ãŒãã£ã³ã° ã·ã¹ãã ãã£ãã·ã¥çšã«ååãªã¹ããŒã¹ãæ®ããŠããå¿
èŠãããããšãæå³ããŸãã
ããã¯ãPrometheus ã®ä»®æ³ã¡ã¢ãªã®æ¶è²»éãããªãå€ããªãå¯èœæ§ãããããšãæå³ããŸãããããã¯å¿é
ããå¿
èŠã¯ãããŸããã
ãã 2.3.2 ã€ã®èå³æ·±ãèšèšãã€ã³ãã¯ãWAL (ãã°å
è¡æžã蟌ã¿) ã®äœ¿çšã§ãã ã¹ãã¬ãŒãžã®ããã¥ã¡ã³ããããããããã«ãPrometheus ã¯ã¯ã©ãã·ã¥ãåé¿ããããã« WAL ã䜿çšããŸãã æ®å¿µãªãããããŒã¿ã®çåæ§ãä¿èšŒããããã®å
·äœçãªã¡ã«ããºã ã¯ååã«ææžåãããŠããŸããã Prometheus ããŒãžã§ã³ 10 ã¯ãXNUMX ç§ããšã« WAL ããã£ã¹ã¯ã«ãã©ãã·ã¥ããŸããããã®ãªãã·ã§ã³ã¯ãŠãŒã¶ãŒãæ§æã§ããŸããã
å§çž®
Prometheus TSDB 㯠LSM (Log Structured Merge) ã¹ãã¢ã®ããã«èšèšãããŠããŸããããã ãããã¯ã¯å®æçã«ãã£ã¹ã¯ã«ãã©ãã·ã¥ãããŸãããã¯ãšãªäžã«ããŸãã«ãå€ãã®ãããã¯ãã¹ãã£ã³ãããããšãé¿ããããã«ãå§çž®ã¡ã«ããºã ãè€æ°ã®ãããã¯ãçµåããŸãã ããã§ã¯ãXNUMX æ¥è² è·ããããåŸã«ãã¹ã ã·ã¹ãã äžã§èŠ³å¯ããããããã¯ã®æ°ã確èªã§ããŸãã
ã¹ãã¢ã«ã€ããŠè©³ããç¥ãããå Žåã¯ãmeta.json ãã¡ã€ã«ã調ã¹ãããšãã§ããŸãããã®ãã¡ã€ã«ã«ã¯ãå©çšå¯èœãªãããã¯ãšãã®çææ¹æ³ã«é¢ããæ
å ±ãå«ãŸããŠããŸãã
{
"ulid": "01CPZDPD1D9R019JS87TPV5MPE",
"minTime": 1536472800000,
"maxTime": 1536494400000,
"stats": {
"numSamples": 8292128378,
"numSeries": 1673622,
"numChunks": 69528220
},
"compaction": {
"level": 2,
"sources": [
"01CPYRY9MS465Y5ETM3SXFBV7X",
"01CPYZT0WRJ1JB1P0DP80VY5KJ",
"01CPZ6NR4Q3PDP3E57HEH760XS"
],
"parents": [
{
"ulid": "01CPYRY9MS465Y5ETM3SXFBV7X",
"minTime": 1536472800000,
"maxTime": 1536480000000
},
{
"ulid": "01CPYZT0WRJ1JB1P0DP80VY5KJ",
"minTime": 1536480000000,
"maxTime": 1536487200000
},
{
"ulid": "01CPZ6NR4Q3PDP3E57HEH760XS",
"minTime": 1536487200000,
"maxTime": 1536494400000
}
]
},
"version": 1
}
Prometheus ã®å§çž®ã¯ãããã ãããã¯ããã£ã¹ã¯ã«ãã©ãã·ã¥ãããæéã«é¢é£ä»ããããŸãã ãã®æç¹ã§ããã®ãããªæäœãããã€ãå®è¡ãããå ŽåããããŸãã
å§çž®ã«ã¯ãããªãå¶éããªãããã§ãå®è¡äžã«å€§ããªãã£ã¹ã¯ I/O ã¹ãã€ã¯ãçºçããå¯èœæ§ããããŸãã
CPUè² è·ã®ã¹ãã€ã¯
ãã¡ãããããã¯ã·ã¹ãã ã®é床ã«ããªãæªåœ±é¿ãåãŒããLSM ã¹ãã¬ãŒãžã«ãšã£ãŠãæ·±å»ãªèª²é¡ãåŒãèµ·ãããŸããããã¯ãé床ã®ãªãŒããŒããããçºçãããã«ãé«ããªã¯ãšã¹ã ã¬ãŒãããµããŒãããããã«å§çž®ãã©ã®ããã«è¡ãããšããããšã§ãã
å§çž®ããã»ã¹ã§ã®ã¡ã¢ãªã®äœ¿çšãéåžžã«èå³æ·±ãããã§ãã
å§çž®åŸãã¡ã¢ãªã®å€§éšåããã£ãã·ã¥ç¶æ
ããããªãŒç¶æ
ã«ã©ã®ããã«å€åããããããããŸããããã¯ãæœåšçã«è²Žéãªæ
å ±ãããããåé€ãããŠããããšãæå³ããŸãã ããã§äœ¿ãããã®ãæ°ã«ãªã fadvice()
ãããšãä»ã®æå°åææ³ãåå ã§ããããããããšãå§çž®äžã«ç Žå£ããããããã¯ãããã£ãã·ã¥ã解æŸãããããã§ãããã?
é害åŸã®å埩
é害ããã®å埩ã«ã¯æéãããããŸãããããã«ã¯ååãªçç±ããããŸãã 25 ç§ããã XNUMX äžã¬ã³ãŒãã®åä¿¡ã¹ããªãŒã ã®å ŽåãSSD ãã©ã€ããèæ ®ããŠãªã«ããªãå®è¡ããããŸã§çŽ XNUMX åéåŸ ã€å¿ èŠããããŸããã
level=info ts=2018-09-13T13:38:14.09650965Z caller=main.go:222 msg="Starting Prometheus" version="(version=2.3.2, branch=v2.3.2, revision=71af5e29e815795e9dd14742ee7725682fa14b7b)"
level=info ts=2018-09-13T13:38:14.096599879Z caller=main.go:223 build_context="(go=go1.10.1, user=Jenkins, date=20180725-08:58:13OURCE)"
level=info ts=2018-09-13T13:38:14.096624109Z caller=main.go:224 host_details="(Linux 4.15.0-32-generic #35-Ubuntu SMP Fri Aug 10 17:58:07 UTC 2018 x86_64 1bee9e9b78cf (none))"
level=info ts=2018-09-13T13:38:14.096641396Z caller=main.go:225 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-09-13T13:38:14.097715256Z caller=web.go:415 component=web msg="Start listening for connections" address=:9090
level=info ts=2018-09-13T13:38:14.097400393Z caller=main.go:533 msg="Starting TSDB ..."
level=info ts=2018-09-13T13:38:14.098718401Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536530400000 maxt=1536537600000 ulid=01CQ0FW3ME8Q5W2AN5F9CB7R0R
level=info ts=2018-09-13T13:38:14.100315658Z caller=web.go:467 component=web msg="router prefix" prefix=/prometheus
level=info ts=2018-09-13T13:38:14.101793727Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536732000000 maxt=1536753600000 ulid=01CQ78486TNX5QZTBF049PQHSM
level=info ts=2018-09-13T13:38:14.102267346Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536537600000 maxt=1536732000000 ulid=01CQ78DE7HSQK0C0F5AZ46YGF0
level=info ts=2018-09-13T13:38:14.102660295Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536775200000 maxt=1536782400000 ulid=01CQ7SAT4RM21Y0PT5GNSS146Q
level=info ts=2018-09-13T13:38:14.103075885Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536753600000 maxt=1536775200000 ulid=01CQ7SV8WJ3C2W5S3RTAHC2GHB
level=error ts=2018-09-13T14:05:18.208469169Z caller=wal.go:275 component=tsdb msg="WAL corruption detected; truncating" err="unexpected CRC32 checksum d0465484, want 0" file=/opt/prometheus/data/.prom2-data/wal/007357 pos=15504363
level=info ts=2018-09-13T14:05:19.471459777Z caller=main.go:543 msg="TSDB started"
level=info ts=2018-09-13T14:05:19.471604598Z caller=main.go:603 msg="Loading configuration file" filename=/etc/prometheus.yml
level=info ts=2018-09-13T14:05:19.499156711Z caller=main.go:629 msg="Completed loading of configuration file" filename=/etc/prometheus.yml
level=info ts=2018-09-13T14:05:19.499228186Z caller=main.go:502 msg="Server is ready to receive web requests."
å埩ããã»ã¹ã®äž»ãªåé¡ã¯ã倧éã®ã¡ã¢ãªæ¶è²»ã§ãã éåžžã®ç¶æ³ã§ã¯ããµãŒããŒã¯åãéã®ã¡ã¢ãªã§å®å®ããŠåäœããŸãããã¯ã©ãã·ã¥ããå ŽåãOOM ãåå ã§å埩ã§ããªãå¯èœæ§ããããŸãã ç§ãèŠã€ããå¯äžã®è§£æ±ºçã¯ãããŒã¿åéãç¡å¹ã«ããŠãµãŒããŒãèµ·åããå埩ãããŠåéãæå¹ã«ããŠåèµ·åããããšã§ããã
ãŠã©ãŒãã³ã°ã¢ãã
ãŠã©ãŒã ã¢ããäžã«çæãã¹ããã XNUMX ã€ã®åäœã¯ãéå§çŽåŸã®äœããã©ãŒãã³ã¹ãšé«ããªãœãŒã¹æ¶è²»ãšã®é¢ä¿ã§ãã ãã¹ãŠã§ã¯ãããŸããããäžéšã®èµ·åäžã«ãCPU ãšã¡ã¢ãªã«é倧ãªè² è·ããããããšã芳å¯ãããŸããã
ã¡ã¢ãªäœ¿çšéã®ã®ã£ããã¯ãPrometheus ãæåãããã¹ãŠã®ã³ã¬ã¯ã·ã§ã³ãæ§æã§ãããäžéšã®æ
å ±ã倱ãããããšã瀺ããŠããŸãã
CPU ãšã¡ã¢ãªã®è² è·ãé«ãæ£ç¢ºãªçç±ã¯ããããŸããã ããã¯ããããããã¯å
ã§é«é »åºŠã§æ°ããªæç³»åãäœæãããããã§ã¯ãªãããšæãããŸãã
CPUè² è·ãæ¥å¢ãã
ããªãé«ã I/O è² è·ãçã¿åºãå§çž®ã«å ããŠãXNUMX åããšã« CPU è² è·ãå€§å¹ ã«æ¥å¢ããŠããããšã«æ°ä»ããŸããã å ¥åãããŒãå€ããšãã¯ããŒã¹ããé·ããªããå°ãªããšãäžéšã®ã³ã¢ãå®å šã«ããŒããããŠããç¶æ 㧠Go ã®ã¬ããŒãž ã³ã¬ã¯ã¿ãŒãåå ã§ãããšèããããŸãã
ãããã®é£èºã¯ããã»ã©éèŠã§ã¯ãããŸããã ããããçºçãããšãPrometheus ã®å
éšãšã³ã㪠ãã€ã³ããšã¡ããªã¯ã¹ãå©çšã§ããªããªããåãæéã«ããŒã¿ ã®ã£ãããçºçããããã§ãã
Prometheus ãšã¯ã¹ããŒã¿ã XNUMX ç§éã·ã£ããããŠã³ããããšãããããŸãã
ã¬ããŒãž ã³ã¬ã¯ã·ã§ã³ (GC) ãšã®çžé¢é¢ä¿ã«æ°ã¥ãããšãã§ããŸãã
ãŸãšã
Prometheus 2 ã® TSDB ã¯é«éã§ãããªãæ§ãããªããŒããŠã§ã¢ã䜿çšããŠã200 ç§ãããæ°çŸäžã®æç³»åãšåæã«æ°åã®ã¬ã³ãŒããåŠçã§ããŸãã CPU ãšãã£ã¹ã¯ I/O 䜿çšçãå°è±¡çã§ãã ç§ã®äŸã§ã¯ã䜿çšãããã³ã¢ããšã« 000 ç§ãããæ倧 XNUMX ã¡ããªã¯ã¹ã瀺ãããŸããã
æ¡åŒµãèšç»ããã«ã¯ãååãªéã®ã¡ã¢ãªãèŠããŠããå¿ èŠããããŸããããã¯å®ã¡ã¢ãªã§ããå¿ èŠããããŸãã ç§ã芳å¯ãã䜿çšã¡ã¢ãªéã¯ãåä¿¡ã¹ããªãŒã ã® 5 ç§ããã 100 ã¬ã³ãŒããããçŽ 000 GB ã§ããªãã¬ãŒãã£ã³ã° ã·ã¹ãã ã®ãã£ãã·ã¥ãšåããããšçŽ 8 GB ã®ã¡ã¢ãªãå ââæãããŠããŸããã
ãã¡ãããCPU ãšãã£ã¹ã¯ I/O ã®ã¹ãã€ã¯ãæããããã«ããã¹ãããšã¯ãŸã ãããããããŸããTSDB Prometheus 2 ã InnoDBãTokuDBãRocksDBãWiredTiger ãšæ¯ã¹ãŠããã«è¥ãããèãããšãããã¯é©ãã¹ãããšã§ã¯ãããŸããããã©ããåæ§ã®æ©èœãåããŠããŸãããã©ã€ããµã€ã¯ã«ã®æ©ã段éã§åé¡ãçºçããŸãã
åºæïŒ habr.com