Itupalẹ TSDB ni Prometheus 2

Itupalẹ TSDB ni Prometheus 2

Aaye data jara akoko (TSDB) ni Prometheus 2 jẹ apẹẹrẹ ti o dara julọ ti ojutu imọ-ẹrọ ti o funni ni awọn ilọsiwaju pataki lori ibi ipamọ v2 ni Prometheus 1 ni awọn ofin iyara ikojọpọ data, ipaniyan ibeere, ati ṣiṣe awọn orisun. A n ṣe imuse Prometheus 2 ni Percona Monitoring and Management (PMM) ati pe Mo ni aye lati loye iṣẹ ti Prometheus 2 TSDB. Ninu nkan yii Emi yoo sọrọ nipa awọn abajade ti awọn akiyesi wọnyi.

Apapọ Prometheus Workload

Fun awọn ti a lo lati ṣe pẹlu awọn ibi ipamọ data idi gbogbogbo, iṣẹ ṣiṣe aṣoju Prometheus jẹ ohun ti o dun. Oṣuwọn ikojọpọ data duro lati jẹ iduroṣinṣin: nigbagbogbo awọn iṣẹ ti o ṣe abojuto firanṣẹ isunmọ nọmba kanna ti awọn metiriki, ati pe awọn amayederun n yipada laiyara laiyara.
Awọn ibeere fun alaye le wa lati awọn orisun oriṣiriṣi. Diẹ ninu wọn, gẹgẹbi awọn titaniji, tun tiraka fun iduroṣinṣin ati iye asọtẹlẹ. Awọn miiran, gẹgẹbi awọn ibeere olumulo, le fa awọn ikọlu, botilẹjẹpe eyi kii ṣe ọran fun ọpọlọpọ awọn ẹru iṣẹ.

Igbeyewo fifuye

Lakoko idanwo, Mo dojukọ agbara lati ṣajọ data. Mo ti gbe Prometheus 2.3.2 ti a ṣe akojọpọ pẹlu Go 1.10.1 (gẹgẹbi apakan ti PMM 1.14) lori iṣẹ Linode nipa lilo iwe afọwọkọ yii: StackScript. Fun awọn julọ bojumu fifuye iran, lilo yi StackScript Mo ṣe ifilọlẹ ọpọlọpọ awọn apa MySQL pẹlu ẹru gidi kan (Idanwo Sysbench TPC-C), ọkọọkan eyiti o ṣe apẹẹrẹ awọn apa Linux 10/MySQL.
Gbogbo awọn idanwo wọnyi ni a ṣe lori olupin Linode pẹlu awọn ohun kohun foju mẹjọ ati 32 GB ti iranti, ṣiṣe awọn iṣeṣiro fifuye 20 n ṣe abojuto awọn igba MySQL igba meji. Tabi, ni awọn ofin Prometheus, awọn ibi-afẹde 800, 440 scrapes fun iṣẹju-aaya, 380 ẹgbẹrun awọn igbasilẹ fun iṣẹju keji, ati 1,7 million jara akoko ti nṣiṣe lọwọ.

Oniru

Ọna deede ti awọn data data ibile, pẹlu eyiti Prometheus 1.x lo, ni lati iranti ifilelẹ. Ti ko ba to lati mu fifuye naa, iwọ yoo ni iriri awọn latencies giga ati diẹ ninu awọn ibeere yoo kuna. Lilo iranti ni Prometheus 2 jẹ atunto nipasẹ bọtini storage.tsdb.min-block-duration, eyi ti o pinnu bi o ṣe gun awọn igbasilẹ yoo wa ni ipamọ ni iranti ṣaaju ki o to ṣan si disk (aiyipada jẹ wakati 2). Iye iranti ti a beere yoo dale lori nọmba jara akoko, awọn akole, ati awọn scrapes ti a ṣafikun si ṣiṣan nwọle nẹtiwọọki. Ni awọn ofin ti aaye disk, Prometheus ni ero lati lo awọn baiti 3 fun igbasilẹ (apẹẹrẹ). Ni apa keji, awọn ibeere iranti jẹ ga julọ.

Botilẹjẹpe o ṣee ṣe lati tunto iwọn bulọọki, ko ṣeduro lati tunto rẹ pẹlu ọwọ, nitorinaa o fi agbara mu lati fun Prometheus bi iranti pupọ bi o ṣe nilo fun iṣẹ ṣiṣe rẹ.
Ti ko ba si iranti to lati ṣe atilẹyin ṣiṣan ti nwọle ti awọn metiriki, Prometheus yoo ṣubu kuro ninu iranti tabi apani OOM yoo gba si.
Ṣafikun swap lati ṣe idaduro jamba nigbati Prometheus ba jade ni iranti ko ṣe iranlọwọ gaan, nitori lilo iṣẹ yii nfa agbara iranti bugbamu. Mo ro pe o jẹ nkankan lati se pẹlu Go, awọn oniwe-idoti-odè ati awọn ọna ti o se pẹlu siwopu.
Ọna miiran ti o nifẹ si ni lati tunto bulọki ori lati ṣan si disk ni akoko kan, dipo kika rẹ lati ibẹrẹ ilana naa.

Itupalẹ TSDB ni Prometheus 2

Gẹgẹbi o ti le rii lati ori aworan, awọn ṣiṣan si disk waye ni gbogbo wakati meji. Ti o ba yi paramita akoko-min-block-min si wakati kan, lẹhinna awọn atunto wọnyi yoo waye ni gbogbo wakati, bẹrẹ lẹhin idaji wakati kan.
Ti o ba fẹ lo eyi ati awọn aworan miiran ninu fifi sori Prometheus rẹ, o le lo eyi dasibodu. O jẹ apẹrẹ fun PMM ṣugbọn, pẹlu awọn iyipada kekere, baamu si eyikeyi fifi sori Prometheus.
A ni ohun ti nṣiṣe lọwọ Àkọsílẹ ti a npe ni ori Àkọsílẹ eyi ti o ti fipamọ ni iranti; ohun amorindun pẹlu agbalagba data wa nipasẹ mmap(). Eyi yọkuro iwulo lati tunto kaṣe lọtọ, ṣugbọn tun tumọ si pe o nilo lati fi aaye ti o to fun kaṣe ẹrọ ẹrọ ti o ba fẹ lati beere data ti o dagba ju ohun ti bulọọki ori le gba.
Eyi tun tumọ si pe agbara iranti foju Prometheus yoo dabi giga pupọ, eyiti kii ṣe nkan lati ṣe aniyan nipa.

Itupalẹ TSDB ni Prometheus 2

Ojuami apẹrẹ ti o nifẹ si ni lilo WAL (kọ iwe iwaju). Bi o ṣe le rii lati inu iwe ipamọ, Prometheus lo WAL lati yago fun awọn ipadanu. Awọn ọna ṣiṣe kan pato fun iṣeduro iwalaaye data jẹ, laanu, ko ṣe akọsilẹ daradara. Ẹya Prometheus 2.3.2 fọ WAL si disk ni gbogbo iṣẹju-aaya 10 ati pe aṣayan yii kii ṣe atunto olumulo.

Awọn idapọmọra

Prometheus TSDB jẹ apẹrẹ bi ile itaja LSM (Log Structured Merge): ori bulọọki ti fọ lorekore si disiki, lakoko ti ẹrọ iwapọ kan ṣajọpọ awọn bulọọki pupọ papọ lati yago fun ọlọjẹ ọpọlọpọ awọn bulọọki lakoko awọn ibeere. Nibi o le rii nọmba awọn bulọọki ti Mo ṣe akiyesi lori eto idanwo lẹhin ọjọ ẹru kan.

Itupalẹ TSDB ni Prometheus 2

Ti o ba fẹ lati ni imọ siwaju sii nipa ile itaja, o le ṣayẹwo faili meta.json, eyiti o ni alaye nipa awọn bulọọki ti o wa ati bii wọn ṣe wa.

{
       "ulid": "01CPZDPD1D9R019JS87TPV5MPE",
       "minTime": 1536472800000,
       "maxTime": 1536494400000,
       "stats": {
               "numSamples": 8292128378,
               "numSeries": 1673622,
               "numChunks": 69528220
       },
       "compaction": {
               "level": 2,
               "sources": [
                       "01CPYRY9MS465Y5ETM3SXFBV7X",
                       "01CPYZT0WRJ1JB1P0DP80VY5KJ",
                       "01CPZ6NR4Q3PDP3E57HEH760XS"
               ],
               "parents": [
                       {
                               "ulid": "01CPYRY9MS465Y5ETM3SXFBV7X",
                               "minTime": 1536472800000,
                               "maxTime": 1536480000000
                       },
                       {
                               "ulid": "01CPYZT0WRJ1JB1P0DP80VY5KJ",
                               "minTime": 1536480000000,
                               "maxTime": 1536487200000
                       },
                       {
                               "ulid": "01CPZ6NR4Q3PDP3E57HEH760XS",
                               "minTime": 1536487200000,
                               "maxTime": 1536494400000
                       }
               ]
       },
       "version": 1
}

Compactions ni Prometheus ti wa ni ti so si awọn akoko ti ori Àkọsílẹ ti wa ni flushing to disk. Ni aaye yii, ọpọlọpọ awọn iṣẹ ṣiṣe le ṣee ṣe.

Itupalẹ TSDB ni Prometheus 2

O han wipe compacts ti wa ni ko ni opin ni eyikeyi ọna ati ki o le fa tobi disk I/O spikes nigba ipaniyan.

Itupalẹ TSDB ni Prometheus 2

Sipiyu fifuye spikes

Itupalẹ TSDB ni Prometheus 2

Nitoribẹẹ, eyi ni ipa odi kuku lori iyara ti eto naa, ati pe o tun jẹ ipenija pataki fun ibi ipamọ LSM: bawo ni a ṣe le ṣe itọpọ lati ṣe atilẹyin awọn oṣuwọn ibeere giga laisi nfa oke pupọ ju?
Awọn lilo ti iranti ni iwapọ ilana wulẹ oyimbo awon.

Itupalẹ TSDB ni Prometheus 2

A le rii bii, lẹhin iwapọ, pupọ julọ ti iranti yipada ipo lati Kaṣe si Ọfẹ: eyi tumọ si pe alaye ti o niyelori ti yọkuro lati ibẹ. Iyanilenu ti o ba ti lo nibi fadvice() tabi diẹ ninu awọn miiran minimization ilana, tabi o jẹ nitori awọn kaṣe ti a ni ominira lati awọn bulọọki run nigba iwapọ?

Imularada lẹhin ikuna

Imularada lati awọn ikuna gba akoko, ati fun idi ti o dara. Fun ṣiṣan ti nwọle ti awọn igbasilẹ miliọnu kan fun iṣẹju-aaya, Mo ni lati duro nipa awọn iṣẹju 25 lakoko ti imularada ti ṣe ni akiyesi awakọ SSD.

level=info ts=2018-09-13T13:38:14.09650965Z caller=main.go:222 msg="Starting Prometheus" version="(version=2.3.2, branch=v2.3.2, revision=71af5e29e815795e9dd14742ee7725682fa14b7b)"
level=info ts=2018-09-13T13:38:14.096599879Z caller=main.go:223 build_context="(go=go1.10.1, user=Jenkins, date=20180725-08:58:13OURCE)"
level=info ts=2018-09-13T13:38:14.096624109Z caller=main.go:224 host_details="(Linux 4.15.0-32-generic #35-Ubuntu SMP Fri Aug 10 17:58:07 UTC 2018 x86_64 1bee9e9b78cf (none))"
level=info ts=2018-09-13T13:38:14.096641396Z caller=main.go:225 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-09-13T13:38:14.097715256Z caller=web.go:415 component=web msg="Start listening for connections" address=:9090
level=info ts=2018-09-13T13:38:14.097400393Z caller=main.go:533 msg="Starting TSDB ..."
level=info ts=2018-09-13T13:38:14.098718401Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536530400000 maxt=1536537600000 ulid=01CQ0FW3ME8Q5W2AN5F9CB7R0R
level=info ts=2018-09-13T13:38:14.100315658Z caller=web.go:467 component=web msg="router prefix" prefix=/prometheus
level=info ts=2018-09-13T13:38:14.101793727Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536732000000 maxt=1536753600000 ulid=01CQ78486TNX5QZTBF049PQHSM
level=info ts=2018-09-13T13:38:14.102267346Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536537600000 maxt=1536732000000 ulid=01CQ78DE7HSQK0C0F5AZ46YGF0
level=info ts=2018-09-13T13:38:14.102660295Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536775200000 maxt=1536782400000 ulid=01CQ7SAT4RM21Y0PT5GNSS146Q
level=info ts=2018-09-13T13:38:14.103075885Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536753600000 maxt=1536775200000 ulid=01CQ7SV8WJ3C2W5S3RTAHC2GHB
level=error ts=2018-09-13T14:05:18.208469169Z caller=wal.go:275 component=tsdb msg="WAL corruption detected; truncating" err="unexpected CRC32 checksum d0465484, want 0" file=/opt/prometheus/data/.prom2-data/wal/007357 pos=15504363
level=info ts=2018-09-13T14:05:19.471459777Z caller=main.go:543 msg="TSDB started"
level=info ts=2018-09-13T14:05:19.471604598Z caller=main.go:603 msg="Loading configuration file" filename=/etc/prometheus.yml
level=info ts=2018-09-13T14:05:19.499156711Z caller=main.go:629 msg="Completed loading of configuration file" filename=/etc/prometheus.yml
level=info ts=2018-09-13T14:05:19.499228186Z caller=main.go:502 msg="Server is ready to receive web requests."

Iṣoro akọkọ ti ilana imularada jẹ agbara iranti giga. Bi o ti jẹ pe ni ipo deede olupin le ṣiṣẹ ni iduroṣinṣin pẹlu iye kanna ti iranti, ti o ba ṣubu o le ma gba pada nitori OOM. Ojutu kan ṣoṣo ti Mo rii ni lati mu gbigba data kuro, mu olupin wa, jẹ ki o bọsipọ ati atunbere pẹlu gbigba ṣiṣẹ.

Igbaradi

Iwa miiran lati tọju ni lokan lakoko igbona ni ibatan laarin iṣẹ kekere ati agbara awọn orisun giga ni kete lẹhin ibẹrẹ. Nigba diẹ ninu, ṣugbọn kii ṣe gbogbo awọn ibẹrẹ, Mo ṣe akiyesi fifuye pataki lori Sipiyu ati iranti.

Itupalẹ TSDB ni Prometheus 2

Itupalẹ TSDB ni Prometheus 2

Awọn ela ni lilo iranti tọkasi pe Prometheus ko le tunto gbogbo awọn ikojọpọ lati ibẹrẹ, ati pe alaye kan ti sọnu.
Mo ti ko ṣayẹwo jade awọn gangan idi fun awọn ga Sipiyu ati iranti fifuye. Mo fura pe eyi jẹ nitori ẹda ti jara akoko tuntun ni ori bulọki pẹlu igbohunsafẹfẹ giga.

Sipiyu fifuye surges

Ni afikun si awọn compactions, eyi ti o ṣẹda kan iṣẹtọ ga I / O fifuye, Mo woye pataki spikes ni Sipiyu fifuye gbogbo meji iṣẹju. Awọn ti nwaye gun nigbati ṣiṣan titẹ sii ga ati pe o han pe o fa nipasẹ agbasọ idoti Go, pẹlu o kere diẹ ninu awọn ohun kohun ti kojọpọ ni kikun.

Itupalẹ TSDB ni Prometheus 2

Itupalẹ TSDB ni Prometheus 2

Awọn fo wọnyi ko ṣe pataki. O han pe nigbati iwọnyi ba waye, aaye titẹsi inu inu Prometheus ati awọn metiriki ko si, nfa awọn ela data lakoko awọn akoko kanna kanna.

Itupalẹ TSDB ni Prometheus 2

O tun le ṣe akiyesi pe olutaja Prometheus ti wa ni pipade fun iṣẹju-aaya kan.

Itupalẹ TSDB ni Prometheus 2

A le ṣe akiyesi awọn ibamu pẹlu ikojọpọ idoti (GC).

Itupalẹ TSDB ni Prometheus 2

ipari

TSDB ni Prometheus 2 yara, o lagbara lati mu awọn miliọnu ti jara akoko ati ni akoko kanna awọn ẹgbẹẹgbẹrun awọn igbasilẹ fun iṣẹju kan nipa lilo ohun elo iwọntunwọnsi iṣẹtọ. Sipiyu ati disk I/O iṣamulo jẹ tun ìkan. Apeere mi fihan to awọn metiriki 200 fun iṣẹju kan fun koko ti a lo.

Lati gbero imugboroosi, o nilo lati ranti nipa iye iranti ti o to, ati pe eyi gbọdọ jẹ iranti gidi. Iwọn iranti ti a lo ti Mo ṣakiyesi jẹ nipa 5 GB fun awọn igbasilẹ 100 fun iṣẹju keji ti ṣiṣan ti nwọle, eyiti o papọ pẹlu kaṣe ẹrọ iṣẹ fun nipa 000 GB ti iranti ti tẹdo.

Nitoribẹẹ, ọpọlọpọ iṣẹ tun wa lati ṣe lati tame Sipiyu ati disk I / O spikes, ati pe eyi kii ṣe iyalẹnu ni imọran bi ọdọ TSDB Prometheus 2 ṣe ṣe afiwe si InnoDB, TokuDB, RocksDB, WiredTiger, ṣugbọn gbogbo wọn ni iru kanna. awọn iṣoro ni kutukutu igbesi aye.

orisun: www.habr.com

Fi ọrọìwòye kun