
Sebaka sa polokelo ea nako (TSDB) ho Prometheus 2 ke mohlala o motle oa tharollo ea boenjiniere e fanang ka lintlafatso tse kholo holim'a polokelo ea v2 ho Prometheus 1 mabapi le ho bokella lintlha le ts'ebetso ea lipotso, le katleho ea lisebelisoa. Re ne re kenya tšebetsong Prometheus 2 ho Percona Monitoring and Management (PMM), ’me ke bile le monyetla oa ho utloisisa tšebetso ea Prometheus 2 TSDB. Sehloohong sena, ke tla arolelana liphello tsa litlhaloso tsena.
Prometheus Karolelano ea Mosebetsi
Bakeng sa ba tloaetseng ho sebetsana le li-database tsa sepheo sa mantlha, mosebetsi o tloaelehileng oa Prometheus o khahla haholo. Sekhahla sa pokello ea lintlha se batla se tsitsitse: lits'ebeletso tseo u li behang leihlo hangata li romella palo e lekanang ea lipalo, 'me lisebelisoa li fetoha butle butle.
Likōpo tsa tlhahisoleseding li ka tsoa mehloling e fapaneng. Tse ling, joalo ka litlhokomeliso, le tsona li na le tšekamelo ea ho tsitsa le ho lebelloa esale pele. Tse ling, joalo ka likopo tsa basebelisi, li ka baka li-spikes, leha sena se sa tloaeleha bakeng sa meroalo e mengata.
Teko ea mojaro
Nakong ea liteko, ke ile ka tsepamisa maikutlo ho bokhoni ba ho bokella lintlha. Ke tsamaisitse Prometheus 2.3.2 e hlophisitsoeng le Go 1.10.1 (e le karolo ea PMM 1.14) tšebeletsong ea Linode ke sebelisa mongolo ona: . Bakeng sa moloko oa sebele oa mojaro, sena Ke qalile li-node tse 'maloa tsa MySQL ka mojaro oa' nete (Teko ea Sysbench TPC-C), tseo e 'ngoe le e 'ngoe ea tsona e neng e etsisa li-node tse 10. Linux/MySQL.
Liteko tsohle tse latelang li entsoe ho seva ea Linode e nang le li-cores tse robeli le 32GB ea memori, e tsamaisang lipapiso tse 20 tse shebileng maemo a 800 a MySQL. Kapa, ho ea ka Prometheus, sepheo sa 440, 380 scrapes ka motsotsoana, lisampole tse 1,7 ka motsotsoana, le letoto la nako e sebetsang ea limilione tse XNUMX.
moralo
Mokhoa o tloaelehileng oa li-database tsa setso, ho kenyelletsa le o sebelisoang ke Prometheus 1.x, ke ho . Haeba e sa lekana ho sebetsana le mojaro, o tla ba le latency e phahameng mme lipotso tse ling li ke ke tsa etsoa. Tšebeliso ea memori ho Prometheus 2 e hlophisitsoe ka senotlolo storage.tsdb.min-block-duration, e khethollang hore na lirekoto li tla bolokoa nako e kae mohopolong pele li fetisetsoa ho disk (ka ho feletseng ke lihora tsa 2). Palo ea memori e hlokahalang e tla ipapisa le palo ea letoto la nako, lileibole, le likhechana ho kenyelletsa phallo e kenang. Mabapi le sebaka sa disk, Prometheus e ikemiselitse ho sebelisa li-byte tsa 3 ka rekoto (sample). Ka lehlakoreng le leng, litlhoko tsa memori li phahame haholo.
Le hoja ho ka khoneha ho lokisa boholo ba li-block, ha ho kgothaletswe ho e lokisa ka letsoho, kahoo o qobelloa ho fa Prometheus mohopolo o mongata kamoo o kopang mosebetsi oa hau kateng.
Haeba ho se na mohopolo o lekaneng oa ho ts'ehetsa molatsoana oa metrics o kenang, Prometheus o tla senyeha a sa hopola kapa a tšoaroe ke 'molai oa OOM.
Ho eketsa swap ho liehisa ho oa ha Prometheus a felloa ke mohopolo ha ho thuse hakaalo hobane ho e sebelisa ho baka hore tšebeliso ea memori e phatlohe. Ke nahana hore ke Go, 'mokelli oa eona oa lithōle, le kamoo e sebetsanang le swap.
Tsela e 'ngoe e thahasellisang ke ho beha hlooho ea hlooho hore e romelloe ho disk ka nako e itseng, ho e-na le ho e bala ho tloha ha ts'ebetso e qala.

Joalo ka ha u bona ho graph, flushes ho disk e etsahala lihora tse ling le tse ling tse peli. Haeba u fetola parameter ea min-block-duration ho hora e le 'ngoe, li-flushes tsena li tla etsahala hora e' ngoe le e 'ngoe, ho qala ka mor'a halofo ea hora.
Haeba u batla ho sebelisa sena le li-graph tse ling ts'ebetsong ea hau ea Prometheus, u ka sebelisa sena . E ne e etselitsoe PMM, empa ka liphetoho tse nyane e lumellana le ts'ebetso efe kapa efe ea Prometheus.
Re na le block e sebetsang, e bitsoang "head block", e bolokiloeng mohopolong; li-blocks tse nang le data ea khale li fumaneha ka mmap(). Sena se tlosa tlhoko ea ho hlophisa cache ka thoko, empa hape ho bolela hore o hloka ho siea sebaka se lekaneng bakeng sa cache ea sistimi ea ts'ebetso haeba o batla ho botsa data ea khale ho feta hlooho e ka tšoarellang.
Sena se boetse se bolela hore ts'ebeliso ea memori ea Prometheus e tla hlaha e phahame haholo, e seng letho le ka tšoenyehang ka eona.

Ntlha e 'ngoe e thahasellisang ea moralo ke tšebeliso ea WAL (ngola pele log). Joalokaha ho ka bonoa litokomaneng tsa polokelo, Prometheus e sebelisa WAL ho qoba tahlehelo nakong ea likotsi. Mekhoa e ikhethileng ea ho netefatsa ho tšoarella ha data, ka bomalimabe, ha e ngotsoe hantle. Prometheus 2.3.2 e hula WAL ho disk metsotsoana e meng le e meng e 10, 'me paramethara ena ha e khonehe ho sebelisoa.
Likopano
Prometheus TSDB e entsoe joalo ka lebenkele la Log Structured Merge (LSM): "head block" e hlatsuoa nako le nako ho disk, ha mochine oa compaction o kopanya li-blocks tse ngata hammoho ho qoba ho hlahloba li-blocks tse ngata bakeng sa lipotso. Mona ke palo ea li-block tseo ke li boneng tsamaisong ea liteko ka mor'a letsatsi la mojaro.

Haeba u batla ho tseba ho eketsehileng ka polokelo, u ka sheba faele ea meta.json, e nang le tlhahisoleseding e mabapi le li-blocks tse teng le hore na li bile teng joang.
{
"ulid": "01CPZDPD1D9R019JS87TPV5MPE",
"minTime": 1536472800000,
"maxTime": 1536494400000,
"stats": {
"numSamples": 8292128378,
"numSeries": 1673622,
"numChunks": 69528220
},
"compaction": {
"level": 2,
"sources": [
"01CPYRY9MS465Y5ETM3SXFBV7X",
"01CPYZT0WRJ1JB1P0DP80VY5KJ",
"01CPZ6NR4Q3PDP3E57HEH760XS"
],
"parents": [
{
"ulid": "01CPYRY9MS465Y5ETM3SXFBV7X",
"minTime": 1536472800000,
"maxTime": 1536480000000
},
{
"ulid": "01CPYZT0WRJ1JB1P0DP80VY5KJ",
"minTime": 1536480000000,
"maxTime": 1536487200000
},
{
"ulid": "01CPZ6NR4Q3PDP3E57HEH760XS",
"minTime": 1536487200000,
"maxTime": 1536494400000
}
]
},
"version": 1
}Likopano tsa Prometheus li hokahane le nako eo hlooho ea hlooho e phahamisoang ho disk. Ho ka 'na ha e-ba le mesebetsi e mengata e joalo ka nako e le 'ngoe.

Ho bonahala eka li-compactions ha li na moeli ka tsela leha e le efe mme li ka baka li-spikes tse kholo ho disk I / O nakong ea ts'ebetso.

Li-spikes tsa boima ba CPU

Ha e le hantle, sena se na le phello e mpe haholo ts'ebetsong ea tsamaiso, hape ke phephetso e tebileng bakeng sa polokelo ea LSM: mokhoa oa ho kopanya ho tšehetsa litekanyetso tse phahameng tsa lipotso ntle le ho baka ho hongata haholo?
Tšebeliso ea mohopolo nakong ea compaction e boetse e shebahala e thahasellisa haholo.

Re ka bona kamoo ka mor'a compaction boholo ba mohopolo o fetola boemo ho tloha ho Cached ho Free, ho bolelang hore boitsebiso bo ka bang bohlokoa bo tlositsoe moo. Kea ipotsa hore na sena se sebelisoa fadvice() kapa mokhoa o mong oa ho nyenyefatsa, kapa o bakoa ke ho lokolloa ha cache ho li-blocks tse senyehileng nakong ea compaction?
Ho hlaphoheloa ka mor'a ho hlōleha
Ho hlaphoheloa liphosong ho nka nako, 'me ke ka lebaka le utloahalang. Bakeng sa phallo e kenang ea lirekoto tse milione ka motsotsoana, ke ile ka tlameha ho ema metsotso e ka bang 25 hore ke hlaphoheloe, ke fane ka SSD.
level=info ts=2018-09-13T13:38:14.09650965Z caller=main.go:222 msg="Starting Prometheus" version="(version=2.3.2, branch=v2.3.2, revision=71af5e29e815795e9dd14742ee7725682fa14b7b)"
level=info ts=2018-09-13T13:38:14.096599879Z caller=main.go:223 build_context="(go=go1.10.1, user=Jenkins, date=20180725-08:58:13OURCE)"
level=info ts=2018-09-13T13:38:14.096624109Z caller=main.go:224 host_details="(Linux 4.15.0-32-generic #35-Ubuntu SMP Fri Aug 10 17:58:07 UTC 2018 x86_64 1bee9e9b78cf (none))"
level=info ts=2018-09-13T13:38:14.096641396Z caller=main.go:225 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-09-13T13:38:14.097715256Z caller=web.go:415 component=web msg="Start listening for connections" address=:9090
level=info ts=2018-09-13T13:38:14.097400393Z caller=main.go:533 msg="Starting TSDB ..."
level=info ts=2018-09-13T13:38:14.098718401Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536530400000 maxt=1536537600000 ulid=01CQ0FW3ME8Q5W2AN5F9CB7R0R
level=info ts=2018-09-13T13:38:14.100315658Z caller=web.go:467 component=web msg="router prefix" prefix=/prometheus
level=info ts=2018-09-13T13:38:14.101793727Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536732000000 maxt=1536753600000 ulid=01CQ78486TNX5QZTBF049PQHSM
level=info ts=2018-09-13T13:38:14.102267346Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536537600000 maxt=1536732000000 ulid=01CQ78DE7HSQK0C0F5AZ46YGF0
level=info ts=2018-09-13T13:38:14.102660295Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536775200000 maxt=1536782400000 ulid=01CQ7SAT4RM21Y0PT5GNSS146Q
level=info ts=2018-09-13T13:38:14.103075885Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536753600000 maxt=1536775200000 ulid=01CQ7SV8WJ3C2W5S3RTAHC2GHB
level=error ts=2018-09-13T14:05:18.208469169Z caller=wal.go:275 component=tsdb msg="WAL corruption detected; truncating" err="unexpected CRC32 checksum d0465484, want 0" file=/opt/prometheus/data/.prom2-data/wal/007357 pos=15504363
level=info ts=2018-09-13T14:05:19.471459777Z caller=main.go:543 msg="TSDB started"
level=info ts=2018-09-13T14:05:19.471604598Z caller=main.go:603 msg="Loading configuration file" filename=/etc/prometheus.yml
level=info ts=2018-09-13T14:05:19.499156711Z caller=main.go:629 msg="Completed loading of configuration file" filename=/etc/prometheus.yml
level=info ts=2018-09-13T14:05:19.499228186Z caller=main.go:502 msg="Server is ready to receive web requests."Bothata bo boholo ka mokhoa oa ho hlaphoheloa ke tšebeliso e phahameng ea memori. Leha seva ka tloaelo e ka sebetsa ka mokhoa o tsitsitseng ka mohopolo o lekanang, ha e soahlamana e kanna ea se khone ho hlaphoheloa ka lebaka la OOM. Tharollo feela eo ke e fumaneng e ne e le ho tima pokello ea data, ho phahamisa seva, ho e tlohella hore e fole, le ho qala bocha ka pokello e nolofalitsoe.
Ho futhumala
Boitšoaro bo bong bo lokelang ho elelloa nakong ea ho futhumatsa ke karolelano ea ts'ebetso e tlase ho tšebeliso e phahameng ea lisebelisoa hang ka mor'a ho qala. Ke bone ts'ebeliso ea bohlokoa ea CPU le memori nakong ea tse ling, empa eseng tsohle, tse qalang.


Marotholi a tšebeliso ea memori a bontša hore Prometheus ha e khone ho lokisa likoleke tsohle ho tloha qalong, 'me boitsebiso bo bong bo lahlehile.
Ha ke so fumane mabaka a tobileng a ho ba le CPU e phahameng le moroalo oa memori. Ke belaela hore e amana le ho qaptjoa ha letoto la nako le lecha hloohong ka lebelo le phahameng.
Li-spikes tsa boima ba CPU
Ntle le likhokahano, tse bakang mojaro o phahameng oa I / O, ke ile ka bona li-spikes tse tebileng ho mojaro oa CPU metsotso e meng le e meng e 'meli. Li-spikes li telele ha phallo e kenang e le holimo, 'me ho bonahala eka li bakoa ke 'mokelli oa lithōle oa Go, bonyane li-cores tse ling li sebelisoa ka botlalo.


Li-spikes tsena ha li bohlokoa hakaalo. Ho bonahala eka ha li etsahala, sebaka sa ho kena ka hare le metrics ea Prometheus ha e fumanehe, e leng se etsang hore data e theohe nakong ena ea nako.

U ka boela ua hlokomela hore morekisi oa Prometheus o leketlile motsotsoana o le mong.

Re ka bona likamano le pokello ea lithōle (GC).

fihlela qeto e
TSDB ho Prometheus 2 e potlakile, e khona ho sebetsana le letoto la limilione tsa nako mme ka nako e ts'oanang ho ngola likete ka motsotsoana ho sebelisa lisebelisoa tse itekanetseng. Tšebeliso ea CPU le disk I/O le eona e ea khahla. Mohlala oaka o bonts'itse ho fihla ho 200 metrics motsotsoana ka konokono e sebelisitsoeng.
Ha u rera ho atolosa, u lokela ho hopola mohopolo o lekaneng, 'me e lokela ho ba mohopolo oa sebele. Palo ea memori e sebelisitsoeng eo ke e boneng e ne e ka ba 5 GB ka 100 e ngola ka motsotsoana oa sephethephethe se kenang, seo, ha se kopantsoe le cache ea tsamaiso ea ts'ebetso, se faneng ka 000 GB ea mohopolo o tšoaretsoeng.
Ha e le hantle, ho ntse ho e-na le mosebetsi o mongata o lokelang ho etsoa ho laola li-spikes tsa CPU le disk I / O, 'me sena ha se makatse ha ho fanoe ka hore na Prometheus 2 TSDB e nyenyane hakae ha e bapisoa le InnoDB, TokuDB, RocksDB, WiredTiger, empa kaofela ba ne ba e-na le mathata a tšoanang qalong ea potoloho ea bophelo ba bona.
Source: www.habr.com
