Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Ina ba da shawarar ku karanta kwafin rahoton ƙarshen 2019 na Alexander Valyalkin "Go ingantawa a cikin VictoriaMetrics"

VictoriaMetrics - DBMS mai sauri da zazzagewa don adanawa da sarrafa bayanai ta hanyar tsarin lokaci (rikodin yana samar da lokaci da saitin dabi'un da suka dace da wannan lokacin, alal misali, ana samun su ta hanyar jefa kuri'a na lokaci-lokaci na matsayin firikwensin ko tarin awo).

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Ga hanyar haɗi zuwa bidiyon wannan rahoto - https://youtu.be/MZ5P21j_HLE

Nunin faifai

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Faɗa mana game da kanku. Ni Alexander Vallyalkin. nan asusun na GitHub. Ina sha'awar Go da haɓaka aiki. Na rubuta littattafai masu amfani da yawa kuma ba masu amfani sosai ba. Suna farawa da ko wanne fast, ko tare da quick prefix.

A halin yanzu ina aiki akan VictoriaMetrics. Menene kuma me nake yi a can? Zan yi magana game da wannan a cikin wannan gabatarwar.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Fassarar rahoton shine kamar haka.

  • Da farko, zan gaya muku menene VictoriaMetrics.
  • Sa'an nan zan gaya muku abin da jerin lokaci ne.
  • Sa'an nan kuma zan gaya muku yadda tsarin bayanai na lokaci ke aiki.
  • Na gaba, zan gaya muku game da gine-ginen bayanai: abin da ya ƙunshi.
  • Kuma bari mu matsa zuwa ingantawa da VictoriaMetrics ke da shi. Wannan haɓakawa ne don jujjuyawar fihirisar da haɓakawa don aiwatar da bitset a cikin Go.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Shin wani a cikin masu sauraro ya san menene VictoriaMetrics? Kai, mutane da yawa sun riga sun sani. Labari ne mai dadi. Ga waɗanda ba su sani ba, wannan jerin bayanai ne na lokaci. Ya dogara ne akan ginin ClickHouse, akan wasu cikakkun bayanai na aiwatar da ClickHouse. Misali, akan irin su: MergeTree, lissafin layi daya akan duk abin da ake da shi na kayan masarufi da haɓaka aiki ta hanyar aiki akan tubalan bayanan da aka sanya a cikin cache na processor.

VictoriaMetrics yana ba da mafi kyawun matse bayanai fiye da sauran jerin bayanai na lokaci.

Yana yin ma'auni a tsaye - wato, zaku iya ƙara ƙarin na'urori masu sarrafawa, ƙarin RAM akan kwamfuta ɗaya. VictoriaMetrics za ta yi nasarar amfani da waɗannan albarkatun da ake da su kuma za su inganta yawan aiki na layi.

VictoriaMetrics kuma yana yin ma'auni a kwance - wato, zaku iya ƙara ƙarin nodes zuwa gungu na VictoriaMetrics, kuma aikinsa zai ƙaru kusan a layi.

Kamar yadda kuka yi tsammani, VictoriaMetrics babban bayanai ne mai sauri, saboda ba zan iya rubuta wasu ba. Kuma an rubuta shi a cikin Go, don haka ina magana game da shi a wannan taron.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Wanene ya san menene jerin lokaci? Ya kuma san mutane da yawa. Jerin lokaci jerin nau'i-nau'i ne (timestamp, значение), inda aka jera waɗannan nau'i-nau'i ta lokaci. Ƙimar lamba ce mai iyo - float64.

Kowane jerin lokaci ana gano shi ta musamman ta maɓalli. Menene wannan makullin ya kunsa? Ya ƙunshi nau'i-nau'i-na ƙima mara amfani.

Ga misalin jerin lokaci. Makullin wannan silsilar shine jerin nau'i-nau'i: __name__="cpu_usage" shine sunan ma'aunin, instance="my-server" - wannan ita ce kwamfutar da ake tattara wannan ma'aunin a kanta, datacenter="us-east" - wannan ita ce cibiyar bayanai inda wannan kwamfutar take.

Mun ƙare da jerin sunaye wanda ya ƙunshi nau'i-nau'i masu ƙima guda uku. Wannan maɓalli yayi dace da jerin nau'i-nau'i (timestamp, value). t1, t3, t3, ..., tN - Waɗannan su ne lokutan lokaci, 10, 20, 12, ..., 15 - daidaitattun dabi'u. Wannan shine amfani da cpu a wani lokaci da aka ba don jerin da aka bayar.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

A ina za a iya amfani da jerin lokaci? Akwai wanda yake da wani ra'ayi?

  • A cikin DevOps, zaku iya auna CPU, RAM, cibiyar sadarwa, rps, adadin kurakurai, da sauransu.
  • IoT - zamu iya auna zafin jiki, matsa lamba, daidaitawar geo da wani abu dabam.
  • Har ila yau, kuɗi - za mu iya saka idanu kan farashi don kowane nau'i na hannun jari da agogo.
  • Bugu da ƙari, ana iya amfani da jerin lokaci a cikin sa ido kan ayyukan samarwa a masana'antu. Muna da masu amfani waɗanda ke amfani da VictoriaMetrics don saka idanu injin turbin iska, don mutummutumi.
  • Hakanan jerin lokaci suna da amfani don tattara bayanai daga na'urori masu auna firikwensin na'urori daban-daban. Misali, ga injin; don auna matsi na taya; don auna saurin gudu, nisa; domin auna yawan man fetur da sauransu.
  • Hakanan ana iya amfani da jerin lokaci don sa ido kan jirgin sama. Kowane jirgin sama yana da akwatin baƙar fata wanda ke tattara jerin lokaci don sigogin lafiya daban-daban na jirgin. Hakanan ana amfani da jerin lokaci a cikin masana'antar sararin samaniya.
  • Kiwon lafiya shine hawan jini, bugun jini, da sauransu.

Ana iya samun ƙarin aikace-aikacen da na manta da su, amma ina fata ku fahimci cewa ana amfani da jerin lokaci sosai a duniyar zamani. Kuma yawan amfanin su yana girma kowace shekara.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Me yasa kuke buƙatar tsarin bayanai na lokaci? Me ya sa ba za ku iya amfani da bayanan alaƙa na yau da kullun don adana jerin lokaci ba?

Domin jerin lokuta yawanci suna ɗauke da bayanai masu yawa, waɗanda ke da wahalar adanawa da sarrafa su a cikin bayanan al'ada. Saboda haka, bayanan bayanai na musamman don jerin lokaci sun bayyana. Waɗannan sansanonin suna adana maki yadda ya kamata (timestamp, value) tare da makullin da aka bayar. Suna samar da API don karanta bayanan da aka adana ta maɓalli, ta maɓalli ɗaya-daraja, ko ta maɓalli-darajar nau'i-nau'i, ko ta regexp. Misali, kuna son nemo nauyin CPU na duk ayyukanku a cibiyar bayanai a Amurka, sannan kuna buƙatar amfani da wannan tambayar-ƙira.

Yawancin lokaci jerin bayanai suna ba da yarukan tambaya na musamman saboda jerin lokaci SQL bai dace sosai ba. Kodayake akwai bayanan bayanan da ke goyan bayan SQL, bai dace sosai ba. Harsunan tambaya kamar PromQL, InfluxQL, ƙarƙashinsu, Q. Ina fata wani ya ji aƙalla ɗaya daga cikin waɗannan harsuna. Wataƙila mutane da yawa sun ji labarin PromQL. Wannan shine yaren tambaya na Prometheus.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Wannan shine abin da tsarin gine-gine na zamani ya yi kama da amfani da VictoriaMetrics a matsayin misali.

Ya ƙunshi sassa biyu. Wannan ma'adana ce don jujjuyawar fihirisar da ma'ajiya don ƙimar jerin lokaci. Waɗannan ma'ajin sun rabu.

Lokacin da sabon rikodin ya shigo cikin ma'ajin bayanai, za mu fara samun damar jujjuyawar fihirisar don nemo mai gano jerin lokaci don saitin da aka bayar. label=value don ma'aunin da aka bayar. Mun sami wannan mai ganowa kuma muna adana ƙimar a cikin ma'ajin bayanai.

Lokacin da buƙatun ya zo don dawo da bayanai daga TSDB, mu fara zuwa maƙasudin jujjuyawar. Bari mu sami komai timeseries_ids bayanan da suka dace da wannan saitin label=value. Sannan muna samun dukkan bayanan da suka wajaba daga ma'ajiyar bayanai, wanda aka lissafta ta timeseries_ids.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Bari mu kalli misali na yadda tsarin bayanai na lokaci ke aiwatar da tambayar zaɓi mai shigowa.

  • Da farko tana samun komai timeseries_ids daga fihirisar jujjuyawar da ta ƙunshi nau'i-nau'i da aka bayar label=value, ko gamsar da magana akai-akai.
  • Sannan yana dawo da duk bayanan da aka adana daga ma'adanar bayanai a wani tazara da aka ba su ga wadanda aka samu timeseries_ids.
  • Bayan haka, ma'adanin yana yin wasu ƙididdiga akan waɗannan wuraren bayanan, bisa ga buƙatar mai amfani. Kuma bayan haka ya mayar da amsar.

A cikin wannan gabatarwa zan ba ku labarin kashi na farko. Wannan bincike ne timeseries_ids ta jujjuyawar index. Zaku iya kallon kashi na biyu da kashi na uku daga baya Madogararsa VictoriaMetrics, ko jira har sai na shirya wasu rahotanni :)

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Bari mu ci gaba zuwa inverted index. Mutane da yawa na iya tunanin wannan abu ne mai sauƙi. Wanene ya san abin da inverted index yake da kuma yadda yake aiki? Oh, ba mutane da yawa ba kuma. Mu yi kokarin fahimtar menene.

Yana da sauƙi a zahiri. Kamus ne kawai wanda ke tsara maɓalli zuwa ƙima. Menene maɓalli? Wannan ma'aurata label=valueinda label и value - waɗannan layi ne. Kuma ƙimar saiti ne timeseries_ids, wanda ya hada da biyu da aka ba label=value.

Fihirisar jujjuyawar tana ba ku damar gano komai da sauri timeseries_ids, wanda suka bayar label=value.

Hakanan yana ba ku damar ganowa da sauri timeseries_ids jerin lokaci don nau'i-nau'i da yawa label=value, ko na ma'aurata label=regexp. Ta yaya hakan ke faruwa? Ta hanyar nemo mahadar saitin timeseries_ids ga kowane nau'i-nau'i label=value.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Bari mu dubi aiwatarwa daban-daban na jujjuyawar fihirisar. Bari mu fara da aiwatar da butulci mafi sauƙi. Ta kamani.

aiki getMetricIDs yana samun jerin kirtani. Kowane layi ya ƙunshi label=value. Wannan aikin yana dawo da jeri metricIDs.

Ta yaya yake aiki? Anan muna da canjin duniya mai suna invertedIndex. Wannan ƙamus ne na yau da kullun (map), wanda zai yi taswirar zaren don yanki ints. Layin ya ƙunshi label=value.

Aiwatar da aiki: samu metricIDs na farko label=value, sai mu wuce ta kowane abu label=value, mun samu metricIDs gare su. Kuma kira aikin intersectInts, wanda za a tattauna a kasa. Kuma wannan aikin yana mayar da mahadar waɗannan lissafin.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Kamar yadda kake gani, aiwatar da jujjuyawar fihirisa ba ta da wahala sosai. Amma wannan aiwatarwa ne na butulci. Wane lahani yake da shi? Babban hasara na aiwatar da butulci shine cewa ana adana irin wannan juzu'in juzu'i a cikin RAM. Bayan sake kunna aikace-aikacen mun rasa wannan fihirisar. Babu ajiyar wannan fihirisar zuwa faifai. Irin wannan jujjuyawar fihirisa ba shi yiwuwa ya dace da bayanan bayanai.

Matsala ta biyu kuma tana da alaƙa da ƙwaƙwalwa. Ma'anar jujjuyawar dole ne ta dace da RAM. Idan ya wuce girman RAM, to a fili za mu samu - daga kuskuren ƙwaƙwalwar ajiya. Kuma shirin ba zai yi aiki ba.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Ana iya magance wannan matsala ta amfani da shirye-shiryen da aka yi kamar su Babban darajar DB, ko RocksDB.

A takaice, muna buƙatar rumbun adana bayanai wanda zai ba mu damar yin ayyuka uku cikin sauri.

  • Aikin farko shine rikodi ключ-значение zuwa wannan database. Ta yi wannan da sauri, a ina ключ-значение igiyoyi ne na sabani.
  • Aiki na biyu shine saurin neman ƙima ta amfani da maɓalli da aka bayar.
  • Kuma aiki na uku shine bincike mai sauri don duk dabi'u ta prefix da aka bayar.

LevelDB da RocksDB - Google da Facebook ne suka haɓaka waɗannan bayanan. Ya fara zuwa LevelDB. Sai mutanen Facebook suka ɗauki LevelDB suka fara inganta shi, suka yi RocksDB. Yanzu kusan duk bayanan bayanan ciki suna aiki akan RocksDB a cikin Facebook, gami da waɗanda aka canjawa wuri zuwa RocksDB da MySQL. Suka sanya masa suna MyRocks.

Ana iya aiwatar da jujjuyawar fihirisa ta amfani da LevelDB. Yadda za a yi? Muna ajiyewa azaman maɓalli label=value. Kuma ƙimar ita ce mai gano jerin lokutan inda ma'auratan ke nan label=value.

Idan muna da jerin lokaci da yawa tare da nau'i-nau'i da aka ba label=value, to za a sami layuka da yawa a cikin wannan ma'ajin bayanai masu maɓalli iri ɗaya kuma daban-daban timeseries_ids. Don samun lissafin duka timeseries_ids, wanda ya fara da wannan label=prefix, Muna yin sikanin kewayon wanda aka inganta wannan bayanan. Wato muna zabar duk layukan da suka fara da label=prefix kuma samun abin da ake bukata timeseries_ids.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Anan ga samfurin aiwatar da yadda zai yi kama da Go. Muna da jujjuyawar fihirisa. Wannan shine LevelDB.

Ayyukan iri ɗaya ne da aiwatar da butulci. Yana maimaita aiwatar da butulci kusan layi ta layi. Maganar kawai ita ce maimakon juya zuwa map mun isa ga inverted index. Muna samun duk dabi'u na farko label=value. Sa'an nan kuma mu shiga cikin dukan sauran nau'i-nau'i label=value kuma a sami madaidaitan saitin metricIDs gare su. Sa'an nan kuma mu sami mahada.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Komai yana da kyau, amma akwai kurakurai ga wannan maganin. VictoriaMetrics ta fara aiwatar da jujjuyawar fihirisa bisa LevelDB. Amma a karshe dole in bar shi.

Me yasa? Saboda LevelDB yana da hankali fiye da aiwatar da butulci. A cikin aiwatar da butulci, da aka ba maɓalli da aka ba, nan da nan muka dawo da dukan yanki metricIDs. Wannan aiki ne mai sauri - gaba ɗaya yanki yana shirye don amfani.

A cikin LevelDB, duk lokacin da aka kira aiki GetValues kana bukatar ka bi ta duk layin da suka fara da label=value. Kuma sami darajar kowane layi timeseries_ids. Na irin wannan timeseries_ids tattara yanki na waɗannan timeseries_ids. Babu shakka, wannan yana da hankali sosai fiye da samun dama ga taswirar yau da kullun ta maɓalli.

Matsala ta biyu ita ce an rubuta LevelDB a cikin C. Kiran ayyukan C daga Go ba shi da sauri sosai. Yana ɗaukar ɗaruruwan nanoseconds. Wannan ba sauri ba ne, saboda idan aka kwatanta da kiran aiki na yau da kullum da aka rubuta a cikin tafi, wanda ke ɗaukar 1-5 nanoseconds, bambanci a cikin aikin shine sau goma. Ga VictoriaMetrics wannan babban aibi ne :)

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Don haka na rubuta aiwatar da kaina na inverted index. Ya kira ta hadewa.

Mergeset ya dogara ne akan tsarin bayanan MergeTree. An aro wannan tsarin bayanan daga ClickHouse. Babu shakka, ya kamata a inganta haɗin haɗin gwiwa don bincike mai sauri timeseries_ids bisa ga maɓallin da aka bayar. An rubuta Mergeset gaba ɗaya a cikin Go. Kuna iya gani Madogaran VictoriaMetrics akan GitHub. Aiwatar da haɗawa yana cikin babban fayil /lib/hade. Kuna iya ƙoƙarin gano abin da ke faruwa a wurin.

API ɗin haɗin kai yayi kama da LevelDB da RocksDB. Wato, yana ba ku damar adana sabbin bayanai da sauri a wurin kuma da sauri zaɓi rikodin ta prefix ɗin da aka bayar.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Za mu yi magana game da rashin amfanin haɗuwa daga baya. Yanzu bari muyi magana game da matsalolin da suka taso tare da VictoriaMetrics a samarwa yayin aiwatar da inverted index.

Me yasa suka tashi?

Dalili na farko shi ne babban churn kudi. Fassara zuwa Rashanci, wannan sauyi ne akai-akai a jerin lokuta. Wannan shi ne lokacin da jerin lokaci ya ƙare kuma sabon jerin ya fara, ko kuma yawancin sabbin jerin lokutan farawa. Kuma wannan yana faruwa sau da yawa.

Dalili na biyu shine yawan adadin lokaci. A farkon, lokacin da saka idanu ke samun karbuwa, adadin adadin lokaci ya kasance kaɗan. Misali, ga kowace kwamfuta kuna buƙatar saka idanu akan CPU, ƙwaƙwalwar ajiya, hanyar sadarwa da nauyin diski. 4 lokaci jerin kowace kwamfuta. Bari mu ce kuna da kwamfutoci 100 da jerin lokuta 400. Wannan kadan ne.

A tsawon lokaci, mutane sun gano cewa za su iya auna ƙarin bayani. Misali, auna nauyin ba na gaba dayan masarrafar ba, sai dai daban na kowane core processor. Idan kana da nau'in sarrafawa guda 40, to, kuna da ƙarin jerin lokaci sau 40 don auna nauyin processor.

Amma ba haka kawai ba. Kowane processor core na iya samun jihohi da yawa, kamar marasa aiki, lokacin da ba ya aiki. Kuma kuma yi aiki a cikin sararin mai amfani, aiki a sararin kwaya da sauran jihohi. Kuma kowace irin wannan jiha kuma ana iya auna ta azaman jerin lokuta daban. Wannan kuma yana ƙara adadin layuka da sau 7-8.

Daga ma'auni ɗaya mun sami 40 x 8 = 320 metrics don kwamfuta ɗaya kawai. Idan aka ninka da 100, muna samun 32 maimakon 000.

Sai Kubernetes ya zo tare. Kuma ya yi muni saboda Kubernetes na iya daukar nauyin ayyuka daban-daban. Kowane sabis a Kubernetes ya ƙunshi kwasfa masu yawa. Kuma duk wannan yana buƙatar kulawa. Bugu da kari, muna da yawan tura sabbin nau'ikan ayyukan ku. Ga kowane sabon sigar, dole ne a ƙirƙiri sabon jerin lokaci. A sakamakon haka, yawan adadin lokaci yana girma da yawa kuma muna fuskantar matsalar yawan adadin lokaci, wanda ake kira high-cardinality. VictoriaMetrics yana jure shi cikin nasara idan aka kwatanta da sauran jerin bayanai na lokaci.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Bari mu yi la'akari a kusa da high churn kudi. Menene ke haifar da hauhawar farashin kayayyaki? Domin wasu ma'anoni na label da tags suna canzawa koyaushe.

Misali, ɗauki Kubernetes, wanda ke da ra'ayi deployment, watau lokacin da aka fitar da sabon sigar aikace-aikacen ku. Don wasu dalilai, masu haɓaka Kubernetes sun yanke shawarar ƙara id ɗin turawa zuwa alamar.

Menene wannan ya kai ga? Bugu da ƙari, tare da kowane sabon turawa, duk tsoffin jerin lokutan ana katsewa, kuma a maimakon su, sabon jerin lokaci yana farawa da sabon ƙimar lakabin. deployment_id. Ana iya samun dubban ɗaruruwan har ma da miliyoyin irin waɗannan layuka.

Muhimmin abu game da wannan duka shi ne cewa jimlar adadin lokaci na girma, amma yawan adadin lokutan da ke aiki a halin yanzu da karɓar bayanai ya kasance akai-akai. Ana kiran wannan jihar high churn rate.

Babban matsalar yawan churn shine tabbatar da saurin bincike akai-akai don kowane jerin lokaci don saitin lakabin da aka ba da akan wani tazarar lokaci. Yawanci wannan shine tazarar lokaci na awa ta ƙarshe ko ranar ƙarshe.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Yadda za a magance wannan matsala? Ga zaɓi na farko. Wannan shine don raba jujjuyawar fihirisar zuwa sassa masu zaman kansu na tsawon lokaci. Wato, wasu tazarar lokaci ya wuce, mun gama aiki tare da inverted index na yanzu. Kuma ƙirƙirar sabon jujjuyawar fihirisa. Wani tazarar lokaci ya wuce, muna ƙirƙirar wani kuma wani.

Kuma lokacin da ake yin samfuri daga waɗannan fihirisar da aka juyar da su, muna samun jujjuyawar fihirisa waɗanda suka faɗi cikin tazarar da aka bayar. Kuma, bisa ga haka, muna zaɓar id na jerin lokutan daga can.

Wannan yana adana albarkatu saboda ba lallai ne mu kalli sassan da ba su faɗi cikin tazarar da aka bayar ba. Wato, yawanci, idan muka zaɓi bayanai na sa'a ta ƙarshe, to don tazarar da ta gabata muna tsallake tambayoyin.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Akwai wani zaɓi don magance wannan matsalar. Wannan shine don adana wa kowace rana keɓan jerin ids na jerin lokutan da suka faru a wannan ranar.

Amfanin wannan maganin akan maganin da ya gabata shine ba mu kwafin bayanan jerin lokaci waɗanda ba sa ɓacewa akan lokaci. Suna nan kullum kuma ba sa canzawa.

Rashin hasara shi ne cewa irin wannan maganin ya fi wuya a aiwatar da shi kuma ya fi wuya a cire shi. Kuma VictoriaMetrics ya zaɓi wannan bayani. Haka abin ya faru a tarihi. Wannan bayani kuma yana aiki da kyau idan aka kwatanta da wanda ya gabata. Domin ba a aiwatar da wannan maganin ba saboda kasancewar ya zama dole a kwafi bayanai a kowane bangare don jerin lokutan da ba sa canzawa, watau wadanda ba sa bacewa kan lokaci. An inganta VictoriaMetrics da farko don amfani da sararin faifai, kuma aiwatarwar da ta gabata ta sa yawan amfani da sarari ya yi muni. Amma wannan aiwatarwa ya fi dacewa don rage yawan amfani da sararin faifai, don haka an zaɓi shi.

Dole na yi mata fada. Gwagwarmayar ita ce a cikin wannan aiwatarwa har yanzu kuna buƙatar zaɓar lamba mafi girma timeseries_ids don bayanai fiye da lokacin da inverted fihirisar ke raba lokaci.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Ta yaya muka magance wannan matsalar? Mun warware shi ta hanyar asali - ta hanyar adana abubuwan gano jerin lokuta da yawa a cikin kowace shigarwar jujjuyawar fihirisa maimakon mai ganowa ɗaya. Wato muna da maɓalli label=value, wanda ke faruwa a kowane lokaci jerin. Kuma yanzu mun ajiye da yawa timeseries_ids cikin shiga daya.

Ga misali. A baya muna da shigarwar N, amma yanzu muna da shigarwa guda ɗaya wanda prefix ɗinta yayi daidai da sauran. Don shigarwar da ta gabata, ƙimar ta ƙunshi duk jerin ids na lokaci.

Wannan ya ba da damar ƙara saurin dubawa na irin wannan juzu'i mai jujjuyawa har sau 10. Kuma ya ba mu damar rage yawan ƙwaƙwalwar ajiya don cache, saboda yanzu muna adana kirtani label=value sau ɗaya kawai a cikin cache tare N sau. Kuma wannan layin na iya zama babba idan kun adana dogayen layi a cikin tambarin ku da tambarin ku, waɗanda Kubernetes ke son turawa a can.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Wani zaɓi don hanzarta bincike akan jujjuyawar fihirisa shine sharding. Ƙirƙirar jujjuyawar fihirisa da yawa maimakon ɗaya da raba bayanai tsakanin su ta maɓalli. Wannan saiti ne key=value tururi. Wato muna samun injunan juzu'i masu zaman kansu da yawa, waɗanda za mu iya tambaya a layi daya akan na'urori masu sarrafawa da yawa. Ayyukan da suka gabata sun ba da izinin aiki a yanayin mai sarrafawa ɗaya kawai, watau, bincika bayanai akan cibiya ɗaya kawai. Wannan bayani yana ba ku damar bincika bayanai akan mahimman bayanai a lokaci ɗaya, kamar yadda ClickHouse ke son yin. Wannan shi ne abin da muke shirin aiwatarwa.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Yanzu bari mu koma ga tumakinmu - zuwa aikin haɗin gwiwa timeseries_ids. Bari mu yi la'akari da abin da aiwatarwa zai iya zama. Wannan aikin yana ba ku damar nemo timeseries_ids don saitin da aka ba label=value.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Zaɓin farko shine aiwatar da butulci. madaukai masu gida biyu. Anan muna samun shigar da aikin intersectInts guda biyu - a и b. A wurin fitarwa, yakamata ya dawo mana da mahadar waɗannan yankan.

Aiwatar da butulci yayi kama da wannan. Muna maimaita duk ƙimar daga yanki a, A cikin wannan madauki muna tafiya ta duk ƙimar yanki b. Kuma muna kwatanta su. Idan sun dace, to, mun sami wata hanya. Kuma ajiye shi a ciki result.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Menene rashin amfani? Hadaddun Quadrate shine babban koma bayansa. Misali, idan girman ku yanki ne a и b miliyan daya a lokaci guda, to wannan aikin ba zai taba mayar muku da amsa ba. Domin zai bukaci yin tazarar tiriliyan daya, wanda ya yi yawa har ma da kwamfutocin zamani.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Aiwatar ta biyu ta dogara ne akan taswira. Muna ƙirƙirar taswira. Mun sanya duk ƙimar daga yanki a cikin wannan taswirar a. Sa'an nan kuma mu tafi ta hanyar yanki a cikin madauki daban b. Kuma muna bincika ko wannan ƙimar daga yanki ne b cikin taswira. Idan akwai, to ƙara shi zuwa sakamakon.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Menene amfanin? Fa'idar ita ce akwai hadaddun layin layi kawai. Wato, aikin zai aiwatar da sauri da sauri don manyan yanka. Don yanki mai girman miliyon, wannan aikin zai aiwatar a cikin juzu'i miliyan 2, sabanin juzu'in tiriliyan na aikin da ya gabata.

Rashin ƙasa shine cewa wannan aikin yana buƙatar ƙarin ƙwaƙwalwar ajiya don ƙirƙirar wannan taswira.

Matsala ta biyu ita ce babban abin hawa don hashing. Wannan koma baya ba a bayyane yake ba. Kuma a gare mu ma ba a bayyane yake ba, don haka da farko a cikin VictoriaMetrics aiwatar da tsaka-tsakin ya kasance ta taswira. Amma sai bayanan martaba ya nuna cewa lokacin babban na'ura mai sarrafa yana kashe rubutawa zuwa taswira da kuma bincika kasancewar darajar a cikin wannan taswira.

Me yasa ake bata lokacin CPU a wadannan wuraren? Domin Go yana yin aikin hashing akan waɗannan layukan. Wato yana ƙididdige hash ɗin maɓalli don samun dama gare shi a wata fihirisar da aka bayar a HashMap. Ana kammala aikin lissafin hash a cikin dubun nanoseconds. Wannan yana jinkirin don VictoriaMetrics.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Na yanke shawarar aiwatar da bitset wanda aka inganta musamman don wannan harka. Wannan shi ne abin da mahaɗin yanka biyu ya yi kama yanzu. Anan muna ƙirƙirar bitset. Muna ƙara abubuwa daga yanki na farko zuwa gare shi. Sa'an nan kuma mu duba kasancewar waɗannan abubuwa a cikin yanki na biyu. Kuma ƙara su zuwa sakamakon. Wato kusan bai bambanta da misalin da ya gabata ba. Abinda kawai anan shine mun maye gurbin samun damar yin taswira tare da ayyuka na al'ada add и has.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Da farko kallo, yana da alama cewa wannan ya kamata ya yi aiki a hankali, idan a baya an yi amfani da taswirar misali a can, sannan ana kiran wasu ayyuka, amma bayanin martaba yana nuna cewa wannan abu yana aiki sau 10 da sauri fiye da taswirar taswirar VictoriaMetrics.

Bugu da ƙari, yana amfani da ƙananan ƙwaƙwalwar ajiya idan aka kwatanta da aiwatar da taswirar. Domin muna adana ragi a nan maimakon ƙima-byte takwas.

Lalacewar wannan aiwatarwa shi ne, ba a fili yake ba, ba qarami ba.

Wani koma baya wanda mutane da yawa ba za su lura ba shine cewa wannan aiwatarwa bazai yi aiki da kyau a wasu lokuta ba. Wato, an inganta shi don takamaiman harka, don wannan yanayin tsaka-tsakin lokaci na ids na VictoriaMetrics. Wannan ba yana nufin ya dace da kowane yanayi ba. Idan aka yi amfani da shi ba daidai ba, ba za mu sami karuwar aiki ba, amma kuskuren ƙwaƙwalwar ajiya da raguwar aiki.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Bari mu yi la'akari da aiwatar da wannan tsari. Idan kana son duba, yana cikin tushen VictoriaMetrics, a cikin babban fayil lib/uint64set. An inganta shi musamman don shari'ar VictoriaMetrics, inda timeseries_id darajar 64-bit ne, inda farkon 32 ragowa suna dawwama kuma kawai 32 ragowa na ƙarshe suna canzawa.

Wannan tsarin bayanan ba a adana shi akan faifai ba, yana aiki ne kawai a cikin ƙwaƙwalwar ajiya.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Ga API ɗin sa. Ba shi da wahala sosai. API ɗin an keɓance shi musamman ga takamaiman misali na amfani da VictoriaMetrics. Wato babu ayyukan da ba dole ba a nan. Anan ga ayyukan da VictoriaMetrics ke amfani da su a sarari.

Akwai ayyuka add, wanda ke ƙara sababbin dabi'u. Akwai aiki has, wanda ke bincika sabbin dabi'u. Kuma akwai aiki del, wanda ke cire dabi'u. Akwai aikin taimako len, wanda ke mayar da girman saitin. Aiki clone clones da yawa. Kuma aiki appendto yana canza wannan saitin zuwa yanki timeseries_ids.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Wannan shine yadda aiwatar da wannan tsarin bayanan yayi kama. saitin yana da abubuwa guda biyu:

  • ItemsCount filin taimako ne don dawo da adadin abubuwan da ke cikin saiti da sauri. Zai yiwu a yi ba tare da wannan filin taimako ba, amma dole ne a ƙara shi a nan saboda VictoriaMetrics sau da yawa yana tambayar tsawon bitset a cikin algorithms.

  • Filin na biyu shine buckets. Wannan yanki ne daga tsarin bucket32. Kowane tsari yana adanawa hi filin. Waɗannan su ne na sama 32 bits. Kuma guda biyu - b16his и buckets daga bucket16 Tsarin.

Ana adana manyan ragi 16 na kashi na biyu na tsarin 64-bit anan. Kuma a nan ana adana bitsets don ƙananan 16 bits na kowane byte.

Bucket64 ya ƙunshi tsararru uint64. Ana ƙididdige tsayin ta amfani da waɗannan madaidaitan. A daya bucket16 iyakar za a iya adanawa 2^16=65536 bit. Idan ka raba wannan da 8, to yana da 8 kilobytes. Idan ka sake raba ta 8, 1000 ne uint64 ma'ana. Wato Bucket16 – Wannan shine tsarin mu na kilobyte 8.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Bari mu dubi yadda ɗayan hanyoyin wannan tsarin don ƙara sabon ƙima ake aiwatar da shi.

Duk yana farawa da uint64 ma'ana. Muna lissafta na sama 32 ragowa, muna lissafin ƙananan 32 bits. Bari mu bi ta komai buckets. Muna kwatanta manyan ragi 32 a cikin kowane guga tare da ƙimar da aka ƙara. Kuma idan sun dace, to muna kiran aikin add a tsarin b32 buckets. Kuma ƙara ƙananan 32 bits a can. Kuma idan ya dawo true, to wannan yana nufin cewa mun ƙara irin wannan darajar a can kuma ba mu da irin wannan darajar. Idan ya dawo false, to, irin wannan ma'anar ta riga ta wanzu. Sa'an nan kuma mu ƙara yawan adadin abubuwa a cikin tsarin.

Idan ba mu sami wanda kuke buƙata ba bucket tare da hi-darajar da ake buƙata, sannan mu kira aikin addAlloc, wanda zai haifar da wani sabo bucket, ƙara shi zuwa tsarin guga.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Wannan shine aiwatar da aikin b32.add. Ya yi kama da aiwatar da baya. Muna ƙididdige mafi mahimmancin 16 ragowa, mafi ƙarancin mahimmin ragi 16.

Sa'an nan kuma mu shiga cikin duk manyan 16 ragowa. Muna samun ashana. Kuma idan akwai wasa, muna kiran hanyar ƙara, wanda za mu yi la'akari da shi a shafi na gaba don bucket16.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Kuma a nan ne matakin mafi ƙasƙanci, wanda ya kamata a inganta shi gwargwadon yiwuwar. Muna lissafin don uint64 darajar id a yanki bit da kuma bitmask. Wannan abin rufe fuska ne don ƙimar 64-bit da aka bayar, wanda za'a iya amfani dashi don bincika kasancewar wannan bit, ko saita shi. Muna bincika don ganin idan an saita wannan bit kuma an saita shi, da dawo da kasancewar. Wannan ita ce aiwatar da mu, wanda ya ba mu damar hanzarta aikin intersecting ids na jerin lokaci sau 10 idan aka kwatanta da taswirorin al'ada.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Baya ga wannan haɓakawa, VictoriaMetrics yana da sauran haɓakawa da yawa. Yawancin waɗannan haɓakawa an ƙara su don dalili, amma bayan bayanan martaba da lambar a samarwa.

Wannan ita ce babbar ka'idar ingantawa - kar a ƙara ingantawa a ɗauka cewa za a sami ƙugiya a nan, saboda yana iya zama ba za a sami matsala a can ba. Haɓakawa yawanci yana lalata ingancin lambar. Sabili da haka, yana da daraja ingantawa kawai bayan bayanan martaba kuma zai fi dacewa a samarwa, don haka wannan shine ainihin bayanan. Idan kowa yana da sha'awar, zaku iya duba lambar tushe ta VictoriaMetrics kuma bincika wasu ingantattu da suke can.

Tafi ingantawa a cikin VictoriaMetrics. Alexander Vallykin

Ina da tambaya game da bitset. Yayi kama da aikin C++ vector bool, ingantaccen bitset. Shin kun dauki aiwatarwa daga can?

A'a, ba daga can ba. Lokacin aiwatar da wannan bitset, an jagorance ni da sanin tsarin waɗannan jerin lokutan ids, waɗanda ake amfani da su a cikin VictoriaMetrics. Kuma tsarin su shine irin cewa na sama 32 bits suna dawwama. Ƙananan 32 bits suna iya canzawa. Ƙananan ƙananan, sau da yawa yana iya canzawa. Saboda haka, an inganta wannan aiwatarwa musamman don wannan tsarin bayanai. Aiwatar da C++, kamar yadda na sani, an inganta shi don shari'ar gaba ɗaya. Idan kun inganta don shari'ar gabaɗaya, wannan yana nufin cewa ba zai zama mafi kyawu ga takamaiman yanayin ba.

Ina kuma ba ku shawara ku kalli rahoton Alexey Milovid. Kusan wata daya da suka gabata, yayi magana game da ingantawa a cikin ClickHouse don takamaiman ƙwarewa. Ya ce kawai a cikin yanayin gabaɗaya, aiwatar da C ++ ko wasu aiwatarwa an keɓance su don yin aiki da kyau a matsakaici a asibiti. Yana iya yin muni fiye da takamaiman aiwatarwa na ilimi kamar namu, inda muka san cewa manyan ragi 32 galibi koyaushe ne.

Ina da tambaya ta biyu. Menene babban bambanci daga InfluxDB?

Akwai bambance-bambance na asali da yawa. Dangane da aiki da amfani da ƙwaƙwalwar ajiya, InfluxDB a cikin gwaje-gwaje yana nuna ƙarin amfani da ƙwaƙwalwar ajiya sau 10 don jerin lokaci mai girma, lokacin da kuke da su da yawa, misali, miliyoyin. Misali, VictoriaMetrics tana cinye 1 GB a kowace layuka masu aiki, yayin da InfluxDB ke cinye 10 GB. Kuma wannan babban bambanci ne.

Bambanci na asali na biyu shine InfluxDB yana da baƙon yarukan tambaya - Flux da InfluxQL. Ba su da matukar dacewa don aiki tare da jerin lokaci idan aka kwatanta da PromQL, wanda VictoriaMetrics ke tallafawa. PromQL yaren tambaya ne daga Prometheus.

Kuma wani ƙarin bambanci shine InfluxDB yana da ɗan ƙaramin ƙirar bayanai, inda kowane layi zai iya adana filayen da yawa tare da saiti daban-daban. Waɗannan layukan an ƙara raba su zuwa tebur daban-daban. Waɗannan ƙarin rikice-rikice suna rikitar da aiki na gaba tare da wannan bayanan. Yana da wuya a goyan baya da fahimta.

A cikin VictoriaMetrics komai ya fi sauƙi. A can, kowane jerin lokaci yana da maɓalli-darajar. Ƙimar saitin maki - (timestamp, value), kuma mabuɗin shine saitin label=value. Babu rabuwa tsakanin filaye da ma'auni. Yana ba ka damar zaɓar kowane bayanai sannan ka haɗa, ƙara, ragi, ninka, raba, sabanin InfluxDB inda har yanzu ba a aiwatar da lissafin tsakanin layuka daban-daban gwargwadon na sani. Ko da an aiwatar da su, yana da wahala, dole ne ku rubuta lambar da yawa.

Ina da tambaya mai fayyace. Shin na gane daidai cewa akwai wata matsala da kuka yi magana a kai, cewa wannan jujjuyawar index bai dace da ƙwaƙwalwar ajiya ba, don haka akwai rabuwa a can?

Na farko, na nuna aiwatar da butulci na jujjuyawar fihirisa akan daidaitaccen taswirar Go. Wannan aiwatarwa bai dace da ma'ajin bayanai ba saboda wannan jujjuyar fihirisa ba a ajiye shi zuwa faifai ba, kuma dole ne a adana bayanan a cikin faifai domin wannan bayanan ya kasance yana samuwa yayin sake farawa. A cikin wannan aiwatarwa, lokacin da kuka sake kunna aikace-aikacen, jujjuyar fihirisar ku zata ɓace. Kuma za ku rasa damar yin amfani da duk bayanan saboda ba za ku iya samun su ba.

Sannu! Na gode da rahoton! Sunana Pavel. Ni daga Wildberries ne. Ina da 'yan tambayoyi gare ku. Tambaya ta daya. Kuna tsammanin cewa da kun zaɓi wata ka'ida ta daban lokacin gina gine-ginen aikace-aikacenku kuma kuna raba bayanan cikin lokaci, to, da wataƙila za ku iya haɗa bayanai yayin bincike, bisa la'akari da cewa bangare ɗaya ya ƙunshi bayanai guda ɗaya kawai. lokaci , wato, a cikin lokaci guda kuma ba za ku damu da gaskiyar cewa sassanku sun warwatse daban ba? Lambar tambaya 2 - tunda kuna aiwatar da irin wannan algorithm tare da bitset da komai, to watakila kun gwada yin amfani da umarnin sarrafawa? Wataƙila kun gwada irin wannan ingantawa?

Zan amsa na biyu nan take. Har yanzu ba mu kai ga wannan matakin ba. Amma idan ya cancanta, za mu isa can. Kuma na farko, menene tambaya?

Kun tattauna al'amura guda biyu. Kuma sun ce sun zabi na biyu tare da aiwatarwa mai rikitarwa. Kuma ba su gwammace na farko ba, inda ake raba bayanai da lokaci.

Ee. A cikin shari'ar farko, jimillar ƙididdiga za ta fi girma, domin a cikin kowane bangare dole ne mu adana bayanan da aka kwafi don jerin lokutan da ke ci gaba ta duk waɗannan ɓangarori. Kuma idan adadin churn ɗin ku na lokaci kaɗan ne, watau ana amfani da jerin iri ɗaya akai-akai, to a cikin yanayin farko za mu yi hasarar da yawa a cikin adadin sararin diski idan aka kwatanta da na biyun.

Sabili da haka - a, rabon lokaci shine zaɓi mai kyau. Prometheus yana amfani da shi. Amma Prometheus yana da wani koma baya. Lokacin haɗa waɗannan guntuwar bayanai, yana buƙatar kiyayewa a cikin bayanan meta na ƙwaƙwalwar ajiya don duk lakabi da jerin lokuta. Saboda haka, idan guntuwar bayanan da ya haɗa sun yi girma, to, yawan ƙwaƙwalwar ajiya yana ƙaruwa sosai yayin haɗuwa, sabanin VictoriaMetrics. Lokacin haɗuwa, VictoriaMetrics baya cinye ƙwaƙwalwar ajiya kwata-kwata; kilobytes biyu ne kawai ake cinyewa, ba tare da la'akari da girman guntuwar bayanai ba.

Algorithm din da kuke amfani da shi yana amfani da ƙwaƙwalwar ajiya. Yana yiwa alama jerin lokuta masu ɗauke da ƙima. Kuma ta wannan hanyar kuna bincika kasancewar haɗin haɗin gwiwa a cikin tsararrun bayanai ɗaya kuma a cikin wani. Kuma kun fahimci ko intersect ya faru ko a'a. Yawanci, ma'ajin bayanai suna aiwatar da siginoni da na'urori waɗanda ke adana abubuwan da suke ciki a halin yanzu kuma suna gudana ta hanyar da aka jera su saboda sauƙi na waɗannan ayyukan.

Me yasa ba ma amfani da siginan kwamfuta don kewaya bayanai?

Ee.

Muna adana layuka da aka jera a cikin LevelDB ko haɗuwa. Za mu iya matsar da siginan kwamfuta da kuma nemo intersection. Me ya sa ba za mu yi amfani da shi ba? Domin a hankali. Domin masu lanƙwasa suna nufin cewa kana buƙatar kiran aiki don kowane layi. Kiran aiki shine 5 nanose seconds. Kuma idan kuna da layukan 100, to, ya zama cewa muna kashe rabin daƙiƙa kawai muna kiran aikin.

Akwai irin wannan abu, eh. Da tambaya ta ta karshe. Tambayar na iya zama ɗan ban mamaki. Me yasa ba zai yiwu a karanta duk abubuwan da ake buƙata ba a lokacin da bayanai suka zo kuma a adana su a cikin hanyar da ake bukata? Me yasa adana babban kundin a wasu tsarin kamar VictoriaMetrics, ClickHouse, da sauransu, sannan ku ciyar da lokaci mai yawa akan su?

Zan ba da misali don bayyana shi. Bari mu ce ta yaya ƙaramin gudun abin wasan yara ke aiki? Yana rikodin nisan da kuka yi tafiya, koyaushe yana ƙara shi zuwa ƙima ɗaya, kuma na biyu - lokaci. Kuma rarraba. Kuma yana samun matsakaicin gudu. Kuna iya yin game da abu ɗaya. Haɗa duk abubuwan da suka dace akan tashi.

To, na fahimci tambayar. Misalin ku yana da wurinsa. Idan kun san abin da tarawa kuke buƙata, to wannan shine mafi kyawun aiwatarwa. Amma matsalar ita ce, mutane suna adana waɗannan ma'auni, wasu bayanai a cikin ClickHouse kuma har yanzu ba su san yadda za su tara da tace su a nan gaba ba, don haka dole ne su adana duk cikakkun bayanai. Amma idan kun san kuna buƙatar ƙididdige wani abu a matsakaici, to me yasa ba za ku lissafta shi ba maimakon adana bunch of raw values ​​can? Amma wannan shine kawai idan kun san ainihin abin da kuke buƙata.

Af, ma'ajin bayanai don adana jerin lokaci suna goyan bayan ƙidayar aggregates. Misali, Prometheus yana goyan bayan dokokin rikodi. Wato ana iya yin hakan idan kun san raka'o'in da kuke buƙata. VictoriaMetrics ba ta da wannan tukuna, amma yawanci Prometheus yana gaba da shi, wanda za'a iya yin hakan a cikin ka'idojin sake fasalin.

Misali, a cikin aikina na baya ina buƙatar ƙidaya adadin abubuwan da suka faru a cikin taga mai zamewa a cikin awa ta ƙarshe. Matsalar ita ce dole ne in yi aiwatar da al'ada a Go, watau sabis don kirga wannan abu. Wannan sabis ɗin ba shi da mahimmanci, saboda yana da wuyar ƙididdigewa. Aiwatar da aiwatarwa na iya zama mai sauƙi idan kuna buƙatar ƙidaya wasu tari a ƙayyadaddun lokaci. Idan kuna son ƙidaya abubuwan da suka faru a cikin taga mai zamewa, to ba shi da sauƙi kamar yadda ake gani. Ina tsammanin har yanzu ba a aiwatar da wannan ba a cikin ClickHouse ko a cikin ɗakunan bayanai na lokuta, saboda yana da wahalar aiwatarwa.

Da kuma wata tambaya. Muna magana ne kawai game da matsakaici, kuma na tuna cewa an taɓa samun irin wannan abu kamar Graphite tare da bayan Carbon. Kuma ya san yadda ake fitar da tsofaffin bayanai, wato, barin maki daya a minti daya, maki daya a cikin awa daya, da dai sauransu. A ka'ida, wannan ya dace sosai idan muna bukatar danyen bayanai, in mun gwada da magana, na wata daya, da duk abin da zai iya. a rage . Amma Prometheus da VictoriaMetrics ba sa goyan bayan wannan aikin. Shin an shirya don tallafa masa? Idan ba haka ba, me zai hana?

Na gode da tambayar. Masu amfani da mu suna yin wannan tambayar lokaci-lokaci. Suna tambayar lokacin da za mu ƙara goyon baya don raguwa. Akwai matsaloli da yawa a nan. Da farko, kowane mai amfani ya fahimta downsampling wani abu dabam: wani yana so ya sami kowane ma'ana na sabani akan wani tazara da aka bayar, wani yana son matsakaicin, ƙarami, matsakaicin dabi'u. Idan tsarin da yawa suna rubuta bayanai zuwa bayananku, to ba za ku iya tattara su gaba ɗaya ba. Yana iya zama kowane tsarin yana buƙatar ɓacin rai daban-daban. Kuma wannan yana da wahalar aiwatarwa.

Kuma abu na biyu shi ne cewa VictoriaMetrics, kamar ClickHouse, an inganta shi don yin aiki a kan manyan bayanai masu yawa, don haka zai iya kwashe layin biliyan a cikin ƙasa da dakika idan kuna da yawa a cikin tsarin ku. Makiyoyin jerin lokaci na duba a cikin VictoriaMetrics - maki 50 a sakan daya a kowace cibiya. Kuma wannan aikin yana da ma'auni zuwa maƙallan da ke akwai. Wato idan kana da cores 000, misali, zaku duba maki biliyan daya a sakan daya. Kuma wannan kadarorin na VictoriaMetrics da ClickHouse yana rage buƙatar saukarwa.

Wani fasalin kuma shine VictoriaMetrics yana danne wannan bayanan yadda ya kamata. Matsawa a matsakaici a cikin samarwa yana daga 0,4 zuwa 0,8 bytes a kowace aya. Kowane batu shine tambarin lokaci + ƙima. Kuma ana matse shi zuwa ƙasa da byte ɗaya a matsakaici.

Sergey. Ina da tambaya Menene mafi ƙarancin lokacin rikodi?

millisecond ɗaya. Kwanan nan mun sami tattaunawa tare da wasu masu haɓaka bayanan jerin lokaci. Mafi ƙarancin lokacin su shine daƙiƙa ɗaya. Kuma a cikin Graphite, alal misali, shi ma daƙiƙa ɗaya ne. A cikin OpenTSDB kuma yana da daƙiƙa ɗaya. InfluxDB yana da madaidaicin nanosecond. A cikin VictoriaMetrics millisecond ɗaya ne, domin a cikin Prometheus yana da ɗaki ɗaya. Kuma VictoriaMetrics an samo asali ne azaman ma'aji mai nisa don Prometheus. Amma yanzu yana iya ajiye bayanai daga wasu tsarin.

Mutumin da na yi magana da shi ya ce suna da daidaito na biyu zuwa na biyu - wannan ya ishe su saboda ya danganta da nau'in bayanan da ake adanawa a cikin bayanan lokaci. Idan wannan bayanan DevOps ne ko bayanai daga abubuwan more rayuwa, inda kuka tattara su a cikin tazara na daƙiƙa 30, a cikin minti ɗaya, to daidaito na biyu ya isa, ba kwa buƙatar komai kaɗan. Kuma idan kun tattara wannan bayanan daga tsarin kasuwanci mai girma, to kuna buƙatar daidaiton nanosecond.

Daidaiton Millisecond a cikin VictoriaMetrics shima ya dace da shari'ar DevOps, kuma yana iya dacewa da yawancin lamuran da na ambata a farkon rahoton. Abinda kawai wanda bazai dace ba shine tsarin kasuwancin mitar mita.

Na gode! Da wata tambaya. Menene daidaituwa a cikin PromQL?

Cikakken dacewa da baya. VictoriaMetrics cikakke yana goyan bayan PromQL. Bugu da ƙari, yana ƙara ƙarin ayyuka na ci gaba a cikin PromQL, wanda ake kira MetricsQL. Akwai magana akan YouTube game da wannan tsawaita aikin. Na yi magana a taron sa ido a cikin bazara a St. Petersburg.

Telegram channel VictoriaMetrics.

Masu amfani da rajista kawai za su iya shiga cikin binciken. Shigadon Allah.

Me ke hana ku canzawa zuwa VictoriaMetrics a matsayin ajiyar ku na dogon lokaci don Prometheus? (Rubuta a cikin sharhi, zan ƙara shi zuwa rumfunan zabe))

  • 71,4%Ba na amfani da Prometheus5

  • 28,6%Ban sani ba game da VictoriaMetrics2

Masu amfani 7 sun kada kuri'a. Masu amfani 12 sun kaurace.

source: www.habr.com

Add a comment