Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

An kwatanta VictoriaMetrics, TimecaleDB da InfluxDB a cikin labarin da ya gabata akan saitin bayanai tare da maki biliyan biliyan na jerin lokuta na musamman na 40K.

Bayan 'yan shekarun da suka wuce akwai zamanin Zabbix. Kowane uwar garken karfe ba shi da fiye da ƴan alamomi - amfanin CPU, amfani da RAM, amfani da diski da kuma amfani da hanyar sadarwa. Ta wannan hanyar, ma'auni daga dubban sabobin zasu iya shiga cikin jerin lokuta na musamman na 40, kuma Zabbix na iya amfani da MySQL azaman bayanan baya don bayanan jerin lokaci :)

A halin yanzu kadai node_exporter tare da saitunan tsoho yana ba da fiye da ma'auni 500 akan matsakaicin mai watsa shiri. Akwai da yawa masu fitarwa don ma'ajin bayanai daban-daban, sabar yanar gizo, tsarin hardware, da sauransu. Dukkansu suna ba da ma'auni masu amfani iri-iri. Duka aikace-aikace da yawa fara saita alamomi daban-daban don kansu. Akwai Kubernetes tare da gungu da kwasfa waɗanda ke fallasa ma'auni da yawa. Wannan yana haifar da sabobin da ke fallasa dubban ma'auni na musamman a kowane runduna. Don haka na musamman na 40K jerin lokaci ba shi da babban iko. Yana zama na al'ada kuma ya kamata kowane TSDB na zamani ya sarrafa shi cikin sauƙi akan sabar guda ɗaya.

Menene babban adadin jerin lokuta na musamman a yanzu? Wataƙila 400K ko 4M? Ko 40m? Bari mu kwatanta TSDB na zamani da waɗannan lambobi.

Shigar da ma'auni

TSBS kyakkyawan kayan aikin benchmarking ne don TSDBs. Yana ba ku damar ƙirƙira adadin ma'auni na sabani ta hanyar wuce adadin da ake buƙata na jerin lokaci zuwa kashi 10 - tuta. - sikelin (tsohon -scale-var). 10 shine adadin ma'auni (ma'auni) da aka samar akan kowane mai watsa shiri ko uwar garken. An ƙirƙiri bayanan bayanan masu zuwa ta amfani da TSBS don maƙasudin:

  • Jerin lokaci na musamman na 400K, tazara ta biyu tsakanin maki 60, bayanai sun mamaye cikakkun kwanaki 3, ~ 1.7B jimlar adadin bayanan.
  • Jerin lokaci na musamman na 4M, tazara na biyu na 600, bayanai sun wuce cikakkun kwanaki 3, ~ 1.7B jimlar adadin bayanai.
  • Jerin lokaci na musamman na 40M, tazarar sa'a 1, bayanai sun wuce cikakkun kwanaki 3, ~ 2.8B jimlar adadin bayanai.

Abokin ciniki da uwar garken suna gudana akan abubuwan da aka keɓe n1-misali-16 a cikin Google Cloud. Waɗannan misalan sun sami daidaitawa kamar haka:

  • vCPUs: 16
  • RAM: 60 GB
  • Adana: Daidaitaccen 1TB HDD. Yana ba da 120 Mbps karantawa / rubuta kayan aiki, ayyukan karantawa 750 a sakan daya da 1,5K ya rubuta a sakan daya.

An fitar da TSDBs daga hotunan docker na hukuma kuma an gudanar da su a cikin docker tare da saitunan masu zuwa:

  • VictoriaMetrics:

    docker run -it --rm -v /mnt/disks/storage/vmetrics-data:/victoria-metrics-data -p 8080:8080 valyala/victoria-metrics

  • Ana buƙatar ƙimar InfluxDB (-e) don tallafawa babban iko. Duba cikakkun bayanai a ciki takardun):

    docker run -it --rm -p 8086:8086 
    -e INFLUXDB_DATA_MAX_VALUES_PER_TAG=4000000 
    -e INFLUXDB_DATA_CACHE_MAX_MEMORY_SIZE=100g 
    -e INFLUXDB_DATA_MAX_SERIES_PER_DATABASE=0 
    -v /mnt/disks/storage/influx-data:/var/lib/influxdb influxdb

  • TimecaleDB (tsarin da aka ɗauka daga shi fayil):

MEM=`free -m | grep "Mem" | awk ‘{print $7}’`
let "SHARED=$MEM/4"
let "CACHE=2*$MEM/3"
let "WORK=($MEM-$SHARED)/30"
let "MAINT=$MEM/16"
let "WAL=$MEM/16"
docker run -it — rm -p 5432:5432 
--shm-size=${SHARED}MB 
-v /mnt/disks/storage/timescaledb-data:/var/lib/postgresql/data 
timescale/timescaledb:latest-pg10 postgres 
-cmax_wal_size=${WAL}MB 
-clog_line_prefix="%m [%p]: [%x] %u@%d" 
-clogging_collector=off 
-csynchronous_commit=off 
-cshared_buffers=${SHARED}MB 
-ceffective_cache_size=${CACHE}MB 
-cwork_mem=${WORK}MB 
-cmaintenance_work_mem=${MAINT}MB 
-cmax_files_per_process=100

An gudanar da mai ɗaukar bayanai tare da zaren layi ɗaya guda 16.

Wannan labarin ya ƙunshi sakamako kawai don alamomin sakawa. Za a buga sakamakon maƙasudin zaɓi a cikin wani labarin dabam.

400K na musamman jerin lokaci

Bari mu fara da abubuwa masu sauƙi - 400K. Sakamakon ma'auni:

  • VictoriaMetrics: 2,6M bayanai maki a dakika; RAM amfani: 3 GB; Girman bayanan ƙarshe akan faifai: 965 MB
  • InfluxDB: maki bayanai 1.2M a sakan daya; RAM amfani: 8.5 GB; Girman bayanan ƙarshe akan faifai: 1.6 GB
  • Matsakaicin lokaci: 849K maki bayanan daƙiƙa; RAM amfani: 2,5 GB; Girman bayanan ƙarshe akan faifai: 50 GB

Kamar yadda kuke gani daga sakamakon da ke sama, VictoriaMetrics yayi nasara a cikin aikin sakawa da rabon matsawa. Timeline yana samun nasara a cikin amfani da RAM, amma yana amfani da sararin faifai mai yawa - 29 bytes a kowane ma'aunin bayanai.

A ƙasa akwai ginshiƙan amfani da CPU don kowane TSDBs yayin ma'auni:

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: VictoriaMetrics - nauyin CPU yayin gwajin sakawa don ma'auni na musamman na 400K.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: InfluxDB - nauyin CPU yayin gwajin sakawa don ma'auni na musamman na 400K.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: TimecaleDB - lodin CPU yayin gwajin sakawa don ma'auni na musamman na 400K.

VictoriaMetrics tana amfani da duk samuwan vCPUs, yayin da InfluxDB ba ta da amfani ~ 2 cikin 16 vCPUs.

Timecale kawai yana amfani da 3-4 na 16 vCPUs. Babban adadin iowait da tsarin a cikin jadawalin lokutan TimescaleDB suna nuna ƙulli a cikin tsarin shigarwa/fitarwa (I/O). Bari mu kalli faifan amfani da bandwidth na faifai:

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton hoto: VictoriaMetrics - Amfani da Bandwidth na Disk a cikin Gwajin Shiga don Ma'auni na Musamman 400K.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama hoton sikirin ne: InfluxDB - Amfani da Bandwidth na Disk akan Gwajin Shiga don Ma'auni na Musamman 400K.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama hoton sikirin ne: TimecaleDB - Amfani da Bandwidth na Disk akan Gwajin Shiga don Ma'auni na Musamman 400K.

VictoriaMetrics tana rikodin bayanai a 20 Mbps tare da kololuwa har zuwa 45 Mbps. Kololuwa sun yi daidai da manyan haɗe-haɗe a cikin bishiyar NGO.

InfluxDB yana rubuta bayanai a 160 MB/s, yayin da tuƙin TB 1 ya kamata a iyakance rubuta kayan aiki 120 MB/s.

TimecaleDB yana iyakance don rubuta kayan aiki na 120 Mbps, amma wani lokacin yakan karya wannan iyaka kuma ya kai 220 Mbps a cikin ƙimar ƙima. Waɗannan kololuwa sun yi daidai da kwaruruka na rashin isasshen amfani da CPU a cikin jadawali da ya gabata.

Bari mu kalli jadawali masu amfani da shigarwa/fitarwa (I/O):

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: VictoriaMetrics - Saka gwajin I/O amfani don ma'auni na musamman na 400K.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: InfluxDB - Saka gwajin I/O amfani don ma'auni na musamman na 400K.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: TimecaleDB - Saka gwajin I/O amfani don ma'auni na musamman na 400K.

Yanzu ya bayyana a fili cewa TimescaleDB yana kaiwa iyakar I/O, don haka ba zai iya amfani da ragowar 12 vCPUs ba.

4M jerin lokaci na musamman

4M jerin lokaci suna kallon ɗan ƙalubale. Amma ’yan takararmu sun ci wannan jarrabawar cikin nasara. Sakamakon ma'auni:

  • VictoriaMetrics: 2,2M bayanai maki a dakika; RAM amfani: 6 GB; Girman bayanan ƙarshe akan faifai: 3 GB.
  • InfluxDB: maki bayanai 330K a sakan daya; RAM amfani: 20,5 GB; Girman bayanan ƙarshe akan faifai: 18,4 GB.
  • TimecaleDB: 480K bayanai maki da biyu; RAM amfani: 2,5 GB; Girman bayanan ƙarshe akan faifai: 52 GB.

Ayyukan InfluxDB ya ragu daga maki 1,2M a kowane daƙiƙa don jerin lokaci na 400K zuwa maki 330K a sakan daya don jerin lokaci na 4M. Wannan babbar hasara ce idan aka kwatanta da sauran masu fafatawa. Bari mu kalli jadawalan amfani da CPU don fahimtar tushen dalilin wannan asara:

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: VictoriaMetrics - Amfani da CPU yayin gwajin sakawa don jerin lokuta na musamman na 4M.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: InfluxDB - Amfani da CPU yayin gwajin sakawa don jerin lokuta na musamman na 4M.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: TimecaleDB - Amfani da CPU yayin gwajin sakawa don jerin lokuta na musamman na 4M.

VictoriaMetrics yana amfani da kusan dukkan ƙarfin na'urar sarrafawa (CPU). Digo a ƙarshen ya yi daidai da ragowar haɗin LSM bayan an shigar da duk bayanan.

InfluxDB yana amfani da 8 kawai cikin 16 vCPUs, yayin da TimsecaleDB yana amfani da 4 daga 16 vCPUs. Menene jadawalinsu ya haɗa? Babban rabo iowait, wanda kuma yana nuna alamar I/O.

TimecaleDB yana da babban rabo system. Muna ɗauka cewa babban iko ya haifar da yawancin kiran tsarin ko da yawa ƙananan kurakuran shafi.

Bari mu kalli faifan faifan kayan aiki:

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: VictoriaMetrics - Amfani da bandwidth don saka ma'auni na musamman na 4M.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: InfluxDB - Amfani da bandwidth na diski don saka ma'auni na musamman na 4M.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: TimecaleDB - Amfani da bandwidth na diski don saka ma'auni na musamman na 4M.

VictoriaMetrics ya kai iyaka na 120 MB/s a mafi girma, yayin da matsakaicin saurin rubutu ya kasance 40 MB/s. Wataƙila an yi manyan ɓangarorin LSM masu nauyi a lokacin kololuwar.

InfluxDB yana sake fitar da matsakaicin matsakaicin rubutu na 200 MB/s tare da kololuwa har zuwa 340 MB/s akan faifai tare da iyakar rubuta 120 MB/s :)

TimecaleDB baya iyakance faifai. Ya bayyana an iyakance shi da wani abu mai alaƙa da babban rabo системной CPU lodi.

Bari mu kalli jadawali masu amfani da IO:

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: VictoriaMetrics - Amfani da I/O yayin gwajin sakawa don keɓaɓɓen jerin lokutan 4M.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama akwai hoton allo: InfluxDB - Amfani da I/O yayin gwajin sakawa don jerin lokuta na musamman na 4M.

Babban ma'auni na TSDB VictoriaMetrics vs TimecaleDB vs InfluxDB

A sama hoton sikirin ne: TimecaleDB - Amfani da I/O yayin gwajin saka don jerin lokuta na musamman na 4M.

Hanyoyin amfani da IO suna madubi na bandwidth na diski - InfluxDB yana iyakance IO, yayin da VictoriaMetrics da TimescaleDB suna da albarkatun IO.

40M jerin lokaci na musamman

Tsarin lokaci na musamman na 40M ya yi girma ga InfluxDB :)

Sakamakon ma'auni:

  • VictoriaMetrics: maki bayanai 1,7M a sakan daya; RAM amfani: 29 GB; Amfanin sararin samaniya: 17 GB.
  • InfluxDB: Bai gama ba saboda yana buƙatar fiye da 60GB na RAM.
  • TimecaleDB: 330K bayanai maki a sakan daya, RAM amfani: 2,5 GB; Amfanin sararin samaniya: 84GB.

TimecaleDB yana nuna ƙarancin amfani da RAM na musamman a 2,5 GB - iri ɗaya da na keɓaɓɓen ma'aunin 4M da 400K.

VictoriaMetrics a hankali ya haɓaka a cikin adadin maki bayanai 100k a cikin daƙiƙa guda har sai an sarrafa duk sunaye na awo 40M. Sannan ya samu ci gaba mai dorewa na saka bayanai na maki 1,5-2,0M a cikin dakika daya, don haka sakamakon karshe ya kasance maki 1,7M a sakan daya.

Hotunan jerin lokuta na musamman na 40M sun yi kama da jadawali don jerin lokuta na musamman na 4M, don haka bari mu tsallake su.

binciken

  • TSDBs na zamani suna da ikon sarrafa abubuwan da aka saka don miliyoyin jerin lokuta na musamman akan sabar guda ɗaya. A cikin labarin na gaba, za mu gwada yadda TSDBs ke yin zaɓi a cikin miliyoyin jerin lokuta na musamman.
  • Rashin isassun amfani da CPU yawanci yana nuna kwalaben I/O. Hakanan yana iya nuna cewa toshewar ya yi yawa sosai, tare da ƴan zaren da za su iya aiki a lokaci ɗaya.
  • Ƙaƙƙarfan ƙaƙƙarfan I/O yana wanzu, musamman a cikin ma'ajin da ba na SSD ba kamar na'urorin toshewa na masu samar da girgije.
  • VictoriaMetrics yana ba da mafi kyawun ingantawa don jinkirin, ƙaramin I/O ajiya. Yana bayar da mafi kyawun gudu da mafi kyawun matsi.

Zazzagewa Hoton uwar garken guda ɗaya na VictoriaMetrics kuma gwada shi akan bayanan ku. Madaidaicin madaidaicin binary yana samuwa a GitHub.

Kara karantawa game da VictoriaMetrics a cikin wannan labarin.

Sabuntawa: an buga labarin kwatanta saka aikin VictoriaMetrics tare da InfluxDB tare da reproducible sakamakon.

Sabunta #2: Karanta kuma Labari akan sikeli a tsaye VictoriaMetrics vs InfluxDB vs TimecaleDB.

Sabunta #3: VictoriaMetrics yanzu buɗaɗɗen tushe!

Tattaunawar Telegram: https://t.me/VictoriaMetrics_ru1

source: www.habr.com

Add a comment