Sevha peni pamavhoriyamu makuru muPostgreSQL

Kuenderera mberi nenyaya yekurekodha yakakura data hova yakasimudzwa ne yapfuura chinyorwa nezve partitioning, mune izvi tichatarisa nzira dzaungagona nadzo kuderedza "muviri" saizi yezvakachengetwa muPostgreSQL, uye maitiro avo pakuita seva.

Tichataura nezvazvo TOAST marongero uye kurongeka kwedata. "Paavhareji," nzira idzi hadzizochengetedze zviwanikwa zvakawandisa, asi pasina kugadzirisa kodhi yekushandisa zvachose.

Sevha peni pamavhoriyamu makuru muPostgreSQL
Zvisinei, ruzivo rwedu rwakazove rwakabudirira zvikuru munyaya iyi, sezvo kuchengetwa kweanenge chero kutarisa nemaitiro ayo kazhinji kuwedzera-chete maererano ne data rekodhi. Uye kana iwe uchinetseka kuti ungadzidzisa sei dhatabhesi kunyora kune diski panzvimbo 200MB / s hafu yakawanda - ndapota pasi pekati.

Zvidiki zvakavanzika zve data hombe

Nenhoroondo yebasa basa redu, dzinogara dzichibhururuka dzichiuya kwaari dzichibva kumaraini text packages.

Uye kubvira VLSI yakaomaane dhatabhesi yatinotarisa ndeye yakawanda-chikamu chigadzirwa chine yakaoma data zvimiro, wozobvunza nokuda kwekushanda kwakanyanya zvikaita sezvizvi "yakawanda-vhoriyamu" ine yakaoma algorithmic logic. Saka huwandu hwechiitiko chega chega chechikumbiro kana hurongwa hwekuita mulogi inouya kwatiri inoshanduka kuva "paavhareji" yakakura.

Ngatitarisei chimiro cheimwe yematafura atinonyora "raw" data - ndiko kuti, heino chinyorwa chepakutanga kubva pane yekupinda log:

CREATE TABLE rawdata_orig(
  pack -- PK
    uuid NOT NULL
, recno -- PK
    smallint NOT NULL
, dt -- ключ секции
    date
, data -- самое главное
    text
, PRIMARY KEY(pack, recno)
);

Chiratidzo chechimiro (chatove chikamu, hongu, saka iyi ndiyo template yechikamu), apo chinonyanya kukosha chinyorwa. Dzimwe nguva yakawanda voluminous.

Yeuka kuti saizi "yemuviri" yerekodhi imwe muPG haigone kutora peji rinopfuura rimwe re data, asi saizi "inonzwisisika" inyaya yakasiyana zvachose. Kunyora volumetric kukosha (varchar/text/bytea) kumunda, shandisa TOAST tekinoroji:

PostgreSQL inoshandisa saizi yepeji yakagadziriswa (kazhinji 8 KB), uye haitenderi tuples kutenderera mapeji akawanda. Nokudaro, hazvibviri kuchengetedza zvakananga maitiro makuru emunda. Kuti ukunde chipingamupinyi ichi, hombe dzemunda dzakatsikirirwa uye/kana kupatsanurwa pamitsetse yakawanda yemuviri. Izvi zvinoitika zvisingaonekwe nemushandisi uye zvine zvishoma zvinokanganisa pane yakawanda server kodhi. Iyi nzira inozivikanwa se TOAST...

Muchokwadi, kune yega tafura ine "inogoneka yakakura" minda, otomatiki tafura yakabatanidzwa ne "slicing" inogadzirwa imwe neimwe "huru" rekodhi muzvikamu zve2KB:

TOAST(
  chunk_id
    integer
, chunk_seq
    integer
, chunk_data
    bytea
, PRIMARY KEY(chunk_id, chunk_seq)
);

Ndiko, kana tichifanira kunyora tambo ine "huru" kukosha data, ipapo kurekodha chaiko kuchaitika kwete chete kutafura huru uye PK yayo, asiwo kuTOAST nePK yayo.

Kuderedza TOAST pesvedzero

Asi mazhinji emarekodhi edu haasati akakura kudaro, inofanira kukwana mu8KB - Ndingachengeta sei mari pane izvi? ..

Apa ndipo panouya hunhu kutibatsira STORAGE patafura yetafura:

  • Yakawedzerwa inobvumira zvose compression uye zvakasiyana kuchengetedza. Izvi standard sarudzo kune akawanda TOAST anoenderana nemhando dzedata. Inotanga kuedza kuita compression, yozoichengeta kunze kwetafura kana mutsara uchiri wakakura.
  • NORUOKO inobvumira kudzvanya asi kwete kuparadzanisa kuchengetedza. (Kutaura zvazviri, chengetedzo yakaparadzana ichiri kuitirwa makoramu akadaro, asi chete sechisarudzo chekupedzisira, kana pasina imwe nzira yekumisikidza tambo kuti ikwane papeji.)

Muchokwadi, izvi ndizvo chaizvo zvatinoda kune chinyorwa - inomesa zvakanyanya sezvinobvira, uye kana isingakwane zvachose, isa muTOAST. Izvi zvinogona kuitwa zvakananga panhunzi, nemurairo mumwechete:

ALTER TABLE rawdata_orig ALTER COLUMN data SET STORAGE MAIN;

Nzira yekuongorora mhedzisiro

Sezvo kuyerera kwedata kuchichinja zuva rega rega, isu hatigone kuenzanisa nhamba dzakakwana, asi mune zvakaenzana mugove muduku Takanyora pasi muTOAST - zvakanyanya nani. Asi pane njodzi pano - iyo yakakura iyo "yemuviri" vhoriyamu yerekodhi yega yega, iyo "yakafara" iyo index inova, nekuti isu tinofanirwa kuvhara mamwe mapeji edata.

Chikamu isati yachinja:

heap  = 37GB (39%)
TOAST = 54GB (57%)
PK    =  4GB ( 4%)

Chikamu mushure mekuchinja:

heap  = 37GB (67%)
TOAST = 16GB (29%)
PK    =  2GB ( 4%)

Chokwadi, isu akatanga kunyorera TOAST ka2 kashoma, iyo yakabudisa kwete disk chete, asiwo CPU:

Sevha peni pamavhoriyamu makuru muPostgreSQL
Sevha peni pamavhoriyamu makuru muPostgreSQL
Ini ndichacherechedza kuti isu tavewo madiki mu "kuverenga" dhisiki, kwete "kunyora" chete - sezvo pakuisa rekodhi mutafura, isu tinofanirawo "kuverenga" chikamu chemuti weindekisi imwe neimwe kuitira kuti tione. chinzvimbo chemangwana mavari.

Ndiani anogona kurarama zvakanaka paPostgreSQL 11

Mushure mekugadzirisa kuPG11, takasarudza kuenderera mberi "tuning" TOAST uye takaona kuti kutanga kubva pane iyi vhezheni parameter. toast_tuple_target:

TOAST processing code inobvira chete kana mutsara unokosha uchachengetwa patafura wakakura kudarika TOAST_TUPLE_THRESHOLD bytes (kazhinji 2 KB). Iyo TOAST kodhi ichamanikidza uye/kana kufambisa minda kukosha kunze kwetafura kusvika kukosha kwemutsara kwave kushoma pane TOAST_TUPLE_TARGET mabhayiti (inoshanduka kukosha, zvakare kazhinji 2 KB) kana saizi haigone kudzikiswa.

Isu takasarudza kuti data yatinowanzo kuve nayo "ipfupi kwazvo" kana "yakareba kwazvo", saka takasarudza kuzviganhurira kune hushoma hunokwanisika kukosha:

ALTER TABLE rawplan_orig SET (toast_tuple_target = 128);

Ngationei kuti zvigadziriso zvitsva zvakakanganisa sei kurodha dhisiki mushure mekugadzirisazve:

Sevha peni pamavhoriyamu makuru muPostgreSQL
Kusaipa! Avhareji mutsara wedhisiki wakadzikira anenge 1.5 nguva, uye dhisiki "yakabatikana" iri 20 muzana! Asi pamwe izvi zvakakanganisa CPU?

Sevha peni pamavhoriyamu makuru muPostgreSQL
At least hazvina kuipa. Kunyangwe, zvakaoma kutonga kana mavhoriyamu akadaro achiri kusakwanisa kusimudza avhareji yeCPU mutoro kumusoro 5%.

Nekushandura nzvimbo dzematemu, sum... inoshanduka!

Sezvaunoziva, peni inochengetedza ruble, uye nemavhoriyamu edu ekuchengetedza anenge 10TB/mwedzi kunyange optimization shoma inogona kupa purofiti yakanaka. Naizvozvo, isu takateerera kune chimiro chemuviri data yedu - sei chaizvo "akaturikidzana" minda mukati rekodhi imwe neimwe yematafura.

Nokuti nokuda kurongeka kwedata izvi zviri mberi inokanganisa huwandu hunobuda:

Zvivakwa zvakawanda zvinopa kurongeka kwedata pamuganho wemazwi emuchina. Semuyenzaniso, pa32-bit x86 system, integers (integer type, 4 bytes) ichaenderana pamuganho wezwi 4-byte, sezvinozoita nhamba dzemapoinzi anoyangarara kaviri (double precision floating point, 8 bytes). Uye pane 64-bit system, maitiro maviri anozoenderana ne8-byte mazwi miganhu. Ichi ndicho chimwe chikonzero chekusawirirana.

Nekuda kwekurongeka, ukuru hwemutsara wetafura zvinoenderana nekurongeka kweminda. Kazhinji chiitiko ichi hachioneki zvakanyanya, asi mune dzimwe nguva chinogona kutungamirira kukuwedzera kukuru kwehukuru. Semuyenzaniso, kana ukasanganisa char(1) neinteger fields, panowanzova ne3 bytes dzakaraswa pakati pawo.

Ngatitangei nemasynthetic modhi:

SELECT pg_column_size(ROW(
  '0000-0000-0000-0000-0000-0000-0000-0000'::uuid
, 0::smallint
, '2019-01-01'::date
));
-- 48 байт

SELECT pg_column_size(ROW(
  '2019-01-01'::date
, '0000-0000-0000-0000-0000-0000-0000-0000'::uuid
, 0::smallint
));
-- 46 байт

Ndekupi mamwe mabheti ekuwedzera akabuda mune yekutanga kesi? Zviri nyore - 2-byte diki yakarongedzwa pa4-byte muganhu pamberi pemunda unotevera, uye kana wava wekupedzisira, hapana uye hapana chikonzero chekuenzanisa.

Mune dzidziso, zvese zvakanaka uye unogona kurongazve minda sezvaunoda. Ngationgororei pane chaiyo data tichishandisa muenzaniso weimwe yematafura, chikamu chezuva nezuva chinotora 10-15GB.

Chimiro chekutanga:

CREATE TABLE public.plan_20190220
(
-- Унаследована from table plan:  pack uuid NOT NULL,
-- Унаследована from table plan:  recno smallint NOT NULL,
-- Унаследована from table plan:  host uuid,
-- Унаследована from table plan:  ts timestamp with time zone,
-- Унаследована from table plan:  exectime numeric(32,3),
-- Унаследована from table plan:  duration numeric(32,3),
-- Унаследована from table plan:  bufint bigint,
-- Унаследована from table plan:  bufmem bigint,
-- Унаследована from table plan:  bufdsk bigint,
-- Унаследована from table plan:  apn uuid,
-- Унаследована from table plan:  ptr uuid,
-- Унаследована from table plan:  dt date,
  CONSTRAINT plan_20190220_pkey PRIMARY KEY (pack, recno),
  CONSTRAINT chck_ptr CHECK (ptr IS NOT NULL),
  CONSTRAINT plan_20190220_dt_check CHECK (dt = '2019-02-20'::date)
)
INHERITS (public.plan)

Chikamu mushure mekuchinja kolamu kurongeka - chaizvo minda yakafanana, kurongeka kwakangosiyana:

CREATE TABLE public.plan_20190221
(
-- Унаследована from table plan:  dt date NOT NULL,
-- Унаследована from table plan:  ts timestamp with time zone,
-- Унаследована from table plan:  pack uuid NOT NULL,
-- Унаследована from table plan:  recno smallint NOT NULL,
-- Унаследована from table plan:  host uuid,
-- Унаследована from table plan:  apn uuid,
-- Унаследована from table plan:  ptr uuid,
-- Унаследована from table plan:  bufint bigint,
-- Унаследована from table plan:  bufmem bigint,
-- Унаследована from table plan:  bufdsk bigint,
-- Унаследована from table plan:  exectime numeric(32,3),
-- Унаследована from table plan:  duration numeric(32,3),
  CONSTRAINT plan_20190221_pkey PRIMARY KEY (pack, recno),
  CONSTRAINT chck_ptr CHECK (ptr IS NOT NULL),
  CONSTRAINT plan_20190221_dt_check CHECK (dt = '2019-02-21'::date)
)
INHERITS (public.plan)

Huwandu hwese hwechikamu hunotarwa nehuwandu hwe "chokwadi" uye zvinoenderana chete nemaitiro ekunze, saka ngatipatsane saizi yemurwi (pg_relation_size) nehuwandu hwezvinyorwa mairi - ndiko kuti, tinowana avhareji saizi yerekodhi chaiyo yakachengetwa:

Sevha peni pamavhoriyamu makuru muPostgreSQL
Minus 6% volume, Hukuru!

Asi zvese, hongu, hazvina kunaka - mushure mezvose, muma indexes hatigone kushandura marongero eminda, uye naizvozvo “zvakawanda” (pg_total_relation_size) ...

Sevha peni pamavhoriyamu makuru muPostgreSQL
... ndichiri pano futi yakachengetedzwa 1.5%pasina kushandura mutsara mumwe wekodhi. Hongu, hongu!

Sevha peni pamavhoriyamu makuru muPostgreSQL

Ini ndinocherechedza kuti sarudzo iri pamusoro yekuronga minda haisi iyo chokwadi chekuti ndiyo yakanyanya kunaka. Nekuti haudi "kubvarura" mamwe mabhuraki eminda nekuda kwezvikonzero zvekunaka - semuenzaniso, vaviri (pack, recno), inova PK yetafura iyi.

Kazhinji, kuona "zvishoma" kurongeka kweminda ibasa riri nyore "brute force". Naizvozvo, iwe unogona kuwana kunyange zvirinani mhedzisiro kubva kune yako data kupfuura yedu - edza!

Source: www.habr.com

Voeg