Txuag ib lub nyiaj ntawm cov ntim loj hauv PostgreSQL

Txuas ntxiv lub ntsiab lus ntawm kev sau cov ntaub ntawv loj loj tau tsa los ntawm Previous tsab xov xwm hais txog kev faib tawm, hauv no peb yuav saib txoj hauv kev uas koj tuaj yeem ua tau txo qhov "lub cev" loj ntawm qhov khaws cia hauv PostgreSQL, thiab lawv qhov cuam tshuam rau kev ua haujlwm ntawm server.

Peb mam li tham txog TOAST teeb tsa thiab cov ntaub ntawv sib dhos. "Ntawm qhov nruab nrab," cov txheej txheem no yuav tsis txuag ntau cov peev txheej, tab sis tsis muaj kev hloov kho daim ntawv thov code txhua.

Txuag ib lub nyiaj ntawm cov ntim loj hauv PostgreSQL
Txawm li cas los xij, peb qhov kev paub dhau los ua tau zoo heev hauv qhov no, txij li kev cia ntawm yuav luag txhua qhov kev saib xyuas los ntawm nws qhov xwm txheej feem ntau append-tsuas nyob rau hauv cov nqe lus ntawm cov ntaub ntawv kaw. Thiab yog tias koj xav paub yuav ua li cas koj tuaj yeem qhia cov ntaub ntawv sau rau disk hloov 200MB / s ib nrab li ntau - thov hauv qab miv.

Me secrets ntawm cov ntaub ntawv loj

Los ntawm txoj haujlwm profile peb qhov kev pabcuam, lawv tsis tu ncua ya mus rau nws los ntawm lub lairs pob ntawv ntawv.

Thiab txij li VLSI complexnws cov ntaub ntawv peb saib xyuas yog cov khoom siv ntau yam khoom nrog cov ntaub ntawv nyuaj, tom qab ntawd cov lus nug rau qhov ua tau zoo tshaj plaws tig tawm zoo li no "multi-volume" nrog complex algorithmic logic. Yog li qhov ntim ntawm txhua tus neeg piv txwv ntawm kev thov lossis cov phiaj xwm ua tiav hauv lub cav uas tuaj rau peb hloov mus ua "qhov nruab nrab" loj heev.

Cia peb saib cov qauv ntawm ib lub rooj uas peb sau "raws" cov ntaub ntawv - uas yog, ntawm no yog cov ntawv qub los ntawm cov ntawv nkag:

CREATE TABLE rawdata_orig(
  pack -- PK
    uuid NOT NULL
, recno -- PK
    smallint NOT NULL
, dt -- ключ секции
    date
, data -- самое главное
    text
, PRIMARY KEY(pack, recno)
);

Ib qho kev kos npe (tam sim no tau faib, tau kawg, yog li qhov no yog cov qauv ntu), qhov tseem ceeb tshaj plaws yog cov ntawv nyeem. Qee zaum heev voluminous.

Nco qab tias "lub cev" loj ntawm ib cov ntaub ntawv hauv PG tsis tuaj yeem tuav ntau tshaj ib nplooj ntawv ntawm cov ntaub ntawv, tab sis qhov loj "logical" yog qhov sib txawv kiag li. Txhawm rau sau tus nqi volumetric (varchar/text/bytea) rau ib daim teb, siv TOAST technology:

PostgreSQL siv cov nplooj ntawv tas li (feem ntau yog 8 KB), thiab tsis pub tuples hla ntau nplooj ntawv. Yog li ntawd, nws yog tsis yooj yim sua kom ncaj qha khaws cov nqi teb loj heev. Txhawm rau kov yeej qhov kev txwv no, cov nqi teb loj yog compressed thiab/lossis faib hla ntau kab ntawm lub cev. Qhov no tshwm sim tsis pom los ntawm tus neeg siv thiab muaj kev cuam tshuam me ntsis rau feem ntau cov server code. Txoj kev no hu ua TOAST...

Qhov tseeb, rau txhua lub rooj nrog "muaj peev xwm loj" teb, tau txais ib lub rooj ua ke nrog "slicing" yog tsim txhua "loj" cov ntaub ntawv hauv 2 KB ntu:

TOAST(
  chunk_id
    integer
, chunk_seq
    integer
, chunk_data
    bytea
, PRIMARY KEY(chunk_id, chunk_seq)
);

Ntawd yog, yog tias peb yuav tsum sau ib txoj hlua nrog tus nqi "loj". data, ces cov ntaubntawv povthawj siv tiag tiag yuav tshwm sim Tsis yog tsuas yog rau lub rooj loj thiab nws PK, tab sis kuj rau TOAST thiab nws PK.

Txo TOAST cawv

Tab sis feem ntau ntawm peb cov ntaub ntawv tseem tsis yog loj, yuav tsum haum rau 8 KB - Kuv yuav txuag tau nyiaj li cas ntawm qhov no? ..

Qhov no yog qhov uas tus cwj pwm tuaj rau peb pab STORAGE ntawm kab lus:

  • TSHAJ tso cai rau ob qho tib si compression thiab cais cia. Qhov no txheem xaiv rau feem ntau TOAST cov ntaub ntawv ua raws hom. Nws thawj zaug sim ua compression, tom qab ntawd khaws nws sab nraum lub rooj yog tias kab tseem loj dhau.
  • TES tso cai compression tab sis tsis cais cia. (Qhov tseeb, cais cia tseem yuav ua rau cov kab ntawv zoo li no, tab sis tsuas yog raws li qhov chaw kawg, thaum tsis muaj lwm txoj hauv kev los txo txoj hlua kom nws haum rau ntawm nplooj ntawv.)

Qhov tseeb, qhov no yog qhov peb xav tau rau cov ntawv nyeem - compress nws kom ntau li ntau tau, thiab yog tias nws tsis haum txhua, muab tso rau hauv TOAST. Qhov no tuaj yeem ua tiav ncaj qha ntawm ya, nrog rau ib qho lus txib:

ALTER TABLE rawdata_orig ALTER COLUMN data SET STORAGE MAIN;

Yuav ntsuas qhov cuam tshuam li cas

Txij li cov ntaub ntawv hloov pauv txhua hnub, peb tsis tuaj yeem sib piv cov lej tiag, tab sis nyob rau hauv cov ntsiab lus txheeb ze qhia me me Peb sau nws hauv TOAST - ntau qhov zoo dua. Tab sis muaj qhov txaus ntshai ntawm no - qhov loj dua "lub cev" ntim ntawm txhua tus neeg cov ntaub ntawv, qhov "dav" qhov ntsuas tau dhau los, vim tias peb yuav tsum npog ntau nplooj ntawv ntawm cov ntaub ntawv.

Feem ua ntej hloov:

heap  = 37GB (39%)
TOAST = 54GB (57%)
PK    =  4GB ( 4%)

Feem tom qab hloov:

heap  = 37GB (67%)
TOAST = 16GB (29%)
PK    =  2GB ( 4%)

Qhov tseeb, peb pib sau ntawv rau TOAST 2 zaug tsawg zaus, uas unloaded tsis tau tsuas yog lub disk, tab sis kuj lub CPU:

Txuag ib lub nyiaj ntawm cov ntim loj hauv PostgreSQL
Txuag ib lub nyiaj ntawm cov ntim loj hauv PostgreSQL
Kuv yuav nco ntsoov tias peb kuj tau ua me me hauv "nyeem" lub disk, tsis yog "sau" nkaus xwb - txij li thaum muab cov ntaub ntawv tso rau hauv ib lub rooj, peb kuj yuav tsum "nyeem" ib feem ntawm tsob ntoo ntawm txhua qhov ntsuas txhawm rau txiav txim siab nws. yav tom ntej txoj hauj lwm nyob rau hauv lawv.

Leej twg tuaj yeem ua neej nyob zoo ntawm PostgreSQL 11

Tom qab hloov kho rau PG11, peb tau txiav txim siab txuas ntxiv "kho" TOAST thiab pom tias pib los ntawm cov qauv no, qhov ntsuas tau dhau los ua rau kev kho toast_tuple_target:

TOAST cov lej ua haujlwm tsuas yog tua hluav taws thaum kab tus nqi khaws cia hauv lub rooj loj dua TOAST_TUPLE_THRESHOLD bytes (feem ntau yog 2 KB). TOAST code yuav compress thiab/los yog tsiv teb qhov tseem ceeb tawm ntawm lub rooj kom txog rau thaum kab tus nqi yuav tsawg dua TOAST_TUPLE_TARGET bytes (tus nqi sib txawv, kuj feem ntau yog 2 KB) lossis qhov loj me tsis tuaj yeem txo.

Peb txiav txim siab tias cov ntaub ntawv peb feem ntau muaj yog "luv heev" lossis "ntev heev", yog li peb txiav txim siab txwv peb tus kheej mus rau qhov tsawg kawg nkaus tus nqi:

ALTER TABLE rawplan_orig SET (toast_tuple_target = 128);

Cia peb saib yuav ua li cas cov chaw tshiab cuam tshuam rau kev thauj khoom disk tom qab rov teeb tsa:

Txuag ib lub nyiaj ntawm cov ntim loj hauv PostgreSQL
Tsis phem! Nruab nrab cov kab rau disk tau txo qis kwv yees li 1.5 npaug, thiab disk "tsis khoom" yog 20 feem pua! Tab sis tej zaum qhov no qee yam cuam tshuam rau CPU?

Txuag ib lub nyiaj ntawm cov ntim loj hauv PostgreSQL
Yam tsawg kawg nws tsis ua phem dua. Txawm li cas los xij, nws nyuaj rau txiav txim siab yog tias txawm tias cov ntim zoo li no tseem tsis tuaj yeem nce qhov nruab nrab CPU load siab dua 5%.

Los ntawm kev hloov qhov chaw ntawm cov ntsiab lus, cov sum ... hloov!

Raws li koj paub, ib lub nyiaj txuag ib ruble, thiab nrog peb cov ntim ntim nws yog hais txog 10TB / hli txawm tias me ntsis optimization tuaj yeem muab cov txiaj ntsig zoo. Yog li ntawd, peb tau xyuam xim rau lub cev qauv ntawm peb cov ntaub ntawv - yuav ua li cas raws nraim "stacked" teb hauv cov ntaub ntawv txhua lub rooj.

vim vim cov ntaub ntawv sib dhos qhov no yog ncaj nraim rau pem hauv ntej cuam ​​tshuam qhov tshwm sim ntim:

Muaj ntau cov architectures muab cov ntaub ntawv sib dhos ntawm lub tshuab lo lus ciam teb. Piv txwv li, ntawm 32-ntsis x86 system, cov lej (tus lej hom, 4 bytes) yuav raug ua raws li 4-byte lo lus ciam teb, raws li yuav muab ob npaug rau qhov tseeb ntab taw tes tus lej (ob npaug precision ntab taw tes, 8 bytes). Thiab ntawm 64-ntsis system, ob qhov txiaj ntsig yuav raug ua raws li 8-byte lo lus ciam teb. Qhov no yog lwm qhov laj thawj rau kev tsis sib haum xeeb.

Vim kev sib tw, qhov loj ntawm kab lus nyob ntawm qhov kev txiav txim ntawm cov teb. Feem ntau cov nyhuv no tsis pom zoo heev, tab sis qee zaum nws tuaj yeem ua rau muaj qhov loj me. Piv txwv li, yog tias koj sib tov char(1) thiab integer teb, feem ntau yuav muaj 3 bytes nkim ntawm lawv.

Cia peb pib nrog cov qauv hluavtaws:

SELECT pg_column_size(ROW(
  '0000-0000-0000-0000-0000-0000-0000-0000'::uuid
, 0::smallint
, '2019-01-01'::date
));
-- 48 байт

SELECT pg_column_size(ROW(
  '2019-01-01'::date
, '0000-0000-0000-0000-0000-0000-0000-0000'::uuid
, 0::smallint
));
-- 46 байт

Qhov twg yog ob peb lub bytes ntxiv los ntawm thawj rooj plaub? Nws yog qhov yooj yim - 2-byte smallint dlhos ntawm 4-byte ciam teb ua ntej daim teb tom ntej, thiab thaum nws yog qhov kawg, tsis muaj dab tsi thiab tsis tas yuav ua kom haum.

Hauv txoj kev xav, txhua yam zoo thiab koj tuaj yeem hloov kho cov teb raws li koj nyiam. Cia peb xyuas nws ntawm cov ntaub ntawv tiag tiag uas siv cov piv txwv ntawm ib qho ntawm cov ntxhuav, ntu txhua hnub uas nyob 10-15GB.

Thawj qauv:

CREATE TABLE public.plan_20190220
(
-- Унаследована from table plan:  pack uuid NOT NULL,
-- Унаследована from table plan:  recno smallint NOT NULL,
-- Унаследована from table plan:  host uuid,
-- Унаследована from table plan:  ts timestamp with time zone,
-- Унаследована from table plan:  exectime numeric(32,3),
-- Унаследована from table plan:  duration numeric(32,3),
-- Унаследована from table plan:  bufint bigint,
-- Унаследована from table plan:  bufmem bigint,
-- Унаследована from table plan:  bufdsk bigint,
-- Унаследована from table plan:  apn uuid,
-- Унаследована from table plan:  ptr uuid,
-- Унаследована from table plan:  dt date,
  CONSTRAINT plan_20190220_pkey PRIMARY KEY (pack, recno),
  CONSTRAINT chck_ptr CHECK (ptr IS NOT NULL),
  CONSTRAINT plan_20190220_dt_check CHECK (dt = '2019-02-20'::date)
)
INHERITS (public.plan)

Tshooj tom qab hloov kab lus - raws nraim tib teb, tsuas yog sib txawv:

CREATE TABLE public.plan_20190221
(
-- Унаследована from table plan:  dt date NOT NULL,
-- Унаследована from table plan:  ts timestamp with time zone,
-- Унаследована from table plan:  pack uuid NOT NULL,
-- Унаследована from table plan:  recno smallint NOT NULL,
-- Унаследована from table plan:  host uuid,
-- Унаследована from table plan:  apn uuid,
-- Унаследована from table plan:  ptr uuid,
-- Унаследована from table plan:  bufint bigint,
-- Унаследована from table plan:  bufmem bigint,
-- Унаследована from table plan:  bufdsk bigint,
-- Унаследована from table plan:  exectime numeric(32,3),
-- Унаследована from table plan:  duration numeric(32,3),
  CONSTRAINT plan_20190221_pkey PRIMARY KEY (pack, recno),
  CONSTRAINT chck_ptr CHECK (ptr IS NOT NULL),
  CONSTRAINT plan_20190221_dt_check CHECK (dt = '2019-02-21'::date)
)
INHERITS (public.plan)

Tag nrho cov ntim ntawm ntu yog txiav txim los ntawm tus naj npawb ntawm "qhov tseeb" thiab tsuas yog nyob ntawm cov txheej txheem sab nraud, yog li cia peb faib qhov loj ntawm lub heap (pg_relation_size) los ntawm tus naj npawb ntawm cov ntaub ntawv hauv nws - uas yog, peb tau txais qhov nruab nrab qhov loj ntawm cov ntaub ntawv khaws cia:

Txuag ib lub nyiaj ntawm cov ntim loj hauv PostgreSQL
Tshem tawm 6% ntim, Zoo heev!

Tab sis txhua yam, ntawm chav kawm, tsis yog rosy - tom qab tag nrho, hauv indexes peb tsis tuaj yeem hloov qhov kev txiav txim ntawm cov teb, thiab yog li ntawd "nyob rau hauv general" (pg_total_relation_size) ...

Txuag ib lub nyiaj ntawm cov ntim loj hauv PostgreSQL
... tseem nyob ntawm no thiab khaws tseg 1.5%tsis hloov ib kab ntawm txoj cai. Yog, yog!

Txuag ib lub nyiaj ntawm cov ntim loj hauv PostgreSQL

Kuv nco ntsoov tias qhov kev xaiv saum toj no rau kev npaj teb tsis yog qhov tseeb tias nws yog qhov zoo tshaj plaws. Vim tias koj tsis xav "tua" qee qhov chaw ntawm qhov laj thawj zoo nkauj - piv txwv li, ob peb (pack, recno), uas yog PK rau lub rooj no.

Feem ntau, txiav txim siab qhov "tsawg kawg" kev npaj ntawm thaj teb yog qhov yooj yim yooj yim "brute force" txoj haujlwm. Yog li ntawd, koj tuaj yeem tau txais txiaj ntsig zoo dua los ntawm koj cov ntaub ntawv tshaj li peb li - sim nws!

Tau qhov twg los: www.hab.com

Ntxiv ib saib