Postgres: bloat, pg_repack thiab ncua kev txwv

Postgres: bloat, pg_repack thiab ncua kev txwv

Cov nyhuv ntawm cov rooj zaum thiab qhov ntsuas (bloat) yog dav paub thiab tam sim no tsis yog hauv Postgres nkaus xwb. Muaj ntau txoj hauv kev los daws nws tawm ntawm lub thawv, xws li VACUUM FULL lossis CLUSTER, tab sis lawv kaw cov rooj thaum lub sijhawm ua haujlwm thiab yog li tsis tuaj yeem siv tas li.

Cov kab lus yuav muaj me ntsis kev xav txog yuav ua li cas tsam plab tshwm sim, koj tuaj yeem tawm tsam nws li cas, txog kev txwv tsis pub dhau thiab cov teeb meem uas lawv coj mus rau kev siv pg_repack txuas ntxiv.

Kab lus no yog sau los ntawm kuv lus ntawm PgConf.Russia 2020.

Vim li cas bloat tshwm sim?

Postgres yog raws li tus qauv ntau yam (MVCC). Nws cov ntsiab lus yog tias txhua kab hauv lub rooj tuaj yeem muaj ntau lub versions, thaum kev lag luam pom tsis ntau tshaj ib qho ntawm cov qauv no, tab sis tsis tas yuav yog tib qho. Qhov no tso cai rau ntau qhov kev lag luam ua haujlwm ib txhij thiab tsis muaj kev cuam tshuam rau ib leeg.

Obviously, tag nrho cov versions yuav tsum tau muab khaws cia. Postgres ua haujlwm nrog lub cim xeeb los ntawm nplooj ntawv thiab ib nplooj ntawv yog qhov tsawg kawg nkaus ntawm cov ntaub ntawv uas tuaj yeem nyeem los ntawm disk lossis sau. Cia peb saib ib qho piv txwv me me kom nkag siab tias qhov no tshwm sim li cas.

Cia peb hais tias peb muaj ib lub rooj uas peb tau ntxiv ntau cov ntaub ntawv. Cov ntaub ntawv tshiab tau tshwm sim rau thawj nplooj ntawv ntawm cov ntaub ntawv uas lub rooj khaws cia. Cov no yog cov qauv nyob ntawm kab uas muaj rau lwm yam kev lag luam tom qab kev cog lus (rau qhov yooj yim, peb yuav xav tias qib kev cais tawm yog Read Committed).

Postgres: bloat, pg_repack thiab ncua kev txwv

Peb mam li hloov kho ib qho ntawm cov kev nkag, yog li kos lub qub version tsis cuam tshuam.

Postgres: bloat, pg_repack thiab ncua kev txwv

Ib kauj ruam dhau los, hloov kho thiab tshem tawm kab ntawv, peb tau xaus nrog nplooj ntawv uas kwv yees li ib nrab ntawm cov ntaub ntawv yog "khoom khib nyiab". Cov ntaub ntawv no tsis pom kev ua lag luam.

Postgres: bloat, pg_repack thiab ncua kev txwv

Postgres muaj ib tug mechanism NQUS PLUA PLAV, uas ntxuav tawm cov versions uas tsis siv lawm thiab ua kom muaj chaw rau cov ntaub ntawv tshiab. Tab sis yog tias nws tsis tau teeb tsa aggressively txaus los yog tibneeg hu tauj coob ua hauj lwm nyob rau hauv lwm lub rooj, ces "cov ntaub ntawv khib nyiab" tseem, thiab peb yuav tsum siv cov nplooj ntawv ntxiv rau cov ntaub ntawv tshiab.

Yog li hauv peb qhov piv txwv, qee lub sijhawm lub rooj yuav muaj plaub nplooj ntawv, tab sis tsuas yog ib nrab ntawm nws yuav muaj cov ntaub ntawv nyob. Yog li ntawd, thaum nkag mus rau lub rooj, peb yuav nyeem cov ntaub ntawv ntau dua li qhov tsim nyog.

Postgres: bloat, pg_repack thiab ncua kev txwv

Txawm hais tias VACUUM tam sim no tshem tawm tag nrho cov kab ntawv tsis cuam tshuam, qhov xwm txheej yuav tsis zoo heev. Peb yuav muaj chaw pub dawb hauv nplooj ntawv lossis txawm tias tag nrho nplooj ntawv rau kab tshiab, tab sis peb tseem yuav tau nyeem cov ntaub ntawv ntau dua li qhov tsim nyog.
Los ntawm txoj kev, yog tias ib nplooj ntawv dawb paug tag nrho (qhov thib ob hauv peb qhov piv txwv) yog qhov kawg ntawm cov ntaub ntawv, ces VACUUM yuav tuaj yeem txiav nws. Tab sis tam sim no nws nyob nruab nrab, yog li tsis muaj dab tsi ua tau nrog nws.

Postgres: bloat, pg_repack thiab ncua kev txwv

Thaum tus naj npawb ntawm cov nplooj ntawv khoob lossis cov nplooj ntawv sib txawv loj heev, uas yog hu ua bloat, nws pib cuam tshuam rau kev ua haujlwm.

Txhua yam uas tau piav saum toj no yog cov txheej txheem ntawm qhov tshwm sim ntawm bloat hauv cov ntxhuav. Nyob rau hauv indexes qhov no tshwm sim nyob rau hauv ntau yam tib yam.

Kuv puas muaj bloat?

Muaj ntau txoj hauv kev los txiav txim seb koj puas muaj bloat. Lub tswv yim ntawm thawj zaug yog siv cov kev txheeb cais hauv Postgres, uas muaj cov ntaub ntawv kwv yees txog cov kab hauv cov ntxhuav, tus naj npawb ntawm "nyob" kab, thiab lwm yam. Koj tuaj yeem pom ntau qhov kev hloov pauv ntawm cov ntawv npaj ua tiav hauv Is Taws Nem. Peb coj los ua lub hauv paus tsab ntawv los ntawm PostgreSQL Cov Kws Tshaj Lij, uas tuaj yeem ntsuas cov ntxhuav bloat nrog rau toast thiab bloat btree indexes. Hauv peb qhov kev paub, nws qhov yuam kev yog 10-20%.

Lwm txoj hauv kev yog siv qhov txuas ntxiv pgst ua, uas tso cai rau koj saib hauv nplooj ntawv thiab tau txais ob qho tib si kwv yees thiab tus nqi bloat pes tsawg. Tab sis nyob rau hauv rooj plaub thib ob, koj yuav tau luam theej duab tag nrho lub rooj.

Peb xav txog tus nqi me me, txog li 20%, tau txais. Nws tuaj yeem suav hais tias yog ib qho analogue ntawm fillfactor rau rooj и ntsuas. Ntawm 50% thiab siab dua, cov teeb meem kev ua haujlwm yuav pib.

Txoj kev los tiv thaiv bloat

Postgres muaj ntau txoj hauv kev los daws qhov mob plab tawm ntawm lub thawv, tab sis lawv tsis yog ib txwm haum rau txhua tus.

Configure AUTOVACUUM thiaj li tsis tshwm sim. Los yog ntau dua precisely, kom nws nyob rau hauv ib theem txaus rau koj. Qhov no zoo li "tus thawj coj" cov lus qhia, tab sis qhov tseeb qhov no tsis yog ib txwm yooj yim kom ua tiav. Piv txwv li, koj muaj kev txhim kho nquag nrog kev hloov pauv tsis tu ncua rau cov ntaub ntawv schema, lossis qee yam ntawm cov ntaub ntawv tsiv teb tsaws tau tshwm sim. Yog li ntawd, koj qhov profile load yuav hloov ntau zaus thiab feem ntau yuav txawv ntawm rooj mus rau lub rooj. Qhov no txhais tau tias koj yuav tsum tau ua haujlwm me ntsis ua ntej thiab kho AUTOVACUUM rau qhov hloov pauv ntawm txhua lub rooj. Tab sis pom tseeb tias qhov no tsis yooj yim ua.

Lwm qhov laj thawj vim li cas AUTOVACUUM tsis tuaj yeem khaws cov ntxhuav yog vim tias muaj kev lag luam ntev uas tiv thaiv nws los ntawm kev ntxuav cov ntaub ntawv uas muaj rau cov kev lag luam. Cov lus pom zoo ntawm no kuj pom tseeb - tshem tawm "dangling" kev lag luam thiab txo lub sijhawm ntawm kev ua lag luam. Tab sis yog tias lub load ntawm koj daim ntawv thov yog ib tug hybrid ntawm OLAP thiab OLTP, ces koj muaj peev xwm ib txhij muaj ntau zaus hloov tshiab thiab cov lus nug luv luv, nrog rau kev khiav hauj lwm mus sij hawm ntev - piv txwv li, tsim ib daim ntawv qhia. Nyob rau hauv cov xwm txheej zoo li no, nws tsim nyog xav txog kev nthuav tawm lub nra ntawm cov hauv paus sib txawv, uas yuav tso cai rau kev kho kom zoo ntawm txhua tus ntawm lawv.

Lwm qhov piv txwv - txawm tias qhov profile yog homogeneous, tab sis cov ntaub ntawv nyob rau hauv ib qho kev thauj khoom siab heev, ces txawm tias AUTOVACUUM hnyav tshaj plaws yuav tsis tiv taus, thiab tsam plab yuav tshwm sim. Scaling ( ntsug lossis kab rov tav) yog tib txoj kev daws teeb meem.

Yuav ua li cas nyob rau hauv ib qho xwm txheej uas koj tau teeb tsa AUTOVACUUM, tab sis qhov bloat tseem loj tuaj.

pab neeg VACUUM FULL rebuilds cov ntsiab lus ntawm cov ntxhuav thiab indexes thiab tsuas yog cov ntaub ntawv tseem ceeb hauv lawv. Txhawm rau tshem tawm tsam plab, nws ua haujlwm zoo kawg nkaus, tab sis thaum nws ua tiav ib qho kev xauv tshwj xeeb ntawm lub rooj raug ntes (AccessExclusiveLock), uas yuav tsis tso cai rau kev ua tiav cov lus nug ntawm lub rooj no, txawm tias xaiv. Yog tias koj tuaj yeem tso tseg koj cov kev pabcuam lossis ib feem ntawm nws rau qee lub sijhawm (los ntawm kaum tawm feeb mus rau ob peb teev nyob ntawm qhov loj ntawm cov ntaub ntawv thiab koj cov khoom siv), ces qhov kev xaiv no yog qhov zoo tshaj plaws. Hmoov tsis zoo, peb tsis muaj sijhawm los khiav VACUUM FULL thaum lub sijhawm tu, yog li txoj kev no tsis haum rau peb.

pab neeg KAWG Rebuilds cov ntsiab lus ntawm cov ntxhuav nyob rau hauv tib txoj kev raws li VACUUM FULL, tab sis tso cai rau koj mus qhia ib tug Performance index raws li cov ntaub ntawv yuav raug txiav txim lub cev ntawm disk (tab sis yav tom ntej qhov kev txiav txim yog tsis guaranteed rau kab tshiab). Hauv qee qhov xwm txheej, qhov no yog qhov ua kom zoo rau ntau cov lus nug - nrog kev nyeem ntau cov ntaub ntawv los ntawm kev ntsuas. Qhov tsis zoo ntawm cov lus txib yog tib yam li ntawm VACUUM FULL - nws xauv lub rooj thaum lub sijhawm ua haujlwm.

pab neeg REINDEX zoo ib yam li ob qho dhau los, tab sis rov tsim kho qhov ntsuas tshwj xeeb lossis txhua qhov ntsuas ntawm lub rooj. Cov xauv me ntsis tsis muaj zog: ShareLock ntawm lub rooj (tiv thaiv kev hloov kho, tab sis tso cai rau xaiv) thiab AccessExclusiveLock ntawm qhov ntsuas tau rov ua dua (blocks queries siv qhov ntsuas no). Txawm li cas los xij, hauv 12th version ntawm Postgres ib qho kev ntsuas tau tshwm sim TSEEM CEEB, uas tso cai rau koj rov tsim kho qhov Performance index yam tsis thaiv kev sib txuas ntxiv, hloov kho, lossis tshem tawm cov ntaub ntawv.

Hauv cov ntawv dhau los ntawm Postgres, koj tuaj yeem ua tiav qhov txiaj ntsig zoo ib yam li REINDEX CONCURRENTLY siv CREATE INDEX CONCURRENTLY. Nws tso cai rau koj los tsim qhov Performance index yam tsis muaj kev kaw nruj heev (ShareUpdateExclusiveLock, uas tsis cuam tshuam nrog cov lus nug sib npaug), tom qab ntawd hloov qhov ntsuas qub nrog ib qho tshiab thiab tshem tawm qhov qub index. Qhov no tso cai rau koj kom tshem tawm qhov ntsuas qhov khoob tsis cuam tshuam nrog koj daim ntawv thov. Nws yog ib qho tseem ceeb uas yuav tau txiav txim siab tias thaum rov tsim cov indexes yuav muaj kev thauj khoom ntxiv ntawm lub disk subsystem.

Yog li, yog tias rau cov ntsuas ntsuas muaj txoj hauv kev los tshem tawm tsam plab "ntawm yoov," ces tsis muaj rau rooj. Qhov no yog qhov uas ntau yam sab nraud extensions tuaj rau hauv kev ua si: pg_repack (yav tas los pg_reorg), pg ua, pgcompactable thiab lwm tus. Hauv tsab xov xwm no, kuv yuav tsis piv lawv thiab tsuas yog tham txog pg_repack, uas, tom qab qee qhov kev hloov kho, peb siv peb tus kheej.

Yuav ua li cas pg_repack ua haujlwm

Postgres: bloat, pg_repack thiab ncua kev txwv
Cia peb hais tias peb muaj ib lub rooj zoo tib yam nkaus - nrog kev ntsuas, kev txwv thiab, hmoov tsis, nrog bloat. Thawj kauj ruam ntawm pg_repack yog los tsim lub rooj log los khaws cov ntaub ntawv hais txog txhua qhov kev hloov pauv thaum nws tab tom khiav. Tus txhais yuav rov ua dua cov kev hloov pauv no rau txhua qhov ntxig, hloov kho thiab tshem tawm. Tom qab ntawd ib lub rooj tsim, zoo ib yam li tus thawj hauv cov qauv, tab sis tsis muaj kev ntsuas thiab kev txwv, thiaj li tsis ua kom qeeb ntawm cov txheej txheem ntawm kev ntxig cov ntaub ntawv.

Tom ntej no, pg_repack hloov cov ntaub ntawv los ntawm lub rooj qub mus rau lub rooj tshiab, cia li lim tawm tag nrho cov kab tsis cuam tshuam, thiab tom qab ntawd tsim cov indexes rau lub rooj tshiab. Thaum lub sijhawm ua tiav ntawm tag nrho cov haujlwm no, cov kev hloov pauv hauv lub rooj log.

Cov kauj ruam tom ntej yog hloov cov kev hloov pauv mus rau lub rooj tshiab. Kev tsiv teb tsaws tau ua dhau los ntawm ntau qhov kev rov ua dua, thiab thaum muaj tsawg dua 20 qhov nkag tawm hauv lub rooj log, pg_repack tau txais lub xauv ruaj khov, hloov cov ntaub ntawv tshiab, thiab hloov lub rooj qub nrog cov tshiab hauv Postgres cov rooj. Qhov no tsuas yog lub sijhawm luv luv thaum koj yuav tsis tuaj yeem ua haujlwm nrog lub rooj. Tom qab ntawd, lub rooj qub thiab lub rooj nrog cov cav raug tshem tawm thiab qhov chaw tso tawm hauv cov ntaub ntawv kaw lus. Cov txheej txheem tiav lawm.

Txhua yam zoo li zoo hauv kev xav, tab sis yuav ua li cas hauv kev xyaum? Peb tau sim pg_repack yam tsis muaj load thiab nyob rau hauv load, thiab xyuas nws cov lag luam nyob rau hauv cov ntaub ntawv ntawm tsis ntxov ntxov (ua lwm yam lus, siv Ctrl + C). Txhua qhov kev xeem tau zoo.

Peb mus rau lub khw muag khoom noj - thiab tom qab ntawd txhua yam tsis mus raws li peb xav tau.

Thawj pancake ntawm kev muag khoom

Ntawm thawj pawg peb tau txais qhov yuam kev hais txog kev ua txhaum cai ntawm kev txwv tshwj xeeb:

$ ./pg_repack -t tablename -o id
INFO: repacking table "tablename"
ERROR: query failed: 
    ERROR: duplicate key value violates unique constraint "index_16508"
DETAIL:  Key (id, index)=(100500, 42) already exists.

Qhov kev txwv no muaj lub npe pib tsim index_16508 - nws tau tsim los ntawm pg_repack. Raws li cov yam ntxwv muaj nyob rau hauv nws cov muaj pes tsawg leeg, peb txiav txim siab "peb" txwv uas cuam tshuam rau nws. Qhov teeb meem tau tshwm sim yog tias qhov no tsis yog qhov kev txwv ib txwm muaj, tab sis ib qho kev ncua (ncua kev txwv), i.e. nws cov ntaub ntawv pov thawj tau ua tom qab tshaj qhov hais kom ua sql, uas ua rau muaj qhov tshwm sim tsis txaus ntseeg.

Kev txwv tsis pub ncua: vim li cas lawv xav tau thiab lawv ua haujlwm li cas

Ib qho kev xav me ntsis txog kev txwv tsis pub ncua.
Cia peb xav txog ib qho piv txwv yooj yim: peb muaj ib lub rooj-siv phau ntawv ntawm lub tsheb nrog ob tus cwj pwm - lub npe thiab kev txiav txim ntawm lub tsheb hauv phau ntawv teev npe.
Postgres: bloat, pg_repack thiab ncua kev txwv

create table cars
(
  name text constraint pk_cars primary key,
  ord integer not null constraint uk_cars unique
);



Cia peb hais tias peb yuav tsum tau pauv lub tsheb thawj thiab thib ob. Qhov kev daws teeb meem ncaj nraim yog hloov kho tus nqi thawj zaug rau qhov thib ob, thiab qhov thib ob mus rau thawj:

begin;
  update cars set ord = 2 where name = 'audi';
  update cars set ord = 1 where name = 'bmw';
commit;

Tab sis thaum peb khiav cov cai no, peb cia siab tias yuav muaj kev txwv tsis pub ua txhaum vim qhov kev txiav txim ntawm qhov tseem ceeb hauv lub rooj yog qhov tshwj xeeb:

[23305] ERROR: duplicate key value violates unique constraint “uk_cars”
Detail: Key (ord)=(2) already exists.

Kuv yuav ua li cas txawv? Kev xaiv ib qho: ntxiv tus nqi ntxiv rau qhov kev txiav txim uas tau lees tias tsis muaj nyob hauv lub rooj, piv txwv li "-1". Hauv kev ua haujlwm, qhov no yog hu ua "kev pauv cov txiaj ntsig ntawm ob qhov sib txawv los ntawm ib feem peb." Qhov tsuas drawback ntawm txoj kev no yog qhov hloov tshiab ntxiv.

Kev xaiv thib ob: Rov tsim lub rooj los siv cov ntaub ntawv ntab ntab ntab rau cov nqi xaj tsis yog cov lej. Tom qab ntawd, thaum hloov kho tus nqi ntawm 1, piv txwv li, mus rau 2.5, thawj qhov nkag yuav cia li "sawv" ntawm qhov thib ob thiab thib peb. Cov tshuaj no ua haujlwm, tab sis muaj ob qhov kev txwv. Ua ntej, nws yuav tsis ua haujlwm rau koj yog tias tus nqi siv rau qhov chaw hauv lub interface. Qhov thib ob, nyob ntawm qhov tseeb ntawm cov ntaub ntawv hom, koj yuav muaj tsawg tsawg ntawm cov ntaub ntawv tuaj yeem ua ntej rov xam cov nqi ntawm txhua cov ntaub ntawv.

Qhov kev xaiv thib peb: ua qhov txwv ncua kom nws raug tshuaj xyuas thaum lub sijhawm cog lus:

create table cars
(
  name text constraint pk_cars primary key,
  ord integer not null constraint uk_cars unique deferrable initially deferred
);

Txij li cov logic ntawm peb qhov kev thov thawj zaug ua kom ntseeg tau tias txhua qhov tseem ceeb yog qhov tshwj xeeb thaum lub sijhawm cog lus, nws yuav ua tiav.

Cov piv txwv tau tham saum toj no yog, ntawm chav kawm, hluavtaws heev, tab sis nws nthuav tawm lub tswv yim. Hauv peb daim ntawv thov, peb siv cov kev txwv tsis pub dhau los siv cov logic uas yog lub luag haujlwm los daws qhov tsis sib haum xeeb thaum cov neeg siv ib txhij ua haujlwm nrog cov khoom sib koom widget ntawm lub rooj tsavxwm. Kev siv cov kev txwv no tso cai rau peb ua kom cov ntawv thov code yooj yim me ntsis.

Feem ntau, nyob ntawm hom kev txwv, Postgres muaj peb theem ntawm granularity rau kev kuaj xyuas lawv: kab, kev hloov pauv, thiab qib qhia.
Postgres: bloat, pg_repack thiab ncua kev txwv
Tau qhov twg los: begriffs

CHECK thiab TSIS TXAUS SIAB yog ib txwm kuaj ntawm qib kab; rau lwm yam kev txwv, raws li pom tau los ntawm lub rooj, muaj ntau txoj kev xaiv. Koj tuaj yeem nyeem ntxiv no.

Ua kom luv luv, ncua kev txwv hauv ntau qhov xwm txheej muab cov lej nyeem tau ntau dua thiab cov lus txib tsawg dua. Txawm li cas los xij, koj yuav tsum tau them nyiaj rau qhov no los ntawm kev cuam tshuam cov txheej txheem debugging, txij li lub sijhawm qhov kev ua yuam kev tshwm sim thiab lub sijhawm koj paub txog nws raug cais raws sijhawm. Lwm qhov teeb meem tshwm sim yog tias tus teem sijhawm yuav tsis tuaj yeem tsim ib txoj kev npaj zoo yog tias qhov kev thov cuam tshuam nrog kev txwv tsis pub dhau.

Kev txhim kho ntawm pg_repack

Peb tau hais txog qhov txwv tsis pub ncua, tab sis lawv cuam tshuam li cas rau peb qhov teeb meem? Cia peb nco ntsoov qhov yuam kev peb tau txais ua ntej:

$ ./pg_repack -t tablename -o id
INFO: repacking table "tablename"
ERROR: query failed: 
    ERROR: duplicate key value violates unique constraint "index_16508"
DETAIL:  Key (id, index)=(100500, 42) already exists.

Nws tshwm sim thaum cov ntaub ntawv tau theej los ntawm lub rooj log mus rau lub rooj tshiab. Qhov no zoo li txawv txawv vim ... cov ntaub ntawv nyob rau hauv lub rooj log yog cog lus nrog rau cov ntaub ntawv nyob rau hauv lub qhov rooj. Yog tias lawv ua raws li cov kev txwv ntawm thawj lub rooj, lawv yuav ua li cas thiaj li ua txhaum tib yam kev txwv hauv lub tshiab?

Raws li nws hloov tawm, lub hauv paus ntawm qhov teeb meem nyob rau hauv cov kauj ruam dhau los ntawm pg_repack, uas tsim tsuas yog indexes, tab sis tsis txwv: lub rooj qub muaj qhov txwv tshwj xeeb, thiab tus tshiab tsim ib qho kev ntsuas tshwj xeeb xwb.

Postgres: bloat, pg_repack thiab ncua kev txwv

Nws yog ib qho tseem ceeb uas yuav tsum nco ntsoov ntawm no tias yog tias qhov kev txwv tsis zoo thiab tsis hloov pauv, ces qhov kev ntsuas tshwj xeeb tsim hloov pauv yog sib npaug rau qhov kev txwv no, vim tias Cov kev txwv tshwj xeeb hauv Postgres yog siv los ntawm kev tsim ib qho kev ntsuas tshwj xeeb. Tab sis nyob rau hauv rooj plaub ntawm kev txwv tsis pub dhau, tus cwj pwm tsis zoo ib yam, vim tias qhov ntsuas tsis tuaj yeem ncua sijhawm thiab ib txwm kuaj xyuas thaum lub sijhawm sql hais kom ua.

Yog li, lub ntsiab lus ntawm qhov teeb meem nyob rau hauv "kev ncua" ntawm daim tshev: nyob rau hauv thawj lub rooj nws tshwm sim thaum lub sij hawm cog lus, thiab nyob rau hauv lub rooj tshiab thaum lub sij hawm cov lus txib sql raug tua. Qhov no txhais tau tias peb yuav tsum xyuas kom meej tias cov tshev tau ua tib yam hauv ob qho xwm txheej: ib txwm ncua sijhawm, lossis ib txwm tam sim ntawd.

Yog li ntawd peb muaj tswv yim dab tsi?

Tsim ib qho kev ntsuas zoo ib yam li kev ncua

Thawj lub tswv yim yog ua ob qho kev kuaj xyuas hauv hom tam sim. Qhov no tuaj yeem tsim ntau qhov kev txwv tsis raug, tab sis yog tias muaj qee qhov ntawm lawv, qhov no yuav tsum tsis txhob cuam tshuam rau kev ua haujlwm ntawm cov neeg siv, vim tias qhov kev tsis sib haum xeeb no yog qhov xwm txheej zoo rau lawv. Lawv tshwm sim, piv txwv li, thaum ob tus neeg siv pib kho tib lub widget tib lub sijhawm, thiab cov neeg siv khoom thib ob tsis muaj sijhawm los txais cov ntaub ntawv uas tus widget twb tau thaiv rau kev kho los ntawm thawj tus neeg siv. Hauv qhov xwm txheej zoo li no, tus neeg rau zaub mov tsis kam tus neeg siv thib ob, thiab nws cov neeg siv khoom rov qab hloov pauv thiab thaiv cov widget. Ib me ntsis tom qab, thaum thawj tus neeg siv ua tiav kev kho, tus thib ob yuav tau txais cov ntaub ntawv hais tias lub widget tsis raug thaiv lawm thiab yuav rov ua dua lawv cov kev ua.

Postgres: bloat, pg_repack thiab ncua kev txwv

Txhawm rau kom ntseeg tau tias cov tshev nyiaj ib txwm nyob rau hauv hom tsis ncua sijhawm, peb tsim qhov ntsuas tshiab zoo ib yam li thawj qhov txwv tsis pub dhau:

CREATE UNIQUE INDEX CONCURRENTLY uk_tablename__immediate ON tablename (id, index);
-- run pg_repack
DROP INDEX CONCURRENTLY uk_tablename__immediate;

Nyob rau hauv ib puag ncig kev xeem, peb tau txais tsuas yog qee qhov xav tau yuam kev. Kev vam meej! Peb khiav pg_repack dua ntawm kev tsim khoom thiab tau txais 5 qhov yuam kev ntawm thawj pawg hauv ib teev ntawm kev ua haujlwm. Qhov no yog ib qho txiaj ntsig tau. Txawm li cas los xij, twb nyob rau pawg thib ob tus lej yuam kev tau nce ntau thiab peb yuav tsum nres pg_repack.

Vim li cas ho tshwm sim? Qhov tshwm sim ntawm qhov yuam kev tshwm sim nyob ntawm seb muaj pes tsawg tus neeg siv ua haujlwm nrog tib lub widgets tib lub sijhawm. Thaj, nyob rau lub sijhawm ntawd muaj kev sib tw tsawg dua nrog cov ntaub ntawv khaws cia ntawm thawj pawg dua li lwm tus, piv txwv li. peb tsuas yog “muaj hmoo” xwb.

Lub tswv yim tsis ua haujlwm. Thaum lub sijhawm ntawd, peb pom ob txoj kev daws teeb meem: rov sau peb daim ntawv thov code kom faib tawm nrog kev txwv tsis pub dhau, lossis "qhia" pg_repack ua haujlwm nrog lawv. Peb xaiv qhov thib ob.

Hloov cov indexes hauv lub rooj tshiab nrog kev txwv tsis pub dhau ntawm lub rooj qub

Lub hom phiaj ntawm kev kho dua yog pom tseeb - yog tias cov lus qub muaj qhov txwv tsis pub dhau, tom qab ntawd rau qhov tshiab koj yuav tsum tsim qhov kev txwv, thiab tsis yog qhov ntsuas.

Txhawm rau ntsuas peb cov kev hloov pauv, peb sau ib qho kev sim yooj yim:

  • rooj nrog ib qho kev txwv ncua thiab ib daim ntawv teev tseg;
  • ntxig cov ntaub ntawv rau hauv ib lub voj uas tsis sib haum nrog cov ntaub ntawv uas twb muaj lawm;
  • ua ib qho kev hloov tshiab - cov ntaub ntawv tsis muaj teeb meem ntxiv lawm;
  • ua txhaum cov kev hloov.

create table test_table
(
  id serial,
  val int,
  constraint uk_test_table__val unique (val) deferrable initially deferred 
);

INSERT INTO test_table (val) VALUES (0);
FOR i IN 1..10000 LOOP
  BEGIN
    INSERT INTO test_table VALUES (0) RETURNING id INTO v_id;
    UPDATE test_table set val = i where id = v_id;
    COMMIT;
  END;
END LOOP;

Tus thawj version ntawm pg_repack ib txwm tsoo ntawm thawj qhov ntxig, qhov hloov kho version ua haujlwm yam tsis muaj qhov yuam kev. Zoo heev.

Peb mus rau ntau lawm thiab rov tau txais qhov yuam kev ntawm tib theem ntawm kev luam cov ntaub ntawv los ntawm lub rooj log mus rau ib qho tshiab:

$ ./pg_repack -t tablename -o id
INFO: repacking table "tablename"
ERROR: query failed: 
    ERROR: duplicate key value violates unique constraint "index_16508"
DETAIL:  Key (id, index)=(100500, 42) already exists.

Cov xwm txheej classic: txhua yam ua haujlwm hauv qhov chaw sim, tab sis tsis yog hauv kev tsim khoom?!

APPLY_COUNT thiab qhov sib txuas ntawm ob pawg

Peb pib txheeb xyuas cov cai kab lus los ntawm kab thiab nrhiav pom ib qho tseem ceeb: cov ntaub ntawv raug xa tawm los ntawm lub rooj log mus rau ib qho tshiab hauv batch, APPLY_COUNT qhov tsis tu ncua qhia qhov loj ntawm batch:

for (;;)
{
num = apply_log(connection, table, APPLY_COUNT);

if (num > MIN_TUPLES_BEFORE_SWITCH)
     continue;  /* there might be still some tuples, repeat. */
...
}

Qhov teeb meem yog tias cov ntaub ntawv los ntawm thawj qhov kev hloov pauv, uas ntau qhov kev ua haujlwm tuaj yeem ua txhaum txoj cai txwv, thaum hloov mus, tuaj yeem xaus rau ntawm kev sib tshuam ntawm ob pawg - ib nrab ntawm cov lus txib yuav raug cog lus hauv thawj pawg, thiab lwm qhov ib nrab. hauv qhov thib ob. Thiab ntawm no, nyob ntawm koj txoj hmoo: yog tias cov pab pawg tsis ua txhaum dab tsi hauv thawj pawg, ces txhua yam zoo, tab sis yog tias lawv ua, qhov yuam kev tshwm sim.

APPLY_COUNT yog sib npaug rau 1000 cov ntaub ntawv, uas piav qhia tias vim li cas peb cov kev xeem tau ua tiav - lawv tsis tau them rau cov ntaub ntawv ntawm "batch junction". Peb siv ob lo lus txib - ntxig thiab hloov kho, yog li 500 kev hloov pauv ntawm ob cov lus txib yeej ib txwm muab tso rau hauv ib pawg thiab peb tsis muaj teeb meem. Tom qab ntxiv qhov hloov tshiab thib ob, peb qhov hloov kho tau nres ua haujlwm:

FOR i IN 1..10000 LOOP
  BEGIN
    INSERT INTO test_table VALUES (1) RETURNING id INTO v_id;
    UPDATE test_table set val = i where id = v_id;
    UPDATE test_table set val = i where id = v_id; -- one more update
    COMMIT;
  END;
END LOOP;

Yog li, txoj haujlwm tom ntej yog kom paub tseeb tias cov ntaub ntawv los ntawm cov lus qub, uas tau hloov pauv hauv ib qho kev hloov pauv, xaus rau hauv lub rooj tshiab kuj nyob rau hauv ib qho kev hloov pauv.

Tsis kam ntawm batch

Thiab dua peb muaj ob txoj kev daws teeb meem. Ua ntej: cia tag nrho tso tseg partitioning rau hauv batch thiab hloov cov ntaub ntawv nyob rau hauv ib tug lw. Qhov kom zoo dua ntawm qhov kev daws teeb meem no yog nws qhov yooj yim - qhov yuav tsum tau hloov pauv tau tsawg (los ntawm txoj kev, hauv cov ntawv qub pg_reorg ua haujlwm raws nraim li ntawd). Tab sis muaj ib qho teeb meem - peb tab tom tsim kev lag luam mus ntev, thiab qhov no, raws li tau hais dhau los, yog qhov kev hem thawj rau qhov tshwm sim ntawm qhov bloat tshiab.

Qhov kev daws teeb meem thib ob yog qhov nyuaj dua, tab sis tej zaum muaj tseeb dua: tsim ib kab hauv lub rooj log nrog tus cim ntawm kev sib pauv uas ntxiv cov ntaub ntawv rau lub rooj. Tom qab ntawd, thaum peb luam cov ntaub ntawv, peb tuaj yeem pab pawg los ntawm tus cwj pwm no thiab xyuas kom meej tias cov kev hloov pauv cuam tshuam raug xa mus ua ke. Cov batch yuav raug tsim los ntawm ntau qhov kev lag luam (los yog ib qho loj) thiab nws qhov loj yuav txawv nyob ntawm seb cov ntaub ntawv tau hloov pauv ntau npaum li cas hauv cov kev lag luam no. Nws yog ib qho tseem ceeb uas yuav tsum nco ntsoov tias txij li cov ntaub ntawv los ntawm kev hloov pauv sib txawv nkag mus rau hauv lub rooj sib tham hauv qhov kev txiav txim siab, nws yuav tsis tuaj yeem nyeem nws raws li nws tau ua dhau los. seqscan rau txhua qhov kev thov nrog kev lim dej los ntawm tx_id yog kim heev, xav tau qhov Performance index, tab sis nws tseem yuav qeeb txoj kev vim qhov nyiaj siv ua haujlwm ntawm kev hloov kho nws. Feem ntau, raws li ib txwm, koj yuav tsum txi ib yam dab tsi.

Yog li, peb txiav txim siab pib nrog thawj qhov kev xaiv, raws li nws yooj yim dua. Ua ntej, nws yog ib qho tsim nyog kom nkag siab seb qhov kev lag luam ntev ntev yuav yog qhov teeb meem tiag tiag. Txij li thaum lub ntsiab hloov ntawm cov ntaub ntawv los ntawm lub qub rooj mus rau ib tug tshiab kuj tshwm sim nyob rau hauv ib tug ntev kev pauv, cov lus nug hloov mus rau hauv "peb yuav nce qhov kev pauv no ntau npaum li cas?" Lub sijhawm ntawm thawj qhov kev hloov pauv yog nyob ntawm qhov loj ntawm lub rooj. Lub sijhawm ntawm ib qho tshiab nyob ntawm seb muaj pes tsawg qhov kev hloov pauv hauv lub rooj thaum lub sijhawm hloov cov ntaub ntawv, i.e. ntawm qhov hnyav ntawm lub load. Lub pg_repack khiav tau tshwm sim thaum lub sij hawm ntawm qhov kev pab cuam tsawg kawg nkaus, thiab qhov ntim ntawm kev hloov pauv me me piv rau qhov loj me ntawm lub rooj. Peb txiav txim siab tias peb tuaj yeem tsis quav ntsej lub sijhawm ntawm kev hloov pauv tshiab (rau kev sib piv, qhov nruab nrab nws yog 1 teev thiab 2-3 feeb).

Cov kev sim tau zoo. Launch ntawm ntau lawm thiab. Txhawm rau kom pom tseeb, ntawm no yog ib daim duab nrog qhov loj ntawm ib qho ntawm cov ntaub ntawv tom qab khiav:

Postgres: bloat, pg_repack thiab ncua kev txwv

Txij li thaum peb tau txaus siab rau qhov kev daws teeb meem no, peb tsis tau sim ua qhov thib ob, tab sis peb tab tom xav txog qhov muaj peev xwm sib tham nrog cov neeg tsim khoom txuas ntxiv. Peb qhov kev hloov kho tam sim no, hmoov tsis, tseem tsis tau npaj rau kev tshaj tawm, txij li peb tsuas yog daws qhov teeb meem nrog cov kev txwv tsis pub ncua, thiab rau qhov kev ua tiav tag nrho nws yog qhov tsim nyog los muab kev txhawb nqa rau lwm hom. Peb cia siab tias yuav muaj peev xwm ua tau li no yav tom ntej.

Tej zaum koj muaj lus nug, vim li cas peb txawm tau koom nrog hauv zaj dab neeg no nrog kev hloov kho ntawm pg_repack, thiab tsis, piv txwv li, siv nws cov analogues? Qee lub sij hawm peb kuj tau xav txog qhov no, tab sis qhov kev paub zoo ntawm kev siv nws ua ntej, ntawm cov ntxhuav tsis muaj kev txwv tsis pub dhau, txhawb peb kom sim nkag siab qhov tseem ceeb ntawm qhov teeb meem thiab kho nws. Tsis tas li ntawd, kev siv lwm cov kev daws teeb meem kuj yuav tsum tau siv sijhawm los ua cov kev sim, yog li peb txiav txim siab tias peb yuav ua ntej sim kho qhov teeb meem hauv nws, thiab yog tias peb pom tias peb tsis tuaj yeem ua qhov no hauv lub sijhawm tsim nyog, ces peb yuav pib saib cov analogues. .

tshawb pom

Qhov peb tuaj yeem pom zoo raws li peb tus kheej qhov kev paub:

  1. Saib xyuas koj lub plab. Raws li kev soj ntsuam cov ntaub ntawv, koj tuaj yeem nkag siab zoo npaum li cas autovacuum teeb tsa.
  2. Kho AUTOVACUUM kom ua kom lub plab nyob rau theem tau txais.
  3. Yog tias qhov bloat tseem loj tuaj thiab koj tsis tuaj yeem kov yeej nws siv cov cuab yeej tawm hauv lub thawv, tsis txhob ntshai siv cov khoom siv sab nraud. Qhov tseem ceeb yog sim txhua yam kom zoo.
  4. Tsis txhob ntshai hloov cov kev daws teeb meem sab nraud kom haum koj cov kev xav tau - qee zaum qhov no tuaj yeem ua tau zoo dua thiab yooj yim dua li hloov koj tus kheej cov cai.

Tau qhov twg los: www.hab.com

Ntxiv ib saib