Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

Umthelela wokuqunjelwa ematafuleni nasezikhombeni waziwa kabanzi futhi awukho kuma-Postgres kuphela. Kunezindlela zokubhekana nakho ngaphandle kwebhokisi, njenge-VACUUM FULL noma i-CLUSTER, kodwa zikhiya amatafula ngesikhathi sokusebenza ngakho-ke azikwazi ukusetshenziswa ngaso sonke isikhathi.

I-athikili izoqukatha ithiyori encane mayelana nokuthi ukuqunjelwa kwenzeka kanjani, ukuthi ungalwa kanjani nakho, mayelana nezingqinamba ezihlehlisiwe kanye nezinkinga eziziletha ekusetshenzisweni kwesandiso se-pg_repack.

Lesi sihloko sibhalwe ngokususelwa ku inkulumo yami ku-PgConf.Russia 2020.

Kungani kwenzeka i-bloat?

I-Postgres isuselwe kumodeli enezinguqulo eziningi (I-MVCC). Ingqikithi yayo iwukuthi umugqa ngamunye kuthebula ungaba nezinguqulo ezimbalwa, kuyilapho ukuthengiselana kungaboni okungaphezu kweyodwa yalezi zinguqulo, kodwa hhayi okufanayo. Lokhu kuvumela okwenziwayo okumbalwa ukuthi kusebenze ngesikhathi esisodwa futhi kungabi namthelela kwenye.

Ngokusobala, zonke lezi zinguqulo zidinga ukugcinwa. I-Postgres isebenza nekhasi lememori ngekhasi futhi ikhasi inani elincane ledatha engafundwa kudiski noma ebhaliwe. Ake sibheke isibonelo esincane ukuze siqonde ukuthi lokhu kwenzeka kanjani.

Ake sithi sinetafula esengeze kulo amarekhodi amaningana. Idatha entsha ivele ekhasini lokuqala lefayela lapho kugcinwa khona ithebula. Lezi izinguqulo ezibukhoma zemigqa ezitholakala kwezinye izinkokhelo ngemva kokuzibophezela (ukuze kube lula, sizothatha ngokuthi izinga lokuzihlukanisa lithi Funda Ukuzibophezela).

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

Sibe sesibuyekeza okunye okufakiwe, ngaleyo ndlela simaka inguqulo endala njengengasabalulekile.

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

Isinyathelo ngesinyathelo, sibuyekeza futhi sisusa izinguqulo zemigqa, sigcine sinekhasi lapho cishe uhhafu wedatha "udoti". Le datha ayibonakali kunoma yikuphi okwenziwayo.

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

I-Postgres inomshini I-VACUUM, ehlanza izinguqulo ezingasasebenzi futhi yenze indawo yedatha entsha. Kodwa uma ingalungiswanga ngendlela enolaka ngokwanele noma imatasa isebenza kwamanye amatafula, khona-ke “idatha kadoti” isala, futhi kufanele sisebenzise amakhasi engeziwe ukuze uthole idatha entsha.

Ngakho esibonelweni sethu, ngesikhathi esithile ithebula lizoba namakhasi amane, kodwa ingxenye yalo kuphela izoqukatha idatha ebukhoma. Njengomphumela, lapho sifinyelela ithebula, sizofunda idatha eningi kunesidingo.

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

Ngisho noma i-VACUUM manje isula zonke izinguqulo zemigqa ezingabalulekile, isimo ngeke sibe ngcono kakhulu. Sizoba nesikhala samahhala emakhasini noma emakhasini wonke emigqeni emisha, kodwa sisazofunda idatha eningi kunalokho okudingekayo.
Phela, uma ikhasi elingenalutho ngokuphelele (elesibili esibonelweni sethu) belisekupheleni kwefayela, i-VACUUM izokwazi ukulinquma. Kodwa manje usephakathi, ngakho akukho okungenziwa ngaye.

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

Lapho inani lamakhasi anjalo angenalutho noma ahlakazekile kakhulu liba likhulu, elibizwa ngokuthi i-bloat, liqala ukuthinta ukusebenza.

Yonke into echazwe ngenhla iyimishini yokuvela kwe-bloat emathebula. Ezinkombeni lokhu kwenzeka ngendlela efanayo.

Ingabe nginesifo sokuqunjelwa?

Kunezindlela eziningana zokunquma ukuthi une-bloat. Umqondo wokuqala ukusebenzisa izibalo ze-Postgres zangaphakathi, eziqukethe ulwazi cishe mayelana nenani lemigqa kumathebula, inani lemigqa “bukhoma”, njll. Ungathola ukuhlukahluka okuningi kwemibhalo esenziwe ngomumo ku-inthanethi. Sathatha njengesisekelo umbhalo kusuka Ochwepheshe be-PostgreSQL, abangahlola amatafula e-bloat kanye nezinkomba ze-toast ne-bloat btree. Ngokuhlangenwe nakho kwethu, iphutha layo liyi-10-20%.

Enye indlela ukusebenzisa isandiso pgstattuple, okukuvumela ukuthi ubheke ngaphakathi kwamakhasi futhi uthole kokubili inani elilinganiselwe neliqondile le-bloat. Kodwa esimweni sesibili, kuzodingeka uhlole itafula lonke.

Sibheka inani elincane le-bloat, elifika ku-20%, elamukelekayo. Kungabhekwa njenge-analogue ye-fillfactor ye amatafula и indices. Ku-50% nangaphezulu, izinkinga zokusebenza zingase ziqale.

Izindlela zokulwa ne-bloat

I-Postgres inezindlela ezimbalwa zokubhekana ne-bloat ngaphandle kwebhokisi, kodwa azihlali zilungele wonke umuntu.

Lungiselela i-AUTOVACUUM ukuze kungenzeki ukuqunjelwa. Noma ngokuqondile, ukuyigcina isezingeni elamukelekayo kuwe. Lokhu kubonakala njengeseluleko “sikakaputeni,” kodwa empeleni lokhu akulula ngaso sonke isikhathi ukukufeza. Isibonelo, unokuthuthuka okusebenzayo okunezinguquko ezivamile ku-schema sedatha, noma uhlobo oluthile lokuthuthwa kwedatha okwenzekayo. Njengomphumela, iphrofayili yakho yokulayisha ingashintsha njalo futhi izohluka ngokwethebula nethebula. Lokhu kusho ukuthi udinga njalo ukusebenza phambili kancane futhi ulungise i-AUTOVACUUM kuphrofayela eshintshayo yetafula ngalinye. Kodwa ngokusobala lokhu akulula ukukwenza.

Esinye isizathu esivamile sokuthi kungani i-AUTOVACUUM ingakwazi ukuhambisana namathebula kungenxa yokuthi kukhona ukuthengiselana okuhlala isikhathi eside kuyivimbela ekuhlanzeni idatha etholakalayo kulokho okwenziwayo. Isincomo lapha futhi sisobala - susa ukuthengiselana "okulenga" futhi unciphise isikhathi semisebenzi esebenzayo. Kodwa uma umthwalo kuhlelo lwakho lokusebenza kuyingxube ye-OLAP ne-OLTP, khona-ke ungaba nezibuyekezo eziningi kanye nemibuzo emifushane ngasikhathi sinye, kanye nemisebenzi yesikhathi eside - ngokwesibonelo, ukwakha umbiko. Esimweni esinjalo, kufanelekile ukucabanga ngokusabalalisa umthwalo ezisekelweni ezahlukene, okuzovumela ukulungiswa okuhle kakhulu ngamunye wabo.

Esinye isibonelo - ngisho noma iphrofayili i-homogeneous, kodwa i-database ingaphansi komthwalo ophezulu kakhulu, khona-ke ngisho ne-AUTOVACUUM enolaka kakhulu ingase ingakwazi ukubhekana, futhi i-bloat izokwenzeka. Ukukala (okuqondile noma okuvundlile) yisona sixazululo kuphela.

Okufanele ukwenze esimweni lapho usethe i-AUTOVACUUM, kodwa i-bloat iyaqhubeka ikhula.

Ithimba IVACUUM IGCWELE yakha kabusha okuqukethwe kwamathebula nezinkomba futhi ishiya idatha efanele kuphela kuwo. Ukuze kuqedwe ukuqunjelwa, kusebenza ngokuphelele, kodwa ngesikhathi sokwenziwa kwayo kubanjwa ukukhiya okukhethekile etafuleni (AccessExclusiveLock), okungeke kuvumele ukwenza imibuzo kuleli thebula, ngisho nokukhetha. Uma ungakwazi ukukhokhela ukumisa isevisi yakho noma ingxenye yayo isikhathi esithile (kusukela emashumini emizuzu kuya emahoreni amaningana kuye ngobukhulu be-database kanye ne-hardware yakho), khona-ke le nketho iyona engcono kakhulu. Ngeshwa, asinaso isikhathi sokusebenzisa i-VACUUM FULL phakathi nokulungiswa okuhleliwe, ngakho le ndlela ayifaneleki kithi.

Ithimba I-CLUSTER Yakha kabusha okuqukethwe kwamathebula ngendlela efanayo ne-VACUUM FULL, kodwa ikuvumela ukuthi ucacise inkomba ngokuya ngokuthi idatha izohlelwa kanjani ngokoqobo kudiski (kodwa ngokuzayo i-oda aliqinisekisiwe ngemigqa emisha). Ezimweni ezithile, lokhu kuwukulungiselelwa okuhle kwenani lemibuzo - ngokufunda amarekhodi amaningi ngenkomba. Ukungalungi komyalo kuyafana nalokho kwe-VACUUM FULL - ikhiya itafula ngesikhathi sokusebenza.

Ithimba REINDEX efana nalezi ezimbili ezedlule, kodwa yakha kabusha inkomba ethile noma zonke izinkomba zethebula. Izingidi ziba buthakathaka kancane: I-ShareLock etafuleni (ivimbela ukuguqulwa, kodwa ivumela ukukhetha) kanye ne-AccessExclusiveLock kunkomba eyakhiwa kabusha (ivimba imibuzo kusetshenziswa le nkomba). Kodwa-ke, kunguqulo ye-12 ye-Postgres ipharamitha yavela NGESIKHATHI sinye, okukuvumela ukuthi wakhe kabusha inkomba ngaphandle kokuvimbela ukungezwa ngesikhathi esisodwa, ukuguqulwa, noma ukususwa kwamarekhodi.

Ezinguqulweni zangaphambili ze-Postgres, ungafinyelela umphumela ofana ne-REINDEX CONCURRENTLY usebenzisa DALA INHLOKO NGESINYE. Ikuvumela ukuthi udale inkomba ngaphandle kokukhiya okuqinile (ShareUpdateExclusiveLock, engaphazamisi imibuzo ehambisanayo), bese ushintsha inkomba endala ufake entsha bese ususa inkomba endala. Lokhu kukuvumela ukuthi ukhiphe i-index bloat ngaphandle kokuphazamisa uhlelo lwakho lokusebenza. Kubalulekile ukucabangela ukuthi lapho kwakhiwa kabusha izinkomba kuzoba nomthwalo owengeziwe ku-subsystem yediski.

Ngakho-ke, uma izinkomba kunezindlela zokuqeda i-bloat "on the fly," azikho zamatafula. Lapha yilapho izandiso zangaphandle ezihlukene ziqala khona ukusebenza: pg_repack (ngaphambili pg_reorg), pgcompact, pgcompacttable nabanye. Kulesi sihloko, ngeke ngibaqhathanise futhi ngizokhuluma kuphela nge-pg_repack, okuthi, ngemva kokuguqulwa okuthile, sisebenzise thina.

Isebenza kanjani i-pg_repack

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe
Ake sithi sinetafula elijwayelekile ngokuphelele - elinezinkomba, imikhawulo futhi, ngeshwa, nge-bloat. Isinyathelo sokuqala se-pg_repack ukudala ithebula lokungena ukugcina idatha mayelana nazo zonke izinguquko ngenkathi isebenza. Isibangeli sizophindaphinda lezi zinguquko kukho konke ukufaka, ukubuyekeza nokususa. Khona-ke itafula liyakhiwa, elifana nelokuqala esakhiweni, kodwa ngaphandle kwezinkomba nemikhawulo, ukuze ungabambezeli inqubo yokufaka idatha.

Okulandelayo, pg_repack idlulisela idatha kusuka kuthebula elidala kuya kuthebula elisha, ihlunga ngokuzenzakalelayo yonke imigqa engabalulekile, bese idala izinkomba zethebula elisha. Ngesikhathi kwenziwa yonke le misebenzi, izinguquko zinqwabelana kuthebula lokungena.

Isinyathelo esilandelayo ukudlulisa izinguquko etafuleni elisha. Ukuthutha kwenziwa izikhathi eziningana, futhi lapho kukhona okufakiwe okungaphansi kuka-20 okusele kuthebula lokungena, i-pg_repack ithola ukukhiya okuqinile, ithuthe idatha yakamuva, futhi imiselele ithebula elidala ngelisha kumathebula esistimu ye-Postgres. Lesi yisikhathi kuphela futhi esifushane kakhulu lapho ungeke ukwazi ukusebenza netafula. Ngemva kwalokhu, itafula elidala netafula elinamalogi kuyasuswa futhi isikhala siyakhululwa ohlelweni lwefayela. Inqubo iphelile.

Konke kubukeka kukuhle ngombono, kepha kwenzekani ekusebenzeni? Sihlole i-pg_repack ngaphandle komthwalo nangaphansi komthwalo, futhi sahlola ukusebenza kwayo uma ime ngaphambi kwesikhathi (ngamanye amazwi, sisebenzisa u-Ctrl+C). Zonke izivivinyo zazikhona.

Saya esitolo sokudla - futhi konke akwenzekanga njengoba besilindele.

I-pancake yokuqala idayiswa

Kuqoqo lokuqala sithole iphutha mayelana nokwephulwa kwesibopho esiyingqayizivele:

$ ./pg_repack -t tablename -o id
INFO: repacking table "tablename"
ERROR: query failed: 
    ERROR: duplicate key value violates unique constraint "index_16508"
DETAIL:  Key (id, index)=(100500, 42) already exists.

Lo mkhawulo ubunenkomba yegama elakhiwe ngokuzenzakalelayo_16508 - idalwe ngu-pg_repack. Ngokusekelwe kuzibaluli ezifakwe ekwakhiweni kwayo, sinqume umkhawulo othi "wethu" ohambisana nawo. Inkinga kuvele ukuthi lokhu akuwona umkhawulo ojwayelekile ngokuphelele, kepha uhlehlisiwe (umkhawulo omisiwe), i.e. ukuqinisekiswa kwayo kwenziwa kamuva kunomyalo we-sql, okuholela emiphumeleni engalindelekile.

Izithiyo ezihlehlisiwe: kungani zidingeka nokuthi zisebenza kanjani

Ithiyori encane mayelana nemikhawulo ehlehlisiwe.
Ake sicabangele isibonelo esilula: sinencwadi yereferensi yetafula yezimoto ezinezimfanelo ezimbili - igama nokuhleleka kwemoto ohlwini lwemibhalo.
Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

create table cars
(
  name text constraint pk_cars primary key,
  ord integer not null constraint uk_cars unique
);



Ake sithi besidinga ukushintshanisa imoto yokuqala neyesibili. Isixazululo esiqondile siwukuvuselela inani lokuqala libe elesibili, bese elesibili libe kwelokuqala:

begin;
  update cars set ord = 2 where name = 'audi';
  update cars set ord = 1 where name = 'bmw';
commit;

Kepha uma sisebenzisa le khodi, silindele ukwephulwa kwemikhawulo ngoba ukuhleleka kwamanani etafuleni kuhlukile:

[23305] ERROR: duplicate key value violates unique constraint “uk_cars”
Detail: Key (ord)=(2) already exists.

Ngingakwenza kanjani ngendlela ehlukile? Inketho yokuqala: engeza inani elingeziwe esikhundleni se-oda eliqinisekisiwe ukuthi ngeke libe khona kuthebula, isibonelo “-1”. Ekuhlelweni, lokhu kubizwa ngokuthi “ukushintshanisa amanani okuguquguqukayo okubili ngeyesithathu.” Okuwukuphela kwe-drawback yale ndlela isibuyekezo esengeziwe.

Inketho yesibili: Yakha kabusha ithebula ukuze usebenzise uhlobo lwedatha yephoyinti elintantayo ngevelu ye-oda esikhundleni sezinombolo eziphelele. Khona-ke, lapho ubuyekeza inani kusuka ku-1, isibonelo, kuya ku-2.5, ukungena kokuqala "kuzoma" ngokuzenzekelayo phakathi kwesibili nesithathu. Lesi sixazululo siyasebenza, kepha kunemikhawulo emibili. Okokuqala, ngeke kukusebenzele uma inani lisetshenziswa ndawana thize kusixhumi esibonakalayo. Okwesibili, kuye ngokunemba kohlobo lwedatha, uzoba nenani elilinganiselwe lokufakwa okungenzeka ngaphambi kokubala kabusha amanani awo wonke amarekhodi.

Inketho yesithathu: yenza umkhawulo uhlehliswe ukuze uhlolwe kuphela ngesikhathi sokuzibophezela:

create table cars
(
  name text constraint pk_cars primary key,
  ord integer not null constraint uk_cars unique deferrable initially deferred
);

Njengoba ingqondo yesicelo sethu sokuqala iqinisekisa ukuthi wonke amanani ahlukile ngesikhathi sokuzibophezela, izophumelela.

Isibonelo okukhulunywe ngaso ngenhla, yiqiniso, sinokwenziwa kakhulu, kodwa siveza umqondo. Kuhlelo lwethu lokusebenza, sisebenzisa imigoqo ehlehlisiwe ukuze sisebenzise ingqondo enesibopho sokuxazulula izingxabano lapho abasebenzisi ngesikhathi esisodwa besebenza nezinto zewijethi eyabiwe ebhodini. Ukusebenzisa imikhawulo enjalo kusivumela ukuthi senze ikhodi yesicelo ibe lula kancane.

Ngokuvamile, kuye ngohlobo lwesivimbelo, i-Postgres inamazinga amathathu embudumbudu wokuwahlola: umugqa, umsebenzi, namazinga wokuvezwa.
Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe
Source: ama-begriffs

HLOLA futhi Hhayi NULL kuhlala kubhekwa ezingeni lomugqa; kweminye imikhawulo, njengoba kungabonwa etafuleni, kunezinketho ezihlukile. Ungafunda okwengeziwe lapha.

Ukufingqa kafushane, imikhawulo ehlehlisiwe ezimeni eziningi ihlinzeka ngekhodi efundeka kakhudlwana nemiyalo embalwa. Kodwa-ke, kufanele ukhokhele lokhu ngokufaka inkimbinkimbi inqubo yokususa iphutha, kusukela lapho iphutha lenzeka ngalo kanye nomzuzu othola ngalo ngalo zihlukaniswa ngesikhathi. Enye inkinga engaba khona ukuthi umhleli angase angakwazi njalo ukwakha uhlelo olufanele uma isicelo sibandakanya umkhawulo ohlehlisiwe.

Ukuthuthukiswa kwe-pg_repack

Sihlanganise ukuthi yiziphi izithiyo ezihlehlisiwe, kodwa zihlobana kanjani nenkinga yethu? Masikhumbule iphutha esilithole ngaphambilini:

$ ./pg_repack -t tablename -o id
INFO: repacking table "tablename"
ERROR: query failed: 
    ERROR: duplicate key value violates unique constraint "index_16508"
DETAIL:  Key (id, index)=(100500, 42) already exists.

Kwenzeka lapho idatha ikopishwa isuka kuthebula lelogi iye kuthebula elisha. Lokhu kubukeka kuxakile ngoba... idatha kuthebula lokungena izinikele kanye nedatha kuthebula lomthombo. Uma benelisa imikhawulo yetafula lokuqala, bangephula kanjani imingcele efanayo kwelisha?

Njengoba kuvela, umsuka wenkinga usesinyathelweni sangaphambilini se-pg_repack, esidala izinkomba kuphela, kodwa hhayi izithiyo: ithebula elidala lalinomkhawulo oyingqayizivele, futhi elisha lakha inkomba eyingqayizivele esikhundleni.

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

Kubalulekile ukuqaphela lapha ukuthi uma umkhawulo ujwayelekile futhi ungahlehlisiwe, khona-ke inkomba eyingqayizivele eyakhiwe esikhundleni salokho ilingana nalesi sivimbelo, ngoba Izithiyo ezihlukile ku-Postgres zenziwa ngokwakha inkomba eyingqayizivele. Kodwa esimweni sokuvinjelwa okuhlehlisiwe, ukuziphatha akufani, ngoba inkomba ayikwazi ukuhlehliswa futhi ihlale ihlolwe ngesikhathi kukhishwa umyalo we-sql.

Ngakho-ke, ingqikithi yenkinga ilele "ekubambezeleni" kwesheke: etafuleni lokuqala kwenzeka ngesikhathi sokuzibophezela, futhi etafuleni elisha ngesikhathi kukhishwa umyalo we-sql. Lokhu kusho ukuthi sidinga ukwenza isiqiniseko sokuthi ukuhlola kwenziwa ngendlela efanayo kuzo zombili izimo: noma kubambezeleka njalo, noma ngaso sonke isikhathi.

Ngakho yimiphi imibono esasinayo?

Dala inkomba efana nehlehlisiwe

Umbono wokuqala uwukwenza kokubili ukuhlola kumodi esheshayo. Lokhu kungase kukhiqize imikhawulo eminingana emihle engalungile, kodwa uma imbalwa, lokhu akufanele kuthinte umsebenzi wabasebenzisi, ngoba ukungqubuzana okunjalo kuyisimo esivamile kubo. Ziyenzeka, isibonelo, lapho abasebenzisi ababili beqala ukuhlela iwijethi efanayo ngesikhathi esisodwa, futhi iklayenti lomsebenzisi wesibili lingenaso isikhathi sokuthola ulwazi lokuthi iwijethi isivele ivinjiwe ukuze ihlelwe umsebenzisi wokuqala. Esimeni esinjalo, iseva yenqaba umsebenzisi wesibili, futhi iklayenti layo libuyisela emuva izinguquko futhi livimbe iwijethi. Kamuva, lapho umsebenzisi wokuqala eqeda ukuhlela, owesibili uzothola ulwazi lokuthi iwijethi ayisavinjiwe futhi uzokwazi ukuphinda isenzo sakhe.

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

Ukuqinisekisa ukuthi ukuhlola kuhlala kumodi engahlehlisiwe, sidale inkomba entsha efana nesithiyo sokuqala esihlehlisiwe:

CREATE UNIQUE INDEX CONCURRENTLY uk_tablename__immediate ON tablename (id, index);
-- run pg_repack
DROP INDEX CONCURRENTLY uk_tablename__immediate;

Esimeni sokuhlola, sithole amaphutha ambalwa kuphela alindelwe. Impumelelo! Siphinde sasebenzisa i-pg_repack ekukhiqizeni futhi sathola amaphutha angu-5 kuqoqo lokuqala ngehora lomsebenzi. Lona umphumela owamukelekayo. Kodwa-ke, kuqoqo lesibili inani lamaphutha lenyuke kakhulu futhi kwadingeka siyeke pg_repack.

Kungani kwenzeka? Amathuba okuthi kwenzeke iphutha ancike ekutheni bangaki abasebenzisi abasebenza ngamawijethi afanayo ngesikhathi esisodwa. Ngokusobala, ngaleso sikhathi kwakukhona izinguquko ezimbalwa kakhulu zokuncintisana nedatha egcinwe kuqoqo lokuqala kunakwamanye, i.e. saba “nenhlanhla” nje.

Umbono awuzange usebenze. Ngaleso sikhathi, sabona ezinye izixazululo ezimbili: bhala kabusha ikhodi yethu yohlelo lokusebenza ukuze usikhiphe ngemikhawulo ehlehlisiwe, noma “fundisa” pg_repack ukuze usebenze nazo. Sikhethe owesibili.

Shintshanisa izinkomba kuthebula elisha ngemikhawulo ehlehlisiwe esuka kuthebula lokuqala

Inhloso yokubuyekeza yayisobala - uma ithebula langempela linomkhawulo ohlehlisiwe, khona-ke omusha udinga ukudala umkhawulo onjalo, hhayi inkomba.

Ukuhlola izinguquko zethu, sibhale ukuhlolwa okulula:

  • ithebula elinesivimbelo esihlehlisiwe kanye nerekhodi elilodwa;
  • faka idatha ku-loop engqubuzana nerekhodi elikhona;
  • yenza isibuyekezo - idatha ayisangqubuzani;
  • yenza izinguquko.

create table test_table
(
  id serial,
  val int,
  constraint uk_test_table__val unique (val) deferrable initially deferred 
);

INSERT INTO test_table (val) VALUES (0);
FOR i IN 1..10000 LOOP
  BEGIN
    INSERT INTO test_table VALUES (0) RETURNING id INTO v_id;
    UPDATE test_table set val = i where id = v_id;
    COMMIT;
  END;
END LOOP;

Inguqulo yasekuqaleni ye-pg_repack yayihlala iphahlazeka kokufaka kokuqala, inguqulo elungisiwe yasebenza ngaphandle kwamaphutha. Kuhle.

Siya ekukhiqizeni futhi siphinde sithole iphutha esigabeni esifanayo sokukopisha idatha kusuka kutafula lokungena kuya kwelisha:

$ ./pg_repack -t tablename -o id
INFO: repacking table "tablename"
ERROR: query failed: 
    ERROR: duplicate key value violates unique constraint "index_16508"
DETAIL:  Key (id, index)=(100500, 42) already exists.

Isimo sakudala: yonke into isebenza ezindaweni zokuhlola, kodwa hhayi ekukhiqizeni?!

APPLY_COUNT kanye nokuhlangana kwamaqoqo amabili

Siqale ukuhlaziya ikhodi ngokwezwi nezwi umugqa ngomugqa futhi sathola iphuzu elibalulekile: idatha idluliswa isuka kulogi iye kwentsha ngamaqoqo, okungaguquki okungu-APPLY_COUNT kubonise usayizi wenqwaba:

for (;;)
{
num = apply_log(connection, table, APPLY_COUNT);

if (num > MIN_TUPLES_BEFORE_SWITCH)
     continue;  /* there might be still some tuples, repeat. */
...
}

Inkinga ukuthi idatha evela ekwenziweni kwasekuqaleni, lapho imisebenzi eminingana ingahle yephule khona umkhawulo, lapho idluliswa, ingagcina isisekuhlanganeni kwamaqoqo amabili - ingxenye yemiyalo izokwenziwa kuqeqebana lokuqala, kanti enye ingxenye. kwesibili. Futhi lapha, kuye ngenhlanhla yakho: uma amaqembu engaphuli lutho ku-batch yokuqala, khona-ke konke kuhamba kahle, kodwa uma benza, kwenzeka iphutha.

APPLY_COUNT ilingana namarekhodi angu-1000, okuchaza ukuthi kungani ukuhlola kwethu kuphumelele - awazange afake icala "le-batch junction". Sisebenzise imiyalo emibili - faka futhi ubuyekeze, ngakho-ke ukuthengiselana okungu-500 kwemiyalelo emibili kwakuhlala kubekwe kunqwaba futhi asizange sibe nezinkinga. Ngemva kokwengeza isibuyekezo sesibili, ukuhlela kwethu kuyekile ukusebenza:

FOR i IN 1..10000 LOOP
  BEGIN
    INSERT INTO test_table VALUES (1) RETURNING id INTO v_id;
    UPDATE test_table set val = i where id = v_id;
    UPDATE test_table set val = i where id = v_id; -- one more update
    COMMIT;
  END;
END LOOP;

Ngakho-ke, umsebenzi olandelayo uwukuqinisekisa ukuthi idatha evela kuthebula langempela, eyashintshwa ekwenziweni okukodwa, igcina etafuleni elisha futhi ngaphakathi kokwenziwe okukodwa.

Ukwenqaba ukuhlanganisa

Futhi saba nezixazululo ezimbili. Okokuqala: masikuyeke ngokuphelele ukwahlukanisa sibe ngamaqoqo futhi sidlulise idatha ngokuthengiselana okukodwa. Inzuzo yalesi sixazululo kwakuwukuba lula kwaso - izinguquko zekhodi ezidingekayo bezincane (ngendlela, ezinguqulweni ezindala pg_reorg zisebenze ngendlela efanayo). Kodwa kunenkinga - sakha ukuthengiselana okuhlala isikhathi eside, futhi lokhu, njengoba kushiwo ngaphambili, kuyingozi ekuveleni kwe-bloat entsha.

Isixazululo sesibili siyinkimbinkimbi kakhulu, kodwa cishe silungile kakhulu: dala ikholomu kuthebula lokungena ngesihlonzi somsebenzi owengeze idatha kuthebula. Bese, lapho sikopisha idatha, singayiqoqa ngalesi sibaluli futhi siqinisekise ukuthi izinguquko ezihlobene zidluliselwa ndawonye. Iqoqo lizokwakhiwa kusuka ekwenziweni okuningana (noma eyodwa enkulu) futhi usayizi wayo uzohluka kuye ngokuthi ingakanani idatha eshintshiwe kulokhu kuthengiselana. Kubalulekile ukuqaphela ukuthi njengoba idatha evela emisebenzini ehlukene ingena etafuleni lokungena ngokulandelana okungahleliwe, ngeke kusakwazi ukuyifunda ngokulandelana, njengoba kwakunjalo ngaphambili. I-seqscan yesicelo ngasinye ngokuhlunga nge-tx_id ibiza kakhulu, inkomba iyadingeka, kodwa futhi izobambezela indlela ngenxa yokungaphezulu kokuyibuyekeza. Ngokuvamile, njengenhlalayenza, udinga ukudela okuthile.

Ngakho-ke, sinqume ukuqala ngenketho yokuqala, njengoba ilula. Okokuqala, kwakudingeka ukuqonda ukuthi ukuthengiselana okude kungaba yinkinga yangempela. Njengoba ukudluliswa okuyinhloko kwedatha kusuka kuthebula elidala kuya kwelisha nakho kwenzeka ekwenziweni okukodwa okude, umbuzo uguqulelwe ekubeni "sizokwandisa malini lokhu kuthenga?" Ubude besikhathi somsebenzi wokuqala buncike kakhulu kusayizi wethebula. Ubude besikhathi esisha buncike ekutheni zingaki izinguquko ezinqwabelana etafuleni ngesikhathi sokudluliswa kwedatha, i.e. ngokushuba komthwalo. Ukugijima kwe-pg_repack kwenzeke ngesikhathi somthwalo omncane wesevisi, futhi umthamo wezinguquko wawumncane ngokungenakulinganiswa uma uqhathaniswa nosayizi wangempela wethebula. Sinqume ukuthi singadebeselela isikhathi somsebenzi omusha (uma siqhathanisa, ngokwesilinganiso ihora elingu-1 namaminithi angu-2-3).

Ukuhlolwa kwaba kuhle. Yethula ekukhiqizeni futhi. Ukuze kucace, nasi isithombe esinosayizi wedatha eyodwa yolwazi ngemva kokusebenza:

Ama-Postgres: i-bloat, pg_repack kanye nezingqinamba ezihlehlisiwe

Njengoba saneliseke ngokuphelele ngalesi sixazululo, asizange sizame ukusebenzisa esesibili, kodwa sicabangela ithuba lokuxoxa ngalo nabathuthukisi bezandiso. Ukubuyekezwa kwethu kwamanje, ngeshwa, akukakalungeli ukushicilelwa, njengoba sixazulule inkinga kuphela ngemikhawulo ehlehlisiwe eyingqayizivele, futhi isiqeshana esigcwele kuyadingeka ukunikeza ukusekelwa kwezinye izinhlobo. Siyethemba ukuthi sizokwazi ukwenza lokhu esikhathini esizayo.

Mhlawumbe unombuzo, kungani saze sahileleka kule ndaba ngokuguqulwa kwe-pg_repack, futhi asizange, isibonelo, sisebenzise ama-analogues ayo? Ngesinye isikhathi siphinde sacabanga ngalokhu, kodwa isipiliyoni esihle sokuyisebenzisa ngaphambili, ematafuleni ngaphandle kwemingcele ehlehlisiwe, yasishukumisela ukuba sizame ukuqonda ingqikithi yenkinga futhi siyilungise. Ngaphezu kwalokho, ukusebenzisa ezinye izixazululo kudinga isikhathi sokwenza izivivinyo, ngakho-ke sanquma ukuthi sizozama kuqala ukulungisa inkinga kuyo, futhi uma siqaphela ukuthi asikwazanga ukwenza lokhu ngesikhathi esifanele, khona-ke sizoqala ukubheka ama-analogues. .

okutholakele

Esingakuncoma ngokusekelwe kokuhlangenwe nakho kwethu:

  1. Gada ukuqunjelwa kwakho. Ngokusekelwe kudatha yokuqapha, ungaqonda ukuthi i-autovacuum ilungiselelwe kahle kangakanani.
  2. Lungisa i-AUTOVACUUM ukuze ugcine ukuqunjelwa ezingeni elamukelekayo.
  3. Uma i-bloat isakhula futhi awukwazi ukuyinqoba usebenzisa amathuluzi angaphandle kwebhokisi, ungesabi ukusebenzisa izandiso zangaphandle. Into eyinhloko ukuhlola konke kahle.
  4. Ungesabi ukulungisa izixazululo zangaphandle ukuze zivumelane nezidingo zakho - ngezinye izikhathi lokhu kungase kusebenze kakhulu futhi kube lula kunokushintsha ikhodi yakho.

Source: www.habr.com

Engeza amazwana