Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Sakamakon kumburi a kan tebur da fihirisa an san shi sosai kuma yana nan ba kawai a cikin Postgres ba. Akwai hanyoyin magance shi daga cikin akwatin, kamar VACUUM FULL ko CLUSTER, amma suna kulle tebur yayin aiki don haka ba za a iya amfani da su koyaushe ba.

Labarin zai ƙunshi ƙaramin ka'idar game da yadda kumburi ke faruwa, yadda zaku iya yaƙi da shi, game da ƙuntatawa da aka jinkirta da kuma matsalolin da suke kawowa ga amfani da pg_repack tsawo.

An rubuta wannan labarin bisa ga maganata a PgConf.Russia 2020.

Me yasa kumburi ke faruwa?

Postgres ya dogara ne akan samfuri iri-iri (MVCC). Asalinta shine cewa kowane layi a cikin tebur na iya samun iri-iri da yawa, yayin da ma'amaloli ke ganin ɗaya daga cikin waɗannan nau'ikan, amma ba lallai ɗaya ne ɗaya ɗaya ba. Wannan yana ba da damar ma'amaloli da yawa suyi aiki lokaci guda kuma ba su da wani tasiri akan juna.

Babu shakka, duk waɗannan nau'ikan suna buƙatar adana su. Postgres yana aiki tare da shafi na ƙwaƙwalwar ajiya da shafi kuma shafi shine mafi ƙarancin adadin bayanan da za'a iya karantawa daga faifai ko rubutawa. Bari mu kalli ƙaramin misali don fahimtar yadda hakan ke faruwa.

Bari mu ce muna da tebur wanda muka ƙara bayanai da yawa. Sabbin bayanai sun bayyana a shafin farko na fayil inda aka adana tebur. Waɗannan nau'ikan layuka ne masu rai waɗanda ke samuwa ga wasu ma'amaloli bayan ƙaddamarwa (don sauƙi, za mu ɗauka cewa an ƙaddamar da matakin keɓewa).

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Daga nan mun sabunta ɗaya daga cikin shigarwar, ta haka ne muka yiwa tsohon sigar alama ba ta da amfani.

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Mataki-mataki, sabuntawa da share sassan layi, mun ƙare tare da shafi wanda kusan rabin bayanan "datti" ne. Wannan bayanan baya ganuwa ga kowace ma'amala.

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Postgres yana da tsari VACUUM, wanda ke tsabtace sifofin da ba a daɗe ba kuma yana ba da damar samun sabbin bayanai. Amma idan ba a daidaita shi da ƙarfi ba ko kuma yana aiki a cikin wasu tebur, to “bayanan shara” ya rage, kuma dole ne mu yi amfani da ƙarin shafuka don sabbin bayanai.

Don haka a cikin misalinmu, a wani lokaci tebur zai ƙunshi shafuka huɗu, amma rabinsa kawai zai ƙunshi bayanan kai tsaye. A sakamakon haka, lokacin samun dama ga tebur, za mu karanta bayanai da yawa fiye da wajibi.

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Ko da VACUUM yanzu yana share duk juzu'in jere maras dacewa, lamarin ba zai inganta sosai ba. Za mu sami sarari kyauta a cikin shafuka ko ma gabaɗayan shafuka don sabbin layuka, amma har yanzu za mu ci gaba da karanta ƙarin bayanai fiye da larura.
Af, idan shafi mara komai (na biyu a misalinmu) ya kasance a ƙarshen fayil ɗin, to VACUUM zai iya datsa shi. Amma yanzu tana tsakiya, don haka babu abin da za a yi da ita.

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Lokacin da adadin irin waɗannan shafuka marasa komai ko kuma masu ɗimbin yawa sun zama babba, wanda ake kira bloat, ya fara rinjayar aiki.

Duk abin da aka bayyana a sama shine makanikai na faruwar kumburi a cikin tebur. A cikin fihirisa wannan yana faruwa ta hanya ɗaya.

Ina da kumburi?

Akwai hanyoyi da yawa don sanin ko kuna da kumburi. Manufar farko ita ce yin amfani da ƙididdiga na Postgres na ciki, wanda ya ƙunshi kimanin bayanai game da adadin layuka a cikin tebur, yawan layuka na "rayuwa", da dai sauransu. Kuna iya samun yawancin bambance-bambancen rubutun da aka yi a Intanet. Mun dauki a matsayin tushe rubutun daga Kwararrun PostgreSQL, waɗanda za su iya kimanta tebur ɗin bloat tare da toast da btree indexes. A cikin kwarewarmu, kuskurensa shine 10-20%.

Wata hanya ita ce amfani da tsawo pgstattuple, wanda ke ba ku damar duba cikin shafuka kuma ku sami duka ƙididdiga da ƙimar ƙima. Amma a cikin akwati na biyu, dole ne ku duba dukkan teburin.

Muna la'akari da ƙananan ƙima, har zuwa 20%, yarda. Ana iya la'akari da shi azaman analog na fillfactor don alluna и fihirisa. A 50% da sama, matsalolin aiki na iya farawa.

Hanyoyin magance kumburi

Postgres yana da hanyoyi da yawa don magance kumburi daga cikin akwatin, amma ba koyaushe dace da kowa ba.

Sanya AUTOVACUUM don kada kumburi ya faru. Ko fiye daidai, don kiyaye shi a matakin yarda da ku. Wannan yana kama da shawarar "kaftin", amma a gaskiya wannan ba koyaushe ba ne mai sauƙi don cimmawa. Misali, kuna da ci gaba mai aiki tare da canje-canje na yau da kullun zuwa tsarin bayanai, ko wani nau'in ƙaura bayanai yana faruwa. Sakamakon haka, bayanin martabar ku na iya canzawa akai-akai kuma yawanci zai bambanta daga tebur zuwa tebur. Wannan yana nufin kuna buƙatar ci gaba da aiki kaɗan gaba kuma daidaita AUTOVACUUM zuwa bayanin martaba na kowane tebur. Amma a fili wannan ba shi da sauƙi a yi.

Wani dalili na yau da kullun da ya sa AUTOVACUUM ba zai iya ci gaba da kasancewa tare da tebur ba saboda akwai ma'amaloli na dogon lokaci waɗanda ke hana shi tsaftace bayanan da ke akwai ga waɗannan ma'amala. Shawarar a nan ita ma a bayyane take - kawar da ma'amaloli na "dangling" kuma rage lokacin ma'amaloli masu aiki. Amma idan nauyin da ke kan aikace-aikacenku shine matasan OLAP da OLTP, to, zaku iya samun sabuntawa akai-akai akai-akai da gajerun tambayoyin, da kuma ayyuka na dogon lokaci - alal misali, gina rahoto. A cikin irin wannan yanayi, yana da kyau a yi tunani game da yada kaya a kan tushe daban-daban, wanda zai ba da damar ƙarin daidaitawa kowane ɗayansu.

Wani misali - ko da bayanin martaba yana kama da juna, amma bayanan yana ƙarƙashin babban nauyi, to ko da mafi yawan AUTOVACUUM ba zai iya jurewa ba, kuma kumburi zai faru. Scaling (a tsaye ko a kwance) shine kawai mafita.

Abin da za ku yi a cikin yanayin da kuka kafa AUTOVACUUM, amma kumburi ya ci gaba da girma.

tawagar WUTA CIKAKKEN yana sake gina abubuwan da ke cikin tebur da fihirisa kuma ya bar bayanan da suka dace kawai a cikinsu. Don kawar da kumburi, yana aiki daidai, amma yayin aiwatar da shi an kama keɓaɓɓen kulle akan tebur (AccessExclusiveLock), wanda ba zai ƙyale aiwatar da tambayoyi akan wannan tebur ba, har ma da zaɓi. Idan za ku iya samun damar dakatar da sabis ɗin ku ko ɓangarensa na ɗan lokaci (daga dubun-dubatar mintuna zuwa sa'o'i da yawa dangane da girman bayanan da kayan aikin ku), to wannan zaɓi shine mafi kyau. Abin baƙin ciki, ba mu da lokacin da za mu gudu VACUUM FULL a lokacin da aka tsara tsare-tsaren, don haka wannan hanya ba ta dace da mu.

tawagar KUNGIYA Yana sake gina abubuwan da ke cikin tebur daidai da VACUUM FULL, amma yana ba ku damar ƙididdige ƙididdiga bisa ga bayanan da za a yi odar ta jiki akan faifai (amma nan gaba ba a ba da garantin odar don sabbin layuka ba). A wasu yanayi, wannan yana da kyau ingantawa ga yawan tambayoyi - tare da karanta bayanai da yawa ta fihirisa. Rashin amfanin umarnin daidai yake da na VACUUM FULL - yana kulle tebur yayin aiki.

tawagar REINDEX kama da biyun da suka gabata, amma yana sake gina ƙayyadaddun fihirisar ko duk fihirisar tebur. Makullan sun ɗan yi rauni: ShareLock akan tebur (yana hana gyare-gyare, amma yana ba da damar zaɓi) da AccessExclusiveLock akan index ɗin ana sake ginawa (yana toshe tambayoyin ta amfani da wannan fihirisar). Koyaya, a cikin sigar 12th na Postgres wani siga ya bayyana A lokaci guda, wanda ke ba ka damar sake gina fihirisar ba tare da toshe ƙari, gyare-gyare, ko share bayanan lokaci guda ba.

A cikin sigar farko na Postgres, zaku iya samun sakamako mai kama da REINDEX A KAN YIN AMFANI Ƙirƙiri INDEX a lokaci guda. Yana ba ku damar ƙirƙira fihirisar ba tare da tsantsan kullewa ba (ShareUpdateExclusiveLock, wanda baya tsoma baki tare da daidaitattun tambayoyin), sannan ku maye gurbin tsohuwar fihirisar da sabo kuma ku share tsohuwar fihirisar. Wannan yana ba ku damar kawar da kumburin index ba tare da tsoma baki tare da aikace-aikacen ku ba. Yana da mahimmanci a yi la'akari da cewa lokacin sake gina ma'auni za a sami ƙarin kaya akan tsarin faifai.

Don haka, idan don alamomi akwai hanyoyin da za a kawar da kumburi "a kan tashi," to babu wani tebur. Wannan shi ne inda daban-daban kari na waje suka shigo cikin wasa: pg_repack (tsohon pg_reorg), pgcompact, pgcompacttable da sauransu. A cikin wannan labarin, ba zan kwatanta su ba kuma zan yi magana ne kawai game da pg_repack, wanda, bayan wasu gyare-gyare, muna amfani da kanmu.

Yadda pg_repack ke aiki

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta
Bari mu ce muna da tebur na yau da kullun - tare da alamomi, ƙuntatawa kuma, rashin alheri, tare da kumburi. Mataki na farko na pg_repack shine ƙirƙirar tebur log don adana bayanai game da duk canje-canje yayin da yake gudana. Mai kunnawa zai maimaita waɗannan canje-canje don kowane sakawa, sabuntawa da sharewa. Sannan sai a samar da teburi, mai kama da na asali a tsari, amma ba tare da fihirisa da takurawa ba, don kada a rage saurin shigar da bayanai.

Bayan haka, pg_repack yana canja wurin bayanai daga tsohon tebur zuwa sabon tebur, ta atomatik tace duk layuka marasa mahimmanci, sannan ƙirƙirar firikwensin sabon tebur. Yayin aiwatar da duk waɗannan ayyukan, canje-canje suna taruwa a cikin tebur ɗin log.

Mataki na gaba shine don canja wurin canje-canje zuwa sabon tebur. Ana yin ƙaura akan gyare-gyare da yawa, kuma lokacin da aka rage ƙasa da shigarwar 20 a cikin tebur ɗin log ɗin, pg_repack ya sami makulli mai ƙarfi, yana ƙaura sabbin bayanai, kuma ya maye gurbin tsohon tebur tare da sabon a cikin teburin tsarin Postgres. Wannan shine kawai kuma ɗan gajeren lokacin da ba za ku iya yin aiki tare da tebur ba. Bayan wannan, an share tsohon tebur da tebur tare da rajistan ayyukan kuma an saki sarari a cikin tsarin fayil. An kammala tsari.

Komai yana da kyau a ka'idar, amma menene ya faru a aikace? Mun gwada pg_repack ba tare da kaya ba kuma a ƙarƙashin kaya, kuma mun duba aikin sa idan an dakatar da shi (a wasu kalmomi, ta amfani da Ctrl + C). Duk gwaje-gwajen sun tabbata.

Mun je kantin sayar da abinci - sannan komai bai tafi kamar yadda muka zata ba.

Pancake na farko ana sayarwa

A gungu na farko mun sami kuskure game da keta haddi na musamman:

$ ./pg_repack -t tablename -o id
INFO: repacking table "tablename"
ERROR: query failed: 
    ERROR: duplicate key value violates unique constraint "index_16508"
DETAIL:  Key (id, index)=(100500, 42) already exists.

Wannan iyakance yana da sunan da aka samar ta atomatik index_16508 - pg_repack ne ya ƙirƙira shi. Dangane da halayen da aka haɗa a cikin abun da ke ciki, mun ƙayyade ƙuntatawar "mu" wanda ya dace da shi. Matsalar ta zama cewa wannan ba ƙayyadaddun ƙayyadaddun tsarin ba ne, amma wanda aka jinkirta (ƙuntatawa da aka jinkirta), i.e. Tabbatarwarsa ana yin shi daga baya fiye da umarnin sql, wanda ke haifar da sakamakon da ba a zata ba.

Matsalolin da aka jinkirta: dalilin da yasa ake buƙatar su da kuma yadda suke aiki

Ka'idar kadan game da ƙuntatawa da aka jinkirta.
Bari mu yi la'akari da wani sauki misali: muna da tebur-reference littafin motoci da biyu halaye - sunan da oda na mota a cikin shugabanci.
Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

create table cars
(
  name text constraint pk_cars primary key,
  ord integer not null constraint uk_cars unique
);



A ce muna bukatar musanya motoci na farko da na biyu. Madaidaicin bayani shine sabunta ƙimar farko zuwa na biyu, na biyu kuma zuwa ta farko:

begin;
  update cars set ord = 2 where name = 'audi';
  update cars set ord = 1 where name = 'bmw';
commit;

Amma lokacin da muka gudanar da wannan lambar, muna tsammanin cin zarafi saboda tsari na dabi'u a cikin tebur na musamman ne:

[23305] ERROR: duplicate key value violates unique constraint “uk_cars”
Detail: Key (ord)=(2) already exists.

Ta yaya zan iya yi daban? Zabi na ɗaya: ƙara ƙarin maye gurbin ƙima zuwa oda wanda aka ba da tabbacin ba zai wanzu a cikin tebur ba, misali "-1". A cikin shirye-shiryen, ana kiran wannan "musayar dabi'u na masu canji biyu zuwa na uku." Iyakar koma bayan wannan hanya shine ƙarin sabuntawa.

Zabi na biyu: Sake tsara tebur don amfani da nau'in bayanan da ke kan iyo don ƙimar oda maimakon lamba. Sa'an nan, lokacin da ake sabunta darajar daga 1, misali, zuwa 2.5, shigarwa na farko zai "tsaye" ta atomatik tsakanin na biyu da na uku. Wannan maganin yana aiki, amma akwai iyakoki guda biyu. Na farko, ba zai yi aiki a gare ku ba idan ana amfani da ƙimar a wani wuri a cikin dubawa. Na biyu, dangane da madaidaicin nau'in bayanan, za ku sami iyakataccen adadin abubuwan da za a iya sakawa kafin a sake ƙididdige ƙimar duk bayanan.

Zabi na uku: sanya takunkumin ya jinkirta don a duba shi kawai a lokacin aikatawa:

create table cars
(
  name text constraint pk_cars primary key,
  ord integer not null constraint uk_cars unique deferrable initially deferred
);

Tun da ma'anar buƙatarmu ta farko ta tabbatar da cewa duk dabi'u sun bambanta a lokacin ƙaddamarwa, zai yi nasara.

Misalin da aka tattauna a sama, ba shakka, na roba ne, amma yana bayyana ra'ayin. A cikin aikace-aikacen mu, muna amfani da ƙuntatawa da aka jinkirta don aiwatar da dabaru waɗanda ke da alhakin warware rikice-rikice lokacin da masu amfani ke aiki tare da abubuwan widget ɗin da aka raba a kan allo. Yin amfani da irin waɗannan ƙuntatawa yana ba mu damar yin lambar aikace-aikacen ɗan sauƙi.

Gabaɗaya, ya danganta da nau'in ƙuntatawa, Postgres yana da matakai uku na granularity don duba su: jere, ma'amala, da matakan magana.
Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta
source: begriffs

Dubawa da BA NULL koyaushe ana duba su a matakin layi; don wasu ƙuntatawa, kamar yadda ake iya gani daga tebur, akwai zaɓuɓɓuka daban-daban. Kuna iya karantawa a nan.

Don taƙaitawa, ƙuntatawa da aka jinkirta a cikin yanayi da yawa suna ba da ƙarin lambar da za a iya karantawa da ƙarancin umarni. Duk da haka, dole ne ku biya wannan ta hanyar rikitar da tsarin gyarawa, tun lokacin da kuskuren ya faru da lokacin da kuka gano game da shi sun rabu cikin lokaci. Wata matsala mai yuwuwa ita ce mai tsara jadawalin ƙila ba koyaushe zai iya gina ingantaccen tsari ba idan buƙatar ta ƙunshi ƙuntatawa da aka jinkirta.

Inganta pg_repack

Mun rufe mene ne matsalolin da aka jinkirta, amma ta yaya suke da alaƙa da matsalarmu? Mu tuna kuskuren da muka samu a baya:

$ ./pg_repack -t tablename -o id
INFO: repacking table "tablename"
ERROR: query failed: 
    ERROR: duplicate key value violates unique constraint "index_16508"
DETAIL:  Key (id, index)=(100500, 42) already exists.

Yana faruwa lokacin da aka kwafi bayanai daga tebur log zuwa sabon tebur. Wannan abin mamaki ne saboda... an ƙaddamar da bayanan da ke cikin tebur ɗin log ɗin tare da bayanan da ke cikin teburin tushe. Idan sun gamsar da ƙuntatawa na tebur na asali, ta yaya za su keta ƙuntatawa ɗaya a cikin sabon?

Kamar yadda ya fito, tushen matsalar yana cikin mataki na baya na pg_repack, wanda ke haifar da ƙididdiga kawai, amma ba ƙuntatawa ba: tsohon tebur yana da ƙuntatawa na musamman, kuma sabon ya haifar da ƙididdiga na musamman maimakon.

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Yana da mahimmanci a lura a nan cewa idan ƙuntatawa ta al'ada ce kuma ba a jinkirta ba, to, maƙasudin musamman da aka halicce shi a maimakon haka yana daidai da wannan ƙuntatawa, saboda. Ana aiwatar da ƙuntatawa na musamman a cikin Postgres ta hanyar ƙirƙirar fihirisa na musamman. Amma game da ƙuntatawa da aka jinkirta, halayen ba ɗaya ba ne, saboda ba za a iya jinkirta fihirisar ba kuma ana bincika koyaushe a lokacin da aka aiwatar da umarnin sql.

Don haka, ainihin matsalar yana cikin "jinkiri" na rajistan: a cikin tebur na asali yana faruwa a lokacin aikatawa, kuma a cikin sabon tebur a lokacin da aka aiwatar da umarnin sql. Wannan yana nufin muna buƙatar tabbatar da cewa an yi cak ɗin a cikin lokuta biyu: ko dai a jinkirta, ko kuma nan da nan.

To wane ra'ayi muka samu?

Ƙirƙiri fihirisa mai kama da wanda aka jinkirta

Tunanin farko shine a yi duka cak a cikin yanayin nan take. Wannan na iya haifar da hane-hane masu kyau da yawa na ƙarya, amma idan akwai kaɗan daga cikinsu, wannan bai kamata ya shafi aikin masu amfani ba, tunda irin wannan rikice-rikicen yanayi ne na al'ada a gare su. Suna faruwa, alal misali, lokacin da masu amfani biyu suka fara gyara widget ɗin iri ɗaya a lokaci guda, kuma abokin ciniki na mai amfani na biyu ba shi da lokacin karɓar bayanin cewa an riga an toshe widget ɗin don gyara ta mai amfani na farko. A cikin irin wannan yanayi, uwar garken ya ƙi mai amfani na biyu, kuma abokin ciniki ya sake mayar da canje-canje kuma ya toshe widget din. Bayan ɗan lokaci, lokacin da mai amfani na farko ya gama gyara, na biyu zai karɓi bayanin cewa ba a toshe widget ɗin kuma za su iya maimaita aikinsu.

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Don tabbatar da cewa cak ɗin koyaushe suna cikin yanayin da ba a jinkirta ba, mun ƙirƙiri sabon fihirisa mai kama da ainihin ƙuntatawa da aka jinkirta:

CREATE UNIQUE INDEX CONCURRENTLY uk_tablename__immediate ON tablename (id, index);
-- run pg_repack
DROP INDEX CONCURRENTLY uk_tablename__immediate;

A cikin yanayin gwaji, mun sami kurakuran da ake tsammani kawai. Nasara! Mun sake gudu pg_repack akan samarwa kuma mun sami kurakurai 5 akan gungu na farko a cikin awa ɗaya na aiki. Wannan sakamako ne mai karbuwa. Koyaya, tuni akan gungu na biyu adadin kurakurai ya ƙaru sosai kuma dole ne mu dakatar da pg_repack.

Me yasa abin ya faru? Yiwuwar kuskure ya dogara da yawan masu amfani da ke aiki tare da widget ɗin iri ɗaya a lokaci guda. A bayyane yake, a wannan lokacin an sami raguwar sauye-sauye masu gasa tare da bayanan da aka adana a gungu na farko fiye da na sauran, watau. mun kasance kawai "sa'a".

Tunanin bai yi aiki ba. A wannan lokacin, mun ga wasu mafita guda biyu: sake rubuta lambar aikace-aikacen mu don rarraba tare da ƙuntatawa da aka jinkirta, ko "koyar" pg_repack don aiki tare da su. Mun zabi na biyu.

Sauya fihirisa a cikin sabon tebur tare da ƙuntatawa da aka jinkirta daga tebur na asali

Manufar bita ya kasance a bayyane - idan tebur na asali yana da ƙuntatawa da aka jinkirta, to don sabon kana buƙatar ƙirƙirar irin wannan ƙuntatawa, kuma ba index ba.

Don gwada canje-canjenmu, mun rubuta gwaji mai sauƙi:

  • tebur tare da ƙuntatawa da aka jinkirta da kuma rikodin guda ɗaya;
  • saka bayanai a cikin madauki wanda ya yi karo da rikodin da ke akwai;
  • yi sabuntawa - bayanan baya yin rikici;
  • aikata canje-canje.

create table test_table
(
  id serial,
  val int,
  constraint uk_test_table__val unique (val) deferrable initially deferred 
);

INSERT INTO test_table (val) VALUES (0);
FOR i IN 1..10000 LOOP
  BEGIN
    INSERT INTO test_table VALUES (0) RETURNING id INTO v_id;
    UPDATE test_table set val = i where id = v_id;
    COMMIT;
  END;
END LOOP;

Sigar asali ta pg_repack koyaushe tana faɗuwa a farkon sakawa, sigar da aka gyara tana aiki ba tare da kurakurai ba. Mai girma.

Muna zuwa samarwa kuma muna sake samun kuskure a lokaci guda na kwafin bayanai daga tebur log zuwa wani sabo:

$ ./pg_repack -t tablename -o id
INFO: repacking table "tablename"
ERROR: query failed: 
    ERROR: duplicate key value violates unique constraint "index_16508"
DETAIL:  Key (id, index)=(100500, 42) already exists.

Halin yanayi: duk abin da ke aiki a cikin yanayin gwaji, amma ba a cikin samarwa ba?!

APPLY_COUNT da mahaɗin batches biyu

Mun fara nazarin lambar a zahiri ta layi kuma mun gano wani muhimmin batu: ana canja wurin bayanai daga teburin log zuwa wani sabo cikin batches, APPLY_COUNT akai-akai yana nuna girman batch:

for (;;)
{
num = apply_log(connection, table, APPLY_COUNT);

if (num > MIN_TUPLES_BEFORE_SWITCH)
     continue;  /* there might be still some tuples, repeat. */
...
}

Matsalar ita ce, bayanan daga ainihin ma'amala, wanda ayyuka da yawa na iya yuwuwar keta ƙuntatawa, lokacin da aka canjawa wuri, na iya ƙarewa a mahaɗin batches biyu - rabin umarnin za a aiwatar a cikin rukunin farko, da sauran rabin. a karo na biyu. Kuma a nan, dangane da sa'ar ku: idan ƙungiyoyin ba su keta wani abu a cikin rukuni na farko ba, to, duk abin da ke da kyau, amma idan sun yi, kuskure ya faru.

APPLY_COUNT yana daidai da rikodin 1000, wanda ke bayanin dalilin da yasa gwaje-gwajenmu suka yi nasara - ba su rufe batun "batch junction". Mun yi amfani da umarni guda biyu - sakawa da sabuntawa, don haka daidai ma'amaloli 500 na umarni biyu koyaushe ana sanya su cikin tsari kuma ba mu fuskanci wata matsala ba. Bayan ƙara sabuntawa na biyu, gyaran mu ya daina aiki:

FOR i IN 1..10000 LOOP
  BEGIN
    INSERT INTO test_table VALUES (1) RETURNING id INTO v_id;
    UPDATE test_table set val = i where id = v_id;
    UPDATE test_table set val = i where id = v_id; -- one more update
    COMMIT;
  END;
END LOOP;

Don haka, aiki na gaba shine tabbatar da cewa bayanai daga tebur na asali, waɗanda aka canza a cikin ma'amala ɗaya, sun ƙare a cikin sabon tebur kuma a cikin ma'amala ɗaya.

Kin daga batching

Kuma mun sake samun mafita guda biyu. Na farko: bari gaba daya mu bar partitioning cikin batches da kuma canja wurin bayanai a daya ma'amala. Amfanin wannan maganin shine sauƙin sa - canje-canjen lambar da ake buƙata sun kasance kaɗan (a hanya, a cikin tsofaffin sigogin pg_reorg yayi aiki daidai da haka). Amma akwai matsala - muna ƙirƙirar ma'amala mai tsawo, kuma wannan, kamar yadda aka fada a baya, yana da barazana ga bayyanar sabon kumburi.

Magani na biyu ya fi rikitarwa, amma mai yiwuwa ya fi daidai: ƙirƙiri ginshiƙi a cikin tebur na log tare da mai gano ma'amala wanda ya kara bayanai zuwa tebur. Bayan haka, lokacin da muka kwafi bayanai, za mu iya haɗa su ta wannan sifa kuma mu tabbatar da cewa an canza canje-canje masu alaƙa tare. Za a samar da rukunin daga ma'amaloli da yawa (ko babba ɗaya) kuma girmansa zai bambanta dangane da nawa aka canza bayanai a cikin waɗannan ma'amaloli. Yana da mahimmanci a lura cewa tun da bayanai daga ma'amaloli daban-daban sun shiga teburin log ɗin a cikin tsari bazuwar, ba za a ƙara samun damar karanta shi a jere ba, kamar yadda yake a da. seqscan ga kowane buƙatun tare da tacewa ta tx_id yana da tsada sosai, ana buƙatar index, amma kuma zai rage tsarin saboda yawan sabunta shi. Gabaɗaya, kamar koyaushe, kuna buƙatar sadaukar da wani abu.

Don haka, mun yanke shawarar farawa tare da zaɓi na farko, kamar yadda ya fi sauƙi. Da farko, ya zama dole a fahimci ko dogon ma'amala zai zama matsala ta gaske. Tun da babban canja wurin bayanai daga tsohon tebur zuwa sabon kuma yana faruwa a cikin dogon lokaci guda ɗaya, tambayar ta rikide zuwa "nawa za mu ƙara wannan ciniki?" Tsawon lokacin ciniki na farko ya dogara ne akan girman tebur. Tsawon lokacin sabon ya dogara ne akan adadin canje-canje da suka taru a cikin tebur yayin canja wurin bayanai, watau. akan tsananin lodin. Gudun pg_repack ya faru ne a lokacin ƙarancin aikin sabis, kuma ƙarar canje-canjen bai yi daidai ba idan aka kwatanta da ainihin girman tebur. Mun yanke shawarar cewa za mu iya yin watsi da lokacin sabon ma'amala (don kwatanta, a matsakaici shine 1 hour da 2-3 mintuna).

Gwaje-gwajen sun kasance tabbatacce. Kaddamar da samarwa kuma. Don bayyanawa, ga hoto mai girman ɗayan bayanan bayan an gudu:

Postgres: kumburi, pg_repack da ƙuntatawa da aka jinkirta

Tun da mun gamsu da wannan bayani gaba ɗaya, ba mu yi ƙoƙarin aiwatar da na biyu ba, amma muna la'akari da yiwuwar tattaunawa da masu haɓakawa. Bita namu na yanzu, da rashin alheri, bai riga ya shirya don bugawa ba, tunda mun warware matsalar kawai tare da ƙuntatawa na musamman da aka jinkirta, kuma don cikakken faci yana da mahimmanci don ba da tallafi ga sauran nau'ikan. Muna fatan za mu iya yin hakan a nan gaba.

Wataƙila kuna da tambaya, me yasa har muka shiga cikin wannan labarin tare da gyara pg_repack, kuma ba, alal misali, yin amfani da kwatankwacinsa? A wani lokaci kuma mun yi tunani game da wannan, amma kyakkyawar ƙwarewar amfani da shi a baya, a kan tebur ba tare da ƙuntatawa ba, ya motsa mu muyi ƙoƙari mu fahimci ainihin matsalar kuma mu gyara shi. Bugu da ƙari, yin amfani da wasu hanyoyin kuma yana buƙatar lokaci don gudanar da gwaje-gwaje, don haka muka yanke shawarar cewa za mu fara kokarin gyara matsalar a cikinta, kuma idan muka gane cewa ba za mu iya yin haka ba a cikin lokaci mai dacewa, to, za mu fara kallon analogues. .

binciken

Abin da za mu iya ba da shawara dangane da kwarewarmu:

  1. Kula da kumburin ku. Dangane da bayanan sa ido, zaku iya fahimtar yadda ake daidaita autovacuum.
  2. Daidaita AUTOVACUUM don kiyaye kumburi a matakin da aka yarda.
  3. Idan kumburi yana ci gaba da girma kuma ba za ku iya shawo kan ta ta amfani da kayan aikin da ba a cikin akwatin ba, kada ku ji tsoro don amfani da kari na waje. Babban abu shine gwada komai da kyau.
  4. Kada ku ji tsoron canza hanyoyin magance matsalolin waje don dacewa da bukatunku - wani lokacin wannan na iya zama mafi inganci kuma har ma da sauƙi fiye da canza lambar ku.

source: www.habr.com

Add a comment