Marka VACUUM uu fashilmo, miiska si gacanta ayaanu u nadiifinnaa

VACUUM waxay "nadiifin kartaa" miiska PostgreSQL oo kaliya waxa qofna ma arki karo - taasi waa, ma jiro hal codsi oo firfircoon oo bilaabmay ka hor intaanan diiwaannadan la beddelin.

Laakiin ka waran haddii nooca aan fiicneyn (load OLAP ee muddada-dheer ee kaydka OLTP) uu weli jiro? Sidee nadiifi miiska beddelka firfircoon waxaa hareereeyay su'aalo dheer oo aan ku tallaabsan qaadin?

Marka VACUUM uu fashilmo, miiska si gacanta ayaanu u nadiifinnaa

Soo qaad qaadka

Marka hore, aan go'aaminno dhibaatada aan rabno inaan xallino waxay tahay iyo sida ay ku kici karto.

Caadiyan xaaladani way dhacdaa oo saaran miis aad u yar, laakiin taas oo ay ku dhacdo isbedel badan. Caadi ahaan tan ama ka duwan mitir/wadarta guud/qiimaynta, kaas oo UPDATE inta badan lagu fuliyo, ama saf-ka-xejin si loo habeeyo qaar ka mid ah dhacdooyinka socodka joogtada ah ee socda, diiwaanada kuwaas oo si joogto ah u GALI/TIRXIR.

Aan isku dayno inaan ku soo saarno ikhtiyaarka qiimeynta:

CREATE TABLE tbl(k text PRIMARY KEY, v integer);
CREATE INDEX ON tbl(v DESC); -- по этому индексу будем строить рейтинг

INSERT INTO
  tbl
SELECT
  chr(ascii('a'::text) + i) k
, 0 v
FROM
  generate_series(0, 25) i;

Oo barbar socota, xidhiidh kale, codsi dheer oo dheer ayaa bilaabmaya, ururinta tirokoobyo adag, laakiin ma saamaynayso miiskayaga:

SELECT pg_sleep(10000);

Hadda waxaan cusbooneysiineynaa qiimaha mid ka mid ah xisaabiyeyaasha marar badan iyo marar badan. Si loo nadiifiyo tijaabada, aynu tan samayno macaamilo kala duwan iyadoo la adeegsanayo dblinksida ay u dhici doonto xaqiiqda:

DO $$
DECLARE
  i integer;
  tsb timestamp;
  tse timestamp;
  d double precision;
BEGIN
  PERFORM dblink_connect('dbname=' || current_database() || ' port=' || current_setting('port'));
  FOR i IN 1..10000 LOOP
    tsb = clock_timestamp();
    PERFORM dblink($e$UPDATE tbl SET v = v + 1 WHERE k = 'a';$e$);
    tse = clock_timestamp();
    IF i % 1000 = 0 THEN
      d = (extract('epoch' from tse) - extract('epoch' from tsb)) * 1000;
      RAISE NOTICE 'i = %, exectime = %', lpad(i::text, 5), lpad(d::text, 5);
    END IF;
  END LOOP;
  PERFORM dblink_disconnect();
END;
$$ LANGUAGE plpgsql;

NOTICE:  i =  1000, exectime = 0.524
NOTICE:  i =  2000, exectime = 0.739
NOTICE:  i =  3000, exectime = 1.188
NOTICE:  i =  4000, exectime = 2.508
NOTICE:  i =  5000, exectime = 1.791
NOTICE:  i =  6000, exectime = 2.658
NOTICE:  i =  7000, exectime = 2.318
NOTICE:  i =  8000, exectime = 2.572
NOTICE:  i =  9000, exectime = 2.929
NOTICE:  i = 10000, exectime = 3.808

Maxaa dhacay? Waa maxay sababta xitaa loogu talagalay cusboonaysiinta ugu fudud ee hal rikoodh wakhtiga fulinta waxa hoos u dhigay 7 jeer - laga bilaabo 0.524ms ilaa 3.808ms? Qiimayntayaduna si tartiib tartiib ah ayey u kordheysaa.

Dhammaan waxaa iska leh MVCC

Waxay ku saabsan tahay Habka MVCC, kaas oo keenaya in weydiintu ay eegto dhammaan noocyadii hore ee gelitaanka. Haddaba aan miiskayaga ka nadiifinno noocyada “ dhintay”:

VACUUM VERBOSE tbl;

INFO:  vacuuming "public.tbl"
INFO:  "tbl": found 0 removable, 10026 nonremovable row versions in 45 out of 45 pages
DETAIL:  10000 dead row versions cannot be removed yet, oldest xmin: 597439602

Oh, ma jiraan wax la nadiifiyo! Barbar socda Codsiga orodka ayaa na faragelinaya Ka dib oo dhan, waxaa laga yaabaa inuu maalin uun rabo inuu u leexdo noocyadan (ka warran haddii?), waana inay diyaar u ahaadaan isaga. Sidaas darteed xitaa VACUUM FULL naguma caawin doono.

"Dhulka" miiska

Laakiin waxaan hubnaa in su'aashaasi aysan u baahnayn miiskayaga. Sidaa darteed, waxaan wali isku dayi doonaa inaan ku soo celino waxqabadka nidaamka xad ku filan anagoo ka saarayna wax kasta oo aan loo baahnayn miiska - ugu yaraan "gacan", maadaama VACUUM ay quusto.

Si aad u caddayso, aynu eegno tusaalaha kiiska miiska bakhaarka. Taasi waa, waxaa jira qulqul weyn oo GALI / TIIR, marmarka qaarkoodna miiska gabi ahaanba waa madhan. Laakiin haddaysan madhnayn, waa in aan kaydi waxa ku jira hadda.

#0: Qiimaynta xaaladda

Way caddahay in aad isku dayi karto inaad wax ku samayso miiska xitaa ka dib qalliin kasta, laakiin tani macno badan ma samaynayso - kor u kaca dayactirka ayaa si cad uga weynaan doona wax soo saarka weydiimaha bartilmaameedka.

Aan diyaarino shuruudaha - "waa waqtigii wax la qaban lahaa" haddii:

  • VACUUM waxa la bilaabay wakhti dheer ka hor
    Waxaan filaynaa rar culus, ee ha ahaado 60 sekan tan iyo markii ugu dambeysay [auto] VACUUM.
  • Cabbirka miiska jidhku wuu ka weyn yahay bartilmaameedka
    Aan u qeexno laba jeer tirada bogagga (8KB blocks) marka loo eego cabbirka ugu yar - 1 blk ee tuulan + 1 blk tus kasta - miis faaruq ah oo suurtagal ah. Haddii aan fileyno in xog gaar ah ay had iyo jeer ku sii jirto kaydka "caadi ahaan", waa macquul in la beddelo qaacidadan.

Codsiga xaqiijinta

SELECT
  relpages
, ((
    SELECT
      count(*)
    FROM
      pg_index
    WHERE
      indrelid = cl.oid
  ) + 1) << 13 size_norm -- тут правильнее делать * current_setting('block_size')::bigint, но кто меняет размер блока?..
, pg_total_relation_size(oid) size
, coalesce(extract('epoch' from (now() - greatest(
    pg_stat_get_last_vacuum_time(oid)
  , pg_stat_get_last_autovacuum_time(oid)
  ))), 1 << 30) vaclag
FROM
  pg_class cl
WHERE
  oid = $1::regclass -- tbl
LIMIT 1;

relpages | size_norm | size    | vaclag
-------------------------------------------
       0 |     24576 | 1105920 | 3392.484835

#1: Weli VACUUM

Hore uma ogaan karno in weydiinta barbar socota ay si weyn noo soo faragelinayso - sida saxda ah inta diiwaan ee "dhacday" tan iyo markii ay bilaabatay. Sidaa darteed, marka aan go'aansanno inaan si uun uga baaraandegno miiska, xaalad kasta, waa inaan marka hore ku fulinaa VACUUM - Si ka duwan VACUUM FULL, kuma farageliso hababka barbar socda ee ka shaqeeya xogta akhris-qorista.

Isla mar ahaantaana, waxay isla markiiba nadiifin kartaa inta badan waxaan jeclaan lahayn inaan ka saarno. Haa, iyo su'aalaha xiga ee miiskan ayaa noo tagi doona by "hot cache", taas oo yarayn doonta muddada - iyo, sidaas darteed, wadarta wakhtiga xannibista kuwa kale adoo u adeegaya macaamil ganacsi.

#2: Ma jiraa qof guri?

Aynu eegno haddii ay wax ku jiraan miiska gabi ahaanba:

TABLE tbl LIMIT 1;

Haddii aanay jirin hal rikoodh oo hadhay, markaa wax badan ayaan ku badbaadin karnaa habaynta annagoo si fudud u samaynayna FUDUD:

Waxay u shaqaysaa si la mid ah amarka DELETE ee miis kasta, laakiin aad buu u dhaqso badan yahay maadaama aysan dhab ahaantii sawirin miisaska. Waxaa intaa dheer, isla markiiba waxay xoraysaa booska diskka, markaa looma baahna in la sameeyo qalliin VACUUM ka dib.

Haddii aad u baahan tahay inaad dib u dejiso miiska isku xigxiga miiska (dib u BILAASH AQOONSI) adiga ayay kugu xidhan tahay inaad go'aansato.

# 3: Qof kasta - is beddel!

Maadaama aan ka shaqeyno jawi tartan aad u sarreeya, inta aan halkan joogno hubinta in aysan jirin wax gelinta miiska, qof ayaa horeyba wax ugu qori lahaa halkaas. Waa in aynaan lumin macluumaadkan, haddaba waa maxay? Taasi waa sax, waxaan u baahanahay inaan hubinno inaan qofna si hubaal ah u qori karin.

Si tan loo sameeyo waxaan u baahanahay inaan awoodno SERIALIZABLEGo'doominta wax kala iibsiga (haa, halkan waxaan ka bilaabeynaa wax kala iibsiga) oo miiska u xir "si adag":

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
LOCK TABLE tbl IN ACCESS EXCLUSIVE MODE;

Heerka xannibaadda waxaa lagu go'aamiyaa hawlgallada aan rabno inaan ku fulinno.

# 4: khilaafka xiisaha

Waxaan halkan nimid oo waxaan rabnaa inaan "xirno" calaamadda - ka waran haddii qof uu ku firfircoon yahay wakhtigaas, tusaale ahaan, wax ka akhriya? Waanu “daldalnaa” anagoo sugayna in balooggan la sii daayo, kuwa kale oo raba inay wax akhriyaan ayaa nagu soo dhex yaaci doona...

Si aan taas uga hortagno, waxaan "wax allabari u bixin doonnaa nafteena" - haddii aynaan awoodin in aan helno quful waqti go'an (oo la aqbali karo), markaa waxaan heli doonnaa ka reeban saldhigga, laakiin ugu yaraan wax badan kama faragelin doonno kuwa kale.

Si tan loo sameeyo, deji doorsoomaha fadhiga quful_time (noocyada 9.3+) ama/iyo ogaysiis_waqti gabaabsi. Waxa ugu weyn ee la xasuusto waa in bayaanka_timeout qiimihiisu kaliya khuseeyo bayaan soo socda. Taasi waa, sida tan oo kale ee ku dhejinta - ma shaqayn doono:

SET statement_timeout = ...;LOCK TABLE ...;

Si aan loola macaamilin dib u soo celinta qiimihii "jir" doorsoomiyaha dambe, waxaan isticmaalnaa foomka DEJI MEELO, kaas oo xaddidaya baaxadda goobta iyo macaamilka hadda socda.

Waxaan xasuusanahay in statement_timeout ay khusayso dhammaan codsiyada soo socda si aanu wax kala beddelashadu u fidin qiyamka aan la aqbali karin haddii ay jirto xog badan oo miiska ku jirta.

#5: Nuqul xogta

Haddii miiska aanu gabi ahaanba madhnayn, xogta waa in dib loo kaydiyaa iyada oo la isticmaalayo miis ku meel gaadh ah:

CREATE TEMPORARY TABLE _tmp_swap ON COMMIT DROP AS TABLE tbl;

Saxeexa KU SAABSAN DHAQANKA macneheedu waxa weeye in wakhtigan la joogo wax kala iibsigu dhamaado, miiska ku meel gaadhka ahi wuu joogsan doonaa, mana jirto baahi loo qabo in gacanta lagu tirtiro marka la eego macnaha xidhiidhka.

Maadaama aan u maleyneyno in aysan jirin wax badan oo "nool" ah, hawlgalkani waa inuu si degdeg ah u dhacaa.

Hagaag, taasi waa dhan! Ha iloobin ka dib markaad dhamaystirto macaamilka ordi ANALYZE si caadi looga dhigo tirakoobka miiska haddii loo baahdo.

Isku dhafka qoraalka ugu dambeeya

Waxaan isticmaalnaa kan "Pseudo-python":

# собираем статистику с таблицы
stat <-
  SELECT
    relpages
  , ((
      SELECT
        count(*)
      FROM
        pg_index
      WHERE
        indrelid = cl.oid
    ) + 1) << 13 size_norm
  , pg_total_relation_size(oid) size
  , coalesce(extract('epoch' from (now() - greatest(
      pg_stat_get_last_vacuum_time(oid)
    , pg_stat_get_last_autovacuum_time(oid)
    ))), 1 << 30) vaclag
  FROM
    pg_class cl
  WHERE
    oid = $1::regclass -- table_name
  LIMIT 1;

# таблица больше целевого размера и VACUUM был давно
if stat.size > 2 * stat.size_norm and stat.vaclag is None or stat.vaclag > 60:
  -> VACUUM %table;
  try:
    -> BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
    # пытаемся захватить монопольную блокировку с предельным временем ожидания 1s
    -> SET LOCAL statement_timeout = '1s'; SET LOCAL lock_timeout = '1s';
    -> LOCK TABLE %table IN ACCESS EXCLUSIVE MODE;
    # надо убедиться в пустоте таблицы внутри транзакции с блокировкой
    row <- TABLE %table LIMIT 1;
    # если в таблице нет ни одной "живой" записи - очищаем ее полностью, в противном случае - "перевставляем" все записи через временную таблицу
    if row is None:
      -> TRUNCATE TABLE %table RESTART IDENTITY;
    else:
      # создаем временную таблицу с данными таблицы-оригинала
      -> CREATE TEMPORARY TABLE _tmp_swap ON COMMIT DROP AS TABLE %table;
      # очищаем оригинал без сброса последовательности
      -> TRUNCATE TABLE %table;
      # вставляем все сохраненные во временной таблице данные обратно
      -> INSERT INTO %table TABLE _tmp_swap;
    -> COMMIT;
  except Exception as e:
    # если мы получили ошибку, но соединение все еще "живо" - словили таймаут
    if not isinstance(e, InterfaceError):
      -> ROLLBACK;

Suurtagal ma tahay in aan nuqul ka samayn xogta mar labaad?Mabda 'ahaan, waa suurtogal haddii saliidda miiska lafteedu aysan ku xidhnayn hawlo kale oo ka socda dhinaca BL ama FK ee dhinaca DB:

CREATE TABLE _swap_%table(LIKE %table INCLUDING ALL);
INSERT INTO _swap_%table TABLE %table;
DROP TABLE %table;
ALTER TABLE _swap_%table RENAME TO %table;

Aan ku socodsiino qoraalka miiska isha oo aan hubinno cabbirada:

VACUUM tbl;
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
  SET LOCAL statement_timeout = '1s'; SET LOCAL lock_timeout = '1s';
  LOCK TABLE tbl IN ACCESS EXCLUSIVE MODE;
  CREATE TEMPORARY TABLE _tmp_swap ON COMMIT DROP AS TABLE tbl;
  TRUNCATE TABLE tbl;
  INSERT INTO tbl TABLE _tmp_swap;
COMMIT;

relpages | size_norm | size   | vaclag
-------------------------------------------
       0 |     24576 |  49152 | 32.705771

Wax walba waa ay shaqeeyeen! Jadwalka ayaa hoos u dhacay 50 jeer, dhammaan CUSBOONAYSIINTA ayaa mar kale si degdeg ah u socda.

Source: www.habr.com

Add a comment