Thaum VACUUM ua tsis tiav, peb ntxuav lub rooj manually

NQUS PLUA PLAV tuaj yeem "huv" los ntawm lub rooj hauv PostgreSQL tsuas yog dab tsi tsis muaj leej twg pom - uas yog, tsis muaj ib qho kev thov nquag uas tau pib ua ntej cov ntaub ntawv no tau hloov pauv.

Tab sis yuav ua li cas yog tias hom tsis zoo li no (lub sijhawm ntev OLAP thauj khoom ntawm OLTP database) tseem muaj? Yuav ua li cas huv si hloov lub rooj nyob ib puag ncig los ntawm cov lus nug ntev thiab tsis nqis los ntawm ib qho rake?

Thaum VACUUM ua tsis tiav, peb ntxuav lub rooj manually

Nthuav tawm lub rake

Ua ntej, cia peb txiav txim siab seb qhov teeb meem peb xav daws yog dab tsi thiab nws tuaj yeem tshwm sim li cas.

Feem ntau qhov xwm txheej no tshwm sim ntawm lub rooj me me, tab sis nyob rau hauv uas nws tshwm sim ntau yam kev hloov. Feem ntau qhov no los yog txawv meters/tag nrho/ntsuas, uas UPDATE feem ntau raug tua, los yog tsis-queue txhawm rau ua qee qhov txuas txuas ntxiv ntawm cov xwm txheej, cov ntaub ntawv uas tas li INSERT/DELETE.

Cia peb sim rov tsim dua qhov kev xaiv nrog kev ntsuas:

CREATE TABLE tbl(k text PRIMARY KEY, v integer);
CREATE INDEX ON tbl(v DESC); -- по этому индексу будем строить рейтинг

INSERT INTO
  tbl
SELECT
  chr(ascii('a'::text) + i) k
, 0 v
FROM
  generate_series(0, 25) i;

Thiab nyob rau hauv parallel, nyob rau hauv lwm yam kev twb kev txuas, ib tug ntev, ntev thov pib, sau ib co complex txheeb cais, tab sis tsis cuam tshuam rau peb lub rooj:

SELECT pg_sleep(10000);

Tam sim no peb hloov kho tus nqi ntawm ib qho ntawm cov txee ntau, ntau zaus. Rau qhov purity ntawm qhov kev sim, cia peb ua qhov no hauv kev sib cais sib pauv siv dblinkYuav ua li cas nws yuav tshwm sim hauv kev muaj tiag:

DO $$
DECLARE
  i integer;
  tsb timestamp;
  tse timestamp;
  d double precision;
BEGIN
  PERFORM dblink_connect('dbname=' || current_database() || ' port=' || current_setting('port'));
  FOR i IN 1..10000 LOOP
    tsb = clock_timestamp();
    PERFORM dblink($e$UPDATE tbl SET v = v + 1 WHERE k = 'a';$e$);
    tse = clock_timestamp();
    IF i % 1000 = 0 THEN
      d = (extract('epoch' from tse) - extract('epoch' from tsb)) * 1000;
      RAISE NOTICE 'i = %, exectime = %', lpad(i::text, 5), lpad(d::text, 5);
    END IF;
  END LOOP;
  PERFORM dblink_disconnect();
END;
$$ LANGUAGE plpgsql;

NOTICE:  i =  1000, exectime = 0.524
NOTICE:  i =  2000, exectime = 0.739
NOTICE:  i =  3000, exectime = 1.188
NOTICE:  i =  4000, exectime = 2.508
NOTICE:  i =  5000, exectime = 1.791
NOTICE:  i =  6000, exectime = 2.658
NOTICE:  i =  7000, exectime = 2.318
NOTICE:  i =  8000, exectime = 2.572
NOTICE:  i =  9000, exectime = 2.929
NOTICE:  i = 10000, exectime = 3.808

Dab tsi tshwm sim? Vim li cas txawm rau qhov yooj yim tshaj UPDATE ntawm ib cov ntaub ntawv execution lub sij hawm degraded los ntawm 7 lub sij hawm - los ntawm 0.524ms rau 3.808ms? Thiab peb qhov kev ntsuam xyuas tau tsim ntau dua thiab maj mam.

Nws yog tag nrho MVCC qhov txhaum.

Nws yog txhua yam hais txog MVCC mechanism, uas ua rau cov lus nug mus saib los ntawm tag nrho cov yav dhau los versions ntawm nkag. Yog li cia peb ntxuav peb lub rooj los ntawm "tuag" versions:

VACUUM VERBOSE tbl;

INFO:  vacuuming "public.tbl"
INFO:  "tbl": found 0 removable, 10026 nonremovable row versions in 45 out of 45 pages
DETAIL:  10000 dead row versions cannot be removed yet, oldest xmin: 597439602

Auj, tsis muaj dab tsi los ntxuav! Parallel Qhov kev thov khiav yog cuam tshuam nrog peb - Tom qab tag nrho, ib hnub nws yuav xav tig mus rau cov ntawv no (dab tsi yog tias?), thiab lawv yuav tsum muaj rau nws. Thiab yog li ntawd txawm VACUUM FULL yuav tsis pab peb.

"Collapsing" lub rooj

Tab sis peb paub tseeb tias cov lus nug ntawd tsis xav tau peb lub rooj. Yog li ntawd, peb tseem yuav sim rov qab ua cov txheej txheem kev ua haujlwm kom txaus los ntawm kev tshem tawm txhua yam tsis tsim nyog los ntawm lub rooj - tsawg kawg "manually", txij li VACUUM muab rau hauv.

Txhawm rau kom meej meej, cia peb saib qhov piv txwv ntawm rooj plaub ntawm lub rooj tsis khoom. Ntawd yog, muaj qhov ntws loj ntawm INSERT / DELETE, thiab qee zaum lub rooj yog khoob kiag li. Tab sis yog tias nws tsis yog khoob, peb yuav tsum txuag nws cov ntsiab lus tam sim no.

#0: Ntsuas qhov xwm txheej

Nws yog qhov tseeb tias koj tuaj yeem sim ua ib yam dab tsi nrog lub rooj txawm tias tom qab txhua qhov kev ua haujlwm, tab sis qhov no tsis ua rau muaj kev nkag siab ntau - kev saib xyuas nyiaj siv ua haujlwm yuav pom tseeb ntau dua qhov kev nkag mus ntawm cov lus nug ntawm lub hom phiaj.

Cia peb tsim cov qauv - "nws yog lub sijhawm los ua" yog tias:

  • VACUUM tau pib ua lub sijhawm ntev dhau los
    Peb cia siab tias yuav hnyav hnyav, yog li cia nws ua 60 vib nas this txij thaum kawg [auto] VACUUM.
  • lub cev lub rooj loj loj dua lub hom phiaj
    Cia peb txhais nws li ob zaug ntawm cov nplooj ntawv (8KB blocks) txheeb ze rau qhov tsawg kawg nkaus - 1 blk rau heap + 1 blk rau txhua qhov ntsuas - rau lub rooj zaum khoob. Yog tias peb cia siab tias qee cov ntaub ntawv yuav nyob twj ywm hauv qhov tsis "ib txwm", nws yog qhov tsim nyog los tweak cov qauv no.

Kev lees paub thov

SELECT
  relpages
, ((
    SELECT
      count(*)
    FROM
      pg_index
    WHERE
      indrelid = cl.oid
  ) + 1) << 13 size_norm -- тут правильнее делать * current_setting('block_size')::bigint, но кто меняет размер блока?..
, pg_total_relation_size(oid) size
, coalesce(extract('epoch' from (now() - greatest(
    pg_stat_get_last_vacuum_time(oid)
  , pg_stat_get_last_autovacuum_time(oid)
  ))), 1 << 30) vaclag
FROM
  pg_class cl
WHERE
  oid = $1::regclass -- tbl
LIMIT 1;

relpages | size_norm | size    | vaclag
-------------------------------------------
       0 |     24576 | 1105920 | 3392.484835

#1: Tseem VACUUM

Peb tsis tuaj yeem paub ua ntej seb qhov kev nug sib txuas puas cuam tshuam nrog peb - ​​muaj pes tsawg cov ntaub ntawv tau dhau los ua "tam sim no" txij li thaum nws pib. Yog li ntawd, thaum peb txiav txim siab los ua txheej txheem ntawm lub rooj, nyob rau hauv txhua rooj plaub, peb yuav tsum xub ua rau nws NQUS PLUA PLAV - tsis zoo li VACUUM FULL, nws tsis cuam tshuam nrog cov txheej txheem sib luag ua haujlwm nrog cov ntaub ntawv nyeem-sau.

Nyob rau tib lub sijhawm, nws tuaj yeem ntxuav tam sim ntawd feem ntau ntawm qhov peb xav tshem tawm. Yog, thiab cov lus nug tom ntej ntawm lub rooj no yuav mus rau peb los ntawm "kub cache", uas yuav txo lawv lub sijhawm - thiab, yog li ntawd, tag nrho lub sijhawm ntawm kev thaiv lwm tus los ntawm peb cov kev pabcuam kev lag luam.

#2: Puas muaj leej twg nyob hauv?

Cia peb xyuas seb puas muaj dab tsi hauv lub rooj txhua:

TABLE tbl LIMIT 1;

Yog tias tsis muaj ib daim ntawv teev tseg, ces peb tuaj yeem txuag tau ntau ntawm kev ua los ntawm kev ua yooj yim HLOOV:

Nws ua tib yam li qhov tsis muaj cai DELETE hais kom ua rau txhua lub rooj, tab sis sai dua vim tias nws tsis tau luam theej duab cov ntxhuav. Ntxiv mus, nws tam sim ntawd tso qhov chaw disk, yog li tsis tas yuav ua haujlwm VACUUM tom qab.

Txawm hais tias koj yuav tsum tau rov pib dua lub rooj sib tw txee (RESTART IDENTITY) yog nyob ntawm koj txiav txim siab.

#3: Txhua tus - hloov pauv!

Txij li thaum peb ua hauj lwm nyob rau hauv ib tug muaj kev sib tw ib puag ncig, thaum peb tseem nyob ntawm no xyuas tias tsis muaj kev nkag rau hauv lub rooj, ib tug neeg yuav tau sau ib yam dab tsi nyob rau ntawd. Peb yuav tsum tsis txhob poob cov ntaub ntawv no, yog li cas? Yog lawm, peb yuav tsum ua kom paub tseeb tias tsis muaj leej twg sau tau tseeb.

Ua li no peb yuav tsum tau pab SERIALIZABLE-kev cais tawm rau peb cov kev lag luam (yog, ntawm no peb pib ua lag luam) thiab kaw lub rooj "nruj":

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
LOCK TABLE tbl IN ACCESS EXCLUSIVE MODE;

Qhov kev thaiv theem no yog txiav txim siab los ntawm cov haujlwm uas peb xav ua rau nws.

#4: Kev tsis sib haum xeeb

Peb tuaj ntawm no thiab xav "xauv" lub cim - ua li cas yog tias ib tug neeg ua haujlwm ntawm nws lub sijhawm ntawd, piv txwv li, nyeem los ntawm nws? Peb yuav "tso" tos qhov thaiv no tso tawm, thiab lwm tus uas xav nyeem yuav khiav mus rau peb ...

Txhawm rau tiv thaiv qhov no los ntawm qhov tshwm sim, peb yuav "tshem peb tus kheej" - yog tias peb tsis tuaj yeem tau txais lub xauv nyob rau hauv ib lub sijhawm (tsim luv), ces peb yuav tau txais kev zam los ntawm lub hauv paus, tab sis tsawg kawg peb yuav tsis cuam tshuam ntau dhau. lwm tus.

Txhawm rau ua qhov no, teeb tsa lub rooj sib tham sib txawv lock_timeout (rau versions 9.3+) lossis / thiab lus_timeout. Qhov tseem ceeb uas yuav tsum nco ntsoov yog tias tus nqi nqe lus_timeout tsuas yog siv los ntawm nqe lus tom ntej. Ntawd yog, zoo li qhov no hauv gluing - yuav tsis ua haujlwm:

SET statement_timeout = ...;LOCK TABLE ...;

Txhawm rau kom tsis txhob muaj kev cuam tshuam nrog kev kho qhov "laus" tus nqi ntawm qhov hloov pauv tom qab, peb siv daim ntawv SET LOCAL, uas txwv lub Scope ntawm kev teeb tsa rau qhov kev lag luam tam sim no.

Peb nco ntsoov tias statement_timeout siv rau txhua qhov kev thov tom ntej kom qhov kev hloov pauv tsis tuaj yeem ncav cuag qhov tsis txaus ntseeg yog tias muaj cov ntaub ntawv ntau hauv cov lus.

#5: Luam cov ntaub ntawv

Yog tias lub rooj tsis tas tas li, cov ntaub ntawv yuav tsum tau rov khaws cia siv lub rooj pab ib ntus:

CREATE TEMPORARY TABLE _tmp_swap ON COMMIT DROP AS TABLE tbl;

Kos npe RAU COMMIT DROP txhais tau hais tias thaum lub sij hawm qhov kev sib pauv xaus, lub rooj ib ntus yuav tsum tsis muaj nyob, thiab tsis tas yuav tsum rho tawm nws hauv cov ntsiab lus sib txuas.

Txij li thaum peb xav tias tsis muaj ntau cov ntaub ntawv "nyob", qhov haujlwm no yuav tsum tau ua sai sai.

Zoo, qhov ntawd yog txhua yam! Tsis txhob hnov ​​qab tom qab ua tiav qhov kev sib pauv khiav ANALYZE normalize cov ntaub ntawv txheeb cais yog tias tsim nyog.

Muab cov ntawv kawg ua ke

Peb siv qhov "pseudo-python":

# собираем статистику с таблицы
stat <-
  SELECT
    relpages
  , ((
      SELECT
        count(*)
      FROM
        pg_index
      WHERE
        indrelid = cl.oid
    ) + 1) << 13 size_norm
  , pg_total_relation_size(oid) size
  , coalesce(extract('epoch' from (now() - greatest(
      pg_stat_get_last_vacuum_time(oid)
    , pg_stat_get_last_autovacuum_time(oid)
    ))), 1 << 30) vaclag
  FROM
    pg_class cl
  WHERE
    oid = $1::regclass -- table_name
  LIMIT 1;

# таблица больше целевого размера и VACUUM был давно
if stat.size > 2 * stat.size_norm and stat.vaclag is None or stat.vaclag > 60:
  -> VACUUM %table;
  try:
    -> BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
    # пытаемся захватить монопольную блокировку с предельным временем ожидания 1s
    -> SET LOCAL statement_timeout = '1s'; SET LOCAL lock_timeout = '1s';
    -> LOCK TABLE %table IN ACCESS EXCLUSIVE MODE;
    # надо убедиться в пустоте таблицы внутри транзакции с блокировкой
    row <- TABLE %table LIMIT 1;
    # если в таблице нет ни одной "живой" записи - очищаем ее полностью, в противном случае - "перевставляем" все записи через временную таблицу
    if row is None:
      -> TRUNCATE TABLE %table RESTART IDENTITY;
    else:
      # создаем временную таблицу с данными таблицы-оригинала
      -> CREATE TEMPORARY TABLE _tmp_swap ON COMMIT DROP AS TABLE %table;
      # очищаем оригинал без сброса последовательности
      -> TRUNCATE TABLE %table;
      # вставляем все сохраненные во временной таблице данные обратно
      -> INSERT INTO %table TABLE _tmp_swap;
    -> COMMIT;
  except Exception as e:
    # если мы получили ошибку, но соединение все еще "живо" - словили таймаут
    if not isinstance(e, InterfaceError):
      -> ROLLBACK;

Puas muaj peev xwm tsis luam cov ntaub ntawv thib ob?Nyob rau hauv txoj cai, nws yog ua tau yog hais tias lub oid ntawm lub rooj nws tus kheej tsis khi rau lwm yam kev ua ub no los ntawm sab BL los yog FK los ntawm DB sab:

CREATE TABLE _swap_%table(LIKE %table INCLUDING ALL);
INSERT INTO _swap_%table TABLE %table;
DROP TABLE %table;
ALTER TABLE _swap_%table RENAME TO %table;

Cia peb khiav cov ntawv sau rau ntawm lub rooj thiab xyuas cov metrics:

VACUUM tbl;
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
  SET LOCAL statement_timeout = '1s'; SET LOCAL lock_timeout = '1s';
  LOCK TABLE tbl IN ACCESS EXCLUSIVE MODE;
  CREATE TEMPORARY TABLE _tmp_swap ON COMMIT DROP AS TABLE tbl;
  TRUNCATE TABLE tbl;
  INSERT INTO tbl TABLE _tmp_swap;
COMMIT;

relpages | size_norm | size   | vaclag
-------------------------------------------
       0 |     24576 |  49152 | 32.705771

Txhua yam ua tiav! Lub rooj tau txo qis los ntawm 50 zaug thiab txhua qhov kev hloov kho tshiab tau khiav nrawm dua.

Tau qhov twg los: www.hab.com

Ntxiv ib saib