PostgreSQL Antipatterns: CTE x CTE

Vim kuv txoj haujlwm ua haujlwm, kuv yuav tsum tau nrog cov xwm txheej thaum tus tsim tawm sau ntawv thov thiab xav tias "Lub hauv paus yog ntse, nws tuaj yeem ua txhua yam ntawm nws tus kheej!Β«

Qee qhov xwm txheej (ib feem los ntawm kev tsis paub txog kev muaj peev xwm ntawm cov ntaub ntawv, ib feem los ntawm kev ua kom zoo ntxov ntxov), qhov no ua rau cov tsos ntawm "Frankensteins".

Ua ntej, kuv yuav muab piv txwv ntawm qhov kev thov no:

-- для ΠΊΠ°ΠΆΠ΄ΠΎΠΉ ΠΊΠ»ΡŽΡ‡Π΅Π²ΠΎΠΉ ΠΏΠ°Ρ€Ρ‹ Π½Π°Ρ…ΠΎΠ΄ΠΈΠΌ ассоциированныС значСния ΠΏΠΎΠ»Π΅ΠΉ
WITH RECURSIVE cte_bind AS (
  SELECT DISTINCT ON (key_a, key_b)
    key_a a
  , key_b b
  , fld1 bind_fld1
  , fld2 bind_fld2
  FROM
    tbl
)
-- Π½Π°Ρ…ΠΎΠ΄ΠΈΠΌ min/max Π·Π½Π°Ρ‡Π΅Π½ΠΈΠΉ для ΠΊΠ°ΠΆΠ΄ΠΎΠ³ΠΎ ΠΏΠ΅Ρ€Π²ΠΎΠ³ΠΎ ΠΊΠ»ΡŽΡ‡Π°
, cte_max AS (
  SELECT
    a
  , max(bind_fld1) bind_fld1
  , min(bind_fld2) bind_fld2
  FROM
    cte_bind
  GROUP BY
    a
)
-- связываСм ΠΏΠΎ ΠΏΠ΅Ρ€Π²ΠΎΠΌΡƒ ΠΊΠ»ΡŽΡ‡Ρƒ ΠΊΠ»ΡŽΡ‡Π΅Π²Ρ‹Π΅ ΠΏΠ°Ρ€Ρ‹ ΠΈ min/max-значСния
, cte_a_bind AS (
  SELECT
    cte_bind.a
  , cte_bind.b
  , cte_max.bind_fld1
  , cte_max.bind_fld2
  FROM
    cte_bind
  INNER JOIN
    cte_max
      ON cte_max.a = cte_bind.a
)
SELECT * FROM cte_a_bind;

Txhawm rau ntsuas qhov zoo ntawm qhov kev thov, cia peb tsim qee cov ntaub ntawv arbitrary:

CREATE TABLE tbl AS
SELECT
  (random() * 1000)::integer key_a
, (random() * 1000)::integer key_b
, (random() * 10000)::integer fld1
, (random() * 10000)::integer fld2
FROM
  generate_series(1, 10000);
CREATE INDEX ON tbl(key_a, key_b);

Nws hloov tawm tias nyeem cov ntaub ntawv siv tsawg dua ib lub hlis twg ntawm lub sijhawm query execution:

PostgreSQL Antipatterns: CTE x CTE[saib ntawm piav qhia.tensor.ru]

Muab nws sib cais ib feem

Cia peb saib ze dua ntawm qhov kev thov thiab xav tsis thoob:

  1. Vim li cas TSIS MUAJ RECURSIVE ntawm no yog tias tsis muaj CTEs rov ua dua?
  2. Vim li cas pab pawg min / max qhov tseem ceeb hauv CTE cais yog tias lawv muab khi rau tus qauv qub lawm?
    + 25% sijhawm
  3. Vim li cas thiaj siv qhov tsis muaj xwm txheej 'SELECT * NTAWM' thaum kawg rov ua dua CTE yav dhau los?
    + 14% sijhawm

Hauv qhov no, peb muaj hmoo heev uas Hash Koom tau raug xaiv rau kev sib txuas, thiab tsis yog Nested Loop, vim tias tom qab ntawd peb yuav tau txais tsis yog ib qho CTE Scan pass, tab sis 10K!

me ntsis txog CTE ScanNtawm no peb yuav tsum nco ntsoov qhov ntawd CTE Scan zoo ib yam li Seq Scan - uas yog, tsis muaj indexing, tab sis tsuas yog ib tug tiav kev tshawb fawb, uas yuav tsum tau 10 K x 0.3ms = 3000ms rau cycles los ntawm cte_max los yog 1 K x 1.5ms = 1500ms thaum looping los ntawm cte_bind!
Qhov tseeb, koj xav tau dab tsi los ntawm qhov tshwm sim? Yog lawm, feem ntau qhov no yog cov lus nug uas tuaj txog qhov chaw hauv 5 feeb ntawm kev tshuaj xyuas "peb-zaj dab neeg" cov lus nug.

Peb xav tso tawm rau txhua tus khub tseem ceeb tshwj xeeb min/max los ntawm pawg los ntawm key_a.
Yog li cia peb siv nws rau qhov no qhov rais ua haujlwm:

SELECT DISTINCT ON(key_a, key_b)
	key_a a
,	key_b b
,	max(fld1) OVER(w) bind_fld1
,	min(fld2) OVER(w) bind_fld2
FROM
	tbl
WINDOW
	w AS (PARTITION BY key_a);

PostgreSQL Antipatterns: CTE x CTE
[saib ntawm piav qhia.tensor.ru]

Txij li thaum nyeem cov ntaub ntawv hauv ob qho kev xaiv yuav siv tib yam li 4-5ms, ces tag nrho peb lub sijhawm nce -32% - qhov no yog nyob rau hauv nws daim ntawv purest load tshem tawm ntawm lub hauv paus CPU, yog hais tias qhov kev thov no tau ua ntau zaus txaus.

Feem ntau, koj yuav tsum tsis txhob yuam lub hauv paus "nqa ib puag ncig, dov lub square."

Tau qhov twg los: www.hab.com

Ntxiv ib saib