PostgreSQL Antipatterns: CTE x CTE

Sababo la xiriira shaqadayda, waa inaan la tacaalaa xaaladaha marka horumariye soo qoro codsi oo uu ka fekero "Saldhiggu waa caqli badan yahay, wax walba laftiisa ayuu xamili karaa!Β«

Xaaladaha qaarkood (qayb ka mid ah jaahilnimada awoodaha kaydka, qayb ahaan hagaajinta hore), habkani wuxuu keenaa muuqaalka "Frankensteins".

Marka hore, waxaan siin doonaa tusaale codsigan ah:

-- для ΠΊΠ°ΠΆΠ΄ΠΎΠΉ ΠΊΠ»ΡŽΡ‡Π΅Π²ΠΎΠΉ ΠΏΠ°Ρ€Ρ‹ Π½Π°Ρ…ΠΎΠ΄ΠΈΠΌ ассоциированныС значСния ΠΏΠΎΠ»Π΅ΠΉ
WITH RECURSIVE cte_bind AS (
  SELECT DISTINCT ON (key_a, key_b)
    key_a a
  , key_b b
  , fld1 bind_fld1
  , fld2 bind_fld2
  FROM
    tbl
)
-- Π½Π°Ρ…ΠΎΠ΄ΠΈΠΌ min/max Π·Π½Π°Ρ‡Π΅Π½ΠΈΠΉ для ΠΊΠ°ΠΆΠ΄ΠΎΠ³ΠΎ ΠΏΠ΅Ρ€Π²ΠΎΠ³ΠΎ ΠΊΠ»ΡŽΡ‡Π°
, cte_max AS (
  SELECT
    a
  , max(bind_fld1) bind_fld1
  , min(bind_fld2) bind_fld2
  FROM
    cte_bind
  GROUP BY
    a
)
-- связываСм ΠΏΠΎ ΠΏΠ΅Ρ€Π²ΠΎΠΌΡƒ ΠΊΠ»ΡŽΡ‡Ρƒ ΠΊΠ»ΡŽΡ‡Π΅Π²Ρ‹Π΅ ΠΏΠ°Ρ€Ρ‹ ΠΈ min/max-значСния
, cte_a_bind AS (
  SELECT
    cte_bind.a
  , cte_bind.b
  , cte_max.bind_fld1
  , cte_max.bind_fld2
  FROM
    cte_bind
  INNER JOIN
    cte_max
      ON cte_max.a = cte_bind.a
)
SELECT * FROM cte_a_bind;

Si aad si dhab ah u qiimayso tayada codsiga, aynu abuurno qaar xog ah oo aan sabab lahayn:

CREATE TABLE tbl AS
SELECT
  (random() * 1000)::integer key_a
, (random() * 1000)::integer key_b
, (random() * 10000)::integer fld1
, (random() * 10000)::integer fld2
FROM
  generate_series(1, 10000);
CREATE INDEX ON tbl(key_a, key_b);

Waxaa soo baxday in akhrinta xogta waxay qaadatay wax ka yar rubuc wakhtiga fulinta weydiinta:

PostgreSQL Antipatterns: CTE x CTE[fiiri sharaxaad.tensor.ru]

Kala soocida gabal gabal

Aan si hoose u eegno codsiga oo aan la yaabbanahay:

  1. Waa maxay sababta RECURSIVE halkan ugu jirto haddii aanay jirin CTE-yo soo noqnoqda?
  2. Waa maxay sababta koox-qiimaha min/max ee CTE gaar ah haddii ay markaas ku xidhan yihiin muunadda asalka ah si kastaba?
    +25% waqtiga
  3. Maxaad u isticmaashaa 'Xul * FROM' shuruud la'aan ah dhamaadka si aad ugu celiso CTE hore?
    +14% waqtiga

Xaaladdan oo kale, waxaan aad u nasiib badannahay in Hash Join loo doortay isku xirka, oo aan loo dooran Nsted Loop, sababtoo ah markaa ma helin kaliya hal kaarka CTE Scan, laakiin 10K!

wax yar oo ku saabsan CTE ScanHalkan waa inaan taas ku xasuusannaa CTE Scan waxay la mid tahay Seq Scan - taasi waa, ma jiro tilmaame, laakiin kaliya raadinta dhamaystiran, taas oo u baahan doonta 10K x 0.3ms = 3000ms wareegyada cte_max ama 1K x 1.5ms = 1500ms marka loo xidho cte_bind!
Dhab ahaantii, maxaad rabtay inaad natiijada ku hesho? Haa, badanaa tani waa su'aasha meel ka soo baxda daqiiqadii 5aad ee falanqaynta su'aalaha "saddex sheeko".

Waxaan rabnay inaan soo saarno lamaane kasta oo fure u gaar ah min/max koox ahaan key_a.
Haddaba aan u isticmaalno kan hawlaha daaqada:

SELECT DISTINCT ON(key_a, key_b)
	key_a a
,	key_b b
,	max(fld1) OVER(w) bind_fld1
,	min(fld2) OVER(w) bind_fld2
FROM
	tbl
WINDOW
	w AS (PARTITION BY key_a);

PostgreSQL Antipatterns: CTE x CTE
[fiiri sharaxaad.tensor.ru]

Maadaama xogta akhriska ee labada ikhtiyaari ay qaadato qiyaastii 4-5ms, markaa dhammaan waqtigeena ayaa faa'iido leh -32% - tani waa qaabkeeda ugu saafisan xamuulka laga saaray CPU saldhigga ah, haddii codsigan oo kale la fuliyo inta badan ku filan.

Guud ahaan, waa inaadan ku qasbin saldhigga inuu "qaado kan wareega, rogo midka labajibbaaran."

Source: www.habr.com

Add a comment