I-PostgreSQL Antipatterns: CTE x CTE

Ngenxa yomsebenzi wami, kufanele ngibhekane nezimo lapho umthuthukisi ebhala isicelo bese ecabanga β€œIsisekelo sihlakaniphile, singakwazi ukusingatha yonke into ngokwayo!Β«

Kwezinye izimo (ngokwengxenye ngenxa yokungazi kahle kwamakhono e-database, ngokwengxenye ekuthuthukisweni ngaphambi kwesikhathi), le ndlela iholela ekubukeni kwe-"Frankensteins".

Okokuqala, ngizonikeza isibonelo sesicelo esinjalo:

-- для ΠΊΠ°ΠΆΠ΄ΠΎΠΉ ΠΊΠ»ΡŽΡ‡Π΅Π²ΠΎΠΉ ΠΏΠ°Ρ€Ρ‹ Π½Π°Ρ…ΠΎΠ΄ΠΈΠΌ ассоциированныС значСния ΠΏΠΎΠ»Π΅ΠΉ
WITH RECURSIVE cte_bind AS (
  SELECT DISTINCT ON (key_a, key_b)
    key_a a
  , key_b b
  , fld1 bind_fld1
  , fld2 bind_fld2
  FROM
    tbl
)
-- Π½Π°Ρ…ΠΎΠ΄ΠΈΠΌ min/max Π·Π½Π°Ρ‡Π΅Π½ΠΈΠΉ для ΠΊΠ°ΠΆΠ΄ΠΎΠ³ΠΎ ΠΏΠ΅Ρ€Π²ΠΎΠ³ΠΎ ΠΊΠ»ΡŽΡ‡Π°
, cte_max AS (
  SELECT
    a
  , max(bind_fld1) bind_fld1
  , min(bind_fld2) bind_fld2
  FROM
    cte_bind
  GROUP BY
    a
)
-- связываСм ΠΏΠΎ ΠΏΠ΅Ρ€Π²ΠΎΠΌΡƒ ΠΊΠ»ΡŽΡ‡Ρƒ ΠΊΠ»ΡŽΡ‡Π΅Π²Ρ‹Π΅ ΠΏΠ°Ρ€Ρ‹ ΠΈ min/max-значСния
, cte_a_bind AS (
  SELECT
    cte_bind.a
  , cte_bind.b
  , cte_max.bind_fld1
  , cte_max.bind_fld2
  FROM
    cte_bind
  INNER JOIN
    cte_max
      ON cte_max.a = cte_bind.a
)
SELECT * FROM cte_a_bind;

Ukuze sihlole ngokujulile ikhwalithi yesicelo, ake sakhe isethi yedatha engafanele:

CREATE TABLE tbl AS
SELECT
  (random() * 1000)::integer key_a
, (random() * 1000)::integer key_b
, (random() * 10000)::integer fld1
, (random() * 10000)::integer fld2
FROM
  generate_series(1, 10000);
CREATE INDEX ON tbl(key_a, key_b);

Kuvele ukuthi ukufunda idatha kuthathe isikhathi esingaphansi kwekota yesikhathi ukuqaliswa kombuzo:

I-PostgreSQL Antipatterns: CTE x CTE[buka kokuthi explain.tensor.ru]

Eyihlukanisa kancane kancane

Ake sibhekisise isicelo futhi sixakeke:

  1. Kungani KUTHI NGE-RECURSIVE lapha uma engekho ama-CTE aphindaphindayo?
  2. Kungani iqoqo lamanani aphansi/aphezulu ku-CTE ehlukile uma-ke eboshelwa kusampula yoqobo?
    + 25% isikhathi
  3. Kungani usebenzise okuthi 'KHETHA *KUSUKA' okungenamibandela ekugcineni ukuphinda i-CTE yangaphambilini?
    + 14% isikhathi

Kulokhu, sibe nenhlanhla enkulu ukuthi Joyina i-Hash ikhethelwe uxhumano esikhundleni se-Nest Loop, kusukela lapho besingeke sithole iphasi eyodwa ye-CTE Scan, kodwa engu-10K!

okuncane mayelana ne-CTE ScanLapha kufanele sikukhumbule lokho I-CTE Scan ifana ne-Seq Scan - okungukuthi, akukho indexing, kodwa kuphela usesho oluphelele, okungase kudingeke 10K x 0.3ms = Ama-3000ms emijikelezweni nge-cte_max noma 1K x 1.5ms = Ama-1500ms lapho uvula nge-cte_bind!
Empeleni, yini obufuna ukuyizuza ngenxa yalokho? Yebo, ngokuvamile lona umbuzo ovela endaweni ethile ngomzuzu wesi-5 wokuhlaziya imibuzo "yezitezi ezintathu".

Besifuna ukukhipha ukhiye ngamunye ohlukile iminithi/ubukhulu eqenjini ngokhiye_a.
Ngakho-ke masiyisebenzisele lokhu imisebenzi yewindi:

SELECT DISTINCT ON(key_a, key_b)
	key_a a
,	key_b b
,	max(fld1) OVER(w) bind_fld1
,	min(fld2) OVER(w) bind_fld2
FROM
	tbl
WINDOW
	w AS (PARTITION BY key_a);

I-PostgreSQL Antipatterns: CTE x CTE
[buka kokuthi explain.tensor.ru]

Njengoba idatha yokufunda kuzo zombili izinketho ithatha cishe ama-4-5ms afanayo, ngakho sonke isikhathi sethu siyazuza -32% - lokhu kusesimweni sakho esimsulwa umthwalo ukhishiwe ku-base CPU, uma isicelo esinjalo senziwe ngokuvamile ngokwanele.

Ngokuvamile, akufanele uphoqe isisekelo ukuthi "ugqoke umjikelezo, ugoqe isikwele."

Source: www.habr.com

Engeza amazwana