ืชื‘ื ื™ื•ืช ื ื’ื“ PostgreSQL: CTE x CTE

ื‘ืฉืœ ืงื• ื”ืขื‘ื•ื“ื” ืฉืœื™, ืื ื™ ื ืืœืฅ ืœื”ืชืžื•ื“ื“ ืขื ืžืฆื‘ื™ื ืฉื‘ื”ื ืžืคืชื— ื›ื•ืชื‘ ื‘ืงืฉื” ื•ื—ื•ืฉื‘ "ื”ื‘ืกื™ืก ื—ื›ื, ื”ื•ื ื™ื›ื•ืœ ืœื”ืชืžื•ื“ื“ ืขื ื”ื›ืœ ื‘ืขืฆืžื•!ยซ

ื‘ืžืงืจื™ื ืžืกื•ื™ืžื™ื (ื—ืœืงื ืžื‘ื•ืจื•ืช ื‘ื™ื›ื•ืœื•ืช ืฉืœ ืžืกื“ ื”ื ืชื•ื ื™ื, ื—ืœืงื ืžืื•ืคื˜ื™ืžื™ื–ืฆื™ื•ืช ืžื•ืงื“ืžื•ืช), ื’ื™ืฉื” ื–ื• ืžื•ื‘ื™ืœื” ืœื”ื•ืคืขืช "ืคืจื ืงื ืฉื˜ื™ื™ืŸ".

ืจืืฉื™ืช, ืืชืŸ ื“ื•ื’ืžื” ืœื‘ืงืฉื” ื›ื–ื•:

-- ะดะปั ะบะฐะถะดะพะน ะบะปัŽั‡ะตะฒะพะน ะฟะฐั€ั‹ ะฝะฐั…ะพะดะธะผ ะฐััะพั†ะธะธั€ะพะฒะฐะฝะฝั‹ะต ะทะฝะฐั‡ะตะฝะธั ะฟะพะปะตะน
WITH RECURSIVE cte_bind AS (
  SELECT DISTINCT ON (key_a, key_b)
    key_a a
  , key_b b
  , fld1 bind_fld1
  , fld2 bind_fld2
  FROM
    tbl
)
-- ะฝะฐั…ะพะดะธะผ min/max ะทะฝะฐั‡ะตะฝะธะน ะดะปั ะบะฐะถะดะพะณะพ ะฟะตั€ะฒะพะณะพ ะบะปัŽั‡ะฐ
, cte_max AS (
  SELECT
    a
  , max(bind_fld1) bind_fld1
  , min(bind_fld2) bind_fld2
  FROM
    cte_bind
  GROUP BY
    a
)
-- ัะฒัะทั‹ะฒะฐะตะผ ะฟะพ ะฟะตั€ะฒะพะผัƒ ะบะปัŽั‡ัƒ ะบะปัŽั‡ะตะฒั‹ะต ะฟะฐั€ั‹ ะธ min/max-ะทะฝะฐั‡ะตะฝะธั
, cte_a_bind AS (
  SELECT
    cte_bind.a
  , cte_bind.b
  , cte_max.bind_fld1
  , cte_max.bind_fld2
  FROM
    cte_bind
  INNER JOIN
    cte_max
      ON cte_max.a = cte_bind.a
)
SELECT * FROM cte_a_bind;

ื›ื“ื™ ืœื”ืขืจื™ืš ื‘ืื•ืคืŸ ืžื”ื•ืชื™ ืืช ืื™ื›ื•ืช ื”ื‘ืงืฉื”, ื‘ื•ืื• ื ื™ืฆื•ืจ ืžืขืจืš ื ืชื•ื ื™ื ืฉืจื™ืจื•ืชื™:

CREATE TABLE tbl AS
SELECT
  (random() * 1000)::integer key_a
, (random() * 1000)::integer key_b
, (random() * 10000)::integer fld1
, (random() * 10000)::integer fld2
FROM
  generate_series(1, 10000);
CREATE INDEX ON tbl(key_a, key_b);

ืžืกืชื‘ืจ ืฉ ืงืจื™ืืช ื”ื ืชื•ื ื™ื ืืจื›ื” ืคื—ื•ืช ืžืจื‘ืข ืžื”ื–ืžืŸ ื‘ื™ืฆื•ืข ืฉืื™ืœืชื”:

ืชื‘ื ื™ื•ืช ื ื’ื“ PostgreSQL: CTE x CTE[ื”ืกืชื›ืœ ื‘-explain.tensor.ru]

ืžืคืจืงื™ื ืื•ืชื• ื—ืœืง ืื—ืจ ื—ืœืง

ื‘ื•ืื• ื ืกืชื›ืœ ืžืงืจื•ื‘ ืขืœ ื”ื‘ืงืฉื” ื•ื ืชืœื‘ื˜:

  1. ืœืžื” WITH RECURSIVE ื ืžืฆื ื›ืืŸ ืื ืื™ืŸ CTEs ืจืงื•ืจืกื™ื‘ื™?
  2. ืœืžื” ืœืงื‘ืฅ ืขืจื›ื™ ืžื™ื ื™ืžื•ื/ืžืงืกื™ืžื•ื ื‘-CTE ื ืคืจื“ ืื ื”ื ืงืฉื•ืจื™ื ืื– ืœืžื“ื’ื ื”ืžืงื•ืจื™ ื‘ื›ืœ ืžืงืจื”?
    +25% ื–ืžืŸ
  3. ืœืžื” ืœื”ืฉืชืžืฉ ื‘'SELECT * FROM' ืœืœื ืชื ืื™ ื‘ืกื•ืฃ ื›ื“ื™ ืœื—ื–ื•ืจ ืขืœ ื”-CTE ื”ืงื•ื“ื?
    +14% ื–ืžืŸ

ื‘ืžืงืจื” ื”ื–ื”, ื”ื™ื” ืœื ื• ืžื–ืœ ื’ื“ื•ืœ ืฉ-Hash Join ื ื‘ื—ืจ ืœื—ื™ื‘ื•ืจ, ื•ืœื Nested Loop, ื›ื™ ืื– ื”ื™ื™ื ื• ืžืงื‘ืœื™ื ืœื ืจืง ื›ืจื˜ื™ืก CTE Scan ืื—ื“, ืืœื 10K!

ืงืฆืช ืขืœ CTE Scanื›ืืŸ ืขืœื™ื ื• ืœื–ื›ื•ืจ ื–ืืช CTE Scan ื“ื•ืžื” ืœ-Seq Scan - ื›ืœื•ืžืจ, ืœืœื ืื™ื ื“ืงืก, ืืœื ืจืง ื—ื™ืคื•ืฉ ืฉืœื, ืฉื™ื“ืจื•ืฉ 10K x 0.3ms = 3000ms ืขื‘ื•ืจ ืžื—ื–ื•ืจื™ื ืœืคื™ cte_max ืื• 1K x 1.5ms = 1500ms ื‘ืขืช ืœื•ืœืื” ืขืœ ื™ื“ื™ cte_bind!
ื‘ืขืฆื, ืžื” ืจืฆื™ืช ืœืงื‘ืœ ื›ืชื•ืฆืื” ืžื›ืš? ื›ืŸ, ื‘ื“ืจืš ื›ืœืœ ื–ื• ื”ืฉืืœื” ืฉืขื•ืœื” ืื™ ืฉื ื‘ื“ืงื” ื”ื—ืžื™ืฉื™ืช ืฉืœ ื ื™ืชื•ื— ืฉืื™ืœืชื•ืช "ืฉืœื•ืฉ ืงื•ืžื•ืช".

ืจืฆื™ื ื• ืœื”ื•ืฆื™ื ืขื‘ื•ืจ ื›ืœ ื–ื•ื’ ืžืคืชื—ื•ืช ื™ื™ื—ื•ื“ื™ min/max ืžืงื‘ื•ืฆื” ืœืคื™ key_a.
ืื– ื‘ื•ืื• ื ืฉืชืžืฉ ื‘ื–ื” ื‘ืฉื‘ื™ืœ ื–ื” ืคื•ื ืงืฆื™ื•ืช ื”ื—ืœื•ื ื•ืช:

SELECT DISTINCT ON(key_a, key_b)
	key_a a
,	key_b b
,	max(fld1) OVER(w) bind_fld1
,	min(fld2) OVER(w) bind_fld2
FROM
	tbl
WINDOW
	w AS (PARTITION BY key_a);

ืชื‘ื ื™ื•ืช ื ื’ื“ PostgreSQL: CTE x CTE
[ื”ืกืชื›ืœ ื‘-explain.tensor.ru]

ืžื›ื™ื•ื•ืŸ ืฉืงืจื™ืืช ื ืชื•ื ื™ื ื‘ืฉืชื™ ื”ืืคืฉืจื•ื™ื•ืช ืœื•ืงื—ืช ืื•ืชื• ื“ื‘ืจ ื‘ืขืจืš 4-5ms, ืื– ื›ืœ ื”ื–ืžืŸ ืฉืœื ื• ืžืจื•ื•ื™ื— -32% - ื–ื” ื‘ืฆื•ืจืชื• ื”ื˜ื”ื•ืจื” ื‘ื™ื•ืชืจ ืขื•ืžืก ื”ื•ืกืจ ืžื”ืžืขื‘ื“ ื”ื‘ืกื™ืกื™, ืื ื‘ืงืฉื” ื›ื–ื• ืžื‘ื•ืฆืขืช ืœืขืชื™ื ืงืจื•ื‘ื•ืช ืžืกืคื™ืง.

ื‘ืื•ืคืŸ ื›ืœืœื™, ืืชื” ืœื ืฆืจื™ืš ืœื”ื›ืจื™ื— ืืช ื”ื‘ืกื™ืก "ืœืกื—ื•ื‘ ืืช ื”ืขื’ื•ืœ, ืœื’ืœื’ืœ ืืช ื”ืžืจื•ื‘ืข".

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”