PostgreSQL ืึทื ื˜ื™ืคึผืึทื˜ื˜ืขืจื ืก: CTE x CTE

ืจืขื›ื˜ ืฆื• ืžื™ื™ืŸ ืฉื•ืจื” ืคื•ืŸ ืึทืจื‘ืขื˜, ืื™ืš ื”ืึธื‘ืŸ ืฆื• ื”ืึทื ื“ืœืขืŸ ืžื™ื˜ ืกื™ื˜ื•ืึทื˜ื™ืึธื ืก ื•ื•ืขืŸ ืึท ื“ืขื•ื•ืขืœืึธืคึผืขืจ ืฉืจื™ื™ื‘ื˜ ืึท ื‘ืงืฉื” ืื•ืŸ ื˜ืจืึทื›ื˜ืŸ "ื“ื™ ื‘ืึทื–ืข ืื™ื– ืงืœื•ื’, ืขืก ืงืขื ืขืŸ ืฉืขืคึผืŸ ืึทืœืฅ ื–ื™ืš!ยซ

ืื™ืŸ ืขื˜ืœืขื›ืข ืงืึทืกืขืก (ื˜ื™ื™ืœ ืคื•ืŸ ืื•ืžื•ื•ื™ืกื ื“ื™ืงื™ื™ื˜ ืคื•ืŸ ื“ื™ ืงื™ื™ืคึผืึทื‘ื™ืœืึทื˜ื™ื– ืคื•ืŸ ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก, ื˜ื™ื™ืœ ืคื•ืŸ ืฆื• ืคืจื™ ืึธืคึผื˜ื™ืžื™ื–ืึทื˜ื™ืึธื ืก), ื“ืขื ืฆื•ื’ืึทื ื’ ืคื™ืจื˜ ืฆื• ื“ืขืจ ืื•ื™ืกื–ืขืŸ ืคื•ืŸ "ืคืจืึทื ืงืขื ืกื˜ืขื™ื ืก".

ืขืจืฉื˜ืขืจ, ืื™ืš ื•ื•ืขืœ ื’ืขื‘ืŸ ืึท ื‘ื™ื™ึทืฉืคึผื™ืœ ืคื•ืŸ ืึทื–ืึท ืึท ื‘ืงืฉื”:

-- ะดะปั ะบะฐะถะดะพะน ะบะปัŽั‡ะตะฒะพะน ะฟะฐั€ั‹ ะฝะฐั…ะพะดะธะผ ะฐััะพั†ะธะธั€ะพะฒะฐะฝะฝั‹ะต ะทะฝะฐั‡ะตะฝะธั ะฟะพะปะตะน
WITH RECURSIVE cte_bind AS (
  SELECT DISTINCT ON (key_a, key_b)
    key_a a
  , key_b b
  , fld1 bind_fld1
  , fld2 bind_fld2
  FROM
    tbl
)
-- ะฝะฐั…ะพะดะธะผ min/max ะทะฝะฐั‡ะตะฝะธะน ะดะปั ะบะฐะถะดะพะณะพ ะฟะตั€ะฒะพะณะพ ะบะปัŽั‡ะฐ
, cte_max AS (
  SELECT
    a
  , max(bind_fld1) bind_fld1
  , min(bind_fld2) bind_fld2
  FROM
    cte_bind
  GROUP BY
    a
)
-- ัะฒัะทั‹ะฒะฐะตะผ ะฟะพ ะฟะตั€ะฒะพะผัƒ ะบะปัŽั‡ัƒ ะบะปัŽั‡ะตะฒั‹ะต ะฟะฐั€ั‹ ะธ min/max-ะทะฝะฐั‡ะตะฝะธั
, cte_a_bind AS (
  SELECT
    cte_bind.a
  , cte_bind.b
  , cte_max.bind_fld1
  , cte_max.bind_fld2
  FROM
    cte_bind
  INNER JOIN
    cte_max
      ON cte_max.a = cte_bind.a
)
SELECT * FROM cte_a_bind;

ืฆื• ืกืึทื‘ืกื˜ืึทื ืฉืึทืœื™ ืึธืคึผืฉืึทืฆืŸ ื“ื™ ืงื•ื•ืึทืœื™ื˜ืขื˜ ืคื•ืŸ ืึท ื‘ืงืฉื”, ืœืึธื–ืŸ ืื•ื ื“ื– ืžืึทื›ืŸ ืขื˜ืœืขื›ืข ืึทืจื‘ื™ื˜ืจืึทืจื™ืฉ ื“ืึทื˜ืŸ ืฉื˜ืขืœืŸ:

CREATE TABLE tbl AS
SELECT
  (random() * 1000)::integer key_a
, (random() * 1000)::integer key_b
, (random() * 10000)::integer fld1
, (random() * 10000)::integer fld2
FROM
  generate_series(1, 10000);
CREATE INDEX ON tbl(key_a, key_b);

ืขืก ื˜ื•ืจื ืก ืื•ื™ืก ืึทื– ืœื™ื™ืขื ืขืŸ ื“ื™ ื“ืึทื˜ืŸ ื’ืขื ื•ืžืขืŸ ื•ื•ื™ื™ื ื™ืงืขืจ ื•ื•ื™ XNUMX/XNUMX ืคื•ืŸ ื“ื™ ืฆื™ื™ื˜ ืึธื ืคึฟืจืขื’ ื“ื•ืจื›ืคื™ืจื•ื ื’:

PostgreSQL ืึทื ื˜ื™ืคึผืึทื˜ื˜ืขืจื ืก: CTE x CTE[ืงื•ืง ืื•ื™ืฃ explain.tensor.ru]

ื ืขืžืขืŸ ืขืก ื‘ืึทื–ื•ื ื“ืขืจ ืฉื˜ื™ืง ื“ื•ืจืš ืฉื˜ื™ืง

ืœืึธืžื™ืจ ื ืขืžืขืŸ ืึท ื ืขืขื ื˜ืขืจ ืงื•ืง ืื™ืŸ ื“ื™ ื‘ืงืฉื” ืื•ืŸ ื–ื™ื™ืŸ ืคึผืึทื–ืึทืœื“:

  1. ืคืืจื•ื•ืืก ืื™ื– ืžื™ื˜ ืจืขืงื•ืจืกื™ื•ื•ืข ื“ืึธ ืื•ื™ื‘ ืขืก ื–ืขื ืขืŸ ืงื™ื™ืŸ ืจืขืงื•ืจืกื™ื•ื•ืข CTEs?
  2. ืคืืจื•ื•ืืก ื’ืจื•ืคึผืข ืžื™ื ืก / ืžืึทืงืก ื•ื•ืึทืœื•ืขืก ืื™ืŸ ืึท ื‘ืึทื–ื•ื ื“ืขืจ CTE ืื•ื™ื‘ ื–ื™ื™ ื–ืขื ืขืŸ ื“ืขืžืึธืœื˜ ื˜ื™ื™ื“ ืฆื• ื“ืขืจ ืึธืจื™ื’ื™ื ืขืœ ืžื•ืกื˜ืขืจ ืกื™ื™ึท ื•ื•ื™ ืกื™ื™ึท?
    +25% ืฆื™ื™ื˜
  3. ืคืืจื•ื•ืืก ื ื•ืฆืŸ ืึท ื•ืžื‘ืึทื“ื™ื ื’ื˜ 'SELECT * FROM' ืื™ืŸ ื“ื™ ืกื•ืฃ ืคื•ืŸ ื“ื™ ืคืจื™ืขืจื“ื™ืงืข CTE?
    +14% ืฆื™ื™ื˜

ืื™ืŸ ื“ืขื ืคืึทืœ, ืžื™ืจ ื–ืขื ืขืŸ ื’ืขื•ื•ืขืŸ ื–ื™ื™ืขืจ ืžืึทื–ืœื“ื™ืง ืึทื– Hash Join ืื™ื– ืื•ื™ืกื“ืขืจื•ื•ื™ื™ืœื˜ ืคึฟืึทืจ ื“ื™ ืคึฟืึทืจื‘ื™ื ื“ื•ื ื’, ืื•ืŸ ื ื™ื˜ ื ืขืกื˜ืขื“ ืœื•ืคึผ, ื•ื•ื™ื™ึทืœ ื“ืขืžืึธืœื˜ ืžื™ืจ ื•ื•ืึธืœื˜ ื”ืึธื‘ืŸ ื‘ืืงื•ืžืขืŸ ื ื™ืฉื˜ ื‘ืœื•ื™ื– ืื™ื™ืŸ CTE ืกืงืึทืŸ ืคืึธืจืŸ, ืึธื‘ืขืจ 10K!

ืึท ื‘ื™ืกืœ ื•ื•ืขื’ืŸ CTE Scanืื˜ ื“ืืจืคืŸ ืžื™ืจ ื“ืืก ื’ืขื“ืขื ืงืขืŸ CTE ืกืงืึทืŸ ืื™ื– ืขื ืœืขืš ืฆื• Seq Scan - ืึทื– ืื™ื–, ืงื™ื™ืŸ ื™ื ื“ืขืงืกื™ื ื’, ืึธื‘ืขืจ ื‘ืœื•ื™ื– ืึท ื’ืึทื ืฅ ื–ื•ื›ืŸ, ื•ื•ืึธืก ื•ื•ืึธืœื˜ ื“ืึทืจืคืŸ 10K x 0.3ms = ืงืกื ื•ืžืงืกืžืก ืคึฟืึทืจ ืกื™ื™ืงืึทืœื– ื“ื•ืจืš cte_max ืึธื“ืขืจ 1K x 1.5ms = ืงืกื ื•ืžืงืกืžืก ื•ื•ืขืŸ ืœื•ืคึผื™ื ื’ ื“ื•ืจืš cte_bind!
ืึทืงื˜ื•ืึทืœืœื™, ื•ื•ืึธืก ื”ืื˜ ืื™ืจ ื•ื•ื™ืœืŸ ืฆื• ื‘ืึทืงื•ืžืขืŸ ื•ื•ื™ ืึท ืจืขื–ื•ืœื˜ืึทื˜? ื™ืึธ, ื™ื•ื–ืฉืึทื•ื•ืึทืœื™ ื“ืึธืก ืื™ื– ื“ื™ ืงืฉื™ื ื•ื•ืึธืก ืงื•ืžื˜ ืขืจื’ืขืฅ ืื™ืŸ ื“ื™ 5 ืžื™ื ื•ื˜ ืคื•ืŸ ืึทื ืึทืœื™ื™ื–ื™ื ื’ "ื“ืจื™ื™ึท-ืกื˜ืึธืจื™" ืคึฟืจืื’ืŸ.

ืžื™ืจ ื’ืขื•ื•ืืœื˜ ืฆื• ืจืขื–ื•ืœื˜ืึทื˜ ืคึฟืึทืจ ื™ืขื“ืขืจ ื™ื™ื ืฆื™ืง ืฉืœื™ืกืœ ืคึผืึธืจ ืžื™ืŸ / ืžืึทืงืก ืคึฟื•ืŸ ื’ืจื•ืคึผืข ื“ื•ืจืš key_a.
ืึทื–ื•ื™ ืœืึธื–ืŸ ืื•ื ื“ื– ื ื•ืฆืŸ ืขืก ืคึฟืึทืจ ื“ืขื ืคึฟืขื ืฆื˜ืขืจ ืคืึทื ื’ืงืฉืึทื ื–:

SELECT DISTINCT ON(key_a, key_b)
	key_a a
,	key_b b
,	max(fld1) OVER(w) bind_fld1
,	min(fld2) OVER(w) bind_fld2
FROM
	tbl
WINDOW
	w AS (PARTITION BY key_a);

PostgreSQL ืึทื ื˜ื™ืคึผืึทื˜ื˜ืขืจื ืก: CTE x CTE
[ืงื•ืง ืื•ื™ืฃ explain.tensor.ru]

ื–ื™ื ื˜ ืœื™ื™ืขื ืขืŸ ื“ืึทื˜ืŸ ืื™ืŸ ื‘ื™ื™ื“ืข ืึธืคึผืฆื™ืขืก ื ืขืžื˜ ื“ื™ ื–ืขืœื‘ืข ื‘ืขืขืจืขืš 4-5ms, ื“ืขืžืึธืœื˜ ืึทืœืข ืื•ื ื“ื–ืขืจ ืฆื™ื™ื˜ ื’ืขื•ื•ื™ื ืขืŸ -ืงืกื ื•ืžืงืก% - ื“ืึธืก ืื™ื– ืื™ืŸ ื–ื™ื™ึทืŸ ืคึผื™ื•ืจืึทืกื˜ ืคืึธืจืขื ืžืึทืกืข ืึทื•ื•ืขืงื’ืขื ื•ืžืขืŸ ืคื•ืŸ ื‘ืึทื–ืข ืงืคึผื•, ืื•ื™ื‘ ืึทื–ืึท ืึท ื‘ืงืฉื” ืื™ื– ืขืงืกืึทืงื™ื•ื˜ืึทื“ ืึธืคื˜ ื’ืขื ื•ื’.

ืื™ืŸ ืึทืœื’ืขืžื™ื™ืŸ, ืื™ืจ ื–ืึธืœ ื ื™ืฉื˜ ืฆื•ื•ื™ื ื’ืขืŸ ื“ื™ ื‘ืึทื–ืข ืฆื• "ื˜ืจืึธื’ืŸ ื“ื™ ืงื™ื™ึทืœืขื›ื™ืง ืื™ื™ื ืขืจ, ืจืึธื•ืœื“ ื“ื™ ืงื•ื•ืึทื“ืจืึทื˜ ืื™ื™ื ืขืจ."

ืžืงื•ืจ: www.habr.com

ืœื™ื™ื’ืŸ ืึท ื‘ืึทืžืขืจืงื•ื ื’