Nā Palena PostgreSQL: CTE x CTE

Ma muli o kaʻu laina o ka hana, pono wau e hana i nā kūlana i ka wā e kākau ai kahi mea hoʻomohala i kahi noi a noʻonoʻo "He akamai ke kumu, hiki iā ia ke mālama pono i nā mea a pau!«

I kekahi mau hihia (ma kekahi hapa mai ka naʻaupō o ka hiki o ka waihona, kekahi hapa mai ka optimizations premature), alakaʻi kēia ala i ke ʻano o "Frankensteins".

ʻO ka mea mua, e hāʻawi wau i kahi laʻana o ia noi:

-- для каждой ключевой пары находим ассоциированные значения полей
WITH RECURSIVE cte_bind AS (
  SELECT DISTINCT ON (key_a, key_b)
    key_a a
  , key_b b
  , fld1 bind_fld1
  , fld2 bind_fld2
  FROM
    tbl
)
-- находим min/max значений для каждого первого ключа
, cte_max AS (
  SELECT
    a
  , max(bind_fld1) bind_fld1
  , min(bind_fld2) bind_fld2
  FROM
    cte_bind
  GROUP BY
    a
)
-- связываем по первому ключу ключевые пары и min/max-значения
, cte_a_bind AS (
  SELECT
    cte_bind.a
  , cte_bind.b
  , cte_max.bind_fld1
  , cte_max.bind_fld2
  FROM
    cte_bind
  INNER JOIN
    cte_max
      ON cte_max.a = cte_bind.a
)
SELECT * FROM cte_a_bind;

No ka loiloi nui ʻana i ka maikaʻi o kahi noi, e hana mākou i kekahi pūʻulu ʻikepili kūʻokoʻa:

CREATE TABLE tbl AS
SELECT
  (random() * 1000)::integer key_a
, (random() * 1000)::integer key_b
, (random() * 10000)::integer fld1
, (random() * 10000)::integer fld2
FROM
  generate_series(1, 10000);
CREATE INDEX ON tbl(key_a, key_b);

ʻIke ʻia kēlā ʻO ka heluhelu ʻana i ka ʻikepili i emi iho ma mua o ka hapahā o ka manawa hoʻokō nīnau:

Nā Palena PostgreSQL: CTE x CTE[nānā ma explain.tensor.ru]

ʻO ka wehe ʻana i kēlā me kēia ʻāpana

E noʻonoʻo pono kākou i ka noi a me ka pohihihi.

  1. No ke aha ʻo WITH RECURSIVE ma aneʻi inā ʻaʻohe CTE recursive?
  2. No ke aha i hui pū ʻia ai nā waiwai min/max i kahi CTE ʻokoʻa inā pili lākou i ka laʻana kumu?
    + 25% manawa
  3. No ke aha e hoʻohana ai i kahi 'SELECT * FROM' ʻole ma ka hopena o ka CTE mua?
    + 14% manawa

I kēia hihia, ua laki loa mākou ua koho ʻia ʻo Hash Join no ka pilina, ʻaʻole ʻo Nested Loop, no ka mea, ʻaʻole i loaʻa iā mākou hoʻokahi wale nō CTE Scan pass, akā 10K!

kahi liʻiliʻi e pili ana i ka CTE ScanMaanei kakou e hoomanao ai Ua like ka CTE Scan me Seq Scan - ʻo ia hoʻi, ʻaʻohe papa kuhikuhi, akā he huli piha wale nō, e pono ai 10K x 0.3ms = 3000ms no nā pōʻai e cte_max ai ole ia, 1K x 1.5ms = 1500ms i ka loopinga ana e cte_bind!
ʻOiaʻiʻo, he aha kāu i makemake ai i ka hopena? ʻAe, maʻamau kēia ka nīnau e piʻi mai ana ma kahi o ka minuke 5 o ka nānā ʻana i nā nīnau "ʻekolu-moʻolelo".

Ua makemake mākou e hoʻopuka no kēlā me kēia kī nui min/max mai ka hui e key_a.
No laila e hoʻohana kākou no kēia nā hana pukaaniani:

SELECT DISTINCT ON(key_a, key_b)
	key_a a
,	key_b b
,	max(fld1) OVER(w) bind_fld1
,	min(fld2) OVER(w) bind_fld2
FROM
	tbl
WINDOW
	w AS (PARTITION BY key_a);

Nā Palena PostgreSQL: CTE x CTE
[nānā ma explain.tensor.ru]

Mai ka heluhelu ʻana i ka ʻikepili ma nā koho ʻelua e like me 4-5ms, a laila loaʻa kā mākou manawa āpau -32% - aia kēia ma kona ʻano maʻemaʻe lawe ʻia ka ukana mai ka CPU kumu, inā e hoʻokō pinepine ʻia kēlā noi.

Ma keʻano laulā, ʻaʻole pono ʻoe e koi i ke kumu e "lawe i ka mea pōʻai, e ʻōwili i ka huina."

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka