PostgreSQL Antipatterns: CTE x CTE

Saboda layin aiki na, dole ne in fuskanci yanayi lokacin da mai haɓakawa ya rubuta buƙatu kuma ya yi tunani "Tushen yana da wayo, yana iya ɗaukar komai da kansa!«

A wasu lokuta (wani ɓangare daga jahilci na iyawar bayanan bayanai, wani ɓangare daga ingantawa da wuri), wannan hanyar tana haifar da bayyanar "Frankensteins".

Da farko, zan ba da misalin irin wannan buƙatar:

-- для каждой ключевой пары находим ассоциированные значения полей
WITH RECURSIVE cte_bind AS (
  SELECT DISTINCT ON (key_a, key_b)
    key_a a
  , key_b b
  , fld1 bind_fld1
  , fld2 bind_fld2
  FROM
    tbl
)
-- находим min/max значений для каждого первого ключа
, cte_max AS (
  SELECT
    a
  , max(bind_fld1) bind_fld1
  , min(bind_fld2) bind_fld2
  FROM
    cte_bind
  GROUP BY
    a
)
-- связываем по первому ключу ключевые пары и min/max-значения
, cte_a_bind AS (
  SELECT
    cte_bind.a
  , cte_bind.b
  , cte_max.bind_fld1
  , cte_max.bind_fld2
  FROM
    cte_bind
  INNER JOIN
    cte_max
      ON cte_max.a = cte_bind.a
)
SELECT * FROM cte_a_bind;

Don kimanta ingancin buƙata sosai, bari mu ƙirƙiri wasu saitin bayanai na sabani:

CREATE TABLE tbl AS
SELECT
  (random() * 1000)::integer key_a
, (random() * 1000)::integer key_b
, (random() * 10000)::integer fld1
, (random() * 10000)::integer fld2
FROM
  generate_series(1, 10000);
CREATE INDEX ON tbl(key_a, key_b);

Sai ya zama haka karanta bayanan bai wuce kwata ba aiwatar da tambaya:

PostgreSQL Antipatterns: CTE x CTE[duba bayanin.tensor.ru]

Ɗaukar ta gaba ɗaya

Bari mu kalli bukatar da kyau mu yi mamaki:

  1. Me yasa yake tare da RECURSIVE a nan idan babu CTEs masu maimaitawa?
  2. Me yasa ƙungiyoyin min/max ƙima a cikin CTE daban idan an ɗaure su da ainihin samfurin ko ta yaya?
    + 25% lokaci
  3. Me yasa amfani da 'Zaɓi * DAGA' mara iyaka a ƙarshen don maimaita CTE na baya?
    + 14% lokaci

A wannan yanayin, mun yi sa'a sosai cewa an zaɓi Hash Join don haɗin kai, kuma ba Nested Loop ba, saboda a lokacin ba za mu sami izinin CTE Scan ɗaya kawai ba, amma 10K!

kadan game da CTE ScanA nan dole ne mu tuna cewa CTE Scan yayi kama da Seq Scan - wato, babu fihirisa, sai dai cikakken bincike, wanda zai buƙaci 10K x 0.3ms = 3000ms don hawan keke ta cte_max ko 1K x 1.5ms = 1500ms lokacin yin madauki ta cte_bind!
A gaskiya, me kuke so ku samu a sakamakon haka? Ee, yawanci wannan ita ce tambayar da ta taso a wani wuri a cikin minti na 5 na nazarin tambayoyin "banoni uku".

Muna son fitar da kowane maɓalli na musamman min/max daga rukuni ta key_a.
Don haka bari mu yi amfani da shi don wannan ayyukan taga:

SELECT DISTINCT ON(key_a, key_b)
	key_a a
,	key_b b
,	max(fld1) OVER(w) bind_fld1
,	min(fld2) OVER(w) bind_fld2
FROM
	tbl
WINDOW
	w AS (PARTITION BY key_a);

PostgreSQL Antipatterns: CTE x CTE
[duba bayanin.tensor.ru]

Tunda karanta bayanai a cikin zaɓuɓɓukan biyu suna ɗaukar kusan 4-5ms, to duk lokacinmu yana samun riba -32% - wannan yana cikin mafi kyawun siffa an cire kaya daga tushen CPU, idan an aiwatar da irin wannan buƙatar sau da yawa isa.

Gabaɗaya, bai kamata ku tilasta tushe don "ɗaukar zagaye ɗaya ba, mirgine murabba'in ɗaya."

source: www.habr.com

Add a comment