I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi

Ezinhlelweni eziyinkimbinkimbi ze-ERP amabhizinisi amaningi anemvelo yokulandelanalapho izinto ezifanayo zilandelana isihlahla sobudlelwano bedlozi nenzalo - lokhu kuyisakhiwo senhlangano yebhizinisi (wonke lawa magatsha, iminyango namaqembu omsebenzi), kanye nekhathalogi yezimpahla, nezindawo zokusebenza, kanye nezwe lamaphuzu okuthengisa, ...

I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi

Eqinisweni, akukho izindawo zebhizinisi ezizenzakalelayo, lapho bekungeke kube khona isigaba sabaphathi ngenxa yalokho. Kodwa noma ngabe awusebenzeli "ibhizinisi," usengakwazi ukuhlangana kalula nobudlelwano be-hierarchical. Kuhle, ngisho nesihlahla sakho somndeni noma ipulani lendawo esikhungweni sezitolo kuyisakhiwo esifanayo.

Kunezindlela eziningi zokugcina isihlahla esinjalo ku-DBMS, kodwa namuhla sizogxila kwinketho eyodwa kuphela:

CREATE TABLE hier(
  id
    integer
      PRIMARY KEY
, pid
    integer
      REFERENCES hier
, data
    json
);

CREATE INDEX ON hier(pid); -- не забываем, что FK не подразумевает автосоздание индекса, в отличие от PK

Futhi ngenkathi ulunguza ekujuleni kwesigaba sabaphathi, kulinde ngesineke ukubona ukuthi izindlela zakho "ezingenangqondo" zokusebenza nesakhiwo esinjalo zizosebenza kanjani.

I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi
Ake sibheke izinkinga ezivamile ezivelayo, ukuqaliswa kwazo ku-SQL, futhi sizame ukuthuthukisa ukusebenza kwazo.

#1. Ujule kangakanani umgodi onogwaja?

Asemukele ngokusobala ukuthi lolu hlaka luzokhombisa ukuthonywa kweminyango ngesakhiwo senhlangano: iminyango, izigaba, imikhakha, amagatsha, izinhlangano zokusebenza... - noma ngabe uzibiza ngani.
I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi

Okokuqala, masikhiqize 'isihlahla' sethu sama-elementi angu-10K

INSERT INTO hier
WITH RECURSIVE T AS (
  SELECT
    1::integer id
  , '{1}'::integer[] pids
UNION ALL
  SELECT
    id + 1
  , pids[1:(random() * array_length(pids, 1))::integer] || (id + 1)
  FROM
    T
  WHERE
    id < 10000
)
SELECT
  pids[array_length(pids, 1)] id
, pids[array_length(pids, 1) - 1] pid
FROM
  T;

Ake siqale ngomsebenzi olula - ukuthola bonke abasebenzi abasebenza ngaphakathi komkhakha othile, noma ngokwezikhundla - thola zonke izingane ze-node. Kungaba kuhle futhi ukuthola "ukujula" kwenzalo... Konke lokhu kungase kudingeke, isibonelo, ukwakha uhlobo oluthile lwe ukukhetha okuyinkimbinkimbi okusekelwe ohlwini lomazisi balaba basebenzi.

Konke kuzolunga uma kunamazinga ambalwa kuphela alezi nzalo futhi inombolo ingaphakathi kweshumi nambili, kodwa uma kunamazinga angaphezu kuka-5, futhi sekuvele kunenqwaba yenzalo, kungase kube nezinkinga. Ake sibheke ukuthi izinketho zokusesha ezansi kwesihlahla zibhalwa kanjani (futhi zisebenza). Kodwa okokuqala, ake sinqume ukuthi yimaphi ama-node azothakazelisa kakhulu ucwaningo lwethu.

Kakhulu "jule" izihlahla ezincane:

WITH RECURSIVE T AS (
  SELECT
    id
  , pid
  , ARRAY[id] path
  FROM
    hier
  WHERE
    pid IS NULL
UNION ALL
  SELECT
    hier.id
  , hier.pid
  , T.path || hier.id
  FROM
    T
  JOIN
    hier
      ON hier.pid = T.id
)
TABLE T ORDER BY array_length(path, 1) DESC;

 id  | pid  | path
---------------------------------------------
7624 | 7623 | {7615,7620,7621,7622,7623,7624}
4995 | 4994 | {4983,4985,4988,4993,4994,4995}
4991 | 4990 | {4983,4985,4988,4989,4990,4991}
...

Kakhulu "banzi" izihlahla ezincane:

...
SELECT
  path[1] id
, count(*)
FROM
  T
GROUP BY
  1
ORDER BY
  2 DESC;

id   | count
------------
5300 |   30
 450 |   28
1239 |   27
1573 |   25

Kule mibuzo sisebenzise okujwayelekile recursive JOIN:
I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi

Ngokusobala, nale modeli yesicelo inani lokuphindaphinda lizofana nenani eliphelele lenzalo (а их ведь несколько десятков), и занимать это может достаточно существенные ресурсы, и, как следствие, время.

Ake sihlole isihlahla "esibanzi kakhulu":

WITH RECURSIVE T AS (
  SELECT
    id
  FROM
    hier
  WHERE
    id = 5300
UNION ALL
  SELECT
    hier.id
  FROM
    T
  JOIN
    hier
      ON hier.pid = T.id
)
TABLE T;

I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi
[buka kokuthi explain.tensor.ru]

Njengoba bekulindelekile, sithole wonke amarekhodi angama-30. Kodwa basebenzise u-60% wesikhathi esiphelele kulokhu - ngoba futhi benze ukusesha okungu-30 kunkomba. Kungenzeka yini ukwenza okuncane?

Ukuhlolwa kwenqwaba ngenkomba

Ingabe sidinga ukwenza umbuzo wenkomba ohlukile wenodi ngayinye? Kuvele ukuthi cha - singakwazi ukufunda kusukela index usebenzisa izinkinobho ezimbalwa ngesikhathi esisodwa ocingweni olulodwa ngosizo lwe = ANY(array).

Futhi eqenjini ngalinye lezihlonzi singathatha wonke ama-ID atholakala esinyathelweni sangaphambilini “ngamanodi”. Okungukuthi, esinyathelweni ngasinye esilandelayo sizokwenza sesha yonke inzalo yezinga elithile ngesikhathi esisodwa.

Kuphela, nansi inkinga, ekukhetheni okuphindaphindayo, awukwazi ukuzitholela yona embuzweni osesidlekeni, kodwa sidinga ngandlela-thile ukukhetha kuphela lokho okutholwe ezingeni langaphambilini ... Kuvela ukuthi akunakwenzeka ukwenza umbuzo ofakwe isidleke kukho konke okukhethiwe, kodwa ngenkambu yayo ethile kungenzeka. Futhi lo mkhakha ungaba futhi uhlu - okuyikhona okudingeka sikusebenzise ANY.

Kuzwakala kuhlanya, kepha emdwebeni konke kulula.

I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi

WITH RECURSIVE T AS (
  SELECT
    ARRAY[id] id$
  FROM
    hier
  WHERE
    id = 5300
UNION ALL
  SELECT
    ARRAY(
      SELECT
        id
      FROM
        hier
      WHERE
        pid = ANY(T.id$)
    ) id$
  FROM
    T
  WHERE
    coalesce(id$, '{}') <> '{}' -- условие выхода из цикла - пустой массив
)
SELECT
  unnest(id$) id
FROM
  T;

I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi
[buka kokuthi explain.tensor.ru]

Futhi lapha into ebaluleke kakhulu ayikho ngisho win izikhathi ezingu-1.5 ngesikhathi, nokuthi sikhiphe amabhafa ambalwa, njengoba sinezingcingo ezi-5 kuphela eziya kunkomba esikhundleni sika-30!

Ibhonasi eyengeziwe yiqiniso lokuthi ngemva kokungalungi kokugcina, izihlonzi zizohlala zi-oda "ngamazinga".

Uphawu lwenodi

Ukucatshangelwa okulandelayo okuzosiza ukuthuthukisa ukusebenza − "amaqabunga" awakwazi ukuba nezingane, okungukuthi, kubo asikho isidingo sokubheka "phansi" nhlobo. Ekwakhiweni komsebenzi wethu, lokhu kusho ukuthi uma silandela uchungechunge lweminyango futhi safinyelela isisebenzi, asikho isidingo sokubheka phambili kuleli gatsha.

Asingene etafuleni lethu eyengeziwe boolean-inkambu, okuzositshela ngokushesha ukuthi lokhu kungena okuthile esihlahleni sethu “kuyinodi” - okungukuthi, noma kungaba nenzalo nhlobo.

ALTER TABLE hier
  ADD COLUMN branch boolean;

UPDATE
  hier T
SET
  branch = TRUE
WHERE
  EXISTS(
    SELECT
      NULL
    FROM
      hier
    WHERE
      pid = T.id
    LIMIT 1
);
-- Запрос успешно выполнен: 3033 строк изменено за 42 мс.

Kuhle! Kuvela ukuthi ngaphezudlwana kuka-30% kuphela kwazo zonke izakhi zesihlahla ezinenzalo.

Manje ake sisebenzise umakhenikha ohluke kancane - ukuxhumana nengxenye ephindaphindayo ngokusebenzisa LATERAL, okuzosivumela ukuthi sifinyelele ngokushesha izinkambu "zethebula" eliphindaphindayo, futhi sisebenzise umsebenzi wokuhlanganisa onombandela wokuhlunga osuselwe endaweni yokunciphisa isethi yokhiye:

I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi

WITH RECURSIVE T AS (
  SELECT
    array_agg(id) id$
  , array_agg(id) FILTER(WHERE branch) ns$
  FROM
    hier
  WHERE
    id = 5300
UNION ALL
  SELECT
    X.*
  FROM
    T
  JOIN LATERAL (
    SELECT
      array_agg(id) id$
    , array_agg(id) FILTER(WHERE branch) ns$
    FROM
      hier
    WHERE
      pid = ANY(T.ns$)
  ) X
    ON coalesce(T.ns$, '{}') <> '{}'
)
SELECT
  unnest(id$) id
FROM
  T;

I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi
[buka kokuthi explain.tensor.ru]

Sikwazile ukunciphisa ikholi yenkomba eyodwa futhi iwine izikhathi ezingaphezu kwezi-2 ngevolumu funda.

#2. Ake sibuyele ezimpandeni

Le algorithm izoba usizo uma udinga ukuqoqa amarekhodi azo zonke izici “phezulu esihlahleni”, kuyilapho ugcina ulwazi mayelana nokuthi yiliphi ishidi lomthombo (nokuthi yiziphi izinkomba) okubangele ukuthi lifakwe kusampula - isibonelo, ukuze ukhiqize umbiko ofinyeziwe. ngokuhlanganisa ku-node.

I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi
Okulandelayo kufanele kuthathwe njengobufakazi bomqondo kuphela, ngoba isicelo sibonakala sinzima kakhulu. Kodwa uma ibusa isizindalwazi sakho, kufanele ucabange ngokusebenzisa amasu afanayo.

Ake siqale ngezitatimende ezimbalwa ezilula:

  • Irekhodi elifanayo elivela kusizindalwazi Kungcono ukuyifunda kanye nje.
  • Amarekhodi asuka kusizindalwazi Kusebenza kahle kakhulu ukufunda ngamaqoqokunokuba yedwa.

Manje ake sizame ukwakha isicelo esisidingayo.

Isinyathelo 1

Ngokusobala, lapho siqala ukuphindaphinda (besiyoba kuphi ngaphandle kwakho!) kuzodingeka sikhiphe amarekhodi amaqabunga ngokwawo ngokusekelwe kusethi yezihlonzi zokuqala:

WITH RECURSIVE tree AS (
  SELECT
    rec -- это цельная запись таблицы
  , id::text chld -- это "набор" приведших сюда исходных листьев
  FROM
    hier rec
  WHERE
    id = ANY('{1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192}'::integer[])
UNION ALL
  ...

Uma kubonakala kungavamile kumuntu ukuthi "isethi" igcinwe njengentambo hhayi i-array, khona-ke kunencazelo elula yalokhu. Kunomsebenzi “wokuhlanganisa” owakhelwe ngaphakathi ohlanganisa izintambo string_agg, kodwa hhayi okweqoqo. Nakuba yena kulula ukuyisebenzisa uwedwa.

Isinyathelo 2

Manje sizothola isethi yama-ID esigaba azodinga ukufundwa ngokuqhubekayo. Cishe njalo zizophindwa kumarekhodi ahlukene wesethi yoqobo - sizokwenza kanjalo baqoqe ndawonye, kuyilapho kugcinwa ulwazi mayelana namaqabunga omthombo.

Kodwa nazi izinkinga ezintathu ezisilindile:

  1. Ingxenye yombuzo ethi "i-subrecursive" ayikwazi ukuqukatha imisebenzi ehlanganisiwe nayo GROUP BY.
  2. Ireferensi "yethebula" eliphindaphindayo ayikwazi ukuba kumbuzo ongaphansi ovalelwe.
  3. Isicelo engxenyeni ephindayo asikwazi ukuqukatha i-CTE.

Ngenhlanhla, zonke lezi zinkinga kulula kakhulu ukubhekana nazo. Ake siqale kusukela ekugcineni.

I-CTE engxenyeni ephindaphindayo

Njengalokhu hhayi isebenza:

WITH RECURSIVE tree AS (
  ...
UNION ALL
  WITH T (...)
  SELECT ...
)

Futhi ngakho kuyasebenza, abakaki benza umehluko!

WITH RECURSIVE tree AS (
  ...
UNION ALL
  (
    WITH T (...)
    SELECT ...
  )
)

Umbuzo ubekwe ngokumelene "netafula" eliphindaphindayo

Hmm... I-CTE ephindaphindayo ayikwazi ukufinyelelwa embuzweni ongaphansi. Kodwa kungaba ngaphakathi kwe-CTE! Futhi isicelo esifakwe esidlekeni sesingakwazi ukufinyelela le CTE!

GROUP BY ngaphakathi kwe-recursion

Akujabulisi, kodwa... Sinendlela elula yokulingisa i-GROUP BY ngokusebenzisa DISTINCT ON kanye nemisebenzi yewindi!

SELECT
  (rec).pid id
, string_agg(chld::text, ',') chld
FROM
  tree
WHERE
  (rec).pid IS NOT NULL
GROUP BY 1 -- не работает!

Futhi lokhu kusebenza kanjani!

SELECT DISTINCT ON((rec).pid)
  (rec).pid id
, string_agg(chld::text, ',') OVER(PARTITION BY (rec).pid) chld
FROM
  tree
WHERE
  (rec).pid IS NOT NULL

Manje siyabona ukuthi kungani i-ID yezinombolo yaguqulwa yaba umbhalo - ukuze ihlanganiswe ndawonye ihlukaniswe ngokhefana!

Isinyathelo 3

Kowamanqamu asisenalutho:

  • sifunda amarekhodi “esigaba” asuselwa kusethi yama-ID aqoqwe
  • siqhathanisa izigaba ezikhishiwe "namasethi" amashidi okuqala
  • "nweba" i-set-string usebenzisa unnest(string_to_array(chld, ',')::integer[])

WITH RECURSIVE tree AS (
  SELECT
    rec
  , id::text chld
  FROM
    hier rec
  WHERE
    id = ANY('{1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192}'::integer[])
UNION ALL
  (
    WITH prnt AS (
      SELECT DISTINCT ON((rec).pid)
        (rec).pid id
      , string_agg(chld::text, ',') OVER(PARTITION BY (rec).pid) chld
      FROM
        tree
      WHERE
        (rec).pid IS NOT NULL
    )
    , nodes AS (
      SELECT
        rec
      FROM
        hier rec
      WHERE
        id = ANY(ARRAY(
          SELECT
            id
          FROM
            prnt
        ))
    )
    SELECT
      nodes.rec
    , prnt.chld
    FROM
      prnt
    JOIN
      nodes
        ON (nodes.rec).id = prnt.id
  )
)
SELECT
  unnest(string_to_array(chld, ',')::integer[]) leaf
, (rec).*
FROM
  tree;

I-PostgreSQL Antipatterns: Ujule kangakanani umgodi onogwaja? ake sidlule esigabeni sobukhosi
[buka kokuthi explain.tensor.ru]

Source: www.habr.com

Engeza amazwana