PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy

Nyob rau hauv complex ERP systems ntau qhov chaw muaj qhov xwm txheej hierarchicalthaum cov khoom homogeneous kab nyob rau hauv tsob ntoo ntawm poj koob yawm txwv- xeeb leej xeeb ntxwv kev sib raug zoo - qhov no yog lub koom haum ntawm lub tuam txhab (tag nrho cov no ceg, departments thiab ua hauj lwm pab pawg neeg), thiab cov catalog ntawm cov khoom, thiab cov cheeb tsam ntawm kev ua hauj lwm, thiab geography ntawm cov ntsiab lus muag, ...

PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy

Qhov tseeb, tsis muaj chaw ua lag luam automation, qhov twg yuav tsis muaj hierarchy li tshwm sim. Tab sis txawm tias koj tsis ua haujlwm "rau kev lag luam," koj tseem tuaj yeem ntsib kev sib raug zoo hierarchical. Nws yog qhov tsis zoo, txawm tias koj tsev neeg tsob ntoo lossis hauv pem teb txoj kev npaj ntawm thaj chaw hauv lub khw muag khoom yog tib lub qauv.

Muaj ntau txoj hauv kev khaws cov ntoo zoo li no hauv DBMS, tab sis hnub no peb yuav tsom mus rau ib qho kev xaiv:

CREATE TABLE hier(
  id
    integer
      PRIMARY KEY
, pid
    integer
      REFERENCES hier
, data
    json
);

CREATE INDEX ON hier(pid); -- Π½Π΅ Π·Π°Π±Ρ‹Π²Π°Π΅ΠΌ, Ρ‡Ρ‚ΠΎ FK Π½Π΅ ΠΏΠΎΠ΄Ρ€Π°Π·ΡƒΠΌΠ΅Π²Π°Π΅Ρ‚ автосозданиС индСкса, Π² ΠΎΡ‚Π»ΠΈΡ‡ΠΈΠ΅ ΠΎΡ‚ PK

Thiab thaum koj tab tom peering rau hauv qhov tob ntawm hierarchy, nws ua siab ntev tos saib yuav ua li cas [hauv] koj txoj kev "naive" ntawm kev ua haujlwm nrog cov qauv zoo li no yuav zoo li cas.

PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy
Cia peb saib cov teeb meem uas tshwm sim, lawv qhov kev siv hauv SQL, thiab sim txhim kho lawv cov kev ua tau zoo.

#1. Lub qhov luav tob npaum li cas?

Cia peb, kom paub meej, lees paub tias cov qauv no yuav cuam tshuam txog kev ua haujlwm ntawm cov tuam tsev hauv cov qauv ntawm lub koom haum: departments, divisions, sectors, ceg, pawg neeg ua hauj lwm ... - txawm koj hu lawv.
PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy

Ua ntej, cia peb tsim peb 'ntoo' ntawm 10K hais

INSERT INTO hier
WITH RECURSIVE T AS (
  SELECT
    1::integer id
  , '{1}'::integer[] pids
UNION ALL
  SELECT
    id + 1
  , pids[1:(random() * array_length(pids, 1))::integer] || (id + 1)
  FROM
    T
  WHERE
    id < 10000
)
SELECT
  pids[array_length(pids, 1)] id
, pids[array_length(pids, 1) - 1] pid
FROM
  T;

Cia peb pib nrog txoj haujlwm yooj yim tshaj plaws - nrhiav txhua tus neeg ua haujlwm uas ua haujlwm hauv ib qho haujlwm tshwj xeeb, lossis hais txog hierarchy - nrhiav txhua tus me nyuam ntawm node. Nws kuj yuav yog qhov zoo kom tau txais "qhov tob" ntawm cov xeeb leej xeeb ntxwv ... Tag nrho cov no yuav tsim nyog, piv txwv li, los tsim qee yam. kev xaiv nyuaj raws li cov npe IDs ntawm cov neeg ua haujlwm no.

Txhua yam yuav zoo yog tias tsuas muaj ob peb theem ntawm cov xeeb leej xeeb ntxwv thiab tus lej nyob hauv ib lub kaum os, tab sis yog tias muaj ntau dua 5 qib, thiab twb muaj kaum ob tus xeeb leej xeeb ntxwv, tej zaum yuav muaj teeb meem. Cia peb saib yuav ua li cas cov kev xaiv hauv qab-tus-ntoo tshawb nrhiav tau sau (thiab ua haujlwm). Tab sis ua ntej, cia peb txiav txim siab seb qhov twg yuav yog qhov nthuav tshaj plaws rau peb cov kev tshawb fawb.

Cov feem coob "dub" subtrees:

WITH RECURSIVE T AS (
  SELECT
    id
  , pid
  , ARRAY[id] path
  FROM
    hier
  WHERE
    pid IS NULL
UNION ALL
  SELECT
    hier.id
  , hier.pid
  , T.path || hier.id
  FROM
    T
  JOIN
    hier
      ON hier.pid = T.id
)
TABLE T ORDER BY array_length(path, 1) DESC;

 id  | pid  | path
---------------------------------------------
7624 | 7623 | {7615,7620,7621,7622,7623,7624}
4995 | 4994 | {4983,4985,4988,4993,4994,4995}
4991 | 4990 | {4983,4985,4988,4989,4990,4991}
...

Cov feem coob "dav" subtrees:

...
SELECT
  path[1] id
, count(*)
FROM
  T
GROUP BY
  1
ORDER BY
  2 DESC;

id   | count
------------
5300 |   30
 450 |   28
1239 |   27
1573 |   25

Rau cov lus nug no peb siv qhov raug recursive JOIN:
PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy

Obviously, nrog cov qauv thov no tus naj npawb ntawm iterations yuav phim tag nrho cov xeeb leej xeeb ntxwv (thiab muaj ntau lub kaum ob ntawm lawv), thiab qhov no tuaj yeem siv cov peev txheej tseem ceeb, thiab, vim li ntawd, lub sijhawm.

Cia peb saib ntawm "dav tshaj plaws" subtree:

WITH RECURSIVE T AS (
  SELECT
    id
  FROM
    hier
  WHERE
    id = 5300
UNION ALL
  SELECT
    hier.id
  FROM
    T
  JOIN
    hier
      ON hier.pid = T.id
)
TABLE T;

PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy
[saib ntawm piav qhia.tensor.ru]

Raws li xav tau, peb pom tag nrho 30 cov ntaub ntawv. Tab sis lawv tau siv 60% ntawm tag nrho lub sijhawm ntawm qhov no - vim tias lawv kuj tau ua 30 qhov kev tshawb fawb hauv qhov ntsuas. Puas ua tau tsawg dua?

Cov ntaub ntawv pov thawj ntau los ntawm index

Peb puas yuav tsum tau ua ib qho lus nug sib cais rau txhua qhov? Nws hloov tawm tias tsis yog - peb tuaj yeem nyeem los ntawm qhov ntsuas siv ntau tus yuam sij ib zaug hauv ib qho hu nrog kev pab = ANY(array).

Thiab nyob rau hauv txhua pab pawg ntawm cov neeg txheeb xyuas peb tuaj yeem nqa tag nrho cov IDs pom hauv cov kauj ruam dhau los los ntawm "nodes". Ntawd yog, ntawm txhua kauj ruam tom ntej peb yuav nrhiav tag nrho cov xeeb leej xeeb ntxwv ntawm ib theem ib zaug.

Tsuas yog, ntawm no yog qhov teeb meem, hauv kev xaiv recursive, koj tsis tuaj yeem nkag mus rau nws tus kheej hauv cov lus nug nested, tab sis peb yuav tsum tau xaiv ib yam dab tsi uas tau pom nyob rau theem dhau los ... Nws hloov tawm tias nws tsis tuaj yeem ua cov lus nug nested rau tag nrho cov kev xaiv, tab sis rau nws qhov tshwj xeeb teb nws yog ua tau. Thiab daim teb no kuj tuaj yeem yog ib qho array - uas yog qhov peb yuav tsum tau siv ANY.

Nws suab me ntsis vwm, tab sis hauv daim duab txhua yam yooj yim.

PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy

WITH RECURSIVE T AS (
  SELECT
    ARRAY[id] id$
  FROM
    hier
  WHERE
    id = 5300
UNION ALL
  SELECT
    ARRAY(
      SELECT
        id
      FROM
        hier
      WHERE
        pid = ANY(T.id$)
    ) id$
  FROM
    T
  WHERE
    coalesce(id$, '{}') <> '{}' -- условиС Π²Ρ‹Ρ…ΠΎΠ΄Π° ΠΈΠ· Ρ†ΠΈΠΊΠ»Π° - пустой массив
)
SELECT
  unnest(id$) id
FROM
  T;

PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy
[saib ntawm piav qhia.tensor.ru]

Thiab ntawm no qhov tseem ceeb tshaj plaws yog tsis txawm yeej 1.5 npaug ntawm lub sijhawm, thiab hais tias peb rho tawm tsawg dua buffers, txij li thaum peb tsuas muaj 5 hu rau qhov ntsuas 30!

Ib qho ntxiv ntxiv yog qhov tseeb tias tom qab qhov kev tsis sib haum xeeb kawg, cov cim yuav nyob twj ywm txiav txim los ntawm "theem".

Node kos npe

Qhov kev txiav txim siab tom ntej uas yuav pab txhim kho kev ua tau zoo yog - "nplooj" tsis tuaj yeem muaj menyuam, uas yog, rau lawv tsis tas yuav saib "down" txhua. Hauv kev tsim ntawm peb txoj haujlwm, qhov no txhais tau hais tias yog peb ua raws li cov saw ntawm cov tuam haujlwm thiab mus txog tus neeg ua haujlwm, ces tsis tas yuav saib ntxiv nrog rau cov ceg no.

Cia peb nkag mus rau hauv peb lub rooj ntxiv boolean- teb, uas yuav qhia rau peb tam sim ntawd seb qhov no tshwj xeeb nkag rau hauv peb tsob ntoo yog "node" - uas yog, seb nws puas tuaj yeem muaj cov xeeb ntxwv ntawm tag nrho.

ALTER TABLE hier
  ADD COLUMN branch boolean;

UPDATE
  hier T
SET
  branch = TRUE
WHERE
  EXISTS(
    SELECT
      NULL
    FROM
      hier
    WHERE
      pid = T.id
    LIMIT 1
);
-- Запрос ΡƒΡΠΏΠ΅ΡˆΠ½ΠΎ Π²Ρ‹ΠΏΠΎΠ»Π½Π΅Π½: 3033 строк ΠΈΠ·ΠΌΠ΅Π½Π΅Π½ΠΎ Π·Π° 42 мс.

Zoo heev! Nws hloov tawm tias tsuas yog me ntsis ntau dua 30% ntawm tag nrho cov ntoo ntoo muaj cov xeeb leej xeeb ntxwv.

Tam sim no cia peb siv ib tug me ntsis txawv mechanics - kev sib txuas mus rau recursive ib feem los ntawm LATERAL, uas yuav tso cai rau peb tam sim ntawd nkag mus rau thaj chaw ntawm recursive "rooj", thiab siv ib qho kev sib sau ua ke nrog cov xwm txheej lim dej raws li lub pob kom txo tau cov yuam sij:

PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy

WITH RECURSIVE T AS (
  SELECT
    array_agg(id) id$
  , array_agg(id) FILTER(WHERE branch) ns$
  FROM
    hier
  WHERE
    id = 5300
UNION ALL
  SELECT
    X.*
  FROM
    T
  JOIN LATERAL (
    SELECT
      array_agg(id) id$
    , array_agg(id) FILTER(WHERE branch) ns$
    FROM
      hier
    WHERE
      pid = ANY(T.ns$)
  ) X
    ON coalesce(T.ns$, '{}') <> '{}'
)
SELECT
  unnest(id$) id
FROM
  T;

PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy
[saib ntawm piav qhia.tensor.ru]

Peb muaj peev xwm txo tau ib qho ntxiv index hu thiab yeej ntau tshaj 2 zaug hauv ntim ntawv pov thawj.

#2. Cia peb rov qab mus rau cov hauv paus hniav

Qhov algorithm no yuav muaj txiaj ntsig yog tias koj xav tau sau cov ntaub ntawv rau tag nrho cov ntsiab lus "nce tsob ntoo", thaum khaws cov ntaub ntawv hais txog cov ntawv twg (thiab nrog cov ntsuas dab tsi) ua rau nws suav nrog hauv cov qauv - piv txwv li, los tsim cov ntawv qhia txog cov ntsiab lus. nrog aggregation nyob rau hauv nodes.

PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy
Dab tsi hauv qab no yuav tsum tau coj los ua pov thawj ntawm lub tswv yim, txij li qhov kev thov hloov mus rau qhov nyuaj heev. Tab sis yog tias nws dominates koj database, koj yuav tsum xav txog kev siv cov tswv yim zoo sib xws.

Cia peb pib nrog ob peb nqe lus yooj yim:

  • Tib cov ntaub ntawv los ntawm database Nws yog qhov zoo tshaj los nyeem nws ib zaug.
  • Cov ntaub ntawv los ntawm database Nws ua tau zoo dua los nyeem hauv pawgdua ib leeg.

Tam sim no cia peb sim ua qhov kev thov peb xav tau.

kauj ruam 1

Obviously, thaum pib recursion (qhov twg peb yuav tsis muaj nws!) peb yuav tsum tau rho tawm cov ntaub ntawv ntawm nplooj lawv tus kheej raws li cov txheej txheem pib:

WITH RECURSIVE tree AS (
  SELECT
    rec -- это Ρ†Π΅Π»ΡŒΠ½Π°Ρ запись Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹
  , id::text chld -- это "Π½Π°Π±ΠΎΡ€" ΠΏΡ€ΠΈΠ²Π΅Π΄ΡˆΠΈΡ… сюда исходных Π»ΠΈΡΡ‚ΡŒΠ΅Π²
  FROM
    hier rec
  WHERE
    id = ANY('{1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192}'::integer[])
UNION ALL
  ...

Yog hais tias ib tug neeg xav tias nws coj txawv txawv hais tias lub "set" tau khaws cia raws li ib tug hlua thiab tsis yog ib tug array, ces muaj ib tug yooj yim piav qhia rau qhov no. Nws muaj ib tug built-in aggregating "gluing" muaj nuj nqi rau cov hlua string_agg, tab sis tsis yog rau arrays. Txawm tias nws yooj yim rau kev siv ntawm koj tus kheej.

kauj ruam 2

Tam sim no peb yuav tau txais ib txheej ntawm ntu IDs uas yuav tsum tau nyeem ntxiv. Yuav luag ib txwm lawv yuav muab duplicated nyob rau hauv cov ntaub ntawv sib txawv ntawm cov thawj txheej - yog li peb xav pab pawg lawv, thaum khaws cov ntaub ntawv hais txog cov nplooj nplooj.

Tab sis ntawm no peb qhov teeb meem tos peb:

  1. Qhov "subrecursive" ib feem ntawm cov lus nug tsis tuaj yeem muaj cov haujlwm sib sau ua ke nrog GROUP BY.
  2. Ib qho kev xa mus rau "rooj" recursive tsis tuaj yeem nyob rau hauv ib qho nested subquery.
  3. Ib qho kev thov hauv qhov recursive tsis tuaj yeem muaj CTE.

Hmoov zoo, tag nrho cov teeb meem no yooj yim heev los ua haujlwm. Cia peb pib ntawm qhov kawg.

CTE nyob rau hauv recursive part

Zoo li no tsis ua haujlwm:

WITH RECURSIVE tree AS (
  ...
UNION ALL
  WITH T (...)
  SELECT ...
)

Thiab yog li nws ua haujlwm, cov kab lus ua qhov txawv!

WITH RECURSIVE tree AS (
  ...
UNION ALL
  (
    WITH T (...)
    SELECT ...
  )
)

Nested query tiv thaiv ib tug recursive "table"

Hmm... Ib qho CTE recursive tsis tuaj yeem nkag mus rau hauv cov lus nug. Tab sis nws tuaj yeem nyob hauv CTE! Thiab qhov kev thov nested tuaj yeem nkag mus rau CTE no!

GROUP BY nyob rau hauv recursion

Nws tsis kaj siab, tab sis ... Peb muaj ib txoj hauv kev yooj yim los ua GROUP BY siv DISTINCT ON thiab lub qhov rais ua haujlwm!

SELECT
  (rec).pid id
, string_agg(chld::text, ',') chld
FROM
  tree
WHERE
  (rec).pid IS NOT NULL
GROUP BY 1 -- Π½Π΅ Ρ€Π°Π±ΠΎΡ‚Π°Π΅Ρ‚!

Thiab qhov no yog li cas nws ua haujlwm!

SELECT DISTINCT ON((rec).pid)
  (rec).pid id
, string_agg(chld::text, ',') OVER(PARTITION BY (rec).pid) chld
FROM
  tree
WHERE
  (rec).pid IS NOT NULL

Tam sim no peb pom vim li cas tus lej ID tau hloov mus rau hauv cov ntawv nyeem - yog li lawv tuaj yeem koom ua ke sib cais los ntawm commas!

kauj ruam 3

Rau qhov kawg peb tsis muaj dab tsi tshuav:

  • peb nyeem "section" cov ntaub ntawv raws li ib pawg IDs
  • peb piv cov ntawv rho tawm nrog cov "sets" ntawm thawj nplooj ntawv
  • "expand" tus txheej txheem siv unnest(string_to_array(chld, ',')::integer[])

WITH RECURSIVE tree AS (
  SELECT
    rec
  , id::text chld
  FROM
    hier rec
  WHERE
    id = ANY('{1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192}'::integer[])
UNION ALL
  (
    WITH prnt AS (
      SELECT DISTINCT ON((rec).pid)
        (rec).pid id
      , string_agg(chld::text, ',') OVER(PARTITION BY (rec).pid) chld
      FROM
        tree
      WHERE
        (rec).pid IS NOT NULL
    )
    , nodes AS (
      SELECT
        rec
      FROM
        hier rec
      WHERE
        id = ANY(ARRAY(
          SELECT
            id
          FROM
            prnt
        ))
    )
    SELECT
      nodes.rec
    , prnt.chld
    FROM
      prnt
    JOIN
      nodes
        ON (nodes.rec).id = prnt.id
  )
)
SELECT
  unnest(string_to_array(chld, ',')::integer[]) leaf
, (rec).*
FROM
  tree;

PostgreSQL Antipatterns: Lub qhov luav tob npaum li cas? cia peb mus dhau lub hierarchy
[saib ntawm piav qhia.tensor.ru]

Tau qhov twg los: www.hab.com

Ntxiv ib saib