Nyob rau hauv complex ERP systems ntau qhov chaw muaj qhov xwm txheej hierarchicalthaum cov khoom homogeneous kab nyob rau hauv tsob ntoo ntawm poj koob yawm txwv- xeeb leej xeeb ntxwv kev sib raug zoo - qhov no yog lub koom haum ntawm lub tuam txhab (tag nrho cov no ceg, departments thiab ua hauj lwm pab pawg neeg), thiab cov catalog ntawm cov khoom, thiab cov cheeb tsam ntawm kev ua hauj lwm, thiab geography ntawm cov ntsiab lus muag, ...
Qhov tseeb, tsis muaj
Muaj ntau txoj hauv kev khaws cov ntoo zoo li no hauv DBMS, tab sis hnub no peb yuav tsom mus rau ib qho kev xaiv:
CREATE TABLE hier(
id
integer
PRIMARY KEY
, pid
integer
REFERENCES hier
, data
json
);
CREATE INDEX ON hier(pid); -- Π½Π΅ Π·Π°Π±ΡΠ²Π°Π΅ΠΌ, ΡΡΠΎ FK Π½Π΅ ΠΏΠΎΠ΄ΡΠ°Π·ΡΠΌΠ΅Π²Π°Π΅Ρ Π°Π²ΡΠΎΡΠΎΠ·Π΄Π°Π½ΠΈΠ΅ ΠΈΠ½Π΄Π΅ΠΊΡΠ°, Π² ΠΎΡΠ»ΠΈΡΠΈΠ΅ ΠΎΡ PK
Thiab thaum koj tab tom peering rau hauv qhov tob ntawm hierarchy, nws ua siab ntev tos saib yuav ua li cas [hauv] koj txoj kev "naive" ntawm kev ua haujlwm nrog cov qauv zoo li no yuav zoo li cas.
Cia peb saib cov teeb meem uas tshwm sim, lawv qhov kev siv hauv SQL, thiab sim txhim kho lawv cov kev ua tau zoo.
#1. Lub qhov luav tob npaum li cas?
Cia peb, kom paub meej, lees paub tias cov qauv no yuav cuam tshuam txog kev ua haujlwm ntawm cov tuam tsev hauv cov qauv ntawm lub koom haum: departments, divisions, sectors, ceg, pawg neeg ua hauj lwm ... - txawm koj hu lawv.
Ua ntej, cia peb tsim peb 'ntoo' ntawm 10K hais
INSERT INTO hier
WITH RECURSIVE T AS (
SELECT
1::integer id
, '{1}'::integer[] pids
UNION ALL
SELECT
id + 1
, pids[1:(random() * array_length(pids, 1))::integer] || (id + 1)
FROM
T
WHERE
id < 10000
)
SELECT
pids[array_length(pids, 1)] id
, pids[array_length(pids, 1) - 1] pid
FROM
T;
Cia peb pib nrog txoj haujlwm yooj yim tshaj plaws - nrhiav txhua tus neeg ua haujlwm uas ua haujlwm hauv ib qho haujlwm tshwj xeeb, lossis hais txog hierarchy - nrhiav txhua tus me nyuam ntawm node. Nws kuj yuav yog qhov zoo kom tau txais "qhov tob" ntawm cov xeeb leej xeeb ntxwv ... Tag nrho cov no yuav tsim nyog, piv txwv li, los tsim qee yam.
Txhua yam yuav zoo yog tias tsuas muaj ob peb theem ntawm cov xeeb leej xeeb ntxwv thiab tus lej nyob hauv ib lub kaum os, tab sis yog tias muaj ntau dua 5 qib, thiab twb muaj kaum ob tus xeeb leej xeeb ntxwv, tej zaum yuav muaj teeb meem. Cia peb saib yuav ua li cas cov kev xaiv hauv qab-tus-ntoo tshawb nrhiav tau sau (thiab ua haujlwm). Tab sis ua ntej, cia peb txiav txim siab seb qhov twg yuav yog qhov nthuav tshaj plaws rau peb cov kev tshawb fawb.
Cov feem coob "dub" subtrees:
WITH RECURSIVE T AS (
SELECT
id
, pid
, ARRAY[id] path
FROM
hier
WHERE
pid IS NULL
UNION ALL
SELECT
hier.id
, hier.pid
, T.path || hier.id
FROM
T
JOIN
hier
ON hier.pid = T.id
)
TABLE T ORDER BY array_length(path, 1) DESC;
id | pid | path
---------------------------------------------
7624 | 7623 | {7615,7620,7621,7622,7623,7624}
4995 | 4994 | {4983,4985,4988,4993,4994,4995}
4991 | 4990 | {4983,4985,4988,4989,4990,4991}
...
Cov feem coob "dav" subtrees:
...
SELECT
path[1] id
, count(*)
FROM
T
GROUP BY
1
ORDER BY
2 DESC;
id | count
------------
5300 | 30
450 | 28
1239 | 27
1573 | 25
Rau cov lus nug no peb siv qhov raug recursive JOIN:
Obviously, nrog cov qauv thov no tus naj npawb ntawm iterations yuav phim tag nrho cov xeeb leej xeeb ntxwv (thiab muaj ntau lub kaum ob ntawm lawv), thiab qhov no tuaj yeem siv cov peev txheej tseem ceeb, thiab, vim li ntawd, lub sijhawm.
Cia peb saib ntawm "dav tshaj plaws" subtree:
WITH RECURSIVE T AS (
SELECT
id
FROM
hier
WHERE
id = 5300
UNION ALL
SELECT
hier.id
FROM
T
JOIN
hier
ON hier.pid = T.id
)
TABLE T;
Raws li xav tau, peb pom tag nrho 30 cov ntaub ntawv. Tab sis lawv tau siv 60% ntawm tag nrho lub sijhawm ntawm qhov no - vim tias lawv kuj tau ua 30 qhov kev tshawb fawb hauv qhov ntsuas. Puas ua tau tsawg dua?
Cov ntaub ntawv pov thawj ntau los ntawm index
Peb puas yuav tsum tau ua ib qho lus nug sib cais rau txhua qhov? Nws hloov tawm tias tsis yog - peb tuaj yeem nyeem los ntawm qhov ntsuas siv ntau tus yuam sij ib zaug hauv ib qho hu nrog kev pab = ANY(array)
.
Thiab nyob rau hauv txhua pab pawg ntawm cov neeg txheeb xyuas peb tuaj yeem nqa tag nrho cov IDs pom hauv cov kauj ruam dhau los los ntawm "nodes". Ntawd yog, ntawm txhua kauj ruam tom ntej peb yuav nrhiav tag nrho cov xeeb leej xeeb ntxwv ntawm ib theem ib zaug.
Tsuas yog, ntawm no yog qhov teeb meem, hauv kev xaiv recursive, koj tsis tuaj yeem nkag mus rau nws tus kheej hauv cov lus nug nested, tab sis peb yuav tsum tau xaiv ib yam dab tsi uas tau pom nyob rau theem dhau los ... Nws hloov tawm tias nws tsis tuaj yeem ua cov lus nug nested rau tag nrho cov kev xaiv, tab sis rau nws qhov tshwj xeeb teb nws yog ua tau. Thiab daim teb no kuj tuaj yeem yog ib qho array - uas yog qhov peb yuav tsum tau siv ANY
.
Nws suab me ntsis vwm, tab sis hauv daim duab txhua yam yooj yim.
WITH RECURSIVE T AS (
SELECT
ARRAY[id] id$
FROM
hier
WHERE
id = 5300
UNION ALL
SELECT
ARRAY(
SELECT
id
FROM
hier
WHERE
pid = ANY(T.id$)
) id$
FROM
T
WHERE
coalesce(id$, '{}') <> '{}' -- ΡΡΠ»ΠΎΠ²ΠΈΠ΅ Π²ΡΡ
ΠΎΠ΄Π° ΠΈΠ· ΡΠΈΠΊΠ»Π° - ΠΏΡΡΡΠΎΠΉ ΠΌΠ°ΡΡΠΈΠ²
)
SELECT
unnest(id$) id
FROM
T;
Thiab ntawm no qhov tseem ceeb tshaj plaws yog tsis txawm yeej 1.5 npaug ntawm lub sijhawm, thiab hais tias peb rho tawm tsawg dua buffers, txij li thaum peb tsuas muaj 5 hu rau qhov ntsuas 30!
Ib qho ntxiv ntxiv yog qhov tseeb tias tom qab qhov kev tsis sib haum xeeb kawg, cov cim yuav nyob twj ywm txiav txim los ntawm "theem".
Node kos npe
Qhov kev txiav txim siab tom ntej uas yuav pab txhim kho kev ua tau zoo yog - "nplooj" tsis tuaj yeem muaj menyuam, uas yog, rau lawv tsis tas yuav saib "down" txhua. Hauv kev tsim ntawm peb txoj haujlwm, qhov no txhais tau hais tias yog peb ua raws li cov saw ntawm cov tuam haujlwm thiab mus txog tus neeg ua haujlwm, ces tsis tas yuav saib ntxiv nrog rau cov ceg no.
Cia peb nkag mus rau hauv peb lub rooj ntxiv boolean
- teb, uas yuav qhia rau peb tam sim ntawd seb qhov no tshwj xeeb nkag rau hauv peb tsob ntoo yog "node" - uas yog, seb nws puas tuaj yeem muaj cov xeeb ntxwv ntawm tag nrho.
ALTER TABLE hier
ADD COLUMN branch boolean;
UPDATE
hier T
SET
branch = TRUE
WHERE
EXISTS(
SELECT
NULL
FROM
hier
WHERE
pid = T.id
LIMIT 1
);
-- ΠΠ°ΠΏΡΠΎΡ ΡΡΠΏΠ΅ΡΠ½ΠΎ Π²ΡΠΏΠΎΠ»Π½Π΅Π½: 3033 ΡΡΡΠΎΠΊ ΠΈΠ·ΠΌΠ΅Π½Π΅Π½ΠΎ Π·Π° 42 ΠΌΡ.
Zoo heev! Nws hloov tawm tias tsuas yog me ntsis ntau dua 30% ntawm tag nrho cov ntoo ntoo muaj cov xeeb leej xeeb ntxwv.
Tam sim no cia peb siv ib tug me ntsis txawv mechanics - kev sib txuas mus rau recursive ib feem los ntawm LATERAL
, uas yuav tso cai rau peb tam sim ntawd nkag mus rau thaj chaw ntawm recursive "rooj", thiab siv ib qho kev sib sau ua ke nrog cov xwm txheej lim dej raws li lub pob kom txo tau cov yuam sij:
WITH RECURSIVE T AS (
SELECT
array_agg(id) id$
, array_agg(id) FILTER(WHERE branch) ns$
FROM
hier
WHERE
id = 5300
UNION ALL
SELECT
X.*
FROM
T
JOIN LATERAL (
SELECT
array_agg(id) id$
, array_agg(id) FILTER(WHERE branch) ns$
FROM
hier
WHERE
pid = ANY(T.ns$)
) X
ON coalesce(T.ns$, '{}') <> '{}'
)
SELECT
unnest(id$) id
FROM
T;
Peb muaj peev xwm txo tau ib qho ntxiv index hu thiab yeej ntau tshaj 2 zaug hauv ntim ntawv pov thawj.
#2. Cia peb rov qab mus rau cov hauv paus hniav
Qhov algorithm no yuav muaj txiaj ntsig yog tias koj xav tau sau cov ntaub ntawv rau tag nrho cov ntsiab lus "nce tsob ntoo", thaum khaws cov ntaub ntawv hais txog cov ntawv twg (thiab nrog cov ntsuas dab tsi) ua rau nws suav nrog hauv cov qauv - piv txwv li, los tsim cov ntawv qhia txog cov ntsiab lus. nrog aggregation nyob rau hauv nodes.
Dab tsi hauv qab no yuav tsum tau coj los ua pov thawj ntawm lub tswv yim, txij li qhov kev thov hloov mus rau qhov nyuaj heev. Tab sis yog tias nws dominates koj database, koj yuav tsum xav txog kev siv cov tswv yim zoo sib xws.
Cia peb pib nrog ob peb nqe lus yooj yim:
- Tib cov ntaub ntawv los ntawm database Nws yog qhov zoo tshaj los nyeem nws ib zaug.
- Cov ntaub ntawv los ntawm database Nws ua tau zoo dua los nyeem hauv pawgdua ib leeg.
Tam sim no cia peb sim ua qhov kev thov peb xav tau.
kauj ruam 1
Obviously, thaum pib recursion (qhov twg peb yuav tsis muaj nws!) peb yuav tsum tau rho tawm cov ntaub ntawv ntawm nplooj lawv tus kheej raws li cov txheej txheem pib:
WITH RECURSIVE tree AS (
SELECT
rec -- ΡΡΠΎ ΡΠ΅Π»ΡΠ½Π°Ρ Π·Π°ΠΏΠΈΡΡ ΡΠ°Π±Π»ΠΈΡΡ
, id::text chld -- ΡΡΠΎ "Π½Π°Π±ΠΎΡ" ΠΏΡΠΈΠ²Π΅Π΄ΡΠΈΡ
ΡΡΠ΄Π° ΠΈΡΡ
ΠΎΠ΄Π½ΡΡ
Π»ΠΈΡΡΡΠ΅Π²
FROM
hier rec
WHERE
id = ANY('{1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192}'::integer[])
UNION ALL
...
Yog hais tias ib tug neeg xav tias nws coj txawv txawv hais tias lub "set" tau khaws cia raws li ib tug hlua thiab tsis yog ib tug array, ces muaj ib tug yooj yim piav qhia rau qhov no. Nws muaj ib tug built-in aggregating "gluing" muaj nuj nqi rau cov hlua string_agg
, tab sis tsis yog rau arrays. Txawm tias nws
kauj ruam 2
Tam sim no peb yuav tau txais ib txheej ntawm ntu IDs uas yuav tsum tau nyeem ntxiv. Yuav luag ib txwm lawv yuav muab duplicated nyob rau hauv cov ntaub ntawv sib txawv ntawm cov thawj txheej - yog li peb xav pab pawg lawv, thaum khaws cov ntaub ntawv hais txog cov nplooj nplooj.
Tab sis ntawm no peb qhov teeb meem tos peb:
- Qhov "subrecursive" ib feem ntawm cov lus nug tsis tuaj yeem muaj cov haujlwm sib sau ua ke nrog
GROUP BY
. - Ib qho kev xa mus rau "rooj" recursive tsis tuaj yeem nyob rau hauv ib qho nested subquery.
- Ib qho kev thov hauv qhov recursive tsis tuaj yeem muaj CTE.
Hmoov zoo, tag nrho cov teeb meem no yooj yim heev los ua haujlwm. Cia peb pib ntawm qhov kawg.
CTE nyob rau hauv recursive part
Zoo li no tsis ua haujlwm:
WITH RECURSIVE tree AS (
...
UNION ALL
WITH T (...)
SELECT ...
)
Thiab yog li nws ua haujlwm, cov kab lus ua qhov txawv!
WITH RECURSIVE tree AS (
...
UNION ALL
(
WITH T (...)
SELECT ...
)
)
Nested query tiv thaiv ib tug recursive "table"
Hmm... Ib qho CTE recursive tsis tuaj yeem nkag mus rau hauv cov lus nug. Tab sis nws tuaj yeem nyob hauv CTE! Thiab qhov kev thov nested tuaj yeem nkag mus rau CTE no!
GROUP BY nyob rau hauv recursion
Nws tsis kaj siab, tab sis ... Peb muaj ib txoj hauv kev yooj yim los ua GROUP BY siv DISTINCT ON
thiab lub qhov rais ua haujlwm!
SELECT
(rec).pid id
, string_agg(chld::text, ',') chld
FROM
tree
WHERE
(rec).pid IS NOT NULL
GROUP BY 1 -- Π½Π΅ ΡΠ°Π±ΠΎΡΠ°Π΅Ρ!
Thiab qhov no yog li cas nws ua haujlwm!
SELECT DISTINCT ON((rec).pid)
(rec).pid id
, string_agg(chld::text, ',') OVER(PARTITION BY (rec).pid) chld
FROM
tree
WHERE
(rec).pid IS NOT NULL
Tam sim no peb pom vim li cas tus lej ID tau hloov mus rau hauv cov ntawv nyeem - yog li lawv tuaj yeem koom ua ke sib cais los ntawm commas!
kauj ruam 3
Rau qhov kawg peb tsis muaj dab tsi tshuav:
- peb nyeem "section" cov ntaub ntawv raws li ib pawg IDs
- peb piv cov ntawv rho tawm nrog cov "sets" ntawm thawj nplooj ntawv
- "expand" tus txheej txheem siv
unnest(string_to_array(chld, ',')::integer[])
WITH RECURSIVE tree AS (
SELECT
rec
, id::text chld
FROM
hier rec
WHERE
id = ANY('{1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192}'::integer[])
UNION ALL
(
WITH prnt AS (
SELECT DISTINCT ON((rec).pid)
(rec).pid id
, string_agg(chld::text, ',') OVER(PARTITION BY (rec).pid) chld
FROM
tree
WHERE
(rec).pid IS NOT NULL
)
, nodes AS (
SELECT
rec
FROM
hier rec
WHERE
id = ANY(ARRAY(
SELECT
id
FROM
prnt
))
)
SELECT
nodes.rec
, prnt.chld
FROM
prnt
JOIN
nodes
ON (nodes.rec).id = prnt.id
)
)
SELECT
unnest(string_to_array(chld, ',')::integer[]) leaf
, (rec).*
FROM
tree;
Tau qhov twg los: www.hab.com