SQL HowTo: sau ib lub sij hawm voj ncaj qha rau hauv cov lus nug, los yog "Elementary peb-kauj ruam"

Ib ntus, txoj haujlwm ntawm kev tshawb nrhiav cov ntaub ntawv ntsig txog siv cov yuam sij tshwm sim. kom txog thaum peb tau txais tag nrho cov ntaub ntawv teev tseg.

Qhov feem ntau "lub neej tiag tiag" piv txwv yog los tso saib 20 qhov teeb meem qub tshaj plaws, npe ntawm daim ntawv teev cov neeg ua haujlwm (piv txwv li, hauv ib qho kev faib). Rau ntau yam kev tswj hwm "dashboards" nrog cov ntsiab lus luv luv ntawm thaj chaw ua haujlwm, ib lub ntsiab lus zoo sib xws yog xav tau ntau zaus.

SQL HowTo: sau ib lub sij hawm voj ncaj qha rau hauv cov lus nug, los yog "Elementary peb-kauj ruam"

Hauv tsab xov xwm no peb yuav saib qhov kev siv hauv PostgreSQL ntawm "naive" kev daws teeb meem rau qhov teeb meem no, "smarter" thiab nyuaj algorithm. "loop" hauv SQL nrog ib qho kev tawm ntawm cov ntaub ntawv pom, uas tuaj yeem siv tau ob qho tib si rau kev txhim kho dav dav thiab siv rau lwm qhov xwm txheej zoo sib xws.

Cia peb kuaj cov ntaub ntawv los ntawm tsab xov xwm dhau los. Txhawm rau tiv thaiv cov ntaub ntawv tso tawm los ntawm "dhia" los ntawm lub sijhawm rau lub sijhawm thaum cov nqi sib txawv, nthuav cov ntsiab lus ntsuas los ntawm kev ntxiv tus yuam sij tseem ceeb. Nyob rau tib lub sijhawm, qhov no yuav tam sim muab nws qhov tshwj xeeb thiab lav peb tias kev txheeb xyuas qhov kev txiav txim tsis muaj tseeb:

CREATE INDEX ON task(owner_id, task_date, id);
-- Π° старый - ΡƒΠ΄Π°Π»ΠΈΠΌ
DROP INDEX task_owner_id_task_date_idx;

Raws li tau hnov, thiaj li sau

Ua ntej, cia peb kos tawm qhov yooj yim version ntawm qhov kev thov, hla tus IDs ntawm cov neeg ua yeeb yam array raws li input parameter:

SELECT
  *
FROM
  task
WHERE
  owner_id = ANY('{1,2,4,8,16,32,64,128,256,512}'::integer[])
ORDER BY
  task_date, id
LIMIT 20;

SQL HowTo: sau ib lub sij hawm voj ncaj qha rau hauv cov lus nug, los yog "Elementary peb-kauj ruam"
[saib ntawm piav qhia.tensor.ru]

Tu siab me ntsis - peb tsuas yog xaj 20 cov ntaub ntawv, tab sis Index Scan xa rov qab rau peb 960 kab, uas ces kuj yuav tsum tau txheeb... Wb sim nyeem tsawg.

unnest + ARRAY

Thawj qhov kev txiav txim siab yuav pab peb yog tias peb xav tau tsuas yog 20 cais cov ntaub ntawv, ces cia li nyeem tsis ntau tshaj 20 txheeb nyob rau hauv tib qhov kev txiav txim rau txhua tus yuam sij. Zoo, haum index (owner_id, task_date, id) peb muaj.

Cia peb siv tib lub tswv yim rau kev rho tawm thiab "kis mus rau hauv kab" ib daim ntawv teev, as in kab lus kawg. Peb kuj tuaj yeem siv folding rau hauv ib qho array siv cov haujlwm ARRAY():

WITH T AS (
  SELECT
    unnest(ARRAY(
      SELECT
        t
      FROM
        task t
      WHERE
        owner_id = unnest
      ORDER BY
        task_date, id
      LIMIT 20 -- ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ Ρ‚ΡƒΡ‚...
    )) r
  FROM
    unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
)
SELECT
  (r).*
FROM
  T
ORDER BY
  (r).task_date, (r).id
LIMIT 20; -- ... ΠΈ Ρ‚ΡƒΡ‚ - Ρ‚ΠΎΠΆΠ΅

SQL HowTo: sau ib lub sij hawm voj ncaj qha rau hauv cov lus nug, los yog "Elementary peb-kauj ruam"
[saib ntawm piav qhia.tensor.ru]

Auj, zoo dua lawm! 40% sai dua thiab 4.5 npaug ntawm cov ntaub ntawv tsawg dua Kuv yuav tsum tau nyeem nws.

Materialization ntawm cov ntaub ntawv rooj ntawm CTECia kuv kos koj lub ntsej muag rau qhov tseeb tias hauv qee kis Ib qho kev sim ua haujlwm tam sim ntawd nrog cov teb ntawm cov ntaub ntawv tom qab tshawb nrhiav nws hauv cov lus nug, tsis muaj "wrapping" nws hauv CTE, tuaj yeem ua rau "multiply" InitPlan proportional rau tus naj npawb ntawm tib lub teb:

SELECT
  ((
    SELECT
      t
    FROM
      task t
    WHERE
      owner_id = 1
    ORDER BY
      task_date, id
    LIMIT 1
  ).*);

Result  (cost=4.77..4.78 rows=1 width=16) (actual time=0.063..0.063 rows=1 loops=1)
  Buffers: shared hit=16
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.031..0.032 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t  (cost=0.42..387.57 rows=500 width=48) (actual time=0.030..0.030 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 2 (returns $1)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_1  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 3 (returns $2)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.008 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_2  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4"
  InitPlan 4 (returns $3)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.009..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_3  (cost=0.42..387.57 rows=500 width=48) (actual time=0.009..0.009 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4

Tib cov ntaub ntawv tau "saib" 4 zaug ... Txog thaum PostgreSQL 11, tus cwj pwm no tshwm sim tsis tu ncua, thiab cov kev daws teeb meem yog "pob" nws hauv CTE, uas yog qhov txwv tsis pub tshaj rau cov optimizer hauv cov qauv no.

Recursive accumulator

Nyob rau hauv lub dhau los version, tag nrho peb nyeem 200 kab rau lub hom phiaj ntawm qhov yuav tsum tau 20. Tsis yog 960, tab sis txawm tsawg - nws puas ua tau?

Cia peb sim siv qhov kev paub uas peb xav tau tag nrho 20 cov ntaub ntawv. Ntawd yog, peb yuav iterate cov ntaub ntawv nyeem nkaus xwb kom txog thaum peb ncav cuag qhov peb xav tau.

Kauj ruam 1: Pib daim ntawv teev npe

Pom tseeb, peb "lub hom phiaj" cov npe ntawm 20 cov ntaub ntawv yuav tsum pib nrog "thawj" cov ntaub ntawv rau ib qho ntawm peb tus tswv_id yuam sij. Yog li ntawd, ua ntej peb yuav pom xws li "thawj heev" rau txhua tus yuam sij thiab ntxiv rau hauv daim ntawv, txheeb nws hauv qhov kev txiav txim peb xav tau - (task_date, id).

SQL HowTo: sau ib lub sij hawm voj ncaj qha rau hauv cov lus nug, los yog "Elementary peb-kauj ruam"

Kauj ruam 2: Nrhiav cov "tom ntej" nkag

Tam sim no yog tias peb coj thawj qhov nkag los ntawm peb cov npe thiab pib "kauj ruam" ntxiv raws qhov ntsuas khaws cia tus tswv_id tus yuam sij, tom qab ntawd tag nrho cov ntaub ntawv pom yog raws nraim cov tom ntej hauv kev xaiv tau. Tau kawg, tsuas yog mus txog thaum peb hla lub pob tw thib ob nkag rau hauv daim ntawv.

Yog tias nws hloov tawm tias peb "hloov" cov ntaub ntawv thib ob, ces qhov kawg nkag nyeem yuav tsum muab ntxiv rau hauv daim ntawv tsis yog thawj tus (nrog tib tus tswv_id), tom qab ntawd peb rov txheeb cov npe dua.

SQL HowTo: sau ib lub sij hawm voj ncaj qha rau hauv cov lus nug, los yog "Elementary peb-kauj ruam"

Ntawd yog, peb ib txwm tau txais tias daim ntawv teev npe tsis muaj ntau tshaj ib qho kev nkag rau txhua tus yuam sij (yog tias cov ntawv nkag mus thiab peb tsis "hla", ces thawj qhov kev nkag ntawm cov npe yuav ploj mus thiab tsis muaj dab tsi ntxiv. ), thiab lawv ib txwm txheeb nyob rau hauv ascending kev txiav txim ntawm daim ntawv thov key (task_date, id).

SQL HowTo: sau ib lub sij hawm voj ncaj qha rau hauv cov lus nug, los yog "Elementary peb-kauj ruam"

Kauj ruam 3: lim thiab "nthuav" cov ntaub ntawv

Hauv qee kab ntawm peb cov kev xaiv recursive, qee cov ntaub ntawv rv yog duplicated - ua ntej peb pom xws li "hla ciam teb ntawm 2nd nkag ntawm daim ntawv", thiab ces hloov nws li 1st ntawm daim ntawv. Yog li thawj qhov tshwm sim yuav tsum tau lim.

Qhov dreaded kawg query

WITH RECURSIVE T AS (
  -- #1 : заносим Π² список "ΠΏΠ΅Ρ€Π²Ρ‹Π΅" записи ΠΏΠΎ ΠΊΠ°ΠΆΠ΄ΠΎΠΌΡƒ ΠΈΠ· ΠΊΠ»ΡŽΡ‡Π΅ΠΉ Π½Π°Π±ΠΎΡ€Π°
  WITH wrap AS ( -- "ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»ΠΈΠ·ΡƒΠ΅ΠΌ" record'Ρ‹, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΠΎΠ±Ρ€Π°Ρ‰Π΅Π½ΠΈΠ΅ ΠΊ полям Π½Π΅ Π²Ρ‹Π·Ρ‹Π²Π°Π»ΠΎ умноТСния InitPlan/SubPlan
    WITH T AS (
      SELECT
        (
          SELECT
            r
          FROM
            task r
          WHERE
            owner_id = unnest
          ORDER BY
            task_date, id
          LIMIT 1
        ) r
      FROM
        unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
    )
    SELECT
      array_agg(r ORDER BY (r).task_date, (r).id) list -- сортируСм список Π² Π½ΡƒΠΆΠ½ΠΎΠΌ порядкС
    FROM
      T
  )
  SELECT
    list
  , list[1] rv
  , FALSE not_cross
  , 0 size
  FROM
    wrap
UNION ALL
  -- #2 : Π²Ρ‹Ρ‡ΠΈΡ‚Ρ‹Π²Π°Π΅ΠΌ записи 1-Π³ΠΎ ΠΏΠΎ порядку ΠΊΠ»ΡŽΡ‡Π°, ΠΏΠΎΠΊΠ° Π½Π΅ ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½Π΅ΠΌ Ρ‡Π΅Ρ€Π΅Π· запись 2-Π³ΠΎ
  SELECT
    CASE
      -- Ссли Π½ΠΈΡ‡Π΅Π³ΠΎ Π½Π΅ Π½Π°ΠΉΠ΄Π΅Π½ΠΎ для ΠΊΠ»ΡŽΡ‡Π° 1-ΠΉ записи
      WHEN X._r IS NOT DISTINCT FROM NULL THEN
        T.list[2:] -- ΡƒΠ±ΠΈΡ€Π°Π΅ΠΌ Π΅Π΅ ΠΈΠ· списка
      -- Ссли ΠΌΡ‹ НЕ пСрСсСкли ΠΏΡ€ΠΈΠΊΠ»Π°Π΄Π½ΠΎΠΉ ΠΊΠ»ΡŽΡ‡ 2-ΠΉ записи
      WHEN X.not_cross THEN
        T.list -- просто протягиваСм Ρ‚ΠΎΡ‚ ΠΆΠ΅ список Π±Π΅Π· ΠΌΠΎΠ΄ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΉ
      -- Ссли Π² спискС ΡƒΠΆΠ΅ Π½Π΅Ρ‚ 2-ΠΉ записи
      WHEN T.list[2] IS NULL THEN
        -- просто Π²ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅ΠΌ пустой список
        '{}'
      -- пСрСсортировываСм ΡΠ»ΠΎΠ²Π°Ρ€ΡŒ, убирая 1-ю запись ΠΈ добавляя послСднюю ΠΈΠ· Π½Π°ΠΉΠ΄Π΅Π½Π½Ρ‹Ρ…
      ELSE (
        SELECT
          coalesce(T.list[2] || array_agg(r ORDER BY (r).task_date, (r).id), '{}')
        FROM
          unnest(T.list[3:] || X._r) r
      )
    END
  , X._r
  , X.not_cross
  , T.size + X.not_cross::integer
  FROM
    T
  , LATERAL(
      WITH wrap AS ( -- "ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»ΠΈΠ·ΡƒΠ΅ΠΌ" record
        SELECT
          CASE
            -- Ссли всС-Ρ‚Π°ΠΊΠΈ "ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½ΡƒΠ»ΠΈ" Ρ‡Π΅Ρ€Π΅Π· 2-ю запись
            WHEN NOT T.not_cross
              -- Ρ‚ΠΎ нуТная запись - пСрвая ΠΈΠ· спписка
              THEN T.list[1]
            ELSE ( -- Ссли Π½Π΅ пСрСсСкли, Ρ‚ΠΎ ΠΊΠ»ΡŽΡ‡ остался ΠΊΠ°ΠΊ Π² ΠΏΡ€Π΅Π΄Ρ‹Π΄ΡƒΡ‰Π΅ΠΉ записи - отталкиваСмся ΠΎΡ‚ Π½Π΅Π΅
              SELECT
                _r
              FROM
                task _r
              WHERE
                owner_id = (rv).owner_id AND
                (task_date, id) > ((rv).task_date, (rv).id)
              ORDER BY
                task_date, id
              LIMIT 1
            )
          END _r
      )
      SELECT
        _r
      , CASE
          -- Ссли 2-ΠΉ записи ΡƒΠΆΠ΅ Π½Π΅Ρ‚ Π² спискС, Π½ΠΎ ΠΌΡ‹ Ρ…ΠΎΡ‚ΡŒ Ρ‡Ρ‚ΠΎ-Ρ‚ΠΎ нашли
          WHEN list[2] IS NULL AND _r IS DISTINCT FROM NULL THEN
            TRUE
          ELSE -- Π½ΠΈΡ‡Π΅Π³ΠΎ Π½Π΅ нашли ΠΈΠ»ΠΈ "ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½ΡƒΠ»ΠΈ"
            coalesce(((_r).task_date, (_r).id) < ((list[2]).task_date, (list[2]).id), FALSE)
        END not_cross
      FROM
        wrap
    ) X
  WHERE
    T.size < 20 AND -- ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ Ρ‚ΡƒΡ‚ количСство
    T.list IS DISTINCT FROM '{}' -- ΠΈΠ»ΠΈ ΠΏΠΎΠΊΠ° список Π½Π΅ кончился
)
-- #3 : "Ρ€Π°Π·Π²ΠΎΡ€Π°Ρ‡ΠΈΠ²Π°Π΅ΠΌ" записи - порядок Π³Π°Ρ€Π°Π½Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½ ΠΏΠΎ ΠΏΠΎΡΡ‚Ρ€ΠΎΠ΅Π½ΠΈΡŽ
SELECT
  (rv).*
FROM
  T
WHERE
  not_cross; -- Π±Π΅Ρ€Π΅ΠΌ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ "Π½Π΅ΠΏΠ΅Ρ€Π΅ΡΠ΅ΠΊΠ°ΡŽΡ‰ΠΈΠ΅" записи

SQL HowTo: sau ib lub sij hawm voj ncaj qha rau hauv cov lus nug, los yog "Elementary peb-kauj ruam"
[saib ntawm piav qhia.tensor.ru]

Yog li ntawd, peb pauv 50% ntawm cov ntaub ntawv nyeem rau 20% ntawm lub sijhawm ua tiav. Ntawd yog, yog tias koj muaj laj thawj ntseeg tias kev nyeem ntawv yuav siv sij hawm ntev (piv txwv li, cov ntaub ntawv feem ntau tsis nyob hauv cache, thiab koj yuav tsum mus rau disk rau nws), ua li no koj tuaj yeem nyob ntawm kev nyeem ntawv tsawg dua. .

Txawm li cas los xij, lub sijhawm ua tiav tau zoo dua li qhov "naive" thawj qhov kev xaiv. Tab sis qhov twg ntawm 3 txoj kev xaiv siv yog nyob ntawm koj.

Tau qhov twg los: www.hab.com

Ntxiv ib saib