SQL HowTo: kulemba pang'onopang'ono molunjika pafunso, kapena "Zoyambira zitatu"

Nthawi ndi nthawi, ntchito yofufuza deta yokhudzana ndi makiyi imawuka. mpaka titapeza ziwerengero zonse zofunika.

Chitsanzo cha "moyo weniweni" ndicho kuwonetsera 20 zovuta zakale, olembedwa pa mndandanda wa antchito (mwachitsanzo, mkati mwa gawo limodzi). Kwa "madashboard" osiyanasiyana oyang'anira okhala ndi chidule chachidule cha malo ogwirira ntchito, mutu wofanana umafunika nthawi zambiri.

SQL HowTo: kulemba pang'onopang'ono molunjika pafunso, kapena "Zoyambira zitatu"

M'nkhaniyi tiwona kukhazikitsidwa kwa PostgreSQL kwa yankho la "naive" pavuto loterolo, "nzeru" komanso zovuta kwambiri. "loop" mu SQL yokhala ndi chotuluka kuchokera pazomwe zapezeka, zomwe zingakhale zothandiza pa chitukuko chonse komanso kugwiritsidwa ntchito muzochitika zina zofanana.

Tiyeni titenge mayeso a data kuchokera nkhani yapita. Kuletsa zolemba zomwe zikuwonetsedwa kuti "zisadumphe" nthawi ndi nthawi pomwe zomwe zasankhidwa zimagwirizana, wonjezerani mlozera wamutu powonjezera kiyi yoyamba. Nthawi yomweyo, izi zipangitsa kuti zikhale zachilendo ndipo zimatitsimikizira kuti kusanja sikumveka bwino:

CREATE INDEX ON task(owner_id, task_date, id);
-- Π° старый - ΡƒΠ΄Π°Π»ΠΈΠΌ
DROP INDEX task_owner_id_task_date_idx;

Monga kwamveka, kotero kwalembedwa

Choyamba, tiyeni tijambule mtundu wosavuta wa pempho, popereka ma ID a osewera gulu ngati gawo lolowera:

SELECT
  *
FROM
  task
WHERE
  owner_id = ANY('{1,2,4,8,16,32,64,128,256,512}'::integer[])
ORDER BY
  task_date, id
LIMIT 20;

SQL HowTo: kulemba pang'onopang'ono molunjika pafunso, kapena "Zoyambira zitatu"
[onani pa explain.tensor.ru]

Zachisoni pang'ono - tidangoyitanitsa zolemba 20 zokha, koma Index Scan idatibwezera 960 mizere, zomwe ndiyenso zinayenera kusanjidwa... Tiyeni tiyese kuwerenga pang'ono.

unnest + ARRAY

Mfundo yoyamba imene ingatithandize ndiyo ngati tikufuna 20 okha osanjidwa zolemba, ndiye ingowerengani zosaposa 20 zosanjidwa mu dongosolo lomwelo pa chilichonse kiyi. Chabwino, index yoyenera (owner_id, task_date, id) tili nayo.

Tiyeni tigwiritse ntchito njira yomweyi pochotsa ndi "kufalikira m'mizere" zolemba zonse za tebulo, monga mu nkhani yomaliza. Titha kugwiritsanso ntchito kupukutira mugulu pogwiritsa ntchito ntchitoyi ARRAY():

WITH T AS (
  SELECT
    unnest(ARRAY(
      SELECT
        t
      FROM
        task t
      WHERE
        owner_id = unnest
      ORDER BY
        task_date, id
      LIMIT 20 -- ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ Ρ‚ΡƒΡ‚...
    )) r
  FROM
    unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
)
SELECT
  (r).*
FROM
  T
ORDER BY
  (r).task_date, (r).id
LIMIT 20; -- ... ΠΈ Ρ‚ΡƒΡ‚ - Ρ‚ΠΎΠΆΠ΅

SQL HowTo: kulemba pang'onopang'ono molunjika pafunso, kapena "Zoyambira zitatu"
[onani pa explain.tensor.ru]

O, zabwino kwambiri kale! 40% mwachangu komanso nthawi 4.5 zochepa Ndinayenera kuliwerenga.

Kupanga zolemba zama tebulo kudzera pa CTENdiroleni ndikukokereni chidwi chanu pa mfundo yakuti nthawi zina Kuyesera kugwira ntchito nthawi yomweyo ndi minda ya zolemba pambuyo pozifufuza mu subquery, popanda "kukulunga" mu CTE, kungayambitse "chulukitsani" InitPlan molingana ndi kuchuluka kwa minda yomweyi:

SELECT
  ((
    SELECT
      t
    FROM
      task t
    WHERE
      owner_id = 1
    ORDER BY
      task_date, id
    LIMIT 1
  ).*);

Result  (cost=4.77..4.78 rows=1 width=16) (actual time=0.063..0.063 rows=1 loops=1)
  Buffers: shared hit=16
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.031..0.032 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t  (cost=0.42..387.57 rows=500 width=48) (actual time=0.030..0.030 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 2 (returns $1)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_1  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 3 (returns $2)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.008 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_2  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4"
  InitPlan 4 (returns $3)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.009..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_3  (cost=0.42..387.57 rows=500 width=48) (actual time=0.009..0.009 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4

Zolemba zomwezo "zinkayang'ana mmwamba" nthawi za 4 ... Mpaka PostgreSQL 11, khalidweli limapezeka nthawi zonse, ndipo yankho ndilo "kukulunga" mu CTE, yomwe ndi malire athunthu a optimizer m'matembenuzidwe awa.

Recursive accumulator

Mu Baibulo lapitalo, okwana timawerenga 200 mizere chifukwa cha zofunikira 20. Osati 960, koma ngakhale zochepa - ndizotheka?

Tiyeni tiyesetse kugwiritsa ntchito chidziwitso chomwe tikufunikira zonse 20 zolemba. Ndiye kuti, tidzabwereza kuwerenga kwa data pokhapokha titapeza kuchuluka komwe tikufuna.

Gawo 1: Mndandanda Woyambira

Mwachiwonekere, mndandanda wathu wa "chandandale" wa zolemba 20 uyenera kuyamba ndi zolemba "zoyamba" za imodzi mwa makiyi a owner_id. Choncho, choyamba tidzapeza zoterozo β€œkoyamba” pa makiyi aliwonse ndikuwonjezera pamndandanda, ndikusankha momwe tikufuna - (task_date, id).

SQL HowTo: kulemba pang'onopang'ono molunjika pafunso, kapena "Zoyambira zitatu"

Gawo 2: Pezani zolemba "zotsatira".

Tsopano ngati titenga kulowa koyamba kuchokera pamndandanda wathu ndikuyamba "step" mopitilira mulozera kusunga key_id key, ndiye zolemba zonse zomwe zapezeka ndizotsatira zomwe zasankhidwa. Inde, kokha mpaka titawoloka batani la matako kulowa kwachiwiri pamndandanda.

Ngati zikuwoneka kuti "tidutsa" mbiri yachiwiri, ndiye cholembedwa chomaliza chiyenera kuwonjezeredwa pamndandanda m'malo mwa woyamba (ndi mwini_id yemweyo), pambuyo pake timasanjanso mndandandawo.

SQL HowTo: kulemba pang'onopang'ono molunjika pafunso, kapena "Zoyambira zitatu"

Ndiye kuti, nthawi zonse timapeza kuti mndandandawo ulibe zolowera kupitilira chimodzi pa makiyi aliwonse (ngati zolembera zatha ndipo "sitikuwoloka", ndiye kuti cholowa choyamba pamndandandawo chidzazimiririka ndipo palibe chomwe chidzawonjezedwe. ), ndi iwo zosankhidwa nthawi zonse pokwera pamakiyi ogwiritsira ntchito (task_date, id).

SQL HowTo: kulemba pang'onopang'ono molunjika pafunso, kapena "Zoyambira zitatu"

Gawo 3: fyuluta ndi "kukulitsa" zolemba

M'mizere ina ya zosankha zathu zobwerezabwereza, zina zimalemba rv Zobwerezedwa - choyamba timapeza monga "kuwoloka malire a mndandanda wachiwiri", ndiyeno m'malo mwa 2 kuchokera pamndandanda. Choncho chochitika choyamba chiyenera kusefedwa.

Funso lomaliza lowopsya

WITH RECURSIVE T AS (
  -- #1 : заносим Π² список "ΠΏΠ΅Ρ€Π²Ρ‹Π΅" записи ΠΏΠΎ ΠΊΠ°ΠΆΠ΄ΠΎΠΌΡƒ ΠΈΠ· ΠΊΠ»ΡŽΡ‡Π΅ΠΉ Π½Π°Π±ΠΎΡ€Π°
  WITH wrap AS ( -- "ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»ΠΈΠ·ΡƒΠ΅ΠΌ" record'Ρ‹, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΠΎΠ±Ρ€Π°Ρ‰Π΅Π½ΠΈΠ΅ ΠΊ полям Π½Π΅ Π²Ρ‹Π·Ρ‹Π²Π°Π»ΠΎ умноТСния InitPlan/SubPlan
    WITH T AS (
      SELECT
        (
          SELECT
            r
          FROM
            task r
          WHERE
            owner_id = unnest
          ORDER BY
            task_date, id
          LIMIT 1
        ) r
      FROM
        unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
    )
    SELECT
      array_agg(r ORDER BY (r).task_date, (r).id) list -- сортируСм список Π² Π½ΡƒΠΆΠ½ΠΎΠΌ порядкС
    FROM
      T
  )
  SELECT
    list
  , list[1] rv
  , FALSE not_cross
  , 0 size
  FROM
    wrap
UNION ALL
  -- #2 : Π²Ρ‹Ρ‡ΠΈΡ‚Ρ‹Π²Π°Π΅ΠΌ записи 1-Π³ΠΎ ΠΏΠΎ порядку ΠΊΠ»ΡŽΡ‡Π°, ΠΏΠΎΠΊΠ° Π½Π΅ ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½Π΅ΠΌ Ρ‡Π΅Ρ€Π΅Π· запись 2-Π³ΠΎ
  SELECT
    CASE
      -- Ссли Π½ΠΈΡ‡Π΅Π³ΠΎ Π½Π΅ Π½Π°ΠΉΠ΄Π΅Π½ΠΎ для ΠΊΠ»ΡŽΡ‡Π° 1-ΠΉ записи
      WHEN X._r IS NOT DISTINCT FROM NULL THEN
        T.list[2:] -- ΡƒΠ±ΠΈΡ€Π°Π΅ΠΌ Π΅Π΅ ΠΈΠ· списка
      -- Ссли ΠΌΡ‹ НЕ пСрСсСкли ΠΏΡ€ΠΈΠΊΠ»Π°Π΄Π½ΠΎΠΉ ΠΊΠ»ΡŽΡ‡ 2-ΠΉ записи
      WHEN X.not_cross THEN
        T.list -- просто протягиваСм Ρ‚ΠΎΡ‚ ΠΆΠ΅ список Π±Π΅Π· ΠΌΠΎΠ΄ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΉ
      -- Ссли Π² спискС ΡƒΠΆΠ΅ Π½Π΅Ρ‚ 2-ΠΉ записи
      WHEN T.list[2] IS NULL THEN
        -- просто Π²ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅ΠΌ пустой список
        '{}'
      -- пСрСсортировываСм ΡΠ»ΠΎΠ²Π°Ρ€ΡŒ, убирая 1-ю запись ΠΈ добавляя послСднюю ΠΈΠ· Π½Π°ΠΉΠ΄Π΅Π½Π½Ρ‹Ρ…
      ELSE (
        SELECT
          coalesce(T.list[2] || array_agg(r ORDER BY (r).task_date, (r).id), '{}')
        FROM
          unnest(T.list[3:] || X._r) r
      )
    END
  , X._r
  , X.not_cross
  , T.size + X.not_cross::integer
  FROM
    T
  , LATERAL(
      WITH wrap AS ( -- "ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»ΠΈΠ·ΡƒΠ΅ΠΌ" record
        SELECT
          CASE
            -- Ссли всС-Ρ‚Π°ΠΊΠΈ "ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½ΡƒΠ»ΠΈ" Ρ‡Π΅Ρ€Π΅Π· 2-ю запись
            WHEN NOT T.not_cross
              -- Ρ‚ΠΎ нуТная запись - пСрвая ΠΈΠ· спписка
              THEN T.list[1]
            ELSE ( -- Ссли Π½Π΅ пСрСсСкли, Ρ‚ΠΎ ΠΊΠ»ΡŽΡ‡ остался ΠΊΠ°ΠΊ Π² ΠΏΡ€Π΅Π΄Ρ‹Π΄ΡƒΡ‰Π΅ΠΉ записи - отталкиваСмся ΠΎΡ‚ Π½Π΅Π΅
              SELECT
                _r
              FROM
                task _r
              WHERE
                owner_id = (rv).owner_id AND
                (task_date, id) > ((rv).task_date, (rv).id)
              ORDER BY
                task_date, id
              LIMIT 1
            )
          END _r
      )
      SELECT
        _r
      , CASE
          -- Ссли 2-ΠΉ записи ΡƒΠΆΠ΅ Π½Π΅Ρ‚ Π² спискС, Π½ΠΎ ΠΌΡ‹ Ρ…ΠΎΡ‚ΡŒ Ρ‡Ρ‚ΠΎ-Ρ‚ΠΎ нашли
          WHEN list[2] IS NULL AND _r IS DISTINCT FROM NULL THEN
            TRUE
          ELSE -- Π½ΠΈΡ‡Π΅Π³ΠΎ Π½Π΅ нашли ΠΈΠ»ΠΈ "ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½ΡƒΠ»ΠΈ"
            coalesce(((_r).task_date, (_r).id) < ((list[2]).task_date, (list[2]).id), FALSE)
        END not_cross
      FROM
        wrap
    ) X
  WHERE
    T.size < 20 AND -- ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ Ρ‚ΡƒΡ‚ количСство
    T.list IS DISTINCT FROM '{}' -- ΠΈΠ»ΠΈ ΠΏΠΎΠΊΠ° список Π½Π΅ кончился
)
-- #3 : "Ρ€Π°Π·Π²ΠΎΡ€Π°Ρ‡ΠΈΠ²Π°Π΅ΠΌ" записи - порядок Π³Π°Ρ€Π°Π½Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½ ΠΏΠΎ ΠΏΠΎΡΡ‚Ρ€ΠΎΠ΅Π½ΠΈΡŽ
SELECT
  (rv).*
FROM
  T
WHERE
  not_cross; -- Π±Π΅Ρ€Π΅ΠΌ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ "Π½Π΅ΠΏΠ΅Ρ€Π΅ΡΠ΅ΠΊΠ°ΡŽΡ‰ΠΈΠ΅" записи

SQL HowTo: kulemba pang'onopang'ono molunjika pafunso, kapena "Zoyambira zitatu"
[onani pa explain.tensor.ru]

Choncho, ife adagulitsa 50% ya data yowerengedwa kwa 20% ya nthawi yophedwa. Ndiye kuti, ngati muli ndi zifukwa zokhulupirira kuti kuwerenga kungatenge nthawi yayitali (mwachitsanzo, deta nthawi zambiri sikhala mu cache, ndipo muyenera kupita ku disk), ndiye mwanjira iyi mutha kudalira pang'ono powerenga. .

Mulimonsemo, nthawi yakupha idakhala yabwinoko kuposa njira yoyamba "yopanda pake". Koma njira zitatu izi zomwe mungagwiritse ntchito zili ndi inu.

Source: www.habr.com

Kuwonjezera ndemanga