SQL HowTo: nyora chinguva-loop zvakananga mumubvunzo, kana "Elementary nhatu-nzira"

Nguva nenguva, basa rekutsvaga data rakabatana uchishandisa seti yemakiyi anomuka. kusvikira tawana nhamba inodiwa yezvinyorwa.

Iyo yakanyanya "hupenyu chaihwo" muenzaniso ndeyekuratidza 20 matambudziko ekare, vakanyorwa pazita revashandi (somuenzaniso, mukati mechikamu chimwe). Kune akasiyana manejimendi "mabhodhi" ane pfupiso pfupi dzenzvimbo dzebasa, musoro wakafanana unodiwa kazhinji.

SQL HowTo: nyora chinguva-loop zvakananga mumubvunzo, kana "Elementary nhatu-nzira"

Muchinyorwa chino tichatarisa kuisirwa muPostgreSQL ye "naive" mhinduro kudambudziko rakadaro, "yakangwara" uye yakaoma kwazvo algorithm. "loop" muSQL ine mamiriro ekubuda kubva kune yakawanikwa data, iyo inogona kubatsira zvose pakukura kwese uye kushandiswa mune zvimwe zviitiko zvakafanana.

Ngatitorei data rekuyedza kubva nyaya yapfuura. Kudzivirira marekodhi akaratidzwa kubva "kusvetuka" nguva nenguva kana maitiro akarongwa achienderana, wedzera indekisi yezvidzidzo nekuwedzera kiyi yekutanga. Panguva imwecheteyo, izvi zvinongozvipa kusarudzika uye zvinotivimbisa kuti kurongeka kwehurongwa hakuna kujeka:

CREATE INDEX ON task(owner_id, task_date, id);
-- Π° старый - ΡƒΠ΄Π°Π»ΠΈΠΌ
DROP INDEX task_owner_id_task_date_idx;

Sezvazvakanyorwa, ndizvo zvakanyorwa;

Kutanga, ngatitorei iyo yakapusa vhezheni yechikumbiro, tichipfuudza maID evatambi array seyekuisa parameter:

SELECT
  *
FROM
  task
WHERE
  owner_id = ANY('{1,2,4,8,16,32,64,128,256,512}'::integer[])
ORDER BY
  task_date, id
LIMIT 20;

SQL HowTo: nyora chinguva-loop zvakananga mumubvunzo, kana "Elementary nhatu-nzira"
[tarisa ku explain.tensor.ru]

Zvinosuwisa zvishoma - takangoodha marekodhi makumi maviri chete, uye Index Scan yakadzosera kwatiri 960 mitsetse, iyo zvakare yaifanira kurongedzwa ... Ngatiedzei kuverenga zvishoma.

unnest + ARRAY

Kutanga kuchatibatsira kana tichida 20 chete dzakarongwa marekodhi, wobva wangoverenga haapfuure makumi maviri akarongwa munhevedzano imwe neimwe key. Kugona, indekisi yakakodzera (owner_id, task_date, id) tine.

Ngatishandisei nzira imwechete yekuburitsa uye "kuparadzira mumakoramu" yakakosha tafura rekodhi, sezvamu chinyorwa chekupedzisira. Isu tinogona zvakare kuisa kupeta kuita hurongwa tichishandisa basa ARRAY():

WITH T AS (
  SELECT
    unnest(ARRAY(
      SELECT
        t
      FROM
        task t
      WHERE
        owner_id = unnest
      ORDER BY
        task_date, id
      LIMIT 20 -- ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ Ρ‚ΡƒΡ‚...
    )) r
  FROM
    unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
)
SELECT
  (r).*
FROM
  T
ORDER BY
  (r).task_date, (r).id
LIMIT 20; -- ... ΠΈ Ρ‚ΡƒΡ‚ - Ρ‚ΠΎΠΆΠ΅

SQL HowTo: nyora chinguva-loop zvakananga mumubvunzo, kana "Elementary nhatu-nzira"
[tarisa ku explain.tensor.ru]

Oo, zviri nani kare! 40% nekukurumidza uye 4.5 nguva shoma data Ndaifanira kuiverenga.

Materialization yetafura rekodhi kuburikidza neCTERega ndikukwevera pfungwa dzako kune chokwadi chekuti mune zvimwe zviitiko Kuedza kushanda nekukurumidza neminda yerekodhi mushure mekuitsvaga mune subquery, pasina "kuiputira" muCTE, inogona kutungamira "wanza" InitPlan zvichienderana nehuwandu hweminda imwe chete iyi:

SELECT
  ((
    SELECT
      t
    FROM
      task t
    WHERE
      owner_id = 1
    ORDER BY
      task_date, id
    LIMIT 1
  ).*);

Result  (cost=4.77..4.78 rows=1 width=16) (actual time=0.063..0.063 rows=1 loops=1)
  Buffers: shared hit=16
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.031..0.032 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t  (cost=0.42..387.57 rows=500 width=48) (actual time=0.030..0.030 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 2 (returns $1)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_1  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 3 (returns $2)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.008 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_2  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4"
  InitPlan 4 (returns $3)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.009..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_3  (cost=0.42..387.57 rows=500 width=48) (actual time=0.009..0.009 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4

Rekodhi imwechete "yakatarisa kumusoro" 4 nguva ... Kusvikira PostgreSQL 11, maitiro aya anoitika nguva dzose, uye mhinduro ndeye "kuiputira" muCTE, inova muganhu wakakwana we optimizer mune idzi shanduro.

Recursive accumulator

Mushanduro yapfuura, muhuwandu tinoverenga 200 mitsetse nokuda kwezvinodiwa 20. Kwete 960, asi kunyange zvishoma - zvinogoneka here?

Ngatiedzei kushandisa zivo yatinoda ese makumi matanhatu zvinyorwa. Ndokunge, isu tichadzokorora kuverenga data chete kudzamara tasvika pamari yatinoda.

Danho 1: Kutanga List

Zviripachena, yedu "chinangwa" chinyorwa chemakumi maviri marekodhi chinofanira kutanga ne "yekutanga" marekodhi eimwe yedu muridzi_id kiyi. Naizvozvo, kutanga tichawana vakadaro β€œchekutanga” kune imwe neimwe yemakiyi uye woiwedzera pane rondedzero, tichiironga nenzira yatinoda - (task_date, id).

SQL HowTo: nyora chinguva-loop zvakananga mumubvunzo, kana "Elementary nhatu-nzira"

Danho 2: Tsvaga izvo "zvinotevera" zvinyorwa

Zvino kana tikatora yekutanga kupinda kubva pane yedu runyorwa uye kutanga "nhanho" mberi pamwe nendekisi kuchengetedza muridzi_id kiyi, saka ese akawanikwa marekodhi ndiwo chaiwo anotevera mukusarudzwa kunoguma. Chokwadi, chete kusvika tayambuka kiyi yebutt chechipiri chekupinda muchinyorwa.

Kana zvikazoitika kuti isu "takayambuka" rekodhi yechipiri, saka chinyorwa chekupedzisira chinoverengwa chinofanira kuwedzerwa pane rondedzero pane yekutanga (nemuridzi mumwechete_id), mushure mezvo tinorongazve rondedzero zvakare.

SQL HowTo: nyora chinguva-loop zvakananga mumubvunzo, kana "Elementary nhatu-nzira"

Kureva kuti, isu tinogara tichiwana kuti iyo rondedzero haina inopfuura imwechete yekupinda kune imwe neimwe yekiyi (kana mapindiro apera uye isu tisina "kuyambuka", ipapo yekutanga yekupinda kubva pane iyo rondedzero inongonyangarika uye hapana chichawedzerwa. ), uye ivo nguva dzose yakarongedzwa mukukwira kurongeka kwekiyi yekushandisa (task_date, id).

SQL HowTo: nyora chinguva-loop zvakananga mumubvunzo, kana "Elementary nhatu-nzira"

Danho rechitatu: sefa uye "wedzera" marekodhi

Mune mimwe mitsara yesarudzo yedu yekudzokorora, mamwe marekodhi rv zvakadhindwa - chekutanga tinowana se "kuyambuka muganho we2nd yekupinda rondedzero", tozoitsiva seyekutanga kubva pakurongwa. Saka chiitiko chekutanga chinoda kusefa.

Mubvunzo wekupedzisira unotyisa

WITH RECURSIVE T AS (
  -- #1 : заносим Π² список "ΠΏΠ΅Ρ€Π²Ρ‹Π΅" записи ΠΏΠΎ ΠΊΠ°ΠΆΠ΄ΠΎΠΌΡƒ ΠΈΠ· ΠΊΠ»ΡŽΡ‡Π΅ΠΉ Π½Π°Π±ΠΎΡ€Π°
  WITH wrap AS ( -- "ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»ΠΈΠ·ΡƒΠ΅ΠΌ" record'Ρ‹, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΠΎΠ±Ρ€Π°Ρ‰Π΅Π½ΠΈΠ΅ ΠΊ полям Π½Π΅ Π²Ρ‹Π·Ρ‹Π²Π°Π»ΠΎ умноТСния InitPlan/SubPlan
    WITH T AS (
      SELECT
        (
          SELECT
            r
          FROM
            task r
          WHERE
            owner_id = unnest
          ORDER BY
            task_date, id
          LIMIT 1
        ) r
      FROM
        unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
    )
    SELECT
      array_agg(r ORDER BY (r).task_date, (r).id) list -- сортируСм список Π² Π½ΡƒΠΆΠ½ΠΎΠΌ порядкС
    FROM
      T
  )
  SELECT
    list
  , list[1] rv
  , FALSE not_cross
  , 0 size
  FROM
    wrap
UNION ALL
  -- #2 : Π²Ρ‹Ρ‡ΠΈΡ‚Ρ‹Π²Π°Π΅ΠΌ записи 1-Π³ΠΎ ΠΏΠΎ порядку ΠΊΠ»ΡŽΡ‡Π°, ΠΏΠΎΠΊΠ° Π½Π΅ ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½Π΅ΠΌ Ρ‡Π΅Ρ€Π΅Π· запись 2-Π³ΠΎ
  SELECT
    CASE
      -- Ссли Π½ΠΈΡ‡Π΅Π³ΠΎ Π½Π΅ Π½Π°ΠΉΠ΄Π΅Π½ΠΎ для ΠΊΠ»ΡŽΡ‡Π° 1-ΠΉ записи
      WHEN X._r IS NOT DISTINCT FROM NULL THEN
        T.list[2:] -- ΡƒΠ±ΠΈΡ€Π°Π΅ΠΌ Π΅Π΅ ΠΈΠ· списка
      -- Ссли ΠΌΡ‹ НЕ пСрСсСкли ΠΏΡ€ΠΈΠΊΠ»Π°Π΄Π½ΠΎΠΉ ΠΊΠ»ΡŽΡ‡ 2-ΠΉ записи
      WHEN X.not_cross THEN
        T.list -- просто протягиваСм Ρ‚ΠΎΡ‚ ΠΆΠ΅ список Π±Π΅Π· ΠΌΠΎΠ΄ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΉ
      -- Ссли Π² спискС ΡƒΠΆΠ΅ Π½Π΅Ρ‚ 2-ΠΉ записи
      WHEN T.list[2] IS NULL THEN
        -- просто Π²ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅ΠΌ пустой список
        '{}'
      -- пСрСсортировываСм ΡΠ»ΠΎΠ²Π°Ρ€ΡŒ, убирая 1-ю запись ΠΈ добавляя послСднюю ΠΈΠ· Π½Π°ΠΉΠ΄Π΅Π½Π½Ρ‹Ρ…
      ELSE (
        SELECT
          coalesce(T.list[2] || array_agg(r ORDER BY (r).task_date, (r).id), '{}')
        FROM
          unnest(T.list[3:] || X._r) r
      )
    END
  , X._r
  , X.not_cross
  , T.size + X.not_cross::integer
  FROM
    T
  , LATERAL(
      WITH wrap AS ( -- "ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»ΠΈΠ·ΡƒΠ΅ΠΌ" record
        SELECT
          CASE
            -- Ссли всС-Ρ‚Π°ΠΊΠΈ "ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½ΡƒΠ»ΠΈ" Ρ‡Π΅Ρ€Π΅Π· 2-ю запись
            WHEN NOT T.not_cross
              -- Ρ‚ΠΎ нуТная запись - пСрвая ΠΈΠ· спписка
              THEN T.list[1]
            ELSE ( -- Ссли Π½Π΅ пСрСсСкли, Ρ‚ΠΎ ΠΊΠ»ΡŽΡ‡ остался ΠΊΠ°ΠΊ Π² ΠΏΡ€Π΅Π΄Ρ‹Π΄ΡƒΡ‰Π΅ΠΉ записи - отталкиваСмся ΠΎΡ‚ Π½Π΅Π΅
              SELECT
                _r
              FROM
                task _r
              WHERE
                owner_id = (rv).owner_id AND
                (task_date, id) > ((rv).task_date, (rv).id)
              ORDER BY
                task_date, id
              LIMIT 1
            )
          END _r
      )
      SELECT
        _r
      , CASE
          -- Ссли 2-ΠΉ записи ΡƒΠΆΠ΅ Π½Π΅Ρ‚ Π² спискС, Π½ΠΎ ΠΌΡ‹ Ρ…ΠΎΡ‚ΡŒ Ρ‡Ρ‚ΠΎ-Ρ‚ΠΎ нашли
          WHEN list[2] IS NULL AND _r IS DISTINCT FROM NULL THEN
            TRUE
          ELSE -- Π½ΠΈΡ‡Π΅Π³ΠΎ Π½Π΅ нашли ΠΈΠ»ΠΈ "ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½ΡƒΠ»ΠΈ"
            coalesce(((_r).task_date, (_r).id) < ((list[2]).task_date, (list[2]).id), FALSE)
        END not_cross
      FROM
        wrap
    ) X
  WHERE
    T.size < 20 AND -- ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ Ρ‚ΡƒΡ‚ количСство
    T.list IS DISTINCT FROM '{}' -- ΠΈΠ»ΠΈ ΠΏΠΎΠΊΠ° список Π½Π΅ кончился
)
-- #3 : "Ρ€Π°Π·Π²ΠΎΡ€Π°Ρ‡ΠΈΠ²Π°Π΅ΠΌ" записи - порядок Π³Π°Ρ€Π°Π½Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½ ΠΏΠΎ ΠΏΠΎΡΡ‚Ρ€ΠΎΠ΅Π½ΠΈΡŽ
SELECT
  (rv).*
FROM
  T
WHERE
  not_cross; -- Π±Π΅Ρ€Π΅ΠΌ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ "Π½Π΅ΠΏΠ΅Ρ€Π΅ΡΠ΅ΠΊΠ°ΡŽΡ‰ΠΈΠ΅" записи

SQL HowTo: nyora chinguva-loop zvakananga mumubvunzo, kana "Elementary nhatu-nzira"
[tarisa ku explain.tensor.ru]

Saka, isu yakatengeswa 50% yedata inoverengwa ye20% yenguva yekuuraya. Ndokunge, kana uine zvikonzero zvekutenda kuti kuverenga kunogona kutora nguva yakareba (semuenzaniso, iyo data kazhinji haisi mu cache, uye iwe unofanirwa kuenda kune dhisiki yayo), saka nenzira iyi unogona kuvimba zvishoma pakuverenga. .

Chero zvazvingaitika, nguva yekuuraya yakave iri nani pane yekutanga sarudzo ye "naive". Asi ndeipi pane idzi 3 sarudzo dzekushandisa zviri kwauri.

Source: www.habr.com

Voeg