SQL HowTo: kākau pololei i ka loop loop i ka nīnau, a i ʻole "Elementary three-step"

I kēlā me kēia manawa, kū mai ka hana o ka ʻimi ʻana i nā ʻikepili pili me ka hoʻohana ʻana i nā kī. a hiki i ka loaʻa ʻana o ka huina o nā moʻolelo i makemake ʻia.

ʻO ka hiʻohiʻona "ola maoli" ka hōʻike 20 pilikia kahiko loa, helu ʻia ma ka papa inoa o na limahana (no ka laʻana, i loko o hoʻokahi mahele). No nā "dashboards" hoʻokele like ʻole me nā hōʻuluʻulu pōkole o nā wahi hana, koi pinepine ʻia kahi kumuhana like.

SQL HowTo: kākau pololei i ka loop loop i ka nīnau, a i ʻole "Elementary three-step"

Ma kēia ʻatikala e nānā mākou i ka hoʻokō ʻana ma PostgreSQL o kahi hopena "naive" i kēlā pilikia, kahi algorithm "ʻoi aku ka akamai" a paʻakikī loa. "loop" ma SQL me kahi kūlana puka mai ka ʻikepili i loaʻa, hiki ke hoʻohana no ka hoʻomohala maʻamau a no ka hoʻohana ʻana i nā hihia like ʻole.

E lawe i kahi hoʻonohonoho ʻikepili hoʻāʻo mai ʻatikala mua. No ka pale ʻana i nā moʻolelo i hōʻike ʻia mai ka "lele" i kēlā me kēia manawa i ka wā e hui pū ai nā waiwai i koho ʻia, e hoʻonui i ka papa kuhikuhi kumuhana ma ka hoʻohui ʻana i kahi kī mua. I ka manawa like, e hāʻawi koke kēia i ke ʻano kūʻokoʻa a hōʻoiaʻiʻo iā mākou he maopopo ʻole ke ʻano o ka hoʻonohonoho ʻana:

CREATE INDEX ON task(owner_id, task_date, id);
-- а старый - удалим
DROP INDEX task_owner_id_task_date_idx;

E like me ka mea i loheia, pela ka palapala

ʻO ka mea mua, e kiʻi i ka mana maʻalahi o ka noi, e hāʻawi i nā ID o nā mea hana. laha ma ke ʻano he ʻāpana hoʻokomo:

SELECT
  *
FROM
  task
WHERE
  owner_id = ANY('{1,2,4,8,16,32,64,128,256,512}'::integer[])
ORDER BY
  task_date, id
LIMIT 20;

SQL HowTo: kākau pololei i ka loop loop i ka nīnau, a i ʻole "Elementary three-step"
[nānā ma explain.tensor.ru]

Ke kaumaha iki - ua kauoha wale mākou i 20 mau moʻolelo, akā ua hoʻihoʻi mai ʻo Index Scan iā mākou 960 laina, a laila pono e hoʻokaʻawale ʻia... E hoʻāʻo kākou e heluhelu liʻiliʻi.

unnest + ARRAY

ʻO ka manaʻo mua e kōkua iā mākou inā pono mākou 20 wale nō i hoʻokaʻawale ʻia nā moʻolelo, a laila heluhelu wale ʻaʻole ʻoi aku ma mua o 20 i hoʻokaʻawale ʻia i ka hoʻonohonoho like no kēlā me kēia kī. Maikaʻi loa, kuhikuhi kūpono (owner_id, task_date, id) iā mākou.

E hoʻohana kākou i ka hana like no ka unuhi ʻana a me ka "hohola i loko o nā kolamu" hoʻopaʻa pākaukau integral, e like me ka ʻatikala hope loa. Hiki iā mākou ke hoʻopili i ka hoʻopili ʻana i kahi array me ka hoʻohana ʻana i ka hana ARRAY():

WITH T AS (
  SELECT
    unnest(ARRAY(
      SELECT
        t
      FROM
        task t
      WHERE
        owner_id = unnest
      ORDER BY
        task_date, id
      LIMIT 20 -- ограничиваем тут...
    )) r
  FROM
    unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
)
SELECT
  (r).*
FROM
  T
ORDER BY
  (r).task_date, (r).id
LIMIT 20; -- ... и тут - тоже

SQL HowTo: kākau pololei i ka loop loop i ka nīnau, a i ʻole "Elementary three-step"
[nānā ma explain.tensor.ru]

ʻAe, ʻoi aku ka maikaʻi! 40% ʻoi aku ka wikiwiki a me 4.5 mau manawa liʻiliʻi i ka ʻikepili Pono wau e heluhelu.

Hoʻokumu ʻia nā moʻolelo papa ma o CTEE ʻae mai iaʻu e huki i kou manaʻo i ka ʻoiaʻiʻo i kekahi mau hihia ʻO ka hoʻāʻo e hana koke me nā kahua o kahi moʻolelo ma hope o ka ʻimi ʻana iā ia ma kahi subquery, me ka ʻole o ka "wōwī" iā ia i kahi CTE, hiki ke alakaʻi i "hoʻonui" InitPlan e like me ka helu o keia mau kahua like:

SELECT
  ((
    SELECT
      t
    FROM
      task t
    WHERE
      owner_id = 1
    ORDER BY
      task_date, id
    LIMIT 1
  ).*);

Result  (cost=4.77..4.78 rows=1 width=16) (actual time=0.063..0.063 rows=1 loops=1)
  Buffers: shared hit=16
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.031..0.032 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t  (cost=0.42..387.57 rows=500 width=48) (actual time=0.030..0.030 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 2 (returns $1)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_1  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 3 (returns $2)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.008 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_2  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4"
  InitPlan 4 (returns $3)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.009..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_3  (cost=0.42..387.57 rows=500 width=48) (actual time=0.009..0.009 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4

ʻO ka moʻolelo hoʻokahi i "nānā" i nā manawa 4 ... A hiki i ka PostgreSQL 11, ke hana mau nei kēia ʻano, a ʻo ka hopena ʻo ia ke "uhi" iā ia i loko o kahi CTE, kahi palena paʻa no ka mea hoʻoponopono i kēia mau mana.

Mea hoʻokuʻu hou

Ma ka mana o mua, i ka huina a mākou i heluhelu ai 200 laina no ka pono o ka 20. ʻAʻole 960, akā ʻoi aku ka liʻiliʻi - hiki paha?

E ho'āʻo kākou e hoʻohana i ka ʻike e pono ai kākou huina 20 mooolelo. ʻO ia hoʻi, e hoʻololi mākou i ka heluhelu ʻikepili wale nō a hiki i ka nui a mākou e pono ai.

KaʻAnuʻu 1: Hoʻomaka papa inoa

ʻIke loa, e hoʻomaka kā mākou papa inoa "kumu" o nā moʻolelo 20 me nā moʻolelo "mua" no kekahi o kā mākou mau kī owner_id. No laila, e ʻike mua mākou i kēlā "mua loa" no kēlā me kēia kī a hoʻohui iā ia i ka papa inoa, hoʻokaʻawale iā ia ma ke ʻano a mākou e makemake ai - (task_date, id).

SQL HowTo: kākau pololei i ka loop loop i ka nīnau, a i ʻole "Elementary three-step"

KaʻAnuʻu Hana 2: E huli i ka "aʻe" komo

I kēia manawa inā lawe mākou i ke komo mua mai kā mākou papa inoa a hoʻomaka "ʻanuʻu" hou aku ma ka papa kuhikuhi e mālama ana i ke kī owner_id, a laila, ʻo nā moʻolelo a pau i loaʻa, ʻo ia nā mea e hiki mai ana i ke koho ʻana. ʻOiaʻiʻo, wale nō a hiki i ko mākou hele ʻana i ke kī kī komo lua ma ka papa inoa.

Ināʻikeʻia ua "holo" mākou i ka moʻoleloʻelua, a laila pono e hoʻohui ʻia ka helu helu hope loa i ka papa inoa ma mua o ka mea mua (me ka owner_id hoʻokahi), a laila hoʻonohonoho hou mākou i ka papa inoa.

SQL HowTo: kākau pololei i ka loop loop i ka nīnau, a i ʻole "Elementary three-step"

ʻO ia hoʻi, ʻike mau mākou ʻaʻole i ʻoi aku ka helu o ka papa inoa no kēlā me kēia kī (inā pau nā mea komo a ʻaʻole mākou e "kea", a laila e nalo wale ka helu mua mai ka papa inoa a ʻaʻohe mea e hoʻohui ʻia. ), a ʻo lākou hoʻokaʻawale mau i ka piʻi ʻana o ke kī noi (task_date, id).

SQL HowTo: kākau pololei i ka loop loop i ka nīnau, a i ʻole "Elementary three-step"

'anuʻu 3: kānana a "hoʻonui" i nā moʻolelo

Ma kekahi o nā lālani o kā mākou koho recursive, kekahi mau moʻolelo rv ua kope ʻia - ʻike mua mākou e like me "ke hele ʻana i ka palena o ka helu 2nd o ka papa inoa", a laila e hoʻololi iā ia e like me ka 1st mai ka papa inoa. No laila, pono e kānana ʻia ka mea mua.

ʻO ka nīnau hope weliweli

WITH RECURSIVE T AS (
  -- #1 : заносим в список "первые" записи по каждому из ключей набора
  WITH wrap AS ( -- "материализуем" record'ы, чтобы обращение к полям не вызывало умножения InitPlan/SubPlan
    WITH T AS (
      SELECT
        (
          SELECT
            r
          FROM
            task r
          WHERE
            owner_id = unnest
          ORDER BY
            task_date, id
          LIMIT 1
        ) r
      FROM
        unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
    )
    SELECT
      array_agg(r ORDER BY (r).task_date, (r).id) list -- сортируем список в нужном порядке
    FROM
      T
  )
  SELECT
    list
  , list[1] rv
  , FALSE not_cross
  , 0 size
  FROM
    wrap
UNION ALL
  -- #2 : вычитываем записи 1-го по порядку ключа, пока не перешагнем через запись 2-го
  SELECT
    CASE
      -- если ничего не найдено для ключа 1-й записи
      WHEN X._r IS NOT DISTINCT FROM NULL THEN
        T.list[2:] -- убираем ее из списка
      -- если мы НЕ пересекли прикладной ключ 2-й записи
      WHEN X.not_cross THEN
        T.list -- просто протягиваем тот же список без модификаций
      -- если в списке уже нет 2-й записи
      WHEN T.list[2] IS NULL THEN
        -- просто возвращаем пустой список
        '{}'
      -- пересортировываем словарь, убирая 1-ю запись и добавляя последнюю из найденных
      ELSE (
        SELECT
          coalesce(T.list[2] || array_agg(r ORDER BY (r).task_date, (r).id), '{}')
        FROM
          unnest(T.list[3:] || X._r) r
      )
    END
  , X._r
  , X.not_cross
  , T.size + X.not_cross::integer
  FROM
    T
  , LATERAL(
      WITH wrap AS ( -- "материализуем" record
        SELECT
          CASE
            -- если все-таки "перешагнули" через 2-ю запись
            WHEN NOT T.not_cross
              -- то нужная запись - первая из спписка
              THEN T.list[1]
            ELSE ( -- если не пересекли, то ключ остался как в предыдущей записи - отталкиваемся от нее
              SELECT
                _r
              FROM
                task _r
              WHERE
                owner_id = (rv).owner_id AND
                (task_date, id) > ((rv).task_date, (rv).id)
              ORDER BY
                task_date, id
              LIMIT 1
            )
          END _r
      )
      SELECT
        _r
      , CASE
          -- если 2-й записи уже нет в списке, но мы хоть что-то нашли
          WHEN list[2] IS NULL AND _r IS DISTINCT FROM NULL THEN
            TRUE
          ELSE -- ничего не нашли или "перешагнули"
            coalesce(((_r).task_date, (_r).id) < ((list[2]).task_date, (list[2]).id), FALSE)
        END not_cross
      FROM
        wrap
    ) X
  WHERE
    T.size < 20 AND -- ограничиваем тут количество
    T.list IS DISTINCT FROM '{}' -- или пока список не кончился
)
-- #3 : "разворачиваем" записи - порядок гарантирован по построению
SELECT
  (rv).*
FROM
  T
WHERE
  not_cross; -- берем только "непересекающие" записи

SQL HowTo: kākau pololei i ka loop loop i ka nīnau, a i ʻole "Elementary three-step"
[nānā ma explain.tensor.ru]

Pela, makou kālepa 50% o ka ʻikepili heluhelu no 20% o ka manawa hoʻokō. ʻO ia hoʻi, inā loaʻa iā ʻoe nā kumu e manaʻoʻiʻo ai e lōʻihi paha ka heluhelu ʻana (no ka laʻana, ʻaʻole pinepine ka ʻikepili i ka cache, a pono ʻoe e hele i ka disk no ia mea), a laila ma kēia ala hiki iā ʻoe ke hilinaʻi liʻiliʻi i ka heluhelu ʻana. .

I kekahi hihia, ua ʻoi aku ka maikaʻi o ka manawa hoʻokō ma mua o ka koho mua "naive". Akā ʻo wai o kēia mau koho 3 e hoʻohana ai iā ʻoe.

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka