SQL HowTo: write a while-loop directly in the query, or "Elementary three-way"

Periodically, the task of searching for related data by a set of keys arises, until we get the required total number of records.

The most "lifelike" example is to display 20 oldest problems, listed on the list of employees (for example, within the same department). For various managerial "dashboards" with brief summaries of areas of work, a similar topic is required quite often.

SQL HowTo: write a while-loop directly in the query, or "Elementary three-way"

In the article, we will consider the implementation on PostgreSQL of a β€œnaive” version of solving such a problem, a β€œsmarter” and very complex algorithm "loop" in SQL with an exit condition from the found data, which can be useful both for general development and for use in other similar cases.

Let's take a test dataset from previous article. So that the output records do not β€œjump” from time to time when the sorted values ​​match, extend the subject index by adding a primary key. At the same time, this will immediately give it uniqueness, and guarantees us the uniqueness of the sort order:

CREATE INDEX ON task(owner_id, task_date, id);
-- Π° старый - ΡƒΠ΄Π°Π»ΠΈΠΌ
DROP INDEX task_owner_id_task_date_idx;

As heard, and spelled

First, let's sketch out the simplest version of the request, passing the IDs of the performers array as input:

SELECT
  *
FROM
  task
WHERE
  owner_id = ANY('{1,2,4,8,16,32,64,128,256,512}'::integer[])
ORDER BY
  task_date, id
LIMIT 20;

SQL HowTo: write a while-loop directly in the query, or "Elementary three-way"
[look at explain.tensor.ru]

A bit sad - we ordered only 20 records, and Index Scan returned us 960 term, which then also had to be sorted ... And let's try to read less.

unnest + ARRAY

The first consideration that will help us - if we need total 20 sorted records, it is enough to read no more than 20 sorted in the same order for each key. Good, suitable index (owner_id, task_date, id) we have.

Let's use the same mechanism of extracting and "turning into columns" integral table entryas in last article. And also apply the convolution to an array using the function ARRAY():

WITH T AS (
  SELECT
    unnest(ARRAY(
      SELECT
        t
      FROM
        task t
      WHERE
        owner_id = unnest
      ORDER BY
        task_date, id
      LIMIT 20 -- ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ Ρ‚ΡƒΡ‚...
    )) r
  FROM
    unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
)
SELECT
  (r).*
FROM
  T
ORDER BY
  (r).task_date, (r).id
LIMIT 20; -- ... ΠΈ Ρ‚ΡƒΡ‚ - Ρ‚ΠΎΠΆΠ΅

SQL HowTo: write a while-loop directly in the query, or "Elementary three-way"
[look at explain.tensor.ru]

Oh, it's already much better! 40% faster and 4.5 times less data had to read.

Materialization of table records via CTEI will note that in Π½Π΅ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Ρ… случаях an attempt to immediately work with the record fields after searching for it in a subquery, without β€œwrapping” in a CTE, can lead to "multiplication" InitPlan proportional to the number of these same fields:

SELECT
  ((
    SELECT
      t
    FROM
      task t
    WHERE
      owner_id = 1
    ORDER BY
      task_date, id
    LIMIT 1
  ).*);

Result  (cost=4.77..4.78 rows=1 width=16) (actual time=0.063..0.063 rows=1 loops=1)
  Buffers: shared hit=16
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.031..0.032 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t  (cost=0.42..387.57 rows=500 width=48) (actual time=0.030..0.030 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 2 (returns $1)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_1  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 3 (returns $2)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.008 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_2  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4"
  InitPlan 4 (returns $3)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.009..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_3  (cost=0.42..387.57 rows=500 width=48) (actual time=0.009..0.009 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4

The same record was β€œsearched” 4 times… Until PostgreSQL 11, this behavior occurs regularly, and the solution is to β€œwrap” in a CTE, which is an unconditional boundary for the optimizer in these versions.

recursive accumulator

In the previous version, in total, we read 200 term for the sake of the necessary 20. Already not 960, but even less - is it possible?

Let's try to use the knowledge that we need total xnumx records. That is, we will iterate the data subtraction only until the amount we need is reached.

Step 1: Start List

Obviously, our "target" list of 20 entries should start with the "first" entries for one of our owner_id keys. Therefore, we first find such "very first" for each of the keys and put it in the list, sorting it in the order we want - (task_date, id).

SQL HowTo: write a while-loop directly in the query, or "Elementary three-way"

Step 2: find the "next" records

Now if we take the first entry from our list and start "step" further down the index with saving the owner_id-key, then all the found records are just the next ones in the resulting selection. Of course, only until we cross the applied key second entry in the list.

If it turned out that we β€œcrossed” the second entry, then the last read entry should be added to the list instead of the first (with the same owner_id), after which the list is sorted again.

SQL HowTo: write a while-loop directly in the query, or "Elementary three-way"

That is, we always get that the list has no more than one entry for each of the keys (if the entries are over, and we have not β€œcrossed”, then the first entry will simply disappear from the list and nothing will be added), and they always sorted in ascending order of the application key (task_date, id).

SQL HowTo: write a while-loop directly in the query, or "Elementary three-way"

Step 3: Filtering and Expanding Records

In the part of the rows of our recursive selection, some records rv are duplicated - first we find such as β€œcrossing the border of the 2nd entry of the list”, and then we substitute it as the 1st from the list. And so the first occurrence should be filtered out.

Terrible final query

WITH RECURSIVE T AS (
  -- #1 : заносим Π² список "ΠΏΠ΅Ρ€Π²Ρ‹Π΅" записи ΠΏΠΎ ΠΊΠ°ΠΆΠ΄ΠΎΠΌΡƒ ΠΈΠ· ΠΊΠ»ΡŽΡ‡Π΅ΠΉ Π½Π°Π±ΠΎΡ€Π°
  WITH wrap AS ( -- "ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»ΠΈΠ·ΡƒΠ΅ΠΌ" record'Ρ‹, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΠΎΠ±Ρ€Π°Ρ‰Π΅Π½ΠΈΠ΅ ΠΊ полям Π½Π΅ Π²Ρ‹Π·Ρ‹Π²Π°Π»ΠΎ умноТСния InitPlan/SubPlan
    WITH T AS (
      SELECT
        (
          SELECT
            r
          FROM
            task r
          WHERE
            owner_id = unnest
          ORDER BY
            task_date, id
          LIMIT 1
        ) r
      FROM
        unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
    )
    SELECT
      array_agg(r ORDER BY (r).task_date, (r).id) list -- сортируСм список Π² Π½ΡƒΠΆΠ½ΠΎΠΌ порядкС
    FROM
      T
  )
  SELECT
    list
  , list[1] rv
  , FALSE not_cross
  , 0 size
  FROM
    wrap
UNION ALL
  -- #2 : Π²Ρ‹Ρ‡ΠΈΡ‚Ρ‹Π²Π°Π΅ΠΌ записи 1-Π³ΠΎ ΠΏΠΎ порядку ΠΊΠ»ΡŽΡ‡Π°, ΠΏΠΎΠΊΠ° Π½Π΅ ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½Π΅ΠΌ Ρ‡Π΅Ρ€Π΅Π· запись 2-Π³ΠΎ
  SELECT
    CASE
      -- Ссли Π½ΠΈΡ‡Π΅Π³ΠΎ Π½Π΅ Π½Π°ΠΉΠ΄Π΅Π½ΠΎ для ΠΊΠ»ΡŽΡ‡Π° 1-ΠΉ записи
      WHEN X._r IS NOT DISTINCT FROM NULL THEN
        T.list[2:] -- ΡƒΠ±ΠΈΡ€Π°Π΅ΠΌ Π΅Π΅ ΠΈΠ· списка
      -- Ссли ΠΌΡ‹ НЕ пСрСсСкли ΠΏΡ€ΠΈΠΊΠ»Π°Π΄Π½ΠΎΠΉ ΠΊΠ»ΡŽΡ‡ 2-ΠΉ записи
      WHEN X.not_cross THEN
        T.list -- просто протягиваСм Ρ‚ΠΎΡ‚ ΠΆΠ΅ список Π±Π΅Π· ΠΌΠΎΠ΄ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΉ
      -- Ссли Π² спискС ΡƒΠΆΠ΅ Π½Π΅Ρ‚ 2-ΠΉ записи
      WHEN T.list[2] IS NULL THEN
        -- просто Π²ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅ΠΌ пустой список
        '{}'
      -- пСрСсортировываСм ΡΠ»ΠΎΠ²Π°Ρ€ΡŒ, убирая 1-ю запись ΠΈ добавляя послСднюю ΠΈΠ· Π½Π°ΠΉΠ΄Π΅Π½Π½Ρ‹Ρ…
      ELSE (
        SELECT
          coalesce(T.list[2] || array_agg(r ORDER BY (r).task_date, (r).id), '{}')
        FROM
          unnest(T.list[3:] || X._r) r
      )
    END
  , X._r
  , X.not_cross
  , T.size + X.not_cross::integer
  FROM
    T
  , LATERAL(
      WITH wrap AS ( -- "ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»ΠΈΠ·ΡƒΠ΅ΠΌ" record
        SELECT
          CASE
            -- Ссли всС-Ρ‚Π°ΠΊΠΈ "ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½ΡƒΠ»ΠΈ" Ρ‡Π΅Ρ€Π΅Π· 2-ю запись
            WHEN NOT T.not_cross
              -- Ρ‚ΠΎ нуТная запись - пСрвая ΠΈΠ· спписка
              THEN T.list[1]
            ELSE ( -- Ссли Π½Π΅ пСрСсСкли, Ρ‚ΠΎ ΠΊΠ»ΡŽΡ‡ остался ΠΊΠ°ΠΊ Π² ΠΏΡ€Π΅Π΄Ρ‹Π΄ΡƒΡ‰Π΅ΠΉ записи - отталкиваСмся ΠΎΡ‚ Π½Π΅Π΅
              SELECT
                _r
              FROM
                task _r
              WHERE
                owner_id = (rv).owner_id AND
                (task_date, id) > ((rv).task_date, (rv).id)
              ORDER BY
                task_date, id
              LIMIT 1
            )
          END _r
      )
      SELECT
        _r
      , CASE
          -- Ссли 2-ΠΉ записи ΡƒΠΆΠ΅ Π½Π΅Ρ‚ Π² спискС, Π½ΠΎ ΠΌΡ‹ Ρ…ΠΎΡ‚ΡŒ Ρ‡Ρ‚ΠΎ-Ρ‚ΠΎ нашли
          WHEN list[2] IS NULL AND _r IS DISTINCT FROM NULL THEN
            TRUE
          ELSE -- Π½ΠΈΡ‡Π΅Π³ΠΎ Π½Π΅ нашли ΠΈΠ»ΠΈ "ΠΏΠ΅Ρ€Π΅ΡˆΠ°Π³Π½ΡƒΠ»ΠΈ"
            coalesce(((_r).task_date, (_r).id) < ((list[2]).task_date, (list[2]).id), FALSE)
        END not_cross
      FROM
        wrap
    ) X
  WHERE
    T.size < 20 AND -- ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ Ρ‚ΡƒΡ‚ количСство
    T.list IS DISTINCT FROM '{}' -- ΠΈΠ»ΠΈ ΠΏΠΎΠΊΠ° список Π½Π΅ кончился
)
-- #3 : "Ρ€Π°Π·Π²ΠΎΡ€Π°Ρ‡ΠΈΠ²Π°Π΅ΠΌ" записи - порядок Π³Π°Ρ€Π°Π½Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½ ΠΏΠΎ ΠΏΠΎΡΡ‚Ρ€ΠΎΠ΅Π½ΠΈΡŽ
SELECT
  (rv).*
FROM
  T
WHERE
  not_cross; -- Π±Π΅Ρ€Π΅ΠΌ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ "Π½Π΅ΠΏΠ΅Ρ€Π΅ΡΠ΅ΠΊΠ°ΡŽΡ‰ΠΈΠ΅" записи

SQL HowTo: write a while-loop directly in the query, or "Elementary three-way"
[look at explain.tensor.ru]

Thus, we traded 50% data reads for 20% execution time. That is, if you have reason to believe that reading can be long (for example, data is often not in the cache, and you have to go to disk for it), then in this way you can depend on reading less.

In any case, the execution time turned out better than in the "naive" first option. But which of these 3 options to use is up to you.

Source: habr.com

Add a comment