DBMS in this respect work on the same principles - "they said to dig, I dig". Your request can not only slow down neighboring processes, constantly taking up processor resources, but also "drop" the entire database as a whole, "eating" all available memory. Therefore infinite recursion protection is the responsibility of the developer.
In PostgreSQL, the ability to use recursive queries through WITH RECURSIVE
Don't write recursive queries
And write non-recursive. Sincerely, Your K.O.
In fact, PostgreSQL provides quite a lot of functionality that you can use to not apply recursion.
Use a fundamentally different approach to the problem
Sometimes you can just look at the problem "from the other side." I gave an example of such a situation in the article
WITH RECURSIVE src AS (
SELECT '{2,3,5,7,11,13,17,19}'::integer[] arr
)
, T(i, val) AS (
SELECT
1::bigint
, 1
UNION ALL
SELECT
i + 1
, val * arr[i]
FROM
T
, src
WHERE
i <= array_length(arr, 1)
)
SELECT
val
FROM
T
ORDER BY -- ΠΎΡΠ±ΠΎΡ ΡΠΈΠ½Π°Π»ΡΠ½ΠΎΠ³ΠΎ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΠ°
i DESC
LIMIT 1;
Such a request can be replaced with a variant from experts in mathematics:
WITH src AS (
SELECT unnest('{2,3,5,7,11,13,17,19}'::integer[]) prime
)
SELECT
exp(sum(ln(prime)))::integer val
FROM
src;
Use generate_series instead of loops
Suppose we are faced with the task of generating all possible prefixes for a string 'abcdefgh'
:
WITH RECURSIVE T AS (
SELECT 'abcdefgh' str
UNION ALL
SELECT
substr(str, 1, length(str) - 1)
FROM
T
WHERE
length(str) > 1
)
TABLE T;
Do you really need recursion here? .. If you use LATERAL
ΠΈ generate_series
, then even CTEs are not needed:
SELECT
substr(str, 1, ln) str
FROM
(VALUES('abcdefgh')) T(str)
, LATERAL(
SELECT generate_series(length(str), 1, -1) ln
) X;
Change the structure of the database
For example, you have a table of forum posts with links to who replied to whom or a thread in
CREATE TABLE message(
message_id
uuid
PRIMARY KEY
, reply_to
uuid
REFERENCES message
, body
text
);
CREATE INDEX ON message(reply_to);
Well, a typical request to download all messages on one topic looks something like this:
WITH RECURSIVE T AS (
SELECT
*
FROM
message
WHERE
message_id = $1
UNION ALL
SELECT
m.*
FROM
T
JOIN
message m
ON m.reply_to = T.message_id
)
TABLE T;
But since we always need the whole subject from the root message, why don't we add its id to each entry automatic?
-- Π΄ΠΎΠ±Π°Π²ΠΈΠΌ ΠΏΠΎΠ»Π΅ Ρ ΠΎΠ±ΡΠΈΠΌ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΎΡΠΎΠΌ ΡΠ΅ΠΌΡ ΠΈ ΠΈΠ½Π΄Π΅ΠΊΡ Π½Π° Π½Π΅Π³ΠΎ
ALTER TABLE message
ADD COLUMN theme_id uuid;
CREATE INDEX ON message(theme_id);
-- ΠΈΠ½ΠΈΡΠΈΠ°Π»ΠΈΠ·ΠΈΡΡΠ΅ΠΌ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΎΡ ΡΠ΅ΠΌΡ Π² ΡΡΠΈΠ³Π³Π΅ΡΠ΅ ΠΏΡΠΈ Π²ΡΡΠ°Π²ΠΊΠ΅
CREATE OR REPLACE FUNCTION ins() RETURNS TRIGGER AS $$
BEGIN
NEW.theme_id = CASE
WHEN NEW.reply_to IS NULL THEN NEW.message_id -- Π±Π΅ΡΠ΅ΠΌ ΠΈΠ· ΡΡΠ°ΡΡΠΎΠ²ΠΎΠ³ΠΎ ΡΠΎΠ±ΡΡΠΈΡ
ELSE ( -- ΠΈΠ»ΠΈ ΠΈΠ· ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ, Π½Π° ΠΊΠΎΡΠΎΡΠΎΠ΅ ΠΎΡΠ²Π΅ΡΠ°Π΅ΠΌ
SELECT
theme_id
FROM
message
WHERE
message_id = NEW.reply_to
)
END;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER ins BEFORE INSERT
ON message
FOR EACH ROW
EXECUTE PROCEDURE ins();
Now our entire recursive query can be reduced to just this:
SELECT
*
FROM
message
WHERE
theme_id = $1;
Use applied "limiters"
If we are unable to change the structure of the database for some reason, let's see what we can rely on so that even the presence of an error in the data does not lead to endless recursion.
Recursion "depth" counter
We simply increase the counter by one at each step of the recursion until the limit is reached, which we consider obviously inadequate:
WITH RECURSIVE T AS (
SELECT
0 i
...
UNION ALL
SELECT
i + 1
...
WHERE
T.i < 64 -- ΠΏΡΠ΅Π΄Π΅Π»
)
Pro: When trying to loop, we will still make no more than the specified limit of iterations "in depth".
cons: There is no guarantee that we will not re-process the same record - for example, at a depth of 15 and 25, and then every +10. And no one promised anything about "breadth".
Formally, such a recursion will not be infinite, but if at each step the number of records increases exponentially, we all know well how it ends ...
Guardian of the Way
In turn, we add all the object identifiers we encountered along the recursion path to an array, which is a unique βpathβ to it:
WITH RECURSIVE T AS (
SELECT
ARRAY[id] path
...
UNION ALL
SELECT
path || id
...
WHERE
id <> ALL(T.path) -- Π½Π΅ ΡΠΎΠ²ΠΏΠ°Π΄Π°Π΅Ρ Π½ΠΈ Ρ ΠΎΠ΄Π½ΠΈΠΌ ΠΈΠ·
)
Pro: If there is a cycle in the data, we absolutely will not process the same record repeatedly within the same path.
cons: But at the same time, we can bypass, literally, all the records without repeating ourselves.
Path length limit
To avoid the situation of "wandering" recursion at an incomprehensible depth, we can combine the two previous methods. Or, if we do not want to support extra fields, supplement the recursion continuation condition with an estimate of the path length:
WITH RECURSIVE T AS (
SELECT
ARRAY[id] path
...
UNION ALL
SELECT
path || id
...
WHERE
id <> ALL(T.path) AND
array_length(T.path, 1) < 10
)
Choose the way you like!
Source: habr.com