PostgreSQL Antipatterns: Condition Evaluation in SQL

SQL is not C++, nor is it JavaScript. Therefore, the evaluation of logical expressions is different, and this is not the same thing at all:

WHERE fncondX() AND fncondY()

= fncondX() && fncondY()

While optimizing the execution plan of a PostgreSQL query can arbitrarily "rearrange" the equivalent conditions, do not calculate any of them for individual records, refer to the condition of the applied index ... In short, the easiest way is to assume that you can't manage the order in which they will be (and whether they will be calculated at all) equitable conditions.

Therefore, if you still want to manage priority, you need to structurally make these conditions unequal with conditional expressions ΠΈ operators.

PostgreSQL Antipatterns: Condition Evaluation in SQL
Data and working with them is the basis of our VLSI complex, so it is very important for us that operations on them are performed not only correctly, but also efficiently. Let's look at specific examples where errors in the calculation of expressions can be made, and where it is worth improving their efficiency.

#0: RTFM

Starting example from documentation:

When the order of evaluation is important, it can be fixed with the construct CASE. For example, this way to avoid division by zero in a sentence WHERE unreliable:

SELECT ... WHERE x > 0 AND y/x > 1.5;

Safe option:

SELECT ... WHERE CASE WHEN x > 0 THEN y/x > 1.5 ELSE false END;

The construction used CASE protects the expression from optimization, so it should only be used when necessary.

#1: trigger condition

BEGIN
  IF cond(NEW.fld) AND EXISTS(SELECT ...) THEN
    ...
  END IF;
  RETURN NEW;
END;

Everything seems to look good, but... No one promises that the invested SELECT will not be executed if the first condition is false. Fix it with nested IF:

BEGIN
  IF cond(NEW.fld) THEN
    IF EXISTS(SELECT ...) THEN
      ...
    END IF;
  END IF;
  RETURN NEW;
END;

Now let's look carefully - the whole body of the trigger function turned out to be "wrapped" in IF. And this means that nothing prevents us from removing this condition from the procedure using WHEN-conditions:

BEGIN
  IF EXISTS(SELECT ...) THEN
    ...
  END IF;
  RETURN NEW;
END;
...
CREATE TRIGGER ...
  WHEN cond(NEW.fld);

This approach allows you to save server resources with a guarantee if the condition is false.

#2: OR/AND chain

SELECT ... WHERE EXISTS(... A) OR EXISTS(... B)

Otherwise, it can be obtained that both EXISTS will be true, but both will be executed.

But if we know for sure that one of them is "true" much more often (or "false" - for AND-chains) - is it possible to somehow "increase its priority" so that the second one is not executed once again?

It turns out that it is possible - the algorithmically approach is close to the topic of the article PostgreSQL Antipatterns: Rare entry reaches the middle of a JOIN.

Let's just "shove under CASE" both of these conditions:

SELECT ...
WHERE
  CASE
    WHEN EXISTS(... A) THEN TRUE
    WHEN EXISTS(... B) THEN TRUE
  END

In this case, we did not define ELSE-value, that is, if both conditions are false CASE will return NULL, which is interpreted as FALSE Π² WHERE- conditions.

This example can be combined in another way - to taste and color:

SELECT ...
WHERE
  CASE
    WHEN NOT EXISTS(... A) THEN EXISTS(... B)
    ELSE TRUE
  END

#3: how [not] to write conditions

We spent two days on analyzing the reasons for the β€œstrange” triggering of this trigger - let's see why.

Source:

IF( NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚_" is null or NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚_" = (select '"ΠšΠΎΠΌΠΏΠ»Π΅ΠΊΡ‚"'::regclass::oid) or NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚_" = (select to_regclass('"Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠŸΠΎΠ—Π°Ρ€ΠΏΠ»Π°Ρ‚Π΅"')::oid)
     AND (   OLD."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠΠ°ΡˆΠ°ΠžΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΡ" <> NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠΠ°ΡˆΠ°ΠžΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΡ"
          OR OLD."Π£Π΄Π°Π»Π΅Π½" <> NEW."Π£Π΄Π°Π»Π΅Π½"
          OR OLD."Π”Π°Ρ‚Π°" <> NEW."Π”Π°Ρ‚Π°"
          OR OLD."ВрСмя" <> NEW."ВрСмя"
          OR OLD."Π›ΠΈΡ†ΠΎΠ‘ΠΎΠ·Π΄Π°Π»" <> NEW."Π›ΠΈΡ†ΠΎΠ‘ΠΎΠ·Π΄Π°Π»" ) ) THEN ...

Problem #1: Inequality doesn't account for NULL

Let's assume that everything OLD-fields mattered NULL. What will happen?

SELECT NULL <> 1 OR NULL <> 2;
-- NULL

And from the point of view of working out the conditions NULL equivalent FALSE, as mentioned above.

Solution: use operator IS DISTINCT FROM from ROW-operator, comparing entire records at once:

SELECT (NULL, NULL) IS DISTINCT FROM (1, 2);
-- TRUE

Problem number 2: different implementation of the same functionality

Let's compare:

NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚_" = (select '"ΠšΠΎΠΌΠΏΠ»Π΅ΠΊΡ‚"'::regclass::oid)
NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚_" = (select to_regclass('"Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠŸΠΎΠ—Π°Ρ€ΠΏΠ»Π°Ρ‚Π΅"')::oid)

Why are there extra investments SELECT? A function to_regclass? Why is it different...

Let's fix:

NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚_" = '"ΠšΠΎΠΌΠΏΠ»Π΅ΠΊΡ‚"'::regclass::oid
NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚_" = '"Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠŸΠΎΠ—Π°Ρ€ΠΏΠ»Π°Ρ‚Π΅"'::regclass::oid

Problem #3: bool precedence

Let's format the source:

{... IS NULL} OR
{... ΠšΠΎΠΌΠΏΠ»Π΅ΠΊΡ‚} OR
{... Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠŸΠΎΠ—Π°Ρ€ΠΏΠ»Π°Ρ‚Π΅} AND
( {... нСравСнства} )

Oops ... In fact, it turned out that in the case of the truth of any of the first two conditions, the entire condition turns into TRUE, disregarding inequalities. And this is not at all what we wanted.

Let's fix:

(
  {... IS NULL} OR
  {... ΠšΠΎΠΌΠΏΠ»Π΅ΠΊΡ‚} OR
  {... Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠŸΠΎΠ—Π°Ρ€ΠΏΠ»Π°Ρ‚Π΅}
) AND
( {... нСравСнства} )

Problem #4 (small): complex OR condition for one field

Actually, we had problems in No. 3 precisely because there were three conditions. But instead of them, you can get by with one, using the mechanism coalesce ... IN:

coalesce(NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚_"::text, '') IN ('', '"ΠšΠΎΠΌΠΏΠ»Π΅ΠΊΡ‚"', '"Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠŸΠΎΠ—Π°Ρ€ΠΏΠ»Π°Ρ‚Π΅"')

So are we NULL "catch", and complex OR You don't have to fuss with parentheses.

Total

Let's fix what we got:

IF (
  coalesce(NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚_"::text, '') IN ('', '"ΠšΠΎΠΌΠΏΠ»Π΅ΠΊΡ‚"', '"Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠŸΠΎΠ—Π°Ρ€ΠΏΠ»Π°Ρ‚Π΅"') AND
  (
    OLD."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠΠ°ΡˆΠ°ΠžΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΡ"
  , OLD."Π£Π΄Π°Π»Π΅Π½"
  , OLD."Π”Π°Ρ‚Π°"
  , OLD."ВрСмя"
  , OLD."Π›ΠΈΡ†ΠΎΠ‘ΠΎΠ·Π΄Π°Π»"
  ) IS DISTINCT FROM (
    NEW."Π”ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠΠ°ΡˆΠ°ΠžΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΡ"
  , NEW."Π£Π΄Π°Π»Π΅Π½"
  , NEW."Π”Π°Ρ‚Π°"
  , NEW."ВрСмя"
  , NEW."Π›ΠΈΡ†ΠΎΠ‘ΠΎΠ·Π΄Π°Π»"
  )
) THEN ...

And given that this trigger function can only be used in UPDATEtrigger due to the presence OLD/NEW in the upper-level condition, then this condition can generally be taken out in WHEN-condition as shown in #1...

Source: habr.com

Add a comment