PostgreSQL Query Profiler: How to match plan and query

Many who already use explain.tensor.ru - our PostgreSQL plan visualization service may not be aware of one of its superpowers - to turn a hard-to-read piece of the server log ...

PostgreSQL Query Profiler: How to match plan and query
… into a beautifully designed query with contextual hints for the relevant plan nodes:

PostgreSQL Query Profiler: How to match plan and query
In this transcript of the second part of his report at PGConf.Russia 2020 I will tell you how we managed to do it.

The transcript of the first part, devoted to typical query performance problems and their solutions, can be found in the article "Recipes for Sick SQL Queries".



First, let's do the coloring - and we will no longer color the plan, we have already painted it, it is already beautiful and understandable, but the request.

It seemed to us that the request pulled out of the log with an unformatted “sheet” looks very ugly and therefore inconvenient.
PostgreSQL Query Profiler: How to match plan and query

Especially when developers “glue” the request body in the code (this, of course, is an anti-pattern, but it happens) in one line. Horror!

Let's draw it somehow more beautifully.
PostgreSQL Query Profiler: How to match plan and query

And if we can draw it beautifully, that is, disassemble and assemble the request body back, then we can then “attach” a hint to each object of this request - what happened at the corresponding point in the plan.

Query Syntax Tree

To do this, the query must first be parsed.
PostgreSQL Query Profiler: How to match plan and query

Because, we have the core of the system runs on NodeJS, then we made a module for it, you can find it on github. In fact, these are extended "bindings" to the internals of the PostgreSQL parser itself. That is, the grammar is simply binary compiled and bindings are made to it by NodeJS. We took other people's modules as a basis - there is no big secret here.

We feed the body of the input request to our function - at the output we get a parsed syntax tree in the form of a JSON object.
PostgreSQL Query Profiler: How to match plan and query

Now we can run through this tree in the opposite direction and assemble the request with the indents, coloring, formatting that we want. No, this is not configurable, but we thought that this would be convenient.
PostgreSQL Query Profiler: How to match plan and query

Mapping Query and Plan Nodes

Now let's see how we can combine the plan that we analyzed in the first step and the query that we analyzed in the second.

Let's take a simple example - we have a request that forms a CTE and reads it twice. He generates such a plan.
PostgreSQL Query Profiler: How to match plan and query

CTE

If you look at it carefully, that before the 12th version (or starting from it with the keyword MATERIALIZED) formation CTE is an unconditional barrier to the scheduler.
PostgreSQL Query Profiler: How to match plan and query

And, that means, if we see somewhere in the request the generation of CTE and somewhere in the plan the node CTE, then these nodes uniquely “fight” among themselves, we can immediately combine them.

Task "with an asterisk"Note: CTEs can be nested.
PostgreSQL Query Profiler: How to match plan and query
There are very poorly nested, and even the same name. For example, you can inside CTE A do CTE X, and at the same level inside CTE B do again CTE X:

WITH A AS (
  WITH X AS (...)
  SELECT ...
)
, B AS (
  WITH X AS (...)
  SELECT ...
)
...

When comparing, you must understand this. It is very difficult to understand this with "eyes" - even seeing the plan, even seeing the body of the request. If your CTE generation is complex, nested, the requests are large, then it is completely unconscious.

UNITY

If we have a keyword in the request UNION [ALL] (the operator of joining two samples), then it corresponds in the plan to either the node Append, or some Recursive Union.
PostgreSQL Query Profiler: How to match plan and query

That which is "above" UNION - this is the first child of our node, which is "from below" - the second. If through UNION we have "glued" several blocks at once, then Append-node will still have only one, but it will have not two children, but many - in order as they go, respectively:

  (...) -- #1
UNION ALL
  (...) -- #2
UNION ALL
  (...) -- #3

Append
  -> ... #1
  -> ... #2
  -> ... #3

Task "with an asterisk": inside recursive fetch generation (WITH RECURSIVE) can also be more than one UNION. But only the very last block after the last one is always recursive UNION. Everything above is one but different UNION:

WITH RECURSIVE T AS(
  (...) -- #1
UNION ALL
  (...) -- #2, тут кончается генерация стартового состояния рекурсии
UNION ALL
  (...) -- #3, только этот блок рекурсивный и может содержать обращение к T
)
...

Such examples also need to be able to "paste". In this example, we see that UNION-segments in our request were 3 pieces. Accordingly, one UNION соответствует Append-node, and the other - Recursive Union.
PostgreSQL Query Profiler: How to match plan and query

Read-write data

Everything, laid out, now we know which piece of the request corresponds to which piece of the plan. And in these pieces we can easily and naturally find those objects that are “readable”.

From the point of view of the query, we do not know if this is a table or a CTE, but they are denoted by the same node RangeVar. And in the “readable” plan, this is also a fairly limited set of nodes:

  • Seq Scan on [tbl]
  • Bitmap Heap Scan on [tbl]
  • Index [Only] Scan [Backward] using [idx] on [tbl]
  • CTE Scan on [cte]
  • Insert/Update/Delete on [tbl]

We know the structure of the plan and the request, we know the correspondence of the blocks, we know the names of the objects - we make an unambiguous comparison.
PostgreSQL Query Profiler: How to match plan and query

Again task "with an asterisk". We take a request, execute it, we don’t have any aliases - we just read it twice from one CTE.
PostgreSQL Query Profiler: How to match plan and query

Let's look at the plan - what's the trouble? Why did we have an alias? We didn't order it. Where does he get such a "number" from?

PostgreSQL adds it itself. You just need to understand that just such an alias for us, for the purposes of comparison with the plan, it does not make any sense, it is simply added here. Let's not pay attention to him.

The second task "with an asterisk": if we are reading from a partitioned table, then we will get a node Append or Merge Append, which will consist of a large number of "children", and each of which will be some Scan'om from the table-section: Seq Scan, Bitmap Heap Scan or Index Scan. But, in any case, these “children” will not be complex queries - this is how these nodes can be distinguished from Append with UNION.
PostgreSQL Query Profiler: How to match plan and query

We also understand such knots, we collect them “in one pile” and say: “everything you read from megatable is right here and down the tree".

"Simple" Data Acquisition Nodes

PostgreSQL Query Profiler: How to match plan and query

Values Scan in plan corresponds VALUES in the request.

Result is a request without FROM sort of SELECT 1. Or when you have a false expression in WHERE-block (then the attribute appears One-Time Filter):

EXPLAIN ANALYZE
SELECT * FROM pg_class WHERE FALSE; -- или 0 = 1

Result  (cost=0.00..0.00 rows=0 width=230) (actual time=0.000..0.000 rows=0 loops=1)
  One-Time Filter: false

Function Scan "Mapyatsya" on the SRF of the same name.

But with nested queries, everything is more complicated - unfortunately, they do not always turn into InitPlan/SubPlan. Sometimes they turn into ... Join or ... Anti Join, especially when you write something like WHERE NOT EXISTS .... And it’s not always possible to combine there - in the text of the plan there are no operators corresponding to the nodes of the plan.

Again task "with an asterisk": some VALUES in the request. In this case and in the plan you will get several nodes Values Scan.
PostgreSQL Query Profiler: How to match plan and query

"Number" suffixes will help to distinguish them from one another - it is added exactly in the order in which the corresponding VALUES-blocks in the course of the request from top to bottom.

Data processing

It seems that everything in our request was sorted out - only Limit.
PostgreSQL Query Profiler: How to match plan and query

But everything is simple here - such nodes as Limit, Sort, Aggregate, WindowAgg, Unique They “map” one-to-one to the corresponding operators in the request, if they are there. There are no "stars" and no difficulties.
PostgreSQL Query Profiler: How to match plan and query

JOIN

Difficulties arise when we want to combine JOIN between themselves. This is not always possible, but it is possible.
PostgreSQL Query Profiler: How to match plan and query

From the query parser's point of view, we have a node JoinExpr, which has exactly two children - left and right. This, respectively, is what is “above” your JOIN and what is written “under” it in the query.

And from the point of view of the plan, these are two descendants of some * Loop/* Join-node. Nested Loop, Hash Anti Join. . . is something like that.

Let's use simple logic: if we have tables A and B that "join" each other in the plan, then in the query they could be located either A-JOIN-Bor B-JOIN-A. Let's try to combine like this, try to combine in the opposite way, and so on until such pairs run out.

Take our syntax tree, take our plan, look at them... it doesn't look like it!
PostgreSQL Query Profiler: How to match plan and query

Let's redraw it in the form of graphs - oh, something has already become similar to something!
PostgreSQL Query Profiler: How to match plan and query

Let's notice that we have nodes that have children B and C at the same time - it doesn't matter to us in what order. Let's combine them and flip the picture of the node.
PostgreSQL Query Profiler: How to match plan and query

Let's see again. Now we have nodes with children A and pairs (B + C) - compatible with them.
PostgreSQL Query Profiler: How to match plan and query

Great! It turns out that we are these two JOIN from the query with the nodes of the plan were successfully combined.

Alas, this problem is not always solved.
PostgreSQL Query Profiler: How to match plan and query

For example, if the request A JOIN B JOIN C, and in the plan, the “extreme” nodes A and C were connected first of all. But there is no such operator in the query, we have nothing to highlight, there is nothing to bind the hint to. Same with "comma" when you write A, B.

But, in most cases, almost all nodes manage to “untie” and get such profiling on the left in time - literally, like in Google Chrome, when you analyze JavaScript code. You can see how long each line and each statement "executed".
PostgreSQL Query Profiler: How to match plan and query

And to make it easier for you to use all this, we have made storage the archive, where you can save and then find your plans along with associated queries or share a link with someone.

If you just need to bring an unreadable request into an adequate form, use our "normalizer".

PostgreSQL Query Profiler: How to match plan and query

Source: habr.com

Add a comment