You've used it over 6000 times since then, but one of the handy features might have gone unnoticed is structural clues, which look something like this:
Listen to them and your requests will "become silky smooth". 🙂
But seriously, many situations that make a request slow and “gluttonous” in terms of resources, are typical and can be recognized by the structure and data of the plan.
In this case, each individual developer will not have to look for an optimization option on his own, relying solely on his own experience - we can tell him what is happening here, what could be the reason, and how to come up with a solution. Which is what we did.
Let's take a closer look at these cases - how they are defined and what recommendations they lead to.
For a better immersion in the topic, you can first listen to the corresponding block from my report at PGConf.Russia 2020, and only then go to a detailed analysis of each example:
CREATE TABLE tbl AS
SELECT
generate_series(1, 100000) pk -- 100K "фактов"
, (random() * 1000)::integer fk_cli; -- 1K разных внешних ключей
CREATE INDEX ON tbl(fk_cli); -- индекс для foreign key
SELECT
*
FROM
tbl
WHERE
fk_cli = 1 -- отбор по конкретной связи
ORDER BY
pk DESC -- хотим всего одну "последнюю" запись
LIMIT 1;
Even on such a primitive sample - 8.5x faster and 33x fewer reads. The effect will be clearer, the more "facts" you have for each value. fk.
I note that such an index will work as a “prefix” index no worse than the previous one for other queries with fk, where sorting by pk was not and is not (you can read more about this in my article about finding inefficient indexes). In particular, it will provide normal explicit foreign key support by this field.
#2: index intersection (BitmapAnd)
When arises
Show all contracts for the client "LLC Kolokolchik" concluded on behalf of "NJSC Lyutik".
How to identify
-> BitmapAnd
-> Bitmap Index Scan
-> Bitmap Index Scan
Recommendations
Create composite index by fields from both source or expand one of the existing fields from the second.
Example:
CREATE TABLE tbl AS
SELECT
generate_series(1, 100000) pk -- 100K "фактов"
, (random() * 100)::integer fk_org -- 100 разных внешних ключей
, (random() * 1000)::integer fk_cli; -- 1K разных внешних ключей
CREATE INDEX ON tbl(fk_org); -- индекс для foreign key
CREATE INDEX ON tbl(fk_cli); -- индекс для foreign key
SELECT
*
FROM
tbl
WHERE
(fk_org, fk_cli) = (1, 999); -- отбор по конкретной паре
Here the gain is smaller, since Bitmap Heap Scan is quite effective on its own. But anyway 7x faster and 2.5x fewer reads.
#3: Combining Indexes (BitmapOr)
When arises
Show the first 20 oldest "own" or unassigned requests for processing, with own in priority.
How to identify
-> BitmapOr
-> Bitmap Index Scan
-> Bitmap Index Scan
Recommendations
Use UNION [ALL] to combine subqueries for each of the condition OR blocks.
Example:
CREATE TABLE tbl AS
SELECT
generate_series(1, 100000) pk -- 100K "фактов"
, CASE
WHEN random() < 1::real/16 THEN NULL -- с вероятностью 1:16 запись "ничья"
ELSE (random() * 100)::integer -- 100 разных внешних ключей
END fk_own;
CREATE INDEX ON tbl(fk_own, pk); -- индекс с "вроде как подходящей" сортировкой
SELECT
*
FROM
tbl
WHERE
fk_own = 1 OR -- свои
fk_own IS NULL -- ... или "ничьи"
ORDER BY
pk
, (fk_own = 1) DESC -- сначала "свои"
LIMIT 20;
(
SELECT
*
FROM
tbl
WHERE
fk_own = 1 -- сначала "свои" 20
ORDER BY
pk
LIMIT 20
)
UNION ALL
(
SELECT
*
FROM
tbl
WHERE
fk_own IS NULL -- потом "ничьи" 20
ORDER BY
pk
LIMIT 20
)
LIMIT 20; -- но всего - 20, больше и не надо
We took advantage of the fact that all 20 necessary records were immediately obtained in the first block, so the second one, with the more “expensive” Bitmap Heap Scan, was not even executed - as a result 22x faster, 44x fewer reads!
As a rule, it occurs when you want to “attach another filter” to an existing request.
“And you don’t have the same, but with mother of pearl buttons?» film "Diamond Hand"
For example, modifying the task above, show the first 20 oldest "critical" requests for processing, regardless of their purpose.
How to identify
-> Seq Scan | Bitmap Heap Scan | Index [Only] Scan [Backward]
&& 5 × rows < RRbF -- отфильтровано >80% прочитанного
&& loops × RRbF > 100 -- и при этом больше 100 записей суммарно
Recommendations
Create [more] specialized index with WHERE clause or include additional fields in the index.
If the filtering condition is "static" for your tasks - that is does not include expansion list of values in the future - it is better to use a WHERE index. Various boolean/enum statuses fit well into this category.
If the filtration condition can take on different values, it is better to expand the index with these fields - as in the situation with BitmapAnd above.
Example:
CREATE TABLE tbl AS
SELECT
generate_series(1, 100000) pk -- 100K "фактов"
, CASE
WHEN random() < 1::real/16 THEN NULL
ELSE (random() * 100)::integer -- 100 разных внешних ключей
END fk_own
, (random() < 1::real/50) critical; -- 1:50, что заявка "критичная"
CREATE INDEX ON tbl(pk);
CREATE INDEX ON tbl(fk_own, pk);
SELECT
*
FROM
tbl
WHERE
critical
ORDER BY
pk
LIMIT 20;
As you can see, the filtering from the plan is completely gone, and the request has become 5 times faster.
#5: sparse table
When arises
Various attempts to make your own task processing queue, when a large number of updates / deletions of records on the table lead to a situation of a large number of "dead" records.
How to identify
-> Seq Scan | Bitmap Heap Scan | Index [Only] Scan [Backward]
&& loops × (rows + RRbF) < (shared hit + shared read) × 8
-- прочитано больше 1KB на каждую запись
&& shared hit + shared read > 64
Recommendations
Manually carry out regularly VACUUM [FULL] or achieve adequately frequent processing autovacuum by fine-tuning its parameters, including for a specific table.
It seems that they read a little, and everything was indexed, and they didn’t filter anyone extra - but still, significantly more pages were read than we would like.
How to identify
-> Index [Only] Scan [Backward]
&& loops × (rows + RRbF) < (shared hit + shared read) × 8
-- прочитано больше 1KB на каждую запись
&& shared hit + shared read > 64
Recommendations
Take a close look at the structure of the index used and the key fields specified in the query - most likely, index part not set. You will most likely need to create a similar index, but without prefix fields, or learn to iterate their values.
Example:
CREATE TABLE tbl AS
SELECT
generate_series(1, 100000) pk -- 100K "фактов"
, (random() * 100)::integer fk_org -- 100 разных внешних ключей
, (random() * 1000)::integer fk_cli; -- 1K разных внешних ключей
CREATE INDEX ON tbl(fk_org, fk_cli); -- все почти как в #2
-- только вот отдельный индекс по fk_cli мы уже посчитали лишним и удалили
SELECT
*
FROM
tbl
WHERE
fk_cli = 999 -- а fk_org не задано, хотя стоит в индексе раньше
LIMIT 20;
Everything seems to be fine, even in terms of the index, but somehow suspicious - for each of the 20 records read, 4 pages of data had to be subtracted, 32KB per record - isn't it bold? Yes and index name tbl_fk_org_fk_cli_idx leads to thought.
One-time processing (sorting or uniqueization) of a large number of records does not fit into the memory allocated for this.
How to identify
-> *
&& temp written > 0
Recommendations
If the amount of memory used by the operation does not greatly exceed the set value of the parameter work_mem, it should be corrected. You can immediately in the config for everyone, or you can through SET [LOCAL] for a specific request/transaction.
Example:
SHOW work_mem;
-- "16MB"
SELECT
random()
FROM
generate_series(1, 1000000)
ORDER BY
1;
For obvious reasons, if only memory is used, and not disk, then the query will be executed much faster. At the same time, part of the load is also removed from the HDD.
But you need to understand that allocating a lot of memory will always not work either - it simply won’t be enough for everyone.
#9: Irrelevant statistics
When arises
A lot was poured into the base at once, but they did not have time to drive it away ANALYZE.
How to identify
-> Seq Scan | Bitmap Heap Scan | Index [Only] Scan [Backward]
&& ratio >> 10
There was a lock waiting for a competing request, or there was not enough CPU/hypervisor hardware resources.
How to identify
-> *
&& (shared hit / 8K) + (shared read / 1K) < time / 1000
-- RAM hit = 64MB/s, HDD read = 8MB/s
&& time > 100ms -- читали мало, но слишком долго
Recommendations
Use external monitoring system server for blocking or abnormal resource consumption. We have already talked about our version of organizing this process for hundreds of servers. here и here.