
Philosophical introduction
As you know, there are only two methods for solving problems:
- Method of analysis or method of deduction, or from the general to the particular.
- Method of synthesis or method of induction, or from particular to general.
To solve the “improve database performance” problem, it might look like this.
Analysis - we analyze the problem into separate parts and by solving them we try to improve the performance of the database as a whole as a result.
In practice, the analysis looks something like this:
- A problem occurs (performance incident)
- Gathering statistical information about the state of the database
- Looking for bottlenecks
- We solve problems from bottlenecks
Database bottlenecks - infrastructure (CPU, Memory, Disks, Network, OS), settings (postgresql.conf), queries:
Infrastructure: the possibilities of influence and change for the engineer are almost zero.
Database settings: the possibilities for changes are slightly more than in the previous case, but as a rule they are still quite difficult, especially in the clouds.
Inquiries to the database: the only area for maneuvers.
Synthesis - we improve the performance of individual parts, expecting that the database performance will improve as a result.
Lyrical introduction or why all this is necessary
How does the performance incident resolution process work if database performance is not monitored:
Customer - “everything is bad with us, for a long time, do us good”
Engineer - “How bad is that?”
Customer - “this is how it is now (an hour ago, yesterday, it was the last case), slowly”
Engineer - “when was it good?”
Customer – “a week (two weeks) ago it was not bad. “(It's lucky)
Customer - “I don’t remember when it was good, but now it’s bad” (Regular answer)
The result is a classic picture:

Who is to blame and what to do?
The first part of the question is the easiest to answer - the DBA engineer is always to blame.
The answer to the second part is also not too difficult - you need to implement a database performance monitoring system.
The first question arises - what to monitor?
Path 1. We will monitor EVERYTHING

CPU load, the number of disk read / write operations, the size of the allocated memory, and a megaton of different counters that any more or less working monitoring system can provide.
The result is a bunch of graphs, pivot tables, and continuous email notifications and 100% employment of an engineer solving a bunch of identical tickets, however, as a rule, with the standard wording - “Temporary issue. No action need.” But, everyone is busy, and there is always something to show the customer - the work is in full swing.
Path 2. Monitor only what is needed, and what is not needed, do not need to be monitored
You can monitor, a little differently - only entities and events:
- Which DBA Engineer Can Influence
- For which there is an algorithm of actions when an event occurs or an entity changes.
Based on this assumption and remembering "Philosophical introduction» in order to avoid regular repetition «Lyrical introduction or why all this is necessary» It would be advisable to monitor the performance of individual queries for optimization and analysis, which should ultimately lead to an improvement in the performance of the entire database.
But in order to improve a heavy query that affects the overall performance of the database, you must first find it.
So there are two related questions:
- which request is considered heavy
- how to search for heavy queries.
Obviously, a heavy query is a query that uses a lot of OS resources to get a result.
Let's move on to the second question - how to search and then monitor heavy queries?
What are the query monitoring features in PostgreSQL?
Compared to Oracle, there are few opportunities, but still something can be done.

PG_STAT_STATEMENTS
The standard pg_stat_statements extension is intended for searching and monitoring heavy queries in PostgreSQL.
After installing the extension, a view of the same name appears in the target database, which should be used for monitoring purposes.
Target columns pg_stat_statements for building a monitoring system:
- queryid Internal hash code computed from operator parse tree
- max_time Maximum time spent per operator, in milliseconds
By accumulating and using statistics on these two columns, you can build a monitoring system.
How pg_stat_statements is used to monitor PostgreSQL performance

To monitor query performance, use:
On the target database side, the pg_stat_statements view
By the Server and monitoring databases - a set of bash scripts and service tables.
Stage 1 - collection of statistical data
The cron monitoring host regularly runs a script that copies the contents of the pg_stat_statements view from the target database to the pg_stat_history table in the monitoring database.
Thus, a history of execution of individual queries is formed, which can be used to generate performance reports and configure metrics.
Stage 2 - setting performance metrics
Based on the collected data, we select the requests, the execution of which is the most critical/important for the client (application). By agreement with the customer, we set the values of performance metrics using the queryid and max_time fields.
The result is the start of performance monitoring
- The monitoring script, when run, checks the configured performance metrics by comparing the max_time value of the metric with the value from the pg_stat_statements view in the target database.
- If the value in the target database exceeds the value of the metric, a warning is generated (incident in the ticket system)
Additional option 1
History of query execution plans
For later resolution of performance incidents, it is very good to have a history of query execution plan changes.
The log_query service table is used to store the history. The table is populated when parsing the loaded PostgreSQL log file. Since the log file, unlike the pg_stat_statements view, gets the full text with the values of the execution parameters, and not the normalized text, it is possible to log not only the time and duration of requests, but also store execution plans for the current time.
Additional option 2
Continuous performance improvement process
Monitoring individual queries in general is not intended to solve the problem of continuous improvement of the performance of the database as a whole, since it monitors and solves performance problems only for individual queries. However, you can extend the method and set up monitoring queries for the entire database.
To do this, you need to enter additional performance metrics:
- In recent days
- For base period
The script selects statements from the pg_stat_statements view in the target database and compares the max_time value with the average max_time value, in the first case over the last days, or over the selected time period (baseline), in the second case.
Thus, in case of performance degradation for any query, a warning will be generated automatically, without manual analysis of reports.
And what about synthesis?
In the described approach, as the synthesis method suggests, by improving individual parts of the system, we improve the system as a whole.
- Query executed by the database - thesis
- Changed request - antithesis
- Change the state of the system - synthesis

System development
- Extending collected statistics by adding history for pg_stat_activity system view
- Expanding the collected statistics by adding history for the statistics of individual tables participating in queries
- Integration with Monitoring System in the AWS Cloud
- Also, anything else you can think of...
Source: habr.com
