An attempt to create an analogue of ASH for PostgreSQL

Formulation of the problem

To optimize PostgreSQL queries, it is very necessary to be able to analyze the history of activity, in particular, waits, locks, table statistics.

Opportunities Available

Historical Load Analysis Tool or "AWR for Postgres": very interesting solution, but no history of pg_stat_activity and pg_locks.

pgsentinel extension :
Β«All accumulated information is stored only in RAM, and the amount of memory consumed is regulated by the number of the last stored records.

The queryid field is added - the same queryid from the pg_stat_statements extension (pre-installation required).Β«

This, of course, would help a lot, but the most troublesome thing is the first point β€œAll accumulated information is stored only in RAM ”, i.e. we have an impact on the target base. In addition, there is no lock history and table statistics. Those. Generally speaking, the solution is incomplete:There is no ready package for installation yet. It is proposed to download the source code and build the library yourself. First you need to install the β€œdevel” package for your server and set the path to pg_config in the PATH variable.".

In general, there is a lot of fuss, and in the case of serious production bases, it may not be possible to do something with the server. You need to come up with something of your own again.

A warning.

Due to the rather large volume and due to the non-completion of the testing period, the article is mainly for informational purposes, rather as a set of theses and intermediate results.
More detailed material will be prepared later, in parts

Draft solution requirements

It is necessary to develop a tool to store:

pg_stat_activity view history
Session lock history using the pg_locks view

Decision requirement– Minimize the impact on the target database.

General idea– the data collection agent is not started in the target database, but in the monitoring database as a systemd service. Yes, some data loss is possible, but this is not critical for reporting, but there is no impact on the target database in terms of memory and disk space. And in the case of using a connection pool, the impact on user processes is minimal.

Stages of implementation

1.Service tables

A separate schema is used to store tables, so as not to complicate the analysis of the main tables used.

DROP SCHEMA IF EXISTS activity_hist ;
CREATE SCHEMA activity_hist AUTHORIZATION monitor ;

Important: The schema is not created in the target database, but in the monitoring database.

pg_stat_activity view history

The pg_stat_activity table is used to store current snapshots of the pg_stat_activity view.

activity_hist.history_pg_stat_activity :

--ACTIVITY_HIST.HISTORY_PG_STAT_ACTIVITY
DROP TABLE IF EXISTS activity_hist.history_pg_stat_activity;
CREATE TABLE activity_hist.history_pg_stat_activity
(
  timepoint timestamp without time zone ,
  datid             oid  , 
  datname           name ,
  pid               integer,
  usesysid          oid    ,
  usename           name   ,
  application_name  text   ,
  client_addr       inet   ,
  client_hostname   text   ,
  client_port       integer,
  backend_start     timestamp with time zone ,
  xact_start        timestamp with time zone ,
  query_start       timestamp with time zone ,
  state_change      timestamp with time zone ,
  wait_event_type   text ,                     
  wait_event        text ,                   
  state             text ,                  
  backend_xid       xid  ,                 
  backend_xmin      xid  ,                
  query             text ,               
  backend_type      text ,  
  queryid           bigint
);

To speed up the insert - no indexes and restrictions.

To store the history itself, a partitioned table is used:

activity_hist.archive_pg_stat_activity :

DROP TABLE IF EXISTS activity_hist.archive_pg_stat_activity;
CREATE TABLE activity_hist.archive_pg_stat_activity
(
  timepoint timestamp without time zone ,
  datid             oid  , 
  datname           name ,
  pid               integer,
  usesysid          oid    ,
  usename           name   ,
  application_name  text   ,
  client_addr       inet   ,
  client_hostname   text   ,
  client_port       integer,
  backend_start     timestamp with time zone ,
  xact_start        timestamp with time zone ,
  query_start       timestamp with time zone ,
  state_change      timestamp with time zone ,
  wait_event_type   text ,                     
  wait_event        text ,                   
  state             text ,                  
  backend_xid       xid  ,                 
  backend_xmin      xid  ,                
  query             text ,               
  backend_type      text ,
  queryid           bigint
)
PARTITION BY RANGE (timepoint);

Since there are no insert speed requirements in this case, some indexes have been created to speed up reporting.

Session lock history

The following table is used to store current session lock snapshots:

activity_hist.history_locking :

--ACTIVITY_HIST.HISTORY_LOCKING
DROP TABLE IF EXISTS activity_hist.history_locking;
CREATE TABLE activity_hist.history_locking
(
	timepoint timestamp without time zone ,
	locktype text ,
	relation oid ,
	mode text ,
	tid xid ,
	vtid text ,
	pid integer ,
	blocking_pids integer[] ,
	granted boolean
);

Also, to speed up the insert - no indexes and restrictions.

To store the history itself, a partitioned table is used:

activity_hist.archive_locking:

DROP TABLE IF EXISTS activity_hist.archive_locking;
CREATE TABLE activity_hist.archive_locking
(
	timepoint timestamp without time zone ,
	locktype text ,
	relation oid ,
	mode text ,
	tid xid ,
	vtid text ,
	pid integer ,
	blocking_pids integer[] ,
	granted boolean	
)
PARTITION BY RANGE (timepoint);

Since there are no insert speed requirements in this case, some indexes have been created to speed up reporting.

2. Filling in the current history

To directly capture view snapshots, a bash script is used that runs the plpgsql function.

get_current_activity.sh

#!/bin/bash
#########################################################
#get_current_activity.sh

ERROR_FILE='/home/demon/get_current_activity'$(date +%Y%m%d-)'T'$(date +%H)$(date +%M)$(date +%S)
host=$1
s_name=$2
s_pass=$3

psql  -A -t -q -v ON_ERROR_STOP=1 -c "SELECT activity_hist.get_current_activity( '$host' , '$s_name' , '$s_pass' )" >/dev/null 2>$ERROR_FILE

line_count=`cat $ERROR_FILE | wc -l`
if [[ $line_count != '0' ]];
then
    rm -f /home/demon/*.err >/dev/null 2>/dev/null
	cp $ERROR_FILE $ERROR_FILE'.err' >/dev/null 2>/dev/null  
fi
rm $ERROR_FILE >/dev/null 2>/dev/null
exit 0

plpgsql the dblink function accesses the views in the target database and inserts rows into the service tables in the monitoring database.

get_current_activity.sql

CREATE OR REPLACE FUNCTION activity_hist.get_current_activity( current_host text , current_s_name text , current_s_pass text ) RETURNS BOOLEAN AS $$
DECLARE 
  database_rec record;
  dblink_str text ;
BEGIN   

	EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||current_host||' port=5432 dbname=postgres'||
	                                         ' user='||current_s_name||' password='||current_s_pass|| ' '')';



--------------------------------------------------------------------
--GET pg_stat_activity stats
	INSERT INTO activity_hist.history_pg_stat_activity
	(
		SELECT * FROM dblink('LINK1',
			'SELECT 
			now() , 
			datid             , 
			datname           ,
			pid               ,
			usesysid              ,
			usename              ,
			application_name     ,
			client_addr          ,
			client_hostname      ,
			client_port       ,
			backend_start         ,
			xact_start            ,
			query_start           ,
			state_change          ,
			wait_event_type    ,                     
			wait_event         ,                   
			state              ,                  
			backend_xid         ,                 
			backend_xmin        ,                
			query              ,               
			backend_type   			
		FROM pg_stat_activity
		') 
		AS t (
		    timepoint 		  timestamp without time zone ,			
			datid             oid  , 
			datname           name ,
			pid               integer,
			usesysid          oid    ,
			usename           name   ,
			application_name  text   ,
			client_addr       inet   ,
			client_hostname   text   ,
			client_port       integer,
			backend_start     timestamp with time zone ,
			xact_start        timestamp with time zone ,
			query_start       timestamp with time zone ,
			state_change      timestamp with time zone ,
			wait_event_type   text ,                     
			wait_event        text ,                   
			state             text ,                  
			backend_xid       xid  ,                 
			backend_xmin      xid  ,                
			query             text ,               
			backend_type      text 			
		)
	);

---------------------------------------	
--ACTIVITY_HIST.HISTORY_LOCKING	
	INSERT INTO activity_hist.history_locking
	(
		SELECT * FROM dblink('LINK1',
			'SELECT 
			now() , 
			lock.locktype,
			lock.relation,
			lock.mode,
			lock.transactionid as tid,
			lock.virtualtransaction as vtid,
			lock.pid,
			pg_blocking_pids(lock.pid), 
			lock.granted
			FROM 	pg_catalog.pg_locks lock LEFT JOIN pg_catalog.pg_database db ON db.oid = lock.database
			WHERE NOT lock.pid = pg_backend_pid()	
		') 
		AS t (
			timepoint timestamp without time zone ,
			locktype text ,
			relation oid , 
			mode text ,
			tid xid ,
			vtid text ,
			pid integer ,
			blocking_pids integer[] ,
			granted boolean
		)
	);
	PERFORM dblink_disconnect('LINK1');
	
	RETURN TRUE ;
END
$$ LANGUAGE plpgsql;

The systemd service is used to collect snapshots of the view, and two scripts:

pg_current_activity.service

# /etc/systemd/system/pg_current_activity.service
[Unit]
Description=Collect history of pg_stat_activity , pg_locks 
Wants=pg_current_activity.timer

[Service]
Type=forking
StartLimitIntervalSec=0
ExecStart=/home/postgres/pgutils/demon/get_current_activity.sh 10.124.70.40 postgres postgres

[Install]
WantedBy=multi-user.target

pg_current_activity.timer

# /etc/systemd/system/pg_current_activity.timer
[Unit]
Description=Run pg_current_activity.sh every 1 second
Requires=pg_current_activity.service

[Timer]
Unit=pg_current_activity.service
OnCalendar=*:*:0/1
AccuracySec=1

[Install]
WantedBy=timers.target

Assign rights to scripts:
# chmod 755 pg_current_activity.timer
# chmod 755 pg_current_activity.service

We start the service:
# systemctl daemon-reload
# systemctl start pg_current_activity.service

Thus, the view history is collected as snapshots every second. Of course, if everything is left as it is, the tables will grow in size very quickly and more or less productive work will become impossible.

It is necessary to organize data archiving.

3.History archiving

Archive* partitioned tables are used for archiving.

New partitions are created every hour, while old data from the history* tables is deleted, so the size of the history* tables does not change much and the insert speed does not degrade over time.

The creation of new sections is done by the plpgsql function activity_hist.archive_current_activity. The operation algorithm is very simple (on the example of a section for the archive_pg_stat_activity table).

Create and populate a new section

EXECUTE format(
'CREATE TABLE ' || partition_name || 
' PARTITION OF activity_hist.archive_pg_stat_activity FOR VALUES FROM ( %L ) TO ( %L ) ' , 
to_char(date_trunc('year', partition_min_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_min_range ),'MM')||'-'||
to_char(date_trunc('day', partition_min_range ),'DD')||' '||
to_char(date_trunc('hour', partition_min_range ),'HH24')||':00', 
to_char(date_trunc('year', partition_max_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_max_range ),'MM')||'-'||
to_char(date_trunc('day', partition_max_range ),'DD')||' '||
to_char(date_trunc('hour', partition_max_range ),'HH24')||':00'
);

INSERT INTO activity_hist.archive_pg_stat_activity
(
	SELECT 	* 
	FROM 	activity_hist.history_pg_stat_activity
	WHERE 	timepoint BETWEEN partition_min_range AND partition_max_range 		
);

Creating indexes

EXECUTE format	(
'CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint )' 
);

EXECUTE format	('CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint , queryid )' 
);

Delete old data from history_pg_stat_activity table

DELETE 
FROM 	activity_hist.history_pg_stat_activity
WHERE 	timepoint < partition_max_range;

Of course, periodically, the old sections are deleted as unnecessary.

Basic reports

Exactly what it's all about. For reporting very remotely, roughly reminiscent of AWR Oracle.

It is important to add that in order to receive reports, you need to build a relationship between the pg_stat_activity and pg_stat_statements views. The tables are linked by adding the 'queryid' column to the 'history_pg_stat_activity', 'archive_pg_stat_activity' tables. The way to add a column value is beyond the scope of this article and is described here βˆ’ pg_stat_statements + pg_stat_activity + loq_query = pg_ash? .

TOTAL CPU TIME FOR QUERIES

Request :

WITH hist AS
(
SELECT 
	aa.query ,aa.queryid ,			
	count(*) * interval '1 second' AS duration 
FROM 	activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND  pg_stat_history_end+(current_hour_diff * interval '1 hour')  AND backend_type = 'client backend' AND datname != 'postgres' AND	( aa.wait_event_type IS NULL  ) ANDaa.state = 'active'
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid		
UNION 
SELECT 
	ha.query ,ha.queryid,
	count(*) * interval '1 second' AS duration 
FROM 	activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')  AND 	backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type IS NULL  )AND ha.state = 'active'
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid		
)
SELECT 	query , queryid , SUM( duration ) as duration 
FROM hist
GROUP BY  query , queryid 
ORDER BY 3 DESC

Example:

-------------------------------------------------------------------
| TOTAL CPU TIME FOR QUERIES : 07:47:36
+----+----------------------------------------+--------------------
|   #|                                 queryid|            duration
+----+----------------------------------------+--------------------
|   1|                      389015618226997618|            04:28:58
|   2|                                        |            01:07:29
|   3|                     1237430309438971376|            00:59:38
|   4|                     4710212362688288619|            00:50:48
|   5|                       28942442626229688|            00:15:50
|   6|                     9150846928388977274|            00:04:46
|   7|                    -6572922443698419129|            00:00:06
|   8|                                        |            00:00:01
+----+----------------------------------------+--------------------

TOTAL WAITINGS TIME FOR QUERIES

Request :

WITH hist AS
(
SELECT 
	aa.query ,aa.queryid ,			
	count(*) * interval '1 second' AS duration 
FROM 	activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')  AND 
	backend_type = 'client backend' AND datname != 'postgres' AND
	( aa.wait_event_type IS NOT NULL  ) 
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid		
UNION 
SELECT 
	ha.query ,ha.queryid,
	count(*) * interval '1 second' AS duration 
FROM 	activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')  AND 
	backend_type = 'client backend' AND datname != 'postgres' AND				
	( ha.wait_event_type IS NOT NULL  )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid		
)
SELECT 	query , queryid , SUM( duration ) as duration 
FROM hist
GROUP BY  query , queryid 
ORDER BY 3 DESC 

Example:

-------------------------------------------------------------------
| TOTAL WAITINGS TIME FOR QUERIES : 21:55:04
+----+----------------------------------------+--------------------
|   #|                                 queryid|            duration
+----+----------------------------------------+--------------------
|   1|                      389015618226997618|            16:19:05
|   2|                                        |            03:47:04
|   3|                     8085340880788646241|            00:40:20
|   4|                     4710212362688288619|            00:13:35
|   5|                     9150846928388977274|            00:12:25
|   6|                       28942442626229688|            00:11:32
|   7|                     1237430309438971376|            00:09:45
|   8|                     2649515222348904837|            00:09:37
|   9|                                        |            00:03:45
|  10|                     3167065002719415275|            00:02:20
|  11|                     5731212217001535134|            00:02:13
|  12|                     8304755792398128062|            00:01:31
|  13|                     2649515222348904837|            00:00:59
|  14|                     2649515222348904837|            00:00:22
|  15|                                        |            00:00:12
|  16|                     3422818749220588372|            00:00:08
|  17|                    -5730801771815999400|            00:00:03
|  18|                    -1473395109729441239|            00:00:02
|  19|                     2404820632950544954|            00:00:02
|  20|                    -6572922443698419129|            00:00:02
|  21|                     2369289265278398647|            00:00:01
|  22|                      180077086776069052|            00:00:01
+----+----------------------------------------+--------------------

WAITINGS FOR QUERIES

Requests:

WITH hist AS
(
SELECT 
	aa.wait_event_type , aa.wait_event 
FROM 	activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND 
	backend_type = 'client backend' AND datname != 'postgres' AND
	aa.wait_event IS NOT NULL 
GROUP BY aa.wait_event_type , aa.wait_event
UNION 
SELECT 
	ha.wait_event_type , ha.wait_event 
FROM 	activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND 
	backend_type = 'client backend' AND datname != 'postgres' AND
	ha.wait_event IS NOT NULL 
GROUP BY ha.wait_event_type , ha.wait_event		
)
SELECT 	wait_event_type , wait_event 
FROM hist
GROUP BY wait_event_type , wait_event
ORDER BY 1 ASC,2 ASC

----------------------------------------------------------------------

WITH hist AS
(
SELECT 
	aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid ,			
	count(*) * interval '1 second' AS duration 
FROM 	activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND 
	backend_type = 'client backend' AND datname != 'postgres' AND
	( aa.wait_event_type = waitings_stat_rec.wait_event_type AND aa.wait_event = waitings_stat_rec.wait_event )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid		
UNION 
SELECT 
	ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid,
	count(*) * interval '1 second' AS duration 
FROM 	activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND 
	backend_type = 'client backend' AND datname != 'postgres' AND				
	( ha.wait_event_type = waitings_stat_rec.wait_event_type AND ha.wait_event = waitings_stat_rec.wait_event )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid		
)
SELECT 	query , queryid , SUM( duration ) as duration 
FROM hist
GROUP BY  query , queryid 
ORDER BY 3 DESC

Example:

------------------------------------------------
| WAITINGS FOR QUERIES
+-----------------------------------------------
|                      wait_event_type = Client|
|                       wait_event = ClientRead|
|                        Total time  = 00:46:56|
------------------------------------------------
|    #|             queryid|            duration
+-----+--------------------+--------------------
|    1| 8085340880788646241|            00:40:20
|    2|                    |            00:03:45
|    3| 5731212217001535134|            00:01:53
|    4|                    |            00:00:12
|    5| 9150846928388977274|            00:00:09
|    6| 3422818749220588372|            00:00:08
|    7| 1237430309438971376|            00:00:06
|    8|   28942442626229688|            00:00:05
|    9| 4710212362688288619|            00:00:05
|   10|-5730801771815999400|            00:00:03
|   11| 8304755792398128062|            00:00:02
|   12|-6572922443698419129|            00:00:02
|   13|-1473395109729441239|            00:00:02
|   14| 2404820632950544954|            00:00:02
|   15|  180077086776069052|            00:00:01
|   16| 2369289265278398647|            00:00:01

+-----------------------------------------------
|                          wait_event_type = IO|
|                      wait_event = BufFileRead|
|                        Total time  = 00:00:38|
------------------------------------------------
|    #|             queryid|            duration
+-----+--------------------+--------------------
|    1|   28942442626229688|            00:00:38

+-----------------------------------------------

LOCKED PROCESSES HISTORY

Inquiry:

SELECT 
MIN(date_trunc('second',timepoint)) AS started , 
	count(*) * interval '1 second' as duration ,
	pid , blocking_pids , relation , mode , locktype 	 
FROM 
	activity_hist.archive_locking al 
WHERE 
	timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
	NOT granted AND 
	locktype = 'relation' 
GROUP BY pid , blocking_pids , relation , mode , locktype			
UNION
SELECT 
	MIN(date_trunc('second',timepoint)) AS started , 
	count(*) * interval '1 second' as duration ,
	pid , blocking_pids , relation , mode , locktype
FROM 
	activity_hist.history_locking 
WHERE 
	timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
	NOT granted AND 
	locktype = 'relation' 
GROUP BY pid , blocking_pids , relation , mode , locktype			
ORDER BY 1

Example:

-------------------------------------------------- -------------------------------------------------- --------------------------------- | LOCKED PROCESSES HISTORY +-----+----------+--------------------+-------- --+--------------------+--------------------+----- ---------------+-------------------- | #| pid| started| duration| blocking_pids| relation| mode| locktype +-----+----------+--------------------+---------- +--------------------+--------------------+------- -------------+-------------------- | 1| 26224| 2019-09-02 19:32:16| 00:01:45| {26211}| 16541| AccessShareLock| relation | 2| 26390| 2019-09-02 19:34:03| 00:00:53| {26211}| 16541| AccessShareLock| relation | 3| 26391| 2019-09-02 19:34:03| 00:00:53| {26211}| 16541| AccessShareLock| relation | 4| 26531| 2019-09-02 19:35:27| 00:00:12| {26211}| 16541| AccessShareLock| relation | 5| 27284| 2019-09-02 19:44:02| 00:00:19| {27276}| 16541| AccessShareLock| relation | 6| 27283| 2019-09-02 19:44:02| 00:00:19| {27276}| 16541| AccessShareLock| relation | 7| 27286| 2019-09-02 19:44:02| 00:00:19| {27276}| 16541| AccessShareLock| relation | 8| 27423| 2019-09-02 19:45:24| 00:00:12| {27394}| 16541| AccessShareLock| relation | 9| 27648| 2019-09-02 19:48:06| 00:00:20| {27647}| 16541| AccessShareLock| relation | 10| 27650| 2019-09-02 19:48:06| 00:00:20| {27647}| 16541| AccessShareLock| relation | 11| 27735| 2019-09-02 19:49:08| 00:00:06| {27650}| 16541| AccessExclusiveLock| relation | 12| 28380| 2019-09-02 19:56:03| 00:01:56| {28379}| 16541| AccessShareLock| relation | 13| 28379| 2019-09-02 19:56:03| 00:00:01| 28377| 16541| AccessExclusiveLock| relation | | | | | 28376| | 

BLOCKING PROCESSES HISTORY

Requests:

SELECT 
blocking_pids 
FROM 
	activity_hist.archive_locking al 
WHERE 
	timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
	NOT granted AND 
	locktype = 'relation' 
GROUP BY blocking_pids 		
UNION
SELECT 
	blocking_pids 
FROM 
	activity_hist.history_locking 
WHERE 
	timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
	NOT granted AND 
	locktype = 'relation' 
GROUP BY blocking_pids 		
ORDER BY 1

---------------------------------------------------------------

SELECT 
	pid , usename , application_name , datname ,
	MIN(date_trunc('second',timepoint)) as started , 
	count(*) * interval '1 second' as duration ,		 
	state , 
	query
				FROM  	activity_hist.archive_pg_stat_activity
				WHERE 	pid= current_pid AND 
						timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') 						 
				GROUP BY pid , usename , application_name , 
						datname , 
						state_change, 
						state , 
						query 
				UNION
				SELECT 
					pid , usename , application_name , datname ,
					MIN(date_trunc('second',timepoint)) as started , 
					count(*) * interval '1 second' as duration ,		 
					state , 
					query
				FROM  	activity_hist.history_pg_stat_activity_for_reports
				WHERE 	pid= current_pid AND 
						timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') 						 
				GROUP BY pid , usename , application_name , 
						datname , 
						state_change, 
						state , 
						query 
				ORDER BY 5 , 1

Example:

-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ----------------------- BLOCKING PROCESSES HISTORY +----+----------+------- ---+--------------------+----------+-------------- ------+--------------------+---------------------- --------+---------------------------------------- | #| pid| usename| application_name| datename| started| duration| state| query +----+----------+----------+--------------------+ ----------+--------------------+------------ --+------------------------------+---------------- ----------------------- | 1| 26211| tuser| psql| tdb1| 2019-09-02 19:31:54| 00:00:04| idle | | 2| 26211| tuser| psql| tdb1| 2019-09-02 19:31:58| 00:00:06| idle in transaction start; | 3| 26211| tuser| psql| tdb1| 2019-09-02 19:32:16| 00:01:45| idle in transaction lock table wafer_data; | 4| 26211| tuser| psql| tdb1| 2019-09-02 19:35:54| 00:01:23| idle | commit; | 5| 26211| tuser| psql| tdb1| 2019-09-02 19:38:46| 00:00:02| idle in transaction start; | 6| 26211| tuser| psql| tdb1| 2019-09-02 19:38:54| 00:00:08| idle in transaction lock table wafer_data; | 7| 26211| tuser| psql| tdb1| 2019-09-02 19:39:08| 00:42:42| idle | commit; | 8| 26211| tuser| psql| tdb1| 2019-09-03 07:12:07| 00:00:52| active| select test_del();

Development.

The basic queries shown and the resulting reports already make life much easier when analyzing performance incidents.
Based on basic queries, you can get a report that is vaguely reminiscent of AWR Oracle.
Summary Report Example

+------------------------------------------------- ------------------------------------ | CONSOLIDATED REPORT FOR ACTIVITY AND WAITINGS . 

To be continued. Next in line is the formation of the lock history (pg_stat_locks), a more detailed description of the process of filling tables.

Source: habr.com

Add a comment