Continuation of the article "
The article will consider and show on specific queries and examples what useful information can be obtained using the pg_locks view history.
A warning.
Due to the novelty of the topic and the incompleteness of the testing period, the article may contain errors. Criticism and comments are welcome and expected.
Input data
pg_locks representation history
archive_locking
CREATE TABLE archive_locking
( timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean ,
queryid bigint
);
Basically, the table is similar to the table archive_pg_stat_activitydescribed in more detail here -
To fill a column queryid function is used
update_history_locking_by_queryid
--update_history_locking_by_queryid.sql
CREATE OR REPLACE FUNCTION update_history_locking_by_queryid() RETURNS boolean AS $$
DECLARE
result boolean ;
current_minute double precision ;
start_minute integer ;
finish_minute integer ;
start_period timestamp without time zone ;
finish_period timestamp without time zone ;
lock_rec record ;
endpoint_rec record ;
current_hour_diff double precision ;
BEGIN
RAISE NOTICE '***update_history_locking_by_queryid';
result = TRUE ;
current_minute = extract ( minute from now() );
SELECT * FROM endpoint WHERE is_need_monitoring
INTO endpoint_rec ;
current_hour_diff = endpoint_rec.hour_diff ;
IF current_minute < 5
THEN
RAISE NOTICE 'Current time is less than 5 minute.';
start_period = date_trunc('hour',now()) + (current_hour_diff * interval '1 hour');
finish_period = start_period - interval '5 minute' ;
ELSE
finish_minute = extract ( minute from now() ) / 5 ;
start_minute = finish_minute - 1 ;
start_period = date_trunc('hour',now()) + interval '1 minute'*start_minute*5+(current_hour_diff * interval '1 hour');
finish_period = date_trunc('hour',now()) + interval '1 minute'*finish_minute*5+(current_hour_diff * interval '1 hour') ;
END IF ;
RAISE NOTICE 'start_period = %', start_period;
RAISE NOTICE 'finish_period = %', finish_period;
FOR lock_rec IN
WITH act_queryid AS
(
SELECT
pid ,
timepoint ,
query_start AS started ,
MAX(timepoint) OVER (PARTITION BY pid , query_start ) AS finished ,
queryid
FROM
activity_hist.history_pg_stat_activity
WHERE
timepoint BETWEEN start_period and
finish_period
GROUP BY
pid ,
timepoint ,
query_start ,
queryid
),
lock_pids AS
(
SELECT
hl.pid ,
hl.locktype ,
hl.mode ,
hl.timepoint ,
MIN ( timepoint ) OVER (PARTITION BY pid , locktype ,mode ) as started
FROM
activity_hist.history_locking hl
WHERE
hl.timepoint between start_period and
finish_period
GROUP BY
hl.pid ,
hl.locktype ,
hl.mode ,
hl.timepoint
)
SELECT
lp.pid ,
lp.locktype ,
lp.mode ,
lp.timepoint ,
aq.queryid
FROM lock_pids lp LEFT OUTER JOIN act_queryid aq ON ( lp.pid = aq.pid AND lp.started BETWEEN aq.started AND aq.finished )
WHERE aq.queryid IS NOT NULL
GROUP BY
lp.pid ,
lp.locktype ,
lp.mode ,
lp.timepoint ,
aq.queryid
LOOP
UPDATE activity_hist.history_locking SET queryid = lock_rec.queryid
WHERE pid = lock_rec.pid AND locktype = lock_rec.locktype AND mode = lock_rec.mode AND timepoint = lock_rec.timepoint ;
END LOOP;
RETURN result ;
END
$$ LANGUAGE plpgsql;
Explanation: the value of the queryid column is updated in the history_locking table, and then when a new partition is created for the archive_locking table, the value will be stored in historical values.
ΠΡΡ ΠΎΠ΄Π½ΡΠ΅ Π΄Π°Π½Π½ΡΠ΅
General information on processes in general.
WAITING FOR LOCKS BY LOCKTYPES
Request
WITH
t AS
(
SELECT
locktype ,
mode ,
count(*) as total
FROM
activity_hist.archive_locking
WHERE
timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted
GROUP BY
locktype ,
mode
)
SELECT
locktype ,
mode ,
total * interval '1 second' as duration
FROM t
ORDER BY 3 DESC
Example
| WAITING FOR LOCKS BY LOCKTYPES +--------------------+----------------------- -------+-------------------- | locktype| mode| duration +--------------------+--------------------- ---+-------------------- | transactionid| sharelock| 19:39:26 | tuple| AccessExclusiveLock| 00:03:35 +--------------------+----------------------- -------+--------------------
TAKINGS OF LOCKS BY LOCKTYPES
Request
WITH
t AS
(
SELECT
locktype ,
mode ,
count(*) as total
FROM
activity_hist.archive_locking
WHERE
timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
granted
GROUP BY
locktype ,
mode
)
SELECT
locktype ,
mode ,
total * interval '1 second' as duration
FROM t
ORDER BY 3 DESC
Example
| TAKINGS OF LOCKS BY LOCKTYPES +--------------------+----------------------- -------+-------------------- | locktype| mode| duration +--------------------+--------------------- ---+-------------------- | relation| RowExclusiveLock| 51:11:10 | virtualxid| ExclusiveLock| 48:10:43 | transactionid| ExclusiveLock| 44:24:53 | relation| AccessShareLock| 20:06:13 | tuple| AccessExclusiveLock| 17:58:47 | tuple| ExclusiveLock| 01:40:41 | relation| ShareUpdateExclusiveLock| 00:26:41 | object| RowExclusiveLock| 00:00:01 | transactionid| sharelock| 00:00:01 | extend| ExclusiveLock| 00:00:01 +--------------------+----------------------- -------+--------------------
Detailed information for specific queryid queries
WAITING FOR LOCKS BY LOCKTYPES BY QUERYID
Request
WITH
lt AS
(
SELECT
pid ,
locktype ,
mode ,
timepoint ,
queryid ,
blocking_pids ,
MIN ( timepoint ) OVER (PARTITION BY pid , locktype ,mode ) as started
FROM
activity_hist.archive_locking
WHERE
timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND
pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
queryid IS NOT NULL
GROUP BY
pid ,
locktype ,
mode ,
timepoint ,
queryid ,
blocking_pids
)
SELECT
lt.pid ,
lt.locktype ,
lt.mode ,
lt.started ,
lt.queryid ,
lt.blocking_pids ,
COUNT(*) * interval '1 second' as duration
FROM lt
GROUP BY
lt.pid ,
lt.locktype ,
lt.mode ,
lt.started ,
lt.queryid ,
lt.blocking_pids
ORDER BY 4
Example
| WAITING FOR LOCKS BY LOCKTYPES BY QUERYID +----------+-------------------------+----- ---------------+------------------------------+--- ------------------+--------------------+----------- --------- | pid| locktype| mode| started| queryid| blocking_pids| duration +----------+-------------------------+----------- ---------+------------------------------+--------- ---+--------------------+----------------- --- | 11288| transactionid| sharelock| 2019-09-17 10:00:00.302936| 389015618226997618| {11092}| 00:03:34 | 11626| transactionid| sharelock| 2019-09-17 10:00:21.380921| 389015618226997618| {12380}| 00:00:29 | 11626| transactionid| sharelock| 2019-09-17 10:00:21.380921| 389015618226997618| {11092}| 00:03:25 | 11626| transactionid| sharelock| 2019-09-17 10:00:21.380921| 389015618226997618| {12213}| 00:01:55 | 11626| transactionid| sharelock| 2019-09-17 10:00:21.380921| 389015618226997618| {12751}| 00:00:01 | 11629| transactionid| sharelock| 2019-09-17 10:00:24.331935| 389015618226997618| {11092}| 00:03:22 | 11629| transactionid| sharelock| 2019-09-17 10:00:24.331935| 389015618226997618| {12007}| 00:00:01 | 12007| transactionid| sharelock| 2019-09-17 10:05:03.327933| 389015618226997618| {11629}| 00:00:13 | 12007| transactionid| sharelock| 2019-09-17 10:05:03.327933| 389015618226997618| {11092}| 00:01:10 | 12007| transactionid| sharelock| 2019-09-17 10:05:03.327933| 389015618226997618| {11288}| 00:00:05 | 12213| transactionid| sharelock| 2019-09-17 10:06:07.328019| 389015618226997618| {12007}| 00:00:10
TAKING LOCKS BY LOCKTYPES BY QUERYID
Request
WITH
lt AS
(
SELECT
pid ,
locktype ,
mode ,
timepoint ,
queryid ,
blocking_pids ,
MIN ( timepoint ) OVER (PARTITION BY pid , locktype ,mode ) as started
FROM
activity_hist.archive_locking
WHERE
timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND
pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
granted AND
queryid IS NOT NULL
GROUP BY
pid ,
locktype ,
mode ,
timepoint ,
queryid ,
blocking_pids
)
SELECT
lt.pid ,
lt.locktype ,
lt.mode ,
lt.started ,
lt.queryid ,
lt.blocking_pids ,
COUNT(*) * interval '1 second' as duration
FROM lt
GROUP BY
lt.pid ,
lt.locktype ,
lt.mode ,
lt.started ,
lt.queryid ,
lt.blocking_pids
ORDER BY 4
Example
| TAKING LOCKS BY LOCKTYPES BY QUERYID +----------+-------------------------+------ --------------+------------------------------+---- -+--------------------+------------ -------- | pid| locktype| mode| started| queryid| blocking_pids| duration +----------+-------------------------+----------- ---------+------------------------------+--------- ---+--------------------+----------------- --- | 11288| relation| RowExclusiveLock| 2019-09-17 10:00:00.302936| 389015618226997618| {11092}| 00:03:34 | 11092| transactionid| ExclusiveLock| 2019-09-17 10:00:00.302936| 389015618226997618| {}| 00:03:34 | 11288| relation| RowExclusiveLock| 2019-09-17 10:00:00.302936| 389015618226997618| {}| 00:00:10 | 11092| relation| RowExclusiveLock| 2019-09-17 10:00:00.302936| 389015618226997618| {}| 00:03:34 | 11092| virtualxid| ExclusiveLock| 2019-09-17 10:00:00.302936| 389015618226997618| {}| 00:03:34 | 11288| virtualxid| ExclusiveLock| 2019-09-17 10:00:00.302936| 389015618226997618| {11092}| 00:03:34 | 11288| transactionid| ExclusiveLock| 2019-09-17 10:00:00.302936| 389015618226997618| {11092}| 00:03:34 | 11288| tuple| AccessExclusiveLock| 2019-09-17 10:00:00.302936| 389015618226997618| {11092}| 00:03:34
Using blocking history when analyzing performance incidents.
- A query with queryid=389015618226997618 executed by a process with pid=11288 has been waiting for a lock since 2019-09-17 10:00:00 for 3 minutes.
- The lock was held by a process with pid=11092
- A process with pid=11092 executing a query with queryid=389015618226997618 since 2019-09-17 10:00:00 held the lock for 3 minutes.
Π‘onclusion
Now, I hope, the most interesting and useful thing will begin - collecting statistics and analyzing cases on the history of expectations and blocking.
In the future, I would like to believe, you will get a set of some kind of note (similar to Oracle's metalink).
In general, it is for this reason that the methodology used is displayed as quickly as possible for public familiarization.
In the very near future I will try to put the project on github.
Source: habr.com