Адзін з метадаў атрымання гісторыі блакіровак у PostgreSQL

Працяг артыкула «Спроба стварыць аналаг ASH для PostgreSQL «.

У артыкуле будзе разгледжана і паказана на канкрэтных запытах і прыкладах - якую ж карысную інфармацыю можна атрымаць з дапамогай гісторыі прадстаўлення pg_locks.

Папярэджанне.
У сувязі з навізной тэмы і незавяршэннем перыяду тэсціравання, артыкул можа змяшчаць памылкі. Крытыка і заўвагі ўсяляк вітаюцца і чакаюцца.

Уваходныя дадзеныя

Гісторыя прадстаўлення pg_locks

archive_locking

CREATE TABLE archive_locking 
(       timepoint timestamp without time zone ,
	locktype text ,
	relation oid ,
	mode text ,
	tid xid ,
	vtid text ,
	pid integer ,
	blocking_pids integer[] ,
	granted boolean ,
        queryid bigint 
);

Па сутнасці, табліца аналагічная табліцы archive_pg_stat_activity, больш падрабязна апісанай тут pg_stat_statements + pg_stat_activity + loq_query = pg_ash? і тут - Спроба стварыць аналаг ASH для PostgreSQL.

Для запаўнення слупка queryid выкарыстоўваецца функцыя

update_history_locking_by_queryid

--update_history_locking_by_queryid.sql
CREATE OR REPLACE FUNCTION update_history_locking_by_queryid() RETURNS boolean AS $$
DECLARE
  result boolean ;
  current_minute double precision ; 
  
  start_minute integer ;
  finish_minute integer ;
  
  start_period timestamp without time zone ;
  finish_period timestamp without time zone ;
  
  lock_rec record ; 
  endpoint_rec record ; 
  
  current_hour_diff double precision ;
BEGIN
  RAISE NOTICE '***update_history_locking_by_queryid';
  
  result = TRUE ;
  
  current_minute = extract ( minute from now() );

  SELECT * FROM endpoint WHERE is_need_monitoring
  INTO endpoint_rec ;
  
  current_hour_diff = endpoint_rec.hour_diff ;
  
  IF current_minute < 5 
  THEN
	RAISE NOTICE 'Current time is less than 5 minute.';
	
	start_period = date_trunc('hour',now()) + (current_hour_diff * interval '1 hour');
    finish_period = start_period - interval '5 minute' ;
  ELSE 
    finish_minute =  extract ( minute from now() ) / 5 ;
    start_minute =  finish_minute - 1 ;
  
    start_period = date_trunc('hour',now()) + interval '1 minute'*start_minute*5+(current_hour_diff * interval '1 hour');
    finish_period = date_trunc('hour',now()) + interval '1 minute'*finish_minute*5+(current_hour_diff * interval '1 hour') ;
    
  END IF ;  
  
  RAISE NOTICE 'start_period = %', start_period;
  RAISE NOTICE 'finish_period = %', finish_period;

	FOR lock_rec IN   
	WITH act_queryid AS
	 (
		SELECT 
				pid , 
				timepoint ,
				query_start AS started ,			
				MAX(timepoint) OVER (PARTITION BY pid ,	query_start   ) AS finished ,			
				queryid 
		FROM 
				activity_hist.history_pg_stat_activity 			
		WHERE 			
				timepoint BETWEEN start_period and 
								  finish_period
		GROUP BY 
				pid , 
				timepoint ,  
				query_start ,
				queryid 
	 ),
	 lock_pids AS
		(
			SELECT
				hl.pid , 
				hl.locktype  ,
				hl.mode ,
				hl.timepoint , 
				MIN ( timepoint ) OVER (PARTITION BY pid , locktype  ,mode ) as started 
			FROM 
				activity_hist.history_locking hl
			WHERE 
				hl.timepoint between start_period and 
								     finish_period
			GROUP BY 
				hl.pid , 
				hl.locktype  ,
				hl.mode ,
				hl.timepoint 
		)
	SELECT 
		lp.pid , 
		lp.locktype  ,
		lp.mode ,
		lp.timepoint ,     
		aq.queryid 
	FROM lock_pids 	lp LEFT OUTER JOIN act_queryid aq ON ( lp.pid = aq.pid AND lp.started BETWEEN aq.started AND aq.finished )
	WHERE aq.queryid IS NOT NULL 
	GROUP BY  
		lp.pid , 
		lp.locktype  ,
		lp.mode ,
		lp.timepoint , 
		aq.queryid
	LOOP
		UPDATE activity_hist.history_locking SET queryid = lock_rec.queryid 
		WHERE pid = lock_rec.pid AND locktype = lock_rec.locktype AND mode = lock_rec.mode AND timepoint = lock_rec.timepoint ;	
	END LOOP;    
  
  RETURN result ;
END
$$ LANGUAGE plpgsql;

Тлумачэнне: значэнне слупка queryid абнаўляецца ў табліцы history_locking, і затым пры стварэнні новай секцыі для табліцы archive_locking, значэнне будзе захавана ў гістарычных значэннях.

выхадныя дадзеныя

Агульная інфармацыя, па працэсах у цэлым.

WAITING FOR LOCKS BY LOCKTYPES

Запыт

WITH
t AS
(
	SELECT 
		locktype  ,
		mode ,
		count(*) as total 
	FROM 
		activity_hist.archive_locking
	WHERE 
		timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND 
		NOT granted
	GROUP BY 
		locktype  ,
		mode  
)
SELECT 
	locktype  ,
	mode ,
	total * interval '1 second' as duration			
FROM t 		
ORDER BY 3 DESC 

Прыклад

| WAITING FOR LOCKS BY LOCKTYPES +--------------------+----------------------- -------+-------------------- | locktype| mode| duration +--------------------+--------------------------- ---+-------------------- | transactionid| ShareLock| 19:39:26 | tuple| AccessExclusiveLock| 00:03:35 +--------------------+--------------------------- -------+--------------------

TAKINGS OF LOCKS BY LOCKTYPES

Запыт

WITH
t AS
(
	SELECT 
		locktype  ,
		mode ,
		count(*) as total 
	FROM 
		activity_hist.archive_locking
	WHERE 
		timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND 
		granted
	GROUP BY 
		locktype  ,
		mode  
)
SELECT 
	locktype  ,
	mode ,
	total * interval '1 second' as duration			
FROM t 		
ORDER BY 3 DESC 

Прыклад

| TAKINGS OF LOCKS BY LOCKTYPES +--------------------+----------------------- -------+-------------------- | locktype| mode| duration +--------------------+--------------------------- ---+-------------------- | relation| RowExclusiveLock| 51:11:10 | virtualxid| ExclusiveLock| 48:10:43 | transactionid| ExclusiveLock| 44:24:53 | relation| AccessShareLock| 20:06:13 | tuple| AccessExclusiveLock| 17:58:47 | tuple| ExclusiveLock| 01:40:41 | relation| ShareUpdateExclusiveLock| 00:26:41 | object| RowExclusiveLock| 00:00:01 | transactionid| ShareLock| 00:00:01 | extend| ExclusiveLock| 00:00:01 +--------------------+--------------------------- -------+--------------------

Дэталёвая інфармацыя, па канкрэтных queryid запытаў

WAITING FOR LOCKS BY LOCKTYPES BY QUERYID

Запыт

WITH
lt AS
(
	SELECT
		pid , 
		locktype  ,
		mode ,
		timepoint , 
		queryid , 
		blocking_pids ,
                MIN ( timepoint ) OVER (PARTITION BY pid , locktype  ,mode ) as started  
	FROM 
		activity_hist.archive_locking
	WHERE 
		timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND 
			                  pg_stat_history_end+(current_hour_diff * interval '1 hour') AND 
		NOT granted AND
	       queryid IS NOT NULL 
	GROUP BY 
	        pid , 
		locktype  ,
		mode ,
		timepoint ,
		queryid ,
		blocking_pids 
)
SELECT 
        lt.pid , 
	lt.locktype  ,
	lt.mode ,			
        lt.started ,
	lt.queryid  ,
	lt.blocking_pids ,
	COUNT(*)  * interval '1 second'	 as duration		
FROM lt 	
GROUP BY 
	lt.pid , 
        lt.locktype  ,
	lt.mode ,			
        lt.started ,
        lt.queryid ,
	lt.blocking_pids 
ORDER BY 4

Прыклад

| WAITING FOR LOCKS BY LOCKTYPES BY QUERYID +----------+-------------------------+----- ---------------+------------------------------+--- -----------------+--------------------+----------- --------- | pid| locktype| mode| started| queryid| blocking_pids| duration +----------+-------------------------+----------- ---------+------------------------------+--------- -----------+--------------------+----------------- --- | 11288 | transactionid| ShareLock| 2019-09-17 10:00:00.302936 | 389015618226997618 | {11092} | 00:03:34 | 11626 | transactionid| ShareLock| 2019-09-17 10:00:21.380921 | 389015618226997618 | {12380} | 00:00:29 | 11626 | transactionid| ShareLock| 2019-09-17 10:00:21.380921 | 389015618226997618 | {11092} | 00:03:25 | 11626 | transactionid| ShareLock| 2019-09-17 10:00:21.380921 | 389015618226997618 | {12213} | 00:01:55 | 11626 | transactionid| ShareLock| 2019-09-17 10:00:21.380921 | 389015618226997618 | {12751} | 00:00:01 | 11629 | transactionid| ShareLock| 2019-09-17 10:00:24.331935 | 389015618226997618 | {11092} | 00:03:22 | 11629 | transactionid| ShareLock| 2019-09-17 10:00:24.331935 | 389015618226997618 | {12007} | 00:00:01 | 12007 | transactionid| ShareLock| 2019-09-17 10:05:03.327933 | 389015618226997618 | {11629} | 00:00:13 | 12007 | transactionid| ShareLock| 2019-09-17 10:05:03.327933 | 389015618226997618 | {11092} | 00:01:10 | 12007 | transactionid| ShareLock| 2019-09-17 10:05:03.327933 | 389015618226997618 | {11288} | 00:00:05 | 12213 | transactionid| ShareLock| 2019-09-17 10:06:07.328019 | 389015618226997618 | {12007} | 00:00:10

TAKING LOCKS BY LOCKTYPES BY QUERYID

Запыт

WITH
lt AS
(
	SELECT
		pid , 
		locktype  ,
		mode ,
		timepoint , 
		queryid , 
		blocking_pids ,
                MIN ( timepoint ) OVER (PARTITION BY pid , locktype  ,mode ) as started  
	FROM 
		activity_hist.archive_locking
	WHERE 
		timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND 
			                  pg_stat_history_end+(current_hour_diff * interval '1 hour') AND 
		granted AND
		queryid IS NOT NULL 
	GROUP BY 
	        pid , 
		locktype  ,
		mode ,
		timepoint ,
		queryid ,
		blocking_pids 
)
SELECT 
        lt.pid , 
	lt.locktype  ,
	lt.mode ,			
        lt.started ,
	lt.queryid  ,
	lt.blocking_pids ,
	COUNT(*)  * interval '1 second'	 as duration			
FROM lt 	
GROUP BY 
	lt.pid , 
	lt.locktype  ,
	lt.mode ,			
        lt.started ,
	lt.queryid ,
	lt.blocking_pids 
ORDER BY 4

Прыклад

| TAKING LOCKS BY LOCKTYPES BY QUERYID +----------+-------------------------+------ --------------+-------------------------------------- ----------------+--------------------+------------ -------- | pid| locktype| mode| started| queryid| blocking_pids| duration +----------+-------------------------+----------- ---------+------------------------------+--------- -----------+--------------------+----------------- --- | 11288 | relation| RowExclusiveLock| 2019-09-17 10:00:00.302936 | 389015618226997618 | {11092} | 00:03:34 | 11092 | transactionid| ExclusiveLock| 2019-09-17 10:00:00.302936 | 389015618226997618 | {}| 00:03:34 | 11288 | relation| RowExclusiveLock| 2019-09-17 10:00:00.302936 | 389015618226997618 | {}| 00:00:10 | 11092 | relation| RowExclusiveLock| 2019-09-17 10:00:00.302936 | 389015618226997618 | {}| 00:03:34 | 11092 | virtualxid| ExclusiveLock| 2019-09-17 10:00:00.302936 | 389015618226997618 | {}| 00:03:34 | 11288 | virtualxid| ExclusiveLock| 2019-09-17 10:00:00.302936 | 389015618226997618 | {11092} | 00:03:34 | 11288 | transactionid| ExclusiveLock| 2019-09-17 10:00:00.302936 | 389015618226997618 | {11092} | 00:03:34 | 11288 | tuple| AccessExclusiveLock| 2019-09-17 10:00:00.302936 | 389015618226997618 | {11092} | 00:03:34

Выкарыстанне гісторыі блакіровак пры аналізе інцыдэнтаў прадукцыйнасці.

  1. Запыт з queryid=389015618226997618 выкананы працэсам з pid=11288 чакаў блакавання пачынаючы з 2019-09-17 10:00:00 на працягу 3-х хвілін.
  2. Блакаванне ўтрымліваў працэс з pid=11092
  3. Працэс з pid=11092 выконваючы запыт з queryid=389015618226997618 пачынаючы з 2019-09-17 10:00:00 утрымліваў блакаванне на працягу 3-х хвілін.

Вынік

Зараз, спадзяюся, пачнецца самае цікавае і карыснае-збор статыстыкі і аналіз кейсаў па гісторыі чаканняў і блакіровак.

У даляглядзе, жадаецца верыць, атрымаецца набор некаторых як бы note(па аналогу з металінкам Аракла).

Увогуле, менавіта з гэтай прычыны выкарыстоўваная методыка, максімальна хутка выводзіцца для ўсеагульнага азнаямлення.

У найбліжэйшы час пастараюся выкласці праект на github.

Крыніца: habr.com

Дадаць каментар