Un intento de crear un análogo ASH para PostgreSQL
Declaración de problemas
Para optimizar as consultas de PostgreSQL, é moi necesaria a capacidade de analizar o historial de actividade, en particular, as esperas, os bloqueos e as estatísticas da táboa.
extensión pgsentinel :
«Toda a información acumulada gárdase só na memoria RAM, e a cantidade de memoria consumida está regulada polo número dos últimos rexistros almacenados.
Engádese o campo queryid: o mesmo queryid da extensión pg_stat_statements (requírese a instalación previa).«
Isto, por suposto, axudaría moito, pero o máis problemático é o primeiro punto”.Toda a información acumulada só se almacena na memoria RAM ", é dicir. hai un impacto na base obxectivo. Ademais, non hai historial de bloqueo nin estatísticas de táboa. Eses. a solución é en xeral incompleta: "Aínda non hai ningún paquete preparado para a instalación. Suxírese descargar as fontes e montar vostede mesmo a biblioteca. Primeiro cómpre instalar o paquete "devel" para o seu servidor e establecer o camiño para pg_config na variable PATH.".
En xeral, hai moito alboroto e, no caso de bases de datos de produción serias, é posible que non se poida facer nada co servidor. Necesitamos inventar algo propio de novo.
Aviso
Debido ao volume bastante grande e debido ao período de proba incompleto, o artigo é principalmente con fins informativos, máis ben como un conxunto de teses e resultados intermedios.
Máis tarde prepararase material máis detallado, por partes
Borrador de requisitos para a solución
É necesario desenvolver unha ferramenta que che permita almacenar:
pg_stat_activity ver historial Historial de bloqueo de sesións usando a vista pg_locks
Requisito da solución– Minimizar o impacto na base de datos de destino.
Idea xeral– o axente de recollida de datos non se inicia na base de datos de destino, senón na base de datos de seguimento como un servizo de sistema. Si, é posible perda de datos, pero isto non é fundamental para informar, pero non hai ningún impacto na base de datos de destino en termos de memoria e espazo en disco. E no caso de utilizar un pool de conexións, o impacto nos procesos dos usuarios é mínimo.
Fases de implantación
1.Táboas de servizo
Utilízase un esquema separado para almacenar táboas, para non complicar a análise das principais táboas utilizadas.
DROP SCHEMA IF EXISTS activity_hist ;
CREATE SCHEMA activity_hist AUTHORIZATION monitor ;
Importante: o esquema non se crea na base de datos de destino, senón na base de datos de seguimento.
pg_stat_activity ver historial
Utilízase unha táboa para almacenar instantáneas actuais da vista pg_stat_activity
activity_hist.history_pg_stat_activity :
--ACTIVITY_HIST.HISTORY_PG_STAT_ACTIVITY
DROP TABLE IF EXISTS activity_hist.history_pg_stat_activity;
CREATE TABLE activity_hist.history_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
);
Para acelerar a inserción, sen índices nin restricións.
Para almacenar o propio historial, utilízase unha táboa particionada:
activity_hist.archive_pg_stat_activity :
DROP TABLE IF EXISTS activity_hist.archive_pg_stat_activity;
CREATE TABLE activity_hist.archive_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
)
PARTITION BY RANGE (timepoint);
Dado que neste caso non existen requisitos de velocidade de inserción, creáronse algúns índices para axilizar a creación de informes.
Historial de bloqueo de sesións
Utilízase unha táboa para almacenar instantáneas actuais dos bloqueos de sesión:
activity_hist.history_locking :
--ACTIVITY_HIST.HISTORY_LOCKING
DROP TABLE IF EXISTS activity_hist.history_locking;
CREATE TABLE activity_hist.history_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
);
Ademais, para acelerar a inserción, non hai índices nin restricións.
Para almacenar o propio historial, utilízase unha táboa particionada:
activity_hist.archive_locking:
DROP TABLE IF EXISTS activity_hist.archive_locking;
CREATE TABLE activity_hist.archive_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
PARTITION BY RANGE (timepoint);
Dado que neste caso non existen requisitos de velocidade de inserción, creáronse algúns índices para axilizar a creación de informes.
2.Enchendo o historial actual
Para recoller directamente as instantáneas de vista, úsase un script bash que executa a función plpgsql.
plpgsql A función dblink accede ás vistas da base de datos de destino e insire filas nas táboas de servizo da base de datos de seguimento.
get_current_activity.sql
CREATE OR REPLACE FUNCTION activity_hist.get_current_activity( current_host text , current_s_name text , current_s_pass text ) RETURNS BOOLEAN AS $$
DECLARE
database_rec record;
dblink_str text ;
BEGIN
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||current_host||' port=5432 dbname=postgres'||
' user='||current_s_name||' password='||current_s_pass|| ' '')';
--------------------------------------------------------------------
--GET pg_stat_activity stats
INSERT INTO activity_hist.history_pg_stat_activity
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
datid ,
datname ,
pid ,
usesysid ,
usename ,
application_name ,
client_addr ,
client_hostname ,
client_port ,
backend_start ,
xact_start ,
query_start ,
state_change ,
wait_event_type ,
wait_event ,
state ,
backend_xid ,
backend_xmin ,
query ,
backend_type
FROM pg_stat_activity
')
AS t (
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text
)
);
---------------------------------------
--ACTIVITY_HIST.HISTORY_LOCKING
INSERT INTO activity_hist.history_locking
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
lock.locktype,
lock.relation,
lock.mode,
lock.transactionid as tid,
lock.virtualtransaction as vtid,
lock.pid,
pg_blocking_pids(lock.pid),
lock.granted
FROM pg_catalog.pg_locks lock LEFT JOIN pg_catalog.pg_database db ON db.oid = lock.database
WHERE NOT lock.pid = pg_backend_pid()
')
AS t (
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
);
PERFORM dblink_disconnect('LINK1');
RETURN TRUE ;
END
$$ LANGUAGE plpgsql;
Para recoller instantáneas de vista, utilízanse o servizo systemd e dous scripts:
pg_current_activity.service
# /etc/systemd/system/pg_current_activity.service
[Unit]
Description=Collect history of pg_stat_activity , pg_locks
Wants=pg_current_activity.timer
[Service]
Type=forking
StartLimitIntervalSec=0
ExecStart=/home/postgres/pgutils/demon/get_current_activity.sh 10.124.70.40 postgres postgres
[Install]
WantedBy=multi-user.target
pg_current_activity.timer
# /etc/systemd/system/pg_current_activity.timer
[Unit]
Description=Run pg_current_activity.sh every 1 second
Requires=pg_current_activity.service
[Timer]
Unit=pg_current_activity.service
OnCalendar=*:*:0/1
AccuracySec=1
[Install]
WantedBy=timers.target
Comezamos o servizo:
# systemctl daemon-reload
# systemctl inicio pg_current_activity.service
Así, o historial das vistas recóllese en forma de instantáneas segundo a segundo. Por suposto, se todo se deixa como está, as mesas aumentarán moi rapidamente de tamaño e un traballo máis ou menos produtivo será imposible.
É necesario organizar o arquivo de datos.
3. Historial de arquivo
Para o arquivo utilízanse o arquivo de táboas particionadas*.
Cada hora créanse novas particións, mentres que os datos antigos son eliminados das táboas do historial*, polo que o tamaño das táboas do historial* non cambia moito e a velocidade de inserción non se degrada co paso do tempo.
A creación de novas seccións realízase pola función plpgsql activity_hist.archive_current_activity. O algoritmo de traballo é moi sinxelo (usando o exemplo da sección para a táboa archive_pg_stat_activity).
Crea e enche unha nova sección
EXECUTE format(
'CREATE TABLE ' || partition_name ||
' PARTITION OF activity_hist.archive_pg_stat_activity FOR VALUES FROM ( %L ) TO ( %L ) ' ,
to_char(date_trunc('year', partition_min_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_min_range ),'MM')||'-'||
to_char(date_trunc('day', partition_min_range ),'DD')||' '||
to_char(date_trunc('hour', partition_min_range ),'HH24')||':00',
to_char(date_trunc('year', partition_max_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_max_range ),'MM')||'-'||
to_char(date_trunc('day', partition_max_range ),'DD')||' '||
to_char(date_trunc('hour', partition_max_range ),'HH24')||':00'
);
INSERT INTO activity_hist.archive_pg_stat_activity
(
SELECT *
FROM activity_hist.history_pg_stat_activity
WHERE timepoint BETWEEN partition_min_range AND partition_max_range
);
Creación de índices
EXECUTE format (
'CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint )'
);
EXECUTE format ('CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint , queryid )'
);
Eliminando datos antigos da táboa history_pg_stat_activity
DELETE
FROM activity_hist.history_pg_stat_activity
WHERE timepoint < partition_max_range;
Por suposto, de cando en vez, elimínanse seccións antigas por ser innecesarias.
Informes básicos
En realidade, por que se fai todo isto? Para obter informes que recordan moi vagamente ao AWR de Oracle.
É importante engadir que para recibir informes, cómpre establecer unha conexión entre as vistas pg_stat_activity e pg_stat_statements. As táboas enlázanse engadindo unha columna "queryid" ás táboas "history_pg_stat_activity", "archive_pg_stat_activity". O método de engadir un valor de columna está fóra do alcance deste artigo e descríbese aquí − pg_stat_statements + pg_stat_activity + loq_query = pg_ash? .
TEMPO TOTAL DA CPU PARA CONSULTAS
Solicitude:
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( aa.wait_event_type IS NULL ) ANDaa.state = 'active'
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type IS NULL )AND ha.state = 'active'
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type IS NOT NULL )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type IS NOT NULL )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
aa.wait_event IS NOT NULL
GROUP BY aa.wait_event_type , aa.wait_event
UNION
SELECT
ha.wait_event_type , ha.wait_event
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
ha.wait_event IS NOT NULL
GROUP BY ha.wait_event_type , ha.wait_event
)
SELECT wait_event_type , wait_event
FROM hist
GROUP BY wait_event_type , wait_event
ORDER BY 1 ASC,2 ASC
----------------------------------------------------------------------
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type = waitings_stat_rec.wait_event_type AND aa.wait_event = waitings_stat_rec.wait_event )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type = waitings_stat_rec.wait_event_type AND ha.wait_event = waitings_stat_rec.wait_event )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
UNION
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
ORDER BY 1
SELECT
blocking_pids
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
UNION
SELECT
blocking_pids
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
ORDER BY 1
---------------------------------------------------------------
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.archive_pg_stat_activity
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
UNION
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.history_pg_stat_activity_for_reports
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
ORDER BY 5 , 1
As consultas básicas mostradas e os informes resultantes xa facilitan moito a vida á hora de analizar incidentes de rendemento.
En base a consultas básicas, podes obter un informe que se asemella vagamente ao AWR de Oracle. Exemplo de informe resumo
+------------------------------------------------- ----------------------------------- | MEMORIA CONSOLIDADA DE ACTIVIDADE E ESPERAS.
Continuará. O seguinte na liña é a creación dun historial de bloqueos (pg_stat_locks), unha descrición máis detallada do proceso de enchido de táboas.