Un tentativu di creà un analogu ASH per PostgreSQL
Formulazione di u prublema
Per ottimisà e dumande PostgreSQL, l'abilità di analizà a storia di l'attività, in particulare, aspetta, chjusi, è statistiche di tavula, hè assai necessaria.
estensione pgsentinel :
«Tutte l'infurmazioni accumulate sò guardate solu in RAM, è a quantità di memoria cunsumata hè regulata da u numeru di l'ultimi records almacenati.
U campu di queryid hè aghjuntu - u listessu queryid da l'estensione pg_stat_statements (pre-installazione necessaria).«
Questu, sicuru, aiuterebbe assai, ma a cosa più fastidiosa hè u primu puntu ".Tutte l'infurmazioni accumulate sò guardate solu in RAM ", i.e. ci hè un impattu nantu à a basa di destinazione. Inoltre, ùn ci hè micca una storia di serratura è statistiche di tavula. Quelli. a suluzione hè in generale incompleta: "Ùn ci hè ancu un pacchettu prontu per a stallazione. Hè cunsigliatu di scaricà e fonti è assemble a biblioteca sè stessu. Prima avete bisognu di installà u pacchettu "devel" per u vostru servitore è stabilisce a strada per pg_config in a variabile PATH.".
In generale, ci hè assai furia, è in u casu di basa di dati di pruduzzione seria, pò esse micca pussibule di fà nunda cù u servitore. Avemu bisognu di cullà cù qualcosa di u nostru novu.
Avvisu.
A causa di u voluminu piuttostu grande è per via di u periodu di teste incomplete, l'articulu hè principalmente per scopi informativi, piuttostu cum'è un inseme di tesi è risultati intermedi.
U materiale più detallatu serà preparatu dopu, in parte
Esigenze abbozzate per a suluzione
Hè necessariu di sviluppà un strumentu chì vi permette di almacenà:
pg_stat_activity vede a storia Storia di bloccu di sessione utilizendu a vista pg_locks
Esigenza di suluzione- minimizà l'impattu nantu à a basa di dati di destinazione.
Idea generale- l'agente di cullizzioni di dati hè lanciatu micca in a basa di dati di destinazione, ma in a basa di dati di monitoraghju cum'è un serviziu di sistema. Iè, una certa perdita di dati hè pussibule, ma questu ùn hè micca criticu per u rapportu, ma ùn ci hè micca impattu nantu à a basa di dati di destinazione in termini di memoria è spaziu di discu. È in u casu di utilizà una piscina di cunnessione, l'impattu nantu à i prucessi di l'utilizatori hè minimu.
Fasi di implementazione
1.Tavule di serviziu
Un schema separatu hè utilizatu per almacenà e tavule, per ùn cumplicà l'analisi di e tavule principali utilizati.
DROP SCHEMA IF EXISTS activity_hist ;
CREATE SCHEMA activity_hist AUTHORIZATION monitor ;
Impurtante: U schema ùn hè micca creatu in a basa di dati di destinazione, ma in a basa di dati di surviglianza.
pg_stat_activity vede a storia
Una tavula hè aduprata per almacenà snapshots attuali di a vista pg_stat_activity
activity_hist.history_pg_stat_activity :
--ACTIVITY_HIST.HISTORY_PG_STAT_ACTIVITY
DROP TABLE IF EXISTS activity_hist.history_pg_stat_activity;
CREATE TABLE activity_hist.history_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
);
Per accelerà l'inserimentu - senza indici o restrizioni.
Per almacenà a storia stessu, una tavula partizionata hè aduprata:
activity_hist.archive_pg_stat_activity :
DROP TABLE IF EXISTS activity_hist.archive_pg_stat_activity;
CREATE TABLE activity_hist.archive_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
)
PARTITION BY RANGE (timepoint);
Siccomu in questu casu ùn ci hè micca esigenza per a velocità di inserimentu, alcuni indici sò stati creati per accelerà a creazione di rapporti.
Storia di bloccu di sessione
Una tavula hè aduprata per almacenà snapshots attuali di i blocchi di sessione:
activity_hist.history_locking :
--ACTIVITY_HIST.HISTORY_LOCKING
DROP TABLE IF EXISTS activity_hist.history_locking;
CREATE TABLE activity_hist.history_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
);
Inoltre, per accelerà l'inserimentu, ùn ci sò micca indici o restrizioni.
Per almacenà a storia stessu, una tavula partizionata hè aduprata:
activity_hist.archive_locking:
DROP TABLE IF EXISTS activity_hist.archive_locking;
CREATE TABLE activity_hist.archive_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
PARTITION BY RANGE (timepoint);
Siccomu in questu casu ùn ci hè micca esigenza per a velocità di inserimentu, alcuni indici sò stati creati per accelerà a creazione di rapporti.
2.Filling a storia attuale
Per cullà direttamente snapshots di vista, hè utilizatu un script bash chì esegue a funzione plpgsql.
plpgsql A funzione dblink accede à vista in a basa di dati di destinazione è inserisce fila in tavule di serviziu in a basa di dati di monitoraghju.
get_current_activity.sql
CREATE OR REPLACE FUNCTION activity_hist.get_current_activity( current_host text , current_s_name text , current_s_pass text ) RETURNS BOOLEAN AS $$
DECLARE
database_rec record;
dblink_str text ;
BEGIN
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||current_host||' port=5432 dbname=postgres'||
' user='||current_s_name||' password='||current_s_pass|| ' '')';
--------------------------------------------------------------------
--GET pg_stat_activity stats
INSERT INTO activity_hist.history_pg_stat_activity
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
datid ,
datname ,
pid ,
usesysid ,
usename ,
application_name ,
client_addr ,
client_hostname ,
client_port ,
backend_start ,
xact_start ,
query_start ,
state_change ,
wait_event_type ,
wait_event ,
state ,
backend_xid ,
backend_xmin ,
query ,
backend_type
FROM pg_stat_activity
')
AS t (
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text
)
);
---------------------------------------
--ACTIVITY_HIST.HISTORY_LOCKING
INSERT INTO activity_hist.history_locking
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
lock.locktype,
lock.relation,
lock.mode,
lock.transactionid as tid,
lock.virtualtransaction as vtid,
lock.pid,
pg_blocking_pids(lock.pid),
lock.granted
FROM pg_catalog.pg_locks lock LEFT JOIN pg_catalog.pg_database db ON db.oid = lock.database
WHERE NOT lock.pid = pg_backend_pid()
')
AS t (
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
);
PERFORM dblink_disconnect('LINK1');
RETURN TRUE ;
END
$$ LANGUAGE plpgsql;
Per cullà snapshots di vista, u serviziu systemd è dui script sò usati:
pg_current_activity.service
# /etc/systemd/system/pg_current_activity.service
[Unit]
Description=Collect history of pg_stat_activity , pg_locks
Wants=pg_current_activity.timer
[Service]
Type=forking
StartLimitIntervalSec=0
ExecStart=/home/postgres/pgutils/demon/get_current_activity.sh 10.124.70.40 postgres postgres
[Install]
WantedBy=multi-user.target
pg_current_activity.timer
# /etc/systemd/system/pg_current_activity.timer
[Unit]
Description=Run pg_current_activity.sh every 1 second
Requires=pg_current_activity.service
[Timer]
Unit=pg_current_activity.service
OnCalendar=*:*:0/1
AccuracySec=1
[Install]
WantedBy=timers.target
Assigna diritti à i script:
# chmod 755 pg_current_activity.timer
# chmod 755 pg_current_activity.service
Cuminciamu u serviziu:
# systemctl daemon-ricaricà
# systemctl start pg_current_activity.service
Cusì, a storia di vista hè cullata in forma di snapshots second-by-second. Di sicuru, se tuttu hè lasciatu cum'è, i tavulini aumentanu assai rapidamente in grandezza è u travagliu più o menu pruduttivu diventerà impussibile.
Hè necessariu urganizà l'archiviazione di dati.
3. Archivamentu di a storia
Per l'archiviazione, l'archiviu di e tabelle partizionate * sò utilizati.
Novi partizioni sò creati ogni ora, mentre chì i vechji dati sò sguassati da e tavule di a storia *, cusì a dimensione di e tavule di a storia * ùn cambia assai è a velocità di inserimentu ùn si degrada cù u tempu.
A creazione di novi sezioni hè realizatu da a funzione plpgsql activity_hist.archive_current_activity. L'algoritmu di u travagliu hè assai simplice (usendu l'esempiu di a rùbbrica per a table archive_pg_stat_activity).
Crea è compie una nova sezione
EXECUTE format(
'CREATE TABLE ' || partition_name ||
' PARTITION OF activity_hist.archive_pg_stat_activity FOR VALUES FROM ( %L ) TO ( %L ) ' ,
to_char(date_trunc('year', partition_min_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_min_range ),'MM')||'-'||
to_char(date_trunc('day', partition_min_range ),'DD')||' '||
to_char(date_trunc('hour', partition_min_range ),'HH24')||':00',
to_char(date_trunc('year', partition_max_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_max_range ),'MM')||'-'||
to_char(date_trunc('day', partition_max_range ),'DD')||' '||
to_char(date_trunc('hour', partition_max_range ),'HH24')||':00'
);
INSERT INTO activity_hist.archive_pg_stat_activity
(
SELECT *
FROM activity_hist.history_pg_stat_activity
WHERE timepoint BETWEEN partition_min_range AND partition_max_range
);
Creazione di indici
EXECUTE format (
'CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint )'
);
EXECUTE format ('CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint , queryid )'
);
Eliminà i vechji dati da a table history_pg_stat_activity
DELETE
FROM activity_hist.history_pg_stat_activity
WHERE timepoint < partition_max_range;
Di sicuru, da u tempu à u tempu, i vechji sezzioni sò sguassati cum'è innecessarii.
Rapporti basi
In verità, perchè tuttu questu hè fattu? Per ottene rapporti assai vagamente reminiscente di l'AWR di Oracle.
Hè impurtante d'aghjunghje chì per riceve rapporti, avete bisognu di custruisce una cunnessione trà pg_stat_activity è pg_stat_statements. I tavule sò ligati aghjunghjendu una colonna "queryid" à e tavule "history_pg_stat_activity", "archive_pg_stat_activity". U metudu di aghjunghje un valore di colonna hè fora di u scopu di stu articulu è hè descrittu quì - pg_stat_statements + pg_stat_activity + loq_query = pg_ash? .
TEMPU TOTALE DI CPU PER CUMANDE
dumanda :
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( aa.wait_event_type IS NULL ) ANDaa.state = 'active'
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type IS NULL )AND ha.state = 'active'
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type IS NOT NULL )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type IS NOT NULL )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
aa.wait_event IS NOT NULL
GROUP BY aa.wait_event_type , aa.wait_event
UNION
SELECT
ha.wait_event_type , ha.wait_event
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
ha.wait_event IS NOT NULL
GROUP BY ha.wait_event_type , ha.wait_event
)
SELECT wait_event_type , wait_event
FROM hist
GROUP BY wait_event_type , wait_event
ORDER BY 1 ASC,2 ASC
----------------------------------------------------------------------
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type = waitings_stat_rec.wait_event_type AND aa.wait_event = waitings_stat_rec.wait_event )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type = waitings_stat_rec.wait_event_type AND ha.wait_event = waitings_stat_rec.wait_event )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
UNION
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
ORDER BY 1
SELECT
blocking_pids
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
UNION
SELECT
blocking_pids
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
ORDER BY 1
---------------------------------------------------------------
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.archive_pg_stat_activity
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
UNION
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.history_pg_stat_activity_for_reports
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
ORDER BY 5 , 1
E dumande di basa mostrate è i rapporti risultanti facenu digià a vita assai più faciule in l'analisi di incidenti di rendiment.
Basatu nantu à e dumande basiche, pudete uttene un rapportu chì s'assumiglia vagamente à l'AWR di Oracle. Esempiu di rapportu riassuntu
+------------------------------------------------- ----------------------------------- | RAPPORTU CONSOLIDAT PER L'ATTIVITÀ E L'ASPETTA.
À seguità. A seguita in linea hè a creazione di una storia di serratura (pg_stat_locks), una descrizzione più dettagliata di u prucessu di riempimentu di e tavule.