E Versuch en Analog vun ASH fir PostgreSQL ze kreéieren
Problemerklärung
Fir PostgreSQL Ufroen ze optimiséieren, ass d'Fäegkeet d'Aktivitéitsgeschicht ze analyséieren, besonnesch Waarden, Spären an Tabellstatistiken, ganz erfuerderlech.
pgsentinel Extensioun :
«All cumuléierten Informatioun gëtt nëmmen am RAM gespäichert, an de verbrauchte Betrag vun der Erënnerung gëtt duerch d'Zuel vun de leschte gespäichert records geregelt.
D'Queryid Feld gëtt bäigefüügt - déiselwecht Queryid vun der pg_stat_statements Extensioun (Pre-Installatioun erfuerderlech).«
Dëst géif natierlech vill hëllefen, awer déi schwieregst Saach ass den éischte Punkt.All cumuléiert Informatioun gëtt nëmmen am RAM gespäichert ", d.h. et gëtt en Impakt op d'Zilbasis. Zousätzlech, gëtt et keng Spär Geschicht an Dësch Statistiken. Déi. d'Léisung ass allgemeng onkomplett: "Et gëtt nach kee fäerdege Package fir d'Installatioun. Et gëtt proposéiert d'Quellen erofzelueden an d'Bibliothéik selwer zesummenzestellen. Dir musst als éischt de "devel" Package fir Äre Server installéieren an de Wee op pg_config an der PATH Variabel setzen.".
Am Allgemengen gëtt et vill Geschwëster, an am Fall vu seriöse Produktiounsdatenbanken kann et net méiglech sinn, eppes mam Server ze maachen. Mir mussen erëm eppes vun eisem eegenen erauskommen.
Warnung
Wéinst dem zimlech grousse Volumen a wéinst der onvollstänneger Testzäit ass den Artikel haaptsächlech vun enger informativer Natur, éischter als Set vun Thesen an Tëscheresultater.
Méi detailléiert Material gëtt spéider virbereet, an Deeler
Entworf Ufuerderunge fir d'Léisung
Et ass néideg en Tool z'entwéckelen dat Iech erlaabt ze späicheren:
pg_stat_activity Vue Geschicht Sessiounsschlossgeschicht mat der pg_locks Vue
Léisung Noutwendegkeete- den Impakt op d'Zildatenbank minimiséieren.
Allgemeng Iddi- den Datesammlungsagent gëtt net an der Zildatenbank gestart, mee an der Iwwerwaachungsdatenbank als Systemdéngscht. Jo, e puer Dateverloscht ass méiglech, awer dëst ass net kritesch fir ze berichten, awer et gëtt keen Impakt op d'Zildatenbank wat d'Erënnerung an d'Plazplaz ugeet. An am Fall vun engem Verbindungspool benotzt, ass den Impakt op Benotzerprozesser minimal.
Etappe vun der Ëmsetzung
1.Service Dëscher
E separat Schema gëtt benotzt fir Dëscher ze späicheren, fir d'Analyse vun den benotzten Haapttabellen net ze komplizéieren.
DROP SCHEMA IF EXISTS activity_hist ;
CREATE SCHEMA activity_hist AUTHORIZATION monitor ;
Wichteg: De Schema gëtt net an der Zildatenbank erstallt, mee an der Iwwerwaachungsdatebank.
pg_stat_activity Vue Geschicht
En Dësch gëtt benotzt fir aktuell Schnappschëss vun der pg_stat_activity Vue ze späicheren
activity_hist.history_pg_stat_activity :
--ACTIVITY_HIST.HISTORY_PG_STAT_ACTIVITY
DROP TABLE IF EXISTS activity_hist.history_pg_stat_activity;
CREATE TABLE activity_hist.history_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
);
Fir d'Insertioun ze beschleunegen - keng Indizes oder Restriktiounen.
Fir d'Geschicht selwer ze späicheren, gëtt e partitionéierten Dësch benotzt:
activity_hist.archive_pg_stat_activity :
DROP TABLE IF EXISTS activity_hist.archive_pg_stat_activity;
CREATE TABLE activity_hist.archive_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
)
PARTITION BY RANGE (timepoint);
Well et an dësem Fall keng Ufuerderunge fir d'Insertiounsgeschwindegkeet sinn, sinn e puer Indizes erstallt ginn fir d'Erstelle vu Berichter ze beschleunegen.
Sessioun blockéiert Geschicht
En Dësch gëtt benotzt fir aktuell Snapshots vu Sessiounsspären ze späicheren:
activity_hist.history_locking :
--ACTIVITY_HIST.HISTORY_LOCKING
DROP TABLE IF EXISTS activity_hist.history_locking;
CREATE TABLE activity_hist.history_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
);
Och fir d'Insertioun ze beschleunegen, ginn et keng Indexen oder Restriktiounen.
Fir d'Geschicht selwer ze späicheren, gëtt e partitionéierten Dësch benotzt:
activity_hist.archive_locking:
DROP TABLE IF EXISTS activity_hist.archive_locking;
CREATE TABLE activity_hist.archive_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
PARTITION BY RANGE (timepoint);
Well et an dësem Fall keng Ufuerderunge fir d'Insertiounsgeschwindegkeet sinn, sinn e puer Indizes erstallt ginn fir d'Erstelle vu Berichter ze beschleunegen.
2.D'aktuell Geschicht ausfëllen
Fir direkt Snapshots ze sammelen, gëtt e Bash Skript benotzt deen d'plpgsql Funktioun leeft.
plpgsql D'dblink Funktioun Zougang Meenung an der Zil-Datebank an setzt Reihen an Service Dëscher an der Iwwerwachung Datebank.
get_current_activity.sql
CREATE OR REPLACE FUNCTION activity_hist.get_current_activity( current_host text , current_s_name text , current_s_pass text ) RETURNS BOOLEAN AS $$
DECLARE
database_rec record;
dblink_str text ;
BEGIN
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||current_host||' port=5432 dbname=postgres'||
' user='||current_s_name||' password='||current_s_pass|| ' '')';
--------------------------------------------------------------------
--GET pg_stat_activity stats
INSERT INTO activity_hist.history_pg_stat_activity
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
datid ,
datname ,
pid ,
usesysid ,
usename ,
application_name ,
client_addr ,
client_hostname ,
client_port ,
backend_start ,
xact_start ,
query_start ,
state_change ,
wait_event_type ,
wait_event ,
state ,
backend_xid ,
backend_xmin ,
query ,
backend_type
FROM pg_stat_activity
')
AS t (
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text
)
);
---------------------------------------
--ACTIVITY_HIST.HISTORY_LOCKING
INSERT INTO activity_hist.history_locking
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
lock.locktype,
lock.relation,
lock.mode,
lock.transactionid as tid,
lock.virtualtransaction as vtid,
lock.pid,
pg_blocking_pids(lock.pid),
lock.granted
FROM pg_catalog.pg_locks lock LEFT JOIN pg_catalog.pg_database db ON db.oid = lock.database
WHERE NOT lock.pid = pg_backend_pid()
')
AS t (
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
);
PERFORM dblink_disconnect('LINK1');
RETURN TRUE ;
END
$$ LANGUAGE plpgsql;
Fir Snapshots ze sammelen, gëtt de Systemd Service an zwee Skripte benotzt:
pg_current_activity.service
# /etc/systemd/system/pg_current_activity.service
[Unit]
Description=Collect history of pg_stat_activity , pg_locks
Wants=pg_current_activity.timer
[Service]
Type=forking
StartLimitIntervalSec=0
ExecStart=/home/postgres/pgutils/demon/get_current_activity.sh 10.124.70.40 postgres postgres
[Install]
WantedBy=multi-user.target
pg_current_activity.timer
# /etc/systemd/system/pg_current_activity.timer
[Unit]
Description=Run pg_current_activity.sh every 1 second
Requires=pg_current_activity.service
[Timer]
Unit=pg_current_activity.service
OnCalendar=*:*:0/1
AccuracySec=1
[Install]
WantedBy=timers.target
Gitt Rechter op Scripten:
# chmod 755 pg_current_activity.timer
# chmod 755 pg_current_activity.service
Loosst eis de Service starten:
# systemctl Daemon-Reload
# systemctl start pg_current_activity.service
Sou gëtt d'Geschicht vun de Meenungen a Form vun Sekonn-vun-Sekonn Schnappschëss gesammelt. Natierlech, wann alles bleift wéi ass, wäerten d'Dëscher ganz séier an der Gréisst eropgoen a méi oder manner produktiv Aarbecht wäert onméiglech ginn.
Et ass néideg d'Datenarchivéierung ze organiséieren.
3. Archivéieren Geschicht
Fir d'Archivéiere gi partitionéiert Tabellenarchiv* benotzt.
Nei Partitionen ginn all Stonn erstallt, während al Donnéeën aus den Geschicht * Dëscher geläscht ginn, sou datt d'Gréisst vun den Geschicht * Dëscher net vill ännert an d'Insertiounsgeschwindegkeet net mat der Zäit degradéiert.
D'Schafe vun neie Sektiounen gëtt vun der plpgsql Funktioun Aktivitéit_hist.archive_current_activity gemaach. Den Algorithmus vun der Aarbecht ass ganz einfach (benotzt d'Beispill vun der Sektioun fir d'Archive_pg_stat_activity Tabelle).
Erstellt a fëllt eng nei Sektioun aus
EXECUTE format(
'CREATE TABLE ' || partition_name ||
' PARTITION OF activity_hist.archive_pg_stat_activity FOR VALUES FROM ( %L ) TO ( %L ) ' ,
to_char(date_trunc('year', partition_min_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_min_range ),'MM')||'-'||
to_char(date_trunc('day', partition_min_range ),'DD')||' '||
to_char(date_trunc('hour', partition_min_range ),'HH24')||':00',
to_char(date_trunc('year', partition_max_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_max_range ),'MM')||'-'||
to_char(date_trunc('day', partition_max_range ),'DD')||' '||
to_char(date_trunc('hour', partition_max_range ),'HH24')||':00'
);
INSERT INTO activity_hist.archive_pg_stat_activity
(
SELECT *
FROM activity_hist.history_pg_stat_activity
WHERE timepoint BETWEEN partition_min_range AND partition_max_range
);
Schafen Indexen
EXECUTE format (
'CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint )'
);
EXECUTE format ('CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint , queryid )'
);
Ewechzehuelen al Daten aus der history_pg_stat_activity Tabelle
DELETE
FROM activity_hist.history_pg_stat_activity
WHERE timepoint < partition_max_range;
Natierlech ginn vun Zäit zu Zäit al Rubriken als onnéideg geläscht.
Basis Berichter
Eigentlech, firwat gëtt dat alles gemaach? Fir Berichter ze kréien, déi ganz vague un den AWR vun Oracle erënneren.
Et ass wichteg ze addéieren datt fir Berichter ze kréien, Dir musst eng Verbindung tëscht de Meenungen pg_stat_activity an pg_stat_statements bauen. D'Tabelle gi verlinkt andeems Dir eng 'queryid' Kolonn un d''history_pg_stat_activity', 'archive_pg_stat_activity' Dëscher bäidréit. D'Method fir e Kolonnwäert ze addéieren ass iwwer den Ëmfang vun dësem Artikel a gëtt hei beschriwwen - pg_stat_statements + pg_stat_activity + loq_query = pg_ash? .
TOTAL CPU TIME FIR Ufroen
Ufro:
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( aa.wait_event_type IS NULL ) ANDaa.state = 'active'
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type IS NULL )AND ha.state = 'active'
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type IS NOT NULL )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type IS NOT NULL )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
aa.wait_event IS NOT NULL
GROUP BY aa.wait_event_type , aa.wait_event
UNION
SELECT
ha.wait_event_type , ha.wait_event
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
ha.wait_event IS NOT NULL
GROUP BY ha.wait_event_type , ha.wait_event
)
SELECT wait_event_type , wait_event
FROM hist
GROUP BY wait_event_type , wait_event
ORDER BY 1 ASC,2 ASC
----------------------------------------------------------------------
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type = waitings_stat_rec.wait_event_type AND aa.wait_event = waitings_stat_rec.wait_event )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type = waitings_stat_rec.wait_event_type AND ha.wait_event = waitings_stat_rec.wait_event )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
UNION
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
ORDER BY 1
SELECT
blocking_pids
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
UNION
SELECT
blocking_pids
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
ORDER BY 1
---------------------------------------------------------------
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.archive_pg_stat_activity
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
UNION
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.history_pg_stat_activity_for_reports
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
ORDER BY 5 , 1
Beispill:
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ----------------------- BLOCKPROZESSER HISTORY +----+------------- ------------------------------------------------ -------------------------- -------+---------------------------------------------- | #| pidd| Benotzernumm| application_name| daten| ugefaangen| Dauer| Staat| Ufro +------------------------------- ---------------------------------------- --------------------------+------------------------------------------------ ------------------ | 1| 26211| zitt| psql| tdb1| 2019-09-02 19:31:54| 00:00:04| lass| | 2| 26211| zitt| psql| tdb1| 2019-09-02 19:31:58| 00:00:06| Idle an der Transaktioun| ufänken; | 3| 26211| zitt| psql| tdb1| 2019-09-02 19:32:16| 00:01:45| Idle an der Transaktioun| Spär Dësch wafer_data; | 4| an 26211| zitt| psql| tdb1| 2019-09-02 19:35:54| 00:01:23| lass| engagéieren; | 5| vun 26211| zitt| psql| tdb1| 2019-09-02 19:38:46| 00:00:02| Idle an der Transaktioun| ufänken; | 6| vun 26211| zitt| psql| tdb1| 2019-09-02 19:38:54| 00:00:08| Idle an der Transaktioun| Spär Dësch wafer_data; | 7| vun 26211| zitt| psql| tdb1| 2019-09-02 19:39:08| 00:42:42| lass| engagéieren; | 8| vun 26211| zitt| psql| tdb1| 2019-09-03 07:12:07| 00:00:52| aktiv| wielt test_del();
Entwécklung.
D'Basis Ufroen gewisen an déi resultéierend Berichter maachen d'Liewen scho vill méi einfach wann Dir Performance Tëschefäll analyséiert.
Baséierend op Basisufroen, kënnt Dir e Bericht kréien deen vague dem Oracle säin AWR gläicht. Resumé Rapport Beispill
+-------------------------------------------------------- ----------------------------------- | KONSOLIDÉIERT RAPPORT FIR AKTIVITÉIT AN Waarden.
Fortsetzung kënnt no. Nächst an der Linn ass d'Schafung vun enger Spärgeschicht (pg_stat_locks), eng méi detailléiert Beschreiwung vum Prozess fir Dëscher ze fëllen.