pgsentinel uitbreiding :
«Alle opgehoopte inligting word slegs in RAM gestoor, en die hoeveelheid geheue wat verbruik word, word gereguleer deur die aantal laaste gestoorde rekords.
Die queryid-veld word bygevoeg - dieselfde queryid van die pg_stat_statements-uitbreiding (voorafinstallasie vereis).«
Dit sal natuurlik baie help, maar die moeilikste ding is die eerste punt.”Alle opgehoopte inligting word slegs in RAM gestoor ”, d.w.s. daar is 'n impak op die teikenbasis. Daarbenewens is daar geen slotgeskiedenis en tabelstatistieke nie. Dié. die oplossing is oor die algemeen onvolledig: "Daar is nog geen gereedgemaakte pakket vir installasie nie. Dit word voorgestel om die bronne af te laai en die biblioteek self saam te stel. U moet eers die "devel"-pakket vir u bediener installeer en die pad na pg_config in die PATH-veranderlike stel.".
Oor die algemeen is daar baie ophef, en in die geval van ernstige produksiedatabasisse is dit dalk nie moontlik om enigiets met die bediener te doen nie. Ons moet weer met iets van ons eie vorendag kom.
Waarskuwing.
As gevolg van die taamlik groot volume en as gevolg van die onvolledige toetstydperk, is die artikel hoofsaaklik van inligtingsaard, eerder as 'n stel tesisse en tussenresultate.
Meer gedetailleerde materiaal sal later in dele voorberei word
Konsepvereistes vir die oplossing
Dit is nodig om 'n instrument te ontwikkel wat jou toelaat om te stoor:
pg_stat_activity sien geskiedenis Sessie slot geskiedenis met behulp van die pg_locks aansig
Oplossingsvereiste– minimaliseer die impak op die teikendatabasis.
Algemene idee– die data-insamelingsagent word nie in die teikendatabasis bekendgestel nie, maar in die moniteringsdatabasis as 'n stelseldiens. Ja, 'n mate van dataverlies is moontlik, maar dit is nie krities vir verslagdoening nie, maar daar is geen impak op die teikendatabasis in terme van geheue en skyfspasie nie. En in die geval van die gebruik van 'n verbindingspoel, is die impak op gebruikersprosesse minimaal.
Implementering stadiums
1.Dienstafels
'n Aparte skema word gebruik om tabelle te stoor, om nie die ontleding van die hooftabelle wat gebruik word, te bemoeilik nie.
DROP SCHEMA IF EXISTS activity_hist ;
CREATE SCHEMA activity_hist AUTHORIZATION monitor ;
Belangrik: Die skema word nie in die teikendatabasis geskep nie, maar in die moniteringsdatabasis.
pg_stat_activity sien geskiedenis
'n Tabel word gebruik om huidige foto's van die pg_stat_activity-aansig te stoor
activity_hist.history_pg_stat_activity :
--ACTIVITY_HIST.HISTORY_PG_STAT_ACTIVITY
DROP TABLE IF EXISTS activity_hist.history_pg_stat_activity;
CREATE TABLE activity_hist.history_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
);
Om invoeging te bespoedig - geen indekse of beperkings nie.
Om die geskiedenis self te stoor, word 'n gepartisioneerde tabel gebruik:
activity_hist.archive_pg_stat_activity :
DROP TABLE IF EXISTS activity_hist.archive_pg_stat_activity;
CREATE TABLE activity_hist.archive_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
)
PARTITION BY RANGE (timepoint);
Aangesien daar in hierdie geval geen vereistes vir invoegspoed is nie, is sommige indekse geskep om die skep van verslae te bespoedig.
Sessie blokkeer geskiedenis
'n Tabel word gebruik om huidige foto's van sessieslotte te stoor:
activity_hist.history_locking :
--ACTIVITY_HIST.HISTORY_LOCKING
DROP TABLE IF EXISTS activity_hist.history_locking;
CREATE TABLE activity_hist.history_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
);
Ook, om invoeging te bespoedig, is daar geen indekse of beperkings nie.
Om die geskiedenis self te stoor, word 'n gepartisioneerde tabel gebruik:
activity_hist.archive_locking:
DROP TABLE IF EXISTS activity_hist.archive_locking;
CREATE TABLE activity_hist.archive_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
PARTITION BY RANGE (timepoint);
Aangesien daar in hierdie geval geen vereistes vir invoegspoed is nie, is sommige indekse geskep om die skep van verslae te bespoedig.
2.Vul die huidige geskiedenis in
Om direkte besigtigingsfoto's te versamel, word 'n bash-skrip gebruik wat die plpgsql-funksie laat loop.
plpgsql Die dblink-funksie kry toegang tot aansigte in die teikendatabasis en voeg rye in dienstabelle in die moniteringsdatabasis in.
kry_huidige_aktiwiteit.sql
CREATE OR REPLACE FUNCTION activity_hist.get_current_activity( current_host text , current_s_name text , current_s_pass text ) RETURNS BOOLEAN AS $$
DECLARE
database_rec record;
dblink_str text ;
BEGIN
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||current_host||' port=5432 dbname=postgres'||
' user='||current_s_name||' password='||current_s_pass|| ' '')';
--------------------------------------------------------------------
--GET pg_stat_activity stats
INSERT INTO activity_hist.history_pg_stat_activity
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
datid ,
datname ,
pid ,
usesysid ,
usename ,
application_name ,
client_addr ,
client_hostname ,
client_port ,
backend_start ,
xact_start ,
query_start ,
state_change ,
wait_event_type ,
wait_event ,
state ,
backend_xid ,
backend_xmin ,
query ,
backend_type
FROM pg_stat_activity
')
AS t (
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text
)
);
---------------------------------------
--ACTIVITY_HIST.HISTORY_LOCKING
INSERT INTO activity_hist.history_locking
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
lock.locktype,
lock.relation,
lock.mode,
lock.transactionid as tid,
lock.virtualtransaction as vtid,
lock.pid,
pg_blocking_pids(lock.pid),
lock.granted
FROM pg_catalog.pg_locks lock LEFT JOIN pg_catalog.pg_database db ON db.oid = lock.database
WHERE NOT lock.pid = pg_backend_pid()
')
AS t (
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
);
PERFORM dblink_disconnect('LINK1');
RETURN TRUE ;
END
$$ LANGUAGE plpgsql;
Om aansigfoto's te versamel, word die systemd-diens en twee skrifte gebruik:
pg_current_activity.service
# /etc/systemd/system/pg_current_activity.service
[Unit]
Description=Collect history of pg_stat_activity , pg_locks
Wants=pg_current_activity.timer
[Service]
Type=forking
StartLimitIntervalSec=0
ExecStart=/home/postgres/pgutils/demon/get_current_activity.sh 10.124.70.40 postgres postgres
[Install]
WantedBy=multi-user.target
pg_current_activity.timer
# /etc/systemd/system/pg_current_activity.timer
[Unit]
Description=Run pg_current_activity.sh every 1 second
Requires=pg_current_activity.service
[Timer]
Unit=pg_current_activity.service
OnCalendar=*:*:0/1
AccuracySec=1
[Install]
WantedBy=timers.target
Ken regte aan skrifte toe:
# chmod 755 pg_current_activity.timer
# chmod 755 pg_current_activity.service
Ons begin die diens:
# systemctl daemon-herlaai
# systemctl begin pg_current_activity.service
So word die geskiedenis van sienings versamel in die vorm van sekonde-vir-sekonde kiekies. Natuurlik, as alles net so gelaat word, sal die tafels baie vinnig groter word en sal min of meer produktiewe werk onmoontlik word.
Dit is nodig om data-argivering te organiseer.
3. Argiveer geskiedenis
Vir argivering word gepartisioneerde tabel-argief* gebruik.
Nuwe partisies word elke uur geskep, terwyl ou data uit die geskiedenis*-tabelle verwyder word, so die grootte van die geskiedenis*-tabelle verander nie veel nie en die invoegspoed verswak nie met verloop van tyd nie.
Die skepping van nuwe afdelings word uitgevoer deur die plpgsql-funksie activity_hist.archive_current_activity. Die algoritme van werk is baie eenvoudig (gebruik die voorbeeld van die afdeling vir die archive_pg_stat_activity-tabel).
Skep en vul 'n nuwe afdeling in
EXECUTE format(
'CREATE TABLE ' || partition_name ||
' PARTITION OF activity_hist.archive_pg_stat_activity FOR VALUES FROM ( %L ) TO ( %L ) ' ,
to_char(date_trunc('year', partition_min_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_min_range ),'MM')||'-'||
to_char(date_trunc('day', partition_min_range ),'DD')||' '||
to_char(date_trunc('hour', partition_min_range ),'HH24')||':00',
to_char(date_trunc('year', partition_max_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_max_range ),'MM')||'-'||
to_char(date_trunc('day', partition_max_range ),'DD')||' '||
to_char(date_trunc('hour', partition_max_range ),'HH24')||':00'
);
INSERT INTO activity_hist.archive_pg_stat_activity
(
SELECT *
FROM activity_hist.history_pg_stat_activity
WHERE timepoint BETWEEN partition_min_range AND partition_max_range
);
Die skep van indekse
EXECUTE format (
'CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint )'
);
EXECUTE format ('CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint , queryid )'
);
Verwyder ou data uit die history_pg_stat_activity-tabel
DELETE
FROM activity_hist.history_pg_stat_activity
WHERE timepoint < partition_max_range;
Natuurlik word ou afdelings van tyd tot tyd as onnodig uitgevee.
Basiese verslae
Eintlik, hoekom word dit alles gedoen? Om verslae te kry wat baie vaagweg herinner aan Oracle se AWR.
Dit is belangrik om by te voeg dat jy 'n verband moet bou tussen die pg_stat_activity- en pg_stat_statements-aansigte om verslae te ontvang. Die tabelle word gekoppel deur 'n 'queryid'-kolom by die 'history_pg_stat_activity', 'archive_pg_stat_activity'-tabelle te voeg. Die metode om 'n kolomwaarde by te voeg, val buite die bestek van hierdie artikel en word hier beskryf - pg_stat_statements + pg_stat_activity + loq_query = pg_ash? .
TOTALE SVE-TYD VIR NAVRAE
Versoek:
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( aa.wait_event_type IS NULL ) ANDaa.state = 'active'
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type IS NULL )AND ha.state = 'active'
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type IS NOT NULL )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type IS NOT NULL )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
aa.wait_event IS NOT NULL
GROUP BY aa.wait_event_type , aa.wait_event
UNION
SELECT
ha.wait_event_type , ha.wait_event
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
ha.wait_event IS NOT NULL
GROUP BY ha.wait_event_type , ha.wait_event
)
SELECT wait_event_type , wait_event
FROM hist
GROUP BY wait_event_type , wait_event
ORDER BY 1 ASC,2 ASC
----------------------------------------------------------------------
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type = waitings_stat_rec.wait_event_type AND aa.wait_event = waitings_stat_rec.wait_event )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type = waitings_stat_rec.wait_event_type AND ha.wait_event = waitings_stat_rec.wait_event )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
UNION
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
ORDER BY 1
SELECT
blocking_pids
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
UNION
SELECT
blocking_pids
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
ORDER BY 1
---------------------------------------------------------------
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.archive_pg_stat_activity
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
UNION
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.history_pg_stat_activity_for_reports
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
ORDER BY 5 , 1
Die basiese navrae wat gewys word en die gevolglike verslae maak die lewe reeds baie makliker wanneer prestasie-insidente ontleed word.
Op grond van basiese navrae kan u 'n verslag kry wat vaagweg soos Oracle se AWR lyk. Samevattende verslag voorbeeld
+------------------------------------------------ ---------------------------------- | GEKONSOLIDEERDE VERSLAG VIR AKTIWITEIT EN WAG.
Vervolg. Volgende aan die beurt is die skepping van 'n slotgeskiedenis (pg_stat_locks), 'n meer gedetailleerde beskrywing van die proses om tabelle in te vul.