razširitev pgsentinel :
«Vse zbrane informacije so shranjene samo v RAM-u, porabljena količina pomnilnika pa se uravnava s številom zadnjih shranjenih zapisov.
Dodano je polje queryid – isti queryid iz razširitve pg_stat_statements (potrebna je predhodna namestitev).«
To bi seveda zelo pomagalo, a najbolj moteča je prva točka.«Vse zbrane informacije so shranjene samo v RAM-u «, tj. obstaja vpliv na ciljno bazo. Poleg tega ni zgodovine zaklepanja in statistike tabele. Tisti. rešitev je na splošno nepopolna: “Za namestitev še ni pripravljenega paketa. Predlagamo, da si prenesete vire in sami sestavite knjižnico. Najprej morate namestiti paket »devel« za vaš strežnik in nastaviti pot do pg_config v spremenljivki PATH.".
Nasploh je nagajanja veliko in pri resnih produkcijskih bazah se morda ne da nič narediti s strežnikom. Spet se moramo domisliti nečesa svojega.
Opozorilo.
Zaradi precej velikega obsega in zaradi nepopolnega obdobja testiranja je članek pretežno informativne narave, bolj kot nabor tez in vmesnih rezultatov.
Podrobnejše gradivo bo pripravljeno naknadno, po delih
Osnutek zahtev za rešitev
Treba je razviti orodje, ki vam omogoča shranjevanje:
pg_stat_activity ogled zgodovine Zgodovina zaklepanja seje s pogledom pg_locks
Zahteva za rešitev– zmanjšati vpliv na ciljno bazo podatkov.
Splošna ideja– agent za zbiranje podatkov se ne zažene v ciljni bazi podatkov, temveč v nadzorni bazi podatkov kot storitev systemd. Da, možna je izguba nekaterih podatkov, vendar to ni kritično za poročanje, vendar ni vpliva na ciljno bazo podatkov v smislu pomnilnika in prostora na disku. In v primeru uporabe bazena povezav je vpliv na uporabniške procese minimalen.
Faze izvajanja
1. Servisne mize
Za shranjevanje tabel se uporablja ločena shema, da ne bi otežili analize glavnih uporabljenih tabel.
DROP SCHEMA IF EXISTS activity_hist ;
CREATE SCHEMA activity_hist AUTHORIZATION monitor ;
Pomembno: Shema ni ustvarjena v ciljni bazi podatkov, temveč v nadzorni bazi podatkov.
pg_stat_activity ogled zgodovine
Tabela se uporablja za shranjevanje trenutnih posnetkov pogleda pg_stat_activity
activity_hist.history_pg_stat_activity:
--ACTIVITY_HIST.HISTORY_PG_STAT_ACTIVITY
DROP TABLE IF EXISTS activity_hist.history_pg_stat_activity;
CREATE TABLE activity_hist.history_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
);
Za pospešitev vstavljanja - brez indeksov ali omejitev.
Za shranjevanje same zgodovine se uporablja particionirana tabela:
activity_hist.archive_pg_stat_activity:
DROP TABLE IF EXISTS activity_hist.archive_pg_stat_activity;
CREATE TABLE activity_hist.archive_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
)
PARTITION BY RANGE (timepoint);
Ker v tem primeru ni nobenih zahtev glede hitrosti vstavljanja, so bili ustvarjeni nekateri indeksi za pospešitev ustvarjanja poročil.
Zgodovina blokiranja sej
Tabela se uporablja za shranjevanje trenutnih posnetkov zaklepanja seje:
activity_hist.history_locking:
--ACTIVITY_HIST.HISTORY_LOCKING
DROP TABLE IF EXISTS activity_hist.history_locking;
CREATE TABLE activity_hist.history_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
);
Tudi za pospešitev vstavljanja ni indeksov ali omejitev.
Za shranjevanje same zgodovine se uporablja particionirana tabela:
activity_hist.archive_locking:
DROP TABLE IF EXISTS activity_hist.archive_locking;
CREATE TABLE activity_hist.archive_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
PARTITION BY RANGE (timepoint);
Ker v tem primeru ni nobenih zahtev glede hitrosti vstavljanja, so bili ustvarjeni nekateri indeksi za pospešitev ustvarjanja poročil.
2. Izpolnjevanje trenutne zgodovine
Za neposredno zbiranje posnetkov pogleda se uporablja skript bash, ki izvaja funkcijo plpgsql.
plpgsql Funkcija dblink dostopa do pogledov v ciljni zbirki podatkov in vstavlja vrstice v storitvene tabele v zbirki podatkov za spremljanje.
get_current_activity.sql
CREATE OR REPLACE FUNCTION activity_hist.get_current_activity( current_host text , current_s_name text , current_s_pass text ) RETURNS BOOLEAN AS $$
DECLARE
database_rec record;
dblink_str text ;
BEGIN
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||current_host||' port=5432 dbname=postgres'||
' user='||current_s_name||' password='||current_s_pass|| ' '')';
--------------------------------------------------------------------
--GET pg_stat_activity stats
INSERT INTO activity_hist.history_pg_stat_activity
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
datid ,
datname ,
pid ,
usesysid ,
usename ,
application_name ,
client_addr ,
client_hostname ,
client_port ,
backend_start ,
xact_start ,
query_start ,
state_change ,
wait_event_type ,
wait_event ,
state ,
backend_xid ,
backend_xmin ,
query ,
backend_type
FROM pg_stat_activity
')
AS t (
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text
)
);
---------------------------------------
--ACTIVITY_HIST.HISTORY_LOCKING
INSERT INTO activity_hist.history_locking
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
lock.locktype,
lock.relation,
lock.mode,
lock.transactionid as tid,
lock.virtualtransaction as vtid,
lock.pid,
pg_blocking_pids(lock.pid),
lock.granted
FROM pg_catalog.pg_locks lock LEFT JOIN pg_catalog.pg_database db ON db.oid = lock.database
WHERE NOT lock.pid = pg_backend_pid()
')
AS t (
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
);
PERFORM dblink_disconnect('LINK1');
RETURN TRUE ;
END
$$ LANGUAGE plpgsql;
Za zbiranje posnetkov pogleda se uporablja storitev systemd in dva skripta:
pg_current_activity.service
# /etc/systemd/system/pg_current_activity.service
[Unit]
Description=Collect history of pg_stat_activity , pg_locks
Wants=pg_current_activity.timer
[Service]
Type=forking
StartLimitIntervalSec=0
ExecStart=/home/postgres/pgutils/demon/get_current_activity.sh 10.124.70.40 postgres postgres
[Install]
WantedBy=multi-user.target
pg_current_activity.timer
# /etc/systemd/system/pg_current_activity.timer
[Unit]
Description=Run pg_current_activity.sh every 1 second
Requires=pg_current_activity.service
[Timer]
Unit=pg_current_activity.service
OnCalendar=*:*:0/1
AccuracySec=1
[Install]
WantedBy=timers.target
Začnimo s storitvijo:
# systemctl daemon-reload
# systemctl zaženi pg_current_activity.service
Tako se zgodovina ogledov zbira v obliki posnetkov sekunde za sekundo. Seveda, če ostane vse tako, kot je, se bodo mize zelo hitro povečale in bolj ali manj produktivno delo bo postalo nemogoče.
Potrebno je organizirati arhiviranje podatkov.
3. Zgodovina arhiviranja
Za arhiviranje se uporabljajo particionirane tabele archive*.
Nove particije se ustvarijo vsako uro, medtem ko se stari podatki odstranijo iz tabel zgodovine*, tako da se velikost tabel zgodovine* ne spremeni veliko in hitrost vstavljanja se sčasoma ne zmanjša.
Ustvarjanje novih razdelkov izvede plpgsql funkcija activity_hist.archive_current_activity. Algoritem dela je zelo preprost (z uporabo primera razdelka za tabelo archive_pg_stat_activity).
Ustvarite in izpolnite nov razdelek
EXECUTE format(
'CREATE TABLE ' || partition_name ||
' PARTITION OF activity_hist.archive_pg_stat_activity FOR VALUES FROM ( %L ) TO ( %L ) ' ,
to_char(date_trunc('year', partition_min_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_min_range ),'MM')||'-'||
to_char(date_trunc('day', partition_min_range ),'DD')||' '||
to_char(date_trunc('hour', partition_min_range ),'HH24')||':00',
to_char(date_trunc('year', partition_max_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_max_range ),'MM')||'-'||
to_char(date_trunc('day', partition_max_range ),'DD')||' '||
to_char(date_trunc('hour', partition_max_range ),'HH24')||':00'
);
INSERT INTO activity_hist.archive_pg_stat_activity
(
SELECT *
FROM activity_hist.history_pg_stat_activity
WHERE timepoint BETWEEN partition_min_range AND partition_max_range
);
Ustvarjanje indeksov
EXECUTE format (
'CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint )'
);
EXECUTE format ('CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint , queryid )'
);
Odstranjevanje starih podatkov iz tabele history_pg_stat_activity
DELETE
FROM activity_hist.history_pg_stat_activity
WHERE timepoint < partition_max_range;
Seveda se od časa do časa stari razdelki izbrišejo kot nepotrebni.
Osnovna poročila
Pravzaprav, zakaj se vse to počne? Za pridobivanje poročil, ki zelo nejasno spominjajo na Oraclov AWR.
Pomembno je dodati, da morate za prejemanje poročil vzpostaviti povezavo med pogledoma pg_stat_activity in pg_stat_statements. Tabele so povezane z dodajanjem stolpca 'queryid' v tabele 'history_pg_stat_activity', 'archive_pg_stat_activity'. Metoda dodajanja vrednosti stolpca presega obseg tega članka in je opisana tukaj − pg_stat_statements + pg_stat_activity + loq_query = pg_ash? .
SKUPNI CPE CPE ZA POIZVEDBE
Prošnja :
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( aa.wait_event_type IS NULL ) ANDaa.state = 'active'
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type IS NULL )AND ha.state = 'active'
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type IS NOT NULL )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type IS NOT NULL )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
aa.wait_event IS NOT NULL
GROUP BY aa.wait_event_type , aa.wait_event
UNION
SELECT
ha.wait_event_type , ha.wait_event
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
ha.wait_event IS NOT NULL
GROUP BY ha.wait_event_type , ha.wait_event
)
SELECT wait_event_type , wait_event
FROM hist
GROUP BY wait_event_type , wait_event
ORDER BY 1 ASC,2 ASC
----------------------------------------------------------------------
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type = waitings_stat_rec.wait_event_type AND aa.wait_event = waitings_stat_rec.wait_event )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type = waitings_stat_rec.wait_event_type AND ha.wait_event = waitings_stat_rec.wait_event )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
UNION
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
ORDER BY 1
SELECT
blocking_pids
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
UNION
SELECT
blocking_pids
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
ORDER BY 1
---------------------------------------------------------------
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.archive_pg_stat_activity
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
UNION
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.history_pg_stat_activity_for_reports
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
ORDER BY 5 , 1
Prikazane osnovne poizvedbe in posledična poročila že močno olajšajo življenje pri analizi incidentov pri delovanju.
Na podlagi osnovnih poizvedb lahko dobite poročilo, ki nejasno spominja na Oraclov AWR. Primer zbirnega poročila
+------------------------------------------------ ---------------------------------- | KONSOLIDIRANO POROČILO ZA AKTIVNOST IN ČAKANJA.
Se nadaljuje. Naslednja je na vrsti izdelava zgodovine zaklepanja (pg_stat_locks), podrobnejši opis postopka izpolnjevanja tabel.