Percubaan untuk mencipta analog ASH untuk PostgreSQL
Pernyataan masalah
Untuk mengoptimumkan pertanyaan PostgreSQL, keupayaan untuk menganalisis sejarah aktiviti, khususnya, menunggu, kunci dan statistik jadual, sangat diperlukan.
sambungan pgsentinel :
Β«Semua maklumat terkumpul disimpan hanya dalam RAM, dan jumlah memori yang digunakan dikawal oleh bilangan rekod terakhir yang disimpan.
Medan queryid ditambahkan - queryid yang sama daripada sambungan pg_stat_statements (prapemasangan diperlukan).Β«
Ini, sudah tentu, akan banyak membantu, tetapi perkara yang paling menyusahkan ialah perkara pertama.βSemua maklumat terkumpul disimpan hanya dalam RAM β, iaitu. terdapat kesan pada asas sasaran. Selain itu, tiada sejarah kunci dan statistik jadual. Itu. penyelesaiannya secara amnya tidak lengkap: "Belum ada pakej siap untuk pemasangan. Adalah dicadangkan untuk memuat turun sumber dan memasang perpustakaan sendiri. Mula-mula anda perlu memasang pakej "devel" untuk pelayan anda dan tetapkan laluan ke pg_config dalam pembolehubah PATH.".
Secara umum, terdapat banyak kekecohan, dan dalam kes pangkalan data pengeluaran yang serius, ia mungkin tidak dapat melakukan apa-apa dengan pelayan. Kita perlu membuat sesuatu yang tersendiri lagi.
Amaran
Disebabkan volum yang agak besar dan disebabkan oleh tempoh ujian yang tidak lengkap, artikel itu terutamanya bersifat maklumat, bukannya sebagai satu set tesis dan keputusan pertengahan.
Bahan yang lebih terperinci akan disediakan kemudian, dalam bahagian
Draf keperluan untuk penyelesaian
Ia adalah perlu untuk membangunkan alat yang membolehkan anda menyimpan:
sejarah paparan pg_stat_activity Sejarah kunci sesi menggunakan paparan pg_locks
Keperluan penyelesaianβmeminimumkan kesan ke atas pangkalan data sasaran.
Idea umumβ ejen pengumpulan data dilancarkan bukan dalam pangkalan data sasaran, tetapi dalam pangkalan data pemantauan sebagai perkhidmatan systemd. Ya, beberapa kehilangan data mungkin, tetapi ini tidak kritikal untuk pelaporan, tetapi tiada kesan ke atas pangkalan data sasaran dari segi memori dan ruang cakera. Dan dalam kes menggunakan kumpulan sambungan, kesan ke atas proses pengguna adalah minimum.
Peringkat pelaksanaan
1. Jadual perkhidmatan
Skema berasingan digunakan untuk menyimpan jadual, supaya tidak merumitkan analisis jadual utama yang digunakan.
DROP SCHEMA IF EXISTS activity_hist ;
CREATE SCHEMA activity_hist AUTHORIZATION monitor ;
Penting: Skema tidak dibuat dalam pangkalan data sasaran, tetapi dalam pangkalan data pemantauan.
sejarah paparan pg_stat_activity
Jadual digunakan untuk menyimpan syot kilat semasa paparan pg_stat_activity
activity_hist.history_pg_stat_activity :
--ACTIVITY_HIST.HISTORY_PG_STAT_ACTIVITY
DROP TABLE IF EXISTS activity_hist.history_pg_stat_activity;
CREATE TABLE activity_hist.history_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
);
Untuk mempercepatkan pemasukan - tiada indeks atau sekatan.
Untuk menyimpan sejarah itu sendiri, jadual pembahagian digunakan:
activity_hist.archive_pg_stat_activity :
DROP TABLE IF EXISTS activity_hist.archive_pg_stat_activity;
CREATE TABLE activity_hist.archive_pg_stat_activity
(
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text ,
queryid bigint
)
PARTITION BY RANGE (timepoint);
Memandangkan dalam kes ini tiada keperluan untuk kelajuan sisipan, beberapa indeks telah dibuat untuk mempercepatkan penciptaan laporan.
Sejarah menyekat sesi
Jadual digunakan untuk menyimpan petikan semasa kunci sesi:
activity_hist.history_locking :
--ACTIVITY_HIST.HISTORY_LOCKING
DROP TABLE IF EXISTS activity_hist.history_locking;
CREATE TABLE activity_hist.history_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
);
Selain itu, untuk mempercepatkan pemasukan, tiada indeks atau sekatan.
Untuk menyimpan sejarah itu sendiri, jadual pembahagian digunakan:
activity_hist.archive_locking:
DROP TABLE IF EXISTS activity_hist.archive_locking;
CREATE TABLE activity_hist.archive_locking
(
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
PARTITION BY RANGE (timepoint);
Memandangkan dalam kes ini tiada keperluan untuk kelajuan sisipan, beberapa indeks telah dibuat untuk mempercepatkan penciptaan laporan.
2.Mengisi sejarah semasa
Untuk mengumpul syot kilat paparan secara langsung, skrip bash digunakan yang menjalankan fungsi plpgsql.
plpgsql Fungsi dblink mengakses paparan dalam pangkalan data sasaran dan memasukkan baris ke dalam jadual perkhidmatan dalam pangkalan data pemantauan.
get_current_activity.sql
CREATE OR REPLACE FUNCTION activity_hist.get_current_activity( current_host text , current_s_name text , current_s_pass text ) RETURNS BOOLEAN AS $$
DECLARE
database_rec record;
dblink_str text ;
BEGIN
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||current_host||' port=5432 dbname=postgres'||
' user='||current_s_name||' password='||current_s_pass|| ' '')';
--------------------------------------------------------------------
--GET pg_stat_activity stats
INSERT INTO activity_hist.history_pg_stat_activity
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
datid ,
datname ,
pid ,
usesysid ,
usename ,
application_name ,
client_addr ,
client_hostname ,
client_port ,
backend_start ,
xact_start ,
query_start ,
state_change ,
wait_event_type ,
wait_event ,
state ,
backend_xid ,
backend_xmin ,
query ,
backend_type
FROM pg_stat_activity
')
AS t (
timepoint timestamp without time zone ,
datid oid ,
datname name ,
pid integer,
usesysid oid ,
usename name ,
application_name text ,
client_addr inet ,
client_hostname text ,
client_port integer,
backend_start timestamp with time zone ,
xact_start timestamp with time zone ,
query_start timestamp with time zone ,
state_change timestamp with time zone ,
wait_event_type text ,
wait_event text ,
state text ,
backend_xid xid ,
backend_xmin xid ,
query text ,
backend_type text
)
);
---------------------------------------
--ACTIVITY_HIST.HISTORY_LOCKING
INSERT INTO activity_hist.history_locking
(
SELECT * FROM dblink('LINK1',
'SELECT
now() ,
lock.locktype,
lock.relation,
lock.mode,
lock.transactionid as tid,
lock.virtualtransaction as vtid,
lock.pid,
pg_blocking_pids(lock.pid),
lock.granted
FROM pg_catalog.pg_locks lock LEFT JOIN pg_catalog.pg_database db ON db.oid = lock.database
WHERE NOT lock.pid = pg_backend_pid()
')
AS t (
timepoint timestamp without time zone ,
locktype text ,
relation oid ,
mode text ,
tid xid ,
vtid text ,
pid integer ,
blocking_pids integer[] ,
granted boolean
)
);
PERFORM dblink_disconnect('LINK1');
RETURN TRUE ;
END
$$ LANGUAGE plpgsql;
Untuk mengumpul syot kilat paparan, perkhidmatan systemd dan dua skrip digunakan:
pg_current_activity.service
# /etc/systemd/system/pg_current_activity.service
[Unit]
Description=Collect history of pg_stat_activity , pg_locks
Wants=pg_current_activity.timer
[Service]
Type=forking
StartLimitIntervalSec=0
ExecStart=/home/postgres/pgutils/demon/get_current_activity.sh 10.124.70.40 postgres postgres
[Install]
WantedBy=multi-user.target
pg_current_activity.timer
# /etc/systemd/system/pg_current_activity.timer
[Unit]
Description=Run pg_current_activity.sh every 1 second
Requires=pg_current_activity.service
[Timer]
Unit=pg_current_activity.service
OnCalendar=*:*:0/1
AccuracySec=1
[Install]
WantedBy=timers.target
Berikan hak kepada skrip:
# chmod 755 pg_current_activity.timer
# chmod 755 pg_current_activity.service
Mari mulakan perkhidmatan:
# daemon-reload systemctl
# systemctl mula pg_current_activity.service
Oleh itu, sejarah pandangan dikumpulkan dalam bentuk syot kilat detik demi saat. Sudah tentu, jika semuanya dibiarkan begitu sahaja, saiz meja akan meningkat dengan cepat dan kerja yang lebih atau kurang produktif akan menjadi mustahil.
Ia adalah perlu untuk mengatur pengarkiban data.
3. Mengarkibkan sejarah
Untuk mengarkibkan, arkib jadual terbahagi* digunakan.
Pembahagian baharu dibuat setiap jam, manakala data lama dialih keluar daripada jadual sejarah*, jadi saiz jadual sejarah* tidak banyak berubah dan kelajuan sisipan tidak merosot dari semasa ke semasa.
Penciptaan bahagian baharu dilakukan oleh fungsi plpgsql activity_hist.archive_current_activity. Algoritma kerja adalah sangat mudah (menggunakan contoh bahagian untuk jadual archive_pg_stat_activity).
Buat dan isi bahagian baharu
EXECUTE format(
'CREATE TABLE ' || partition_name ||
' PARTITION OF activity_hist.archive_pg_stat_activity FOR VALUES FROM ( %L ) TO ( %L ) ' ,
to_char(date_trunc('year', partition_min_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_min_range ),'MM')||'-'||
to_char(date_trunc('day', partition_min_range ),'DD')||' '||
to_char(date_trunc('hour', partition_min_range ),'HH24')||':00',
to_char(date_trunc('year', partition_max_range ),'YYYY')||'-'||
to_char(date_trunc('month', partition_max_range ),'MM')||'-'||
to_char(date_trunc('day', partition_max_range ),'DD')||' '||
to_char(date_trunc('hour', partition_max_range ),'HH24')||':00'
);
INSERT INTO activity_hist.archive_pg_stat_activity
(
SELECT *
FROM activity_hist.history_pg_stat_activity
WHERE timepoint BETWEEN partition_min_range AND partition_max_range
);
Mencipta indeks
EXECUTE format (
'CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint )'
);
EXECUTE format ('CREATE INDEX '||index_name||
' ON '||partition_name||' ( wait_event_type , backend_type , timepoint , queryid )'
);
Mengalih keluar data lama daripada jadual history_pg_stat_activity
DELETE
FROM activity_hist.history_pg_stat_activity
WHERE timepoint < partition_max_range;
Sudah tentu, dari semasa ke semasa, bahagian lama dipadamkan sebagai tidak perlu.
Laporan asas
Sebenarnya, kenapa semua ini dilakukan? Untuk mendapatkan laporan yang sangat samar-samar mengingatkan AWR Oracle.
Adalah penting untuk menambah bahawa untuk menerima laporan, anda perlu membina hubungan antara paparan pg_stat_activity dan pg_stat_statements. Jadual dipautkan dengan menambahkan lajur 'queryid' pada jadual 'history_pg_stat_activity', 'archive_pg_stat_activity'. Kaedah menambah nilai lajur adalah di luar skop artikel ini dan diterangkan di sini β pg_stat_statements + pg_stat_activity + loq_query = pg_ash? .
JUMLAH MASA CPU UNTUK PERTANYAAN
Permintaan :
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( aa.wait_event_type IS NULL ) ANDaa.state = 'active'
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type IS NULL )AND ha.state = 'active'
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type IS NOT NULL )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type IS NOT NULL )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
aa.wait_event IS NOT NULL
GROUP BY aa.wait_event_type , aa.wait_event
UNION
SELECT
ha.wait_event_type , ha.wait_event
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
ha.wait_event IS NOT NULL
GROUP BY ha.wait_event_type , ha.wait_event
)
SELECT wait_event_type , wait_event
FROM hist
GROUP BY wait_event_type , wait_event
ORDER BY 1 ASC,2 ASC
----------------------------------------------------------------------
WITH hist AS
(
SELECT
aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid ,
count(*) * interval '1 second' AS duration
FROM activity_hist.archive_pg_stat_activity aa
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( aa.wait_event_type = waitings_stat_rec.wait_event_type AND aa.wait_event = waitings_stat_rec.wait_event )
GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid
UNION
SELECT
ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid,
count(*) * interval '1 second' AS duration
FROM activity_hist.history_pg_stat_activity_for_reports ha
WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
backend_type = 'client backend' AND datname != 'postgres' AND
( ha.wait_event_type = waitings_stat_rec.wait_event_type AND ha.wait_event = waitings_stat_rec.wait_event )
GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid
)
SELECT query , queryid , SUM( duration ) as duration
FROM hist
GROUP BY query , queryid
ORDER BY 3 DESC
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
UNION
SELECT
MIN(date_trunc('second',timepoint)) AS started ,
count(*) * interval '1 second' as duration ,
pid , blocking_pids , relation , mode , locktype
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY pid , blocking_pids , relation , mode , locktype
ORDER BY 1
SELECT
blocking_pids
FROM
activity_hist.archive_locking al
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
UNION
SELECT
blocking_pids
FROM
activity_hist.history_locking
WHERE
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND
NOT granted AND
locktype = 'relation'
GROUP BY blocking_pids
ORDER BY 1
---------------------------------------------------------------
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.archive_pg_stat_activity
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
UNION
SELECT
pid , usename , application_name , datname ,
MIN(date_trunc('second',timepoint)) as started ,
count(*) * interval '1 second' as duration ,
state ,
query
FROM activity_hist.history_pg_stat_activity_for_reports
WHERE pid= current_pid AND
timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour')
GROUP BY pid , usename , application_name ,
datname ,
state_change,
state ,
query
ORDER BY 5 , 1
Pertanyaan asas yang ditunjukkan dan laporan yang terhasil sudah menjadikan kehidupan lebih mudah apabila menganalisis insiden prestasi.
Berdasarkan pertanyaan asas, anda boleh mendapatkan laporan yang samar-samar menyerupai AWR Oracle. Contoh laporan ringkasan
+------------------------------------------------ ----------------------------------- | LAPORAN DISATUKAN UNTUK AKTIVITI DAN MENUNGGU.
Akan bersambung. Seterusnya dalam barisan ialah penciptaan sejarah kunci (pg_stat_locks), penerangan yang lebih terperinci tentang proses mengisi jadual.