Jew tetrisology ftit applikat.
Kollox ġdid huwa minsi sew qodma.
Epigrafi.
Dikjarazzjoni tal-problema
Għandek bżonn perjodikament tniżżel il-fajl tal-log PostgreSQL attwali mill-sħaba tal-AWS għall-host Linux lokali tiegħek. Mhux fil-ħin reali, iżda, ejja ngħidu, b'dewmien żgħir.
Il-perjodu tat-tniżżil tal-aġġornament tal-log file huwa ta' 5 minuti.
Il-log file fl-AWS jiddawwar kull siegħa.
Għodda wżata
Biex tniżżel il-fajl log lill-host, jintuża script bash li jsejjaħ l-AWS API "
Parametri:
- —db-instance-identifier: isem tal-istanza AWS;
- --log-file-name: isem il-log file iġġenerat bħalissa
- --max-item: In-numru totali ta 'oġġetti rritornati fl-output tal-kmand.Id-daqs tal-porzjon tal-fajl imniżżel.
- --starting-token: Token tal-bidu
U huwa sempliċi - kompitu interessanti għat-taħriġ u l-varjetà matul il-ħinijiet tax-xogħol.
Se nassumi li l-problema diġà ġiet solvuta minħabba l-ħajja ta 'kuljum. Iżda Google mgħaġġla ma ssuġġeriet l-ebda soluzzjonijiet, u ma kellix ħafna xewqa li nfittex f'aktar fond. Jew il-mod, huwa workout tajjeb.
Formalizzazzjoni tal-kompitu
Il-fajl tar-reġistru finali jikkonsisti f'ħafna linji ta' tul varjabbli. Grafikament, il-log file jista 'jiġi rappreżentat xi ħaġa bħal din:
Diġà tfakkarkom f'xi ħaġa? X'għandu x'jaqsam miegħu Tetris? U hawn x'għandha x'taqsam magħha.
Jekk nimmaġinaw l-għażliet possibbli li jinqalgħu meta tgħabbi l-fajl li jmiss b'mod grafiku (għal sempliċità, f'dan il-każ, ħalli l-linji jkollhom l-istess tul), inġibu Biċċiet Tetris standard:
1) Il-fajl jitniżżel fl-intier tiegħu u huwa finali. Id-daqs tal-porzjon huwa akbar mid-daqs finali tal-fajl:
2) Il-fajl ikompli. Id-daqs tal-biċċa huwa iżgħar mid-daqs finali tal-fajl:
3) Il-fajl huwa kontinwazzjoni tal-fajl preċedenti u għandu kontinwazzjoni. Id-daqs tal-biċċa huwa iżgħar mid-daqs tal-bqija tal-fajl finali:
4) Il-fajl huwa kontinwazzjoni tal-fajl preċedenti u huwa dak finali. Id-daqs tal-biċċa huwa akbar mid-daqs tal-bqija tal-fajl finali:
Il-kompitu huwa li tiġbor rettangolu jew tilgħab Tetris fuq livell ġdid.
Problemi li jinqalgħu waqt li tissolva problema
1) Kolla spaga ta '2 biċċiet
B'mod ġenerali, ma kienx hemm problemi speċjali. Problema standard minn kors inizjali ta 'programmazzjoni.
Daqs ottimali tas-servizz
Iżda dan huwa ftit aktar interessanti.
Sfortunatament, m'hemm l-ebda mod kif tuża offset wara t-tikketta tal-porzjon tal-bidu:
Kif diġà taf l-għażla —starting-token tintuża biex tispeċifika fejn tibda l-paġnar. Din l-għażla tieħu valuri String li jkun ifisser li jekk tipprova żżid valur ta 'offset quddiem is-sekwenza ta' Token Next, l-għażla ma titqiesx bħala offset.
U għalhekk, trid taqrah f'biċċiet.
Jekk taqra f'porzjonijiet kbar, in-numru ta 'qari jkun minimu, iżda l-volum ikun massimu.
Jekk taqra f'porzjonijiet żgħar, allura għall-kuntrarju, in-numru ta 'qari se jkun massimu, iżda l-volum se jkun minimu.
Għalhekk, biex inaqqas it-traffiku u għas-sbuħija ġenerali tas-soluzzjoni, kelli noħroġ b'soluzzjoni, li, sfortunatament, tidher xi ftit qisha crutch.
Għall-illustrazzjoni, ejja nikkunsidraw il-proċess tat-tniżżil ta 'log file f'2 verżjonijiet simplifikati ħafna. In-numru ta 'qari fiż-żewġ każijiet jiddependi mid-daqs tal-porzjon.
1) Tagħbija f'porzjonijiet żgħar:
2) Tagħbija f'porzjonijiet kbar:
Bħas-soltu, l-aħjar soluzzjoni hija fin-nofs.
Id-daqs tas-servizz huwa minimu, iżda matul il-proċess tal-qari, id-daqs jista 'jiżdied biex jitnaqqas in-numru ta' qari.
Għandu jiġi nnutat li l-problema tal-għażla tad-daqs ottimali tal-porzjon li jinqara għadha ma ġietx solvuta u teħtieġ studju u analiżi aktar fil-fond. Forsi ftit wara.
Deskrizzjoni ġenerali tal-implimentazzjoni
Tabelli tas-servizz użati
CREATE TABLE endpoint
(
id SERIAL ,
host text
);
TABLE database
(
id SERIAL ,
…
last_aws_log_time text ,
last_aws_nexttoken text ,
aws_max_item_size integer
);
last_aws_log_time — временная метка последнего загруженного лог-файла в формате YYYY-MM-DD-HH24.
last_aws_nexttoken — текстовая метка последней загруженной порции.
aws_max_item_size- эмпирическим путем, подобранный начальный размер порции.
Test b'kitba sħiħa
download_aws_piece.sh
#!/bin/bash
#########################################################
# download_aws_piece.sh
# downloan piece of log from AWS
# version HABR
let min_item_size=1024
let max_item_size=1048576
let growth_factor=3
let growth_counter=1
let growth_counter_max=3
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:''STARTED'
AWS_LOG_TIME=$1
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:AWS_LOG_TIME='$AWS_LOG_TIME
database_id=$2
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:database_id='$database_id
RESULT_FILE=$3
endpoint=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE_DATABASE -A -t -c "select e.host from endpoint e join database d on e.id = d.endpoint_id where d.id = $database_id "`
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:endpoint='$endpoint
db_instance=`echo $endpoint | awk -F"." '{print toupper($1)}'`
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:db_instance='$db_instance
LOG_FILE=$RESULT_FILE'.tmp_log'
TMP_FILE=$LOG_FILE'.tmp'
TMP_MIDDLE=$LOG_FILE'.tmp_mid'
TMP_MIDDLE2=$LOG_FILE'.tmp_mid2'
current_aws_log_time=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select last_aws_log_time from database where id = $database_id "`
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:current_aws_log_time='$current_aws_log_time
if [[ $current_aws_log_time != $AWS_LOG_TIME ]];
then
is_new_log='1'
if ! psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -q -c "update database set last_aws_log_time = '$AWS_LOG_TIME' where id = $database_id "
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - update database set last_aws_log_time .'
exit 1
fi
else
is_new_log='0'
fi
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:is_new_log='$is_new_log
let last_aws_max_item_size=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select aws_max_item_size from database where id = $database_id "`
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: last_aws_max_item_size='$last_aws_max_item_size
let count=1
if [[ $is_new_log == '1' ]];
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: START DOWNLOADING OF NEW AWS LOG'
if ! aws rds download-db-log-file-portion
--max-items $last_aws_max_item_size
--region REGION
--db-instance-identifier $db_instance
--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
exit 2
fi
else
next_token=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -c "select last_aws_nexttoken from database where id = $database_id "`
if [[ $next_token == '' ]];
then
next_token='0'
fi
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: CONTINUE DOWNLOADING OF AWS LOG'
if ! aws rds download-db-log-file-portion
--max-items $last_aws_max_item_size
--starting-token $next_token
--region REGION
--db-instance-identifier $db_instance
--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
exit 3
fi
line_count=`cat $LOG_FILE | wc -l`
let lines=$line_count-1
tail -$lines $LOG_FILE > $TMP_MIDDLE
mv -f $TMP_MIDDLE $LOG_FILE
fi
next_token_str=`cat $LOG_FILE | grep NEXTTOKEN`
next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
grep -v NEXTTOKEN $LOG_FILE > $TMP_FILE
if [[ $next_token == '' ]];
then
cp $TMP_FILE $RESULT_FILE
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: NEXTTOKEN NOT FOUND - FINISH '
rm $LOG_FILE
rm $TMP_FILE
rm $TMP_MIDDLE
rm $TMP_MIDDLE2
exit 0
else
psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
fi
first_str=`tail -1 $TMP_FILE`
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
head -$lines $TMP_FILE > $RESULT_FILE
###############################################
# MAIN CIRCLE
let count=2
while [[ $next_token != '' ]];
do
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: count='$count
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: START DOWNLOADING OF AWS LOG'
if ! aws rds download-db-log-file-portion
--max-items $last_aws_max_item_size
--starting-token $next_token
--region REGION
--db-instance-identifier $db_instance
--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
exit 4
fi
next_token_str=`cat $LOG_FILE | grep NEXTTOKEN`
next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
TMP_FILE=$LOG_FILE'.tmp'
grep -v NEXTTOKEN $LOG_FILE > $TMP_FILE
last_str=`head -1 $TMP_FILE`
if [[ $next_token == '' ]];
then
concat_str=$first_str$last_str
echo $concat_str >> $RESULT_FILE
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
tail -$lines $TMP_FILE >> $RESULT_FILE
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: NEXTTOKEN NOT FOUND - FINISH '
rm $LOG_FILE
rm $TMP_FILE
rm $TMP_MIDDLE
rm $TMP_MIDDLE2
exit 0
fi
if [[ $next_token != '' ]];
then
let growth_counter=$growth_counter+1
if [[ $growth_counter -gt $growth_counter_max ]];
then
let last_aws_max_item_size=$last_aws_max_item_size*$growth_factor
let growth_counter=1
fi
if [[ $last_aws_max_item_size -gt $max_item_size ]];
then
let last_aws_max_item_size=$max_item_size
fi
psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
concat_str=$first_str$last_str
echo $concat_str >> $RESULT_FILE
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
#############################
#Get middle of file
head -$lines $TMP_FILE > $TMP_MIDDLE
line_count=`cat $TMP_MIDDLE | wc -l`
let lines=$line_count-1
tail -$lines $TMP_MIDDLE > $TMP_MIDDLE2
cat $TMP_MIDDLE2 >> $RESULT_FILE
first_str=`tail -1 $TMP_FILE`
fi
let count=$count+1
done
#
#################################################################
exit 0
Frammenti tal-kitba b'xi spjegazzjonijiet:
Parametri tad-dħul tal-iskript:
- Timestamp tal-isem tal-fajl log fil-format SSSS-MM-JJ-HH24: AWS_LOG_TIME=$1
- ID tad-database: database_id=$2
- Isem tal-log file miġbur: RESULT_FILE=$3
Ikseb it-timestamp tal-aħħar log file mgħobbi:
current_aws_log_time=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select last_aws_log_time from database where id = $database_id "`
Jekk il-timestamp tal-aħħar log file mgħobbi ma jaqbilx mal-parametru tal-input, jitgħabba log fajl ġdid:
if [[ $current_aws_log_time != $AWS_LOG_TIME ]];
then
is_new_log='1'
if ! psql -h ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -c "update database set last_aws_log_time = '$AWS_LOG_TIME' where id = $database_id "
then
echo '***download_aws_piece.sh -FATAL_ERROR - update database set last_aws_log_time .'
exit 1
fi
else
is_new_log='0'
fi
Nieħdu l-valur tat-tikketta nexttoken mill-fajl imniżżel:
next_token_str=`cat $LOG_FILE | grep NEXTTOKEN`
next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
Valur nexttoken vojt iservi bħala sinjal tat-tmiem tat-tniżżil.
F'linja, ngħoddu porzjonijiet tal-fajl, nikkonkatenaw il-linji tul it-triq u nżidu d-daqs tal-porzjon:
Loop prinċipali
# MAIN CIRCLE
let count=2
while [[ $next_token != '' ]];
do
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: count='$count
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: START DOWNLOADING OF AWS LOG'
if ! aws rds download-db-log-file-portion
--max-items $last_aws_max_item_size
--starting-token $next_token
--region REGION
--db-instance-identifier $db_instance
--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
exit 4
fi
next_token_str=`cat $LOG_FILE | grep NEXTTOKEN`
next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
TMP_FILE=$LOG_FILE'.tmp'
grep -v NEXTTOKEN $LOG_FILE > $TMP_FILE
last_str=`head -1 $TMP_FILE`
if [[ $next_token == '' ]];
then
concat_str=$first_str$last_str
echo $concat_str >> $RESULT_FILE
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
tail -$lines $TMP_FILE >> $RESULT_FILE
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: NEXTTOKEN NOT FOUND - FINISH '
rm $LOG_FILE
rm $TMP_FILE
rm $TMP_MIDDLE
rm $TMP_MIDDLE2
exit 0
fi
if [[ $next_token != '' ]];
then
let growth_counter=$growth_counter+1
if [[ $growth_counter -gt $growth_counter_max ]];
then
let last_aws_max_item_size=$last_aws_max_item_size*$growth_factor
let growth_counter=1
fi
if [[ $last_aws_max_item_size -gt $max_item_size ]];
then
let last_aws_max_item_size=$max_item_size
fi
psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
concat_str=$first_str$last_str
echo $concat_str >> $RESULT_FILE
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
#############################
#Get middle of file
head -$lines $TMP_FILE > $TMP_MIDDLE
line_count=`cat $TMP_MIDDLE | wc -l`
let lines=$line_count-1
tail -$lines $TMP_MIDDLE > $TMP_MIDDLE2
cat $TMP_MIDDLE2 >> $RESULT_FILE
first_str=`tail -1 $TMP_FILE`
fi
let count=$count+1
done
X'inhu jmiss?
Għalhekk, l-ewwel kompitu intermedju - "niżżel fajl tal-ġurnal mis-sħaba" ġie solvut. X'għandek tagħmel bil-log imniżżel?
L-ewwel, għandek bżonn teżamina l-fajl tal-ġurnal u tiġbed it-talbiet attwali minnu.
Il-kompitu mhuwiex diffiċli ħafna. L-iskript bash l-aktar sempliċi jagħmel ix-xogħol pjuttost tajjeb.
upload_log_query.sh
#!/bin/bash
#########################################################
# upload_log_query.sh
# Upload table table from dowloaded aws file
# version HABR
###########################################################
echo 'TIMESTAMP:'$(date +%c)' Upload log_query table '
source_file=$1
echo 'source_file='$source_file
database_id=$2
echo 'database_id='$database_id
beginer=' '
first_line='1'
let "line_count=0"
sql_line=' '
sql_flag=' '
space=' '
cat $source_file | while read line
do
line="$space$line"
if [[ $first_line == "1" ]]; then
beginer=`echo $line | awk -F" " '{ print $1}' `
first_line='0'
fi
current_beginer=`echo $line | awk -F" " '{ print $1}' `
if [[ $current_beginer == $beginer ]]; then
if [[ $sql_flag == '1' ]]; then
sql_flag='0'
log_date=`echo $sql_line | awk -F" " '{ print $1}' `
log_time=`echo $sql_line | awk -F" " '{ print $2}' `
duration=`echo $sql_line | awk -F" " '{ print $5}' `
#replace ' to ''
sql_modline=`echo "$sql_line" | sed 's/'''/''''''/g'`
sql_line=' '
################
#PROCESSING OF THE SQL-SELECT IS HERE
if ! psql -h ENDPOINT.rds.amazonaws.com -U USER -d DATABASE -v ON_ERROR_STOP=1 -A -t -c "select log_query('$ip_port',$database_id , '$log_date' , '$log_time' , '$duration' , '$sql_modline' )"
then
echo 'FATAL_ERROR - log_query '
exit 1
fi
################
fi #if [[ $sql_flag == '1' ]]; then
let "line_count=line_count+1"
check=`echo $line | awk -F" " '{ print $8}' `
check_sql=${check^^}
#echo 'check_sql='$check_sql
if [[ $check_sql == 'SELECT' ]]; then
sql_flag='1'
sql_line="$sql_line$line"
ip_port=`echo $sql_line | awk -F":" '{ print $4}' `
fi
else
if [[ $sql_flag == '1' ]]; then
sql_line="$sql_line$line"
fi
fi #if [[ $current_beginer == $beginer ]]; then
done
Issa tista 'taħdem bit-talba magħżula mill-fajl log.
U jinfetħu diversi opportunitajiet utli.
Mistoqsijiet parsed jeħtieġ li jinħażnu x'imkien. Għal dan tintuża tabella tas-servizz log_query
CREATE TABLE log_query
(
id SERIAL ,
queryid bigint ,
query_md5hash text not null ,
database_id integer not null ,
timepoint timestamp without time zone not null,
duration double precision not null ,
query text not null ,
explained_plan text[],
plan_md5hash text ,
explained_plan_wo_costs text[],
plan_hash_value text ,
baseline_id integer ,
ip text ,
port text
);
ALTER TABLE log_query ADD PRIMARY KEY (id);
ALTER TABLE log_query ADD CONSTRAINT queryid_timepoint_unique_key UNIQUE (queryid, timepoint );
ALTER TABLE log_query ADD CONSTRAINT query_md5hash_timepoint_unique_key UNIQUE (query_md5hash, timepoint );
CREATE INDEX log_query_timepoint_idx ON log_query (timepoint);
CREATE INDEX log_query_queryid_idx ON log_query (queryid);
ALTER TABLE log_query ADD CONSTRAINT database_id_fk FOREIGN KEY (database_id) REFERENCES database (id) ON DELETE CASCADE ;
It-talba parsed tiġi pproċessata fi plpgsql funzjonijiet "log_query".
log_query.sql
--log_query.sql
--verison HABR
CREATE OR REPLACE FUNCTION log_query( ip_port text ,log_database_id integer , log_date text , log_time text , duration text , sql_line text ) RETURNS boolean AS $$
DECLARE
result boolean ;
log_timepoint timestamp without time zone ;
log_duration double precision ;
pos integer ;
log_query text ;
activity_string text ;
log_md5hash text ;
log_explain_plan text[] ;
log_planhash text ;
log_plan_wo_costs text[] ;
database_rec record ;
pg_stat_query text ;
test_log_query text ;
log_query_rec record;
found_flag boolean;
pg_stat_history_rec record ;
port_start integer ;
port_end integer ;
client_ip text ;
client_port text ;
log_queryid bigint ;
log_query_text text ;
pg_stat_query_text text ;
BEGIN
result = TRUE ;
RAISE NOTICE '***log_query';
port_start = position('(' in ip_port);
port_end = position(')' in ip_port);
client_ip = substring( ip_port from 1 for port_start-1 );
client_port = substring( ip_port from port_start+1 for port_end-port_start-1 );
SELECT e.host , d.name , d.owner_pwd
INTO database_rec
FROM database d JOIN endpoint e ON e.id = d.endpoint_id
WHERE d.id = log_database_id ;
log_timepoint = to_timestamp(log_date||' '||log_time,'YYYY-MM-DD HH24-MI-SS');
log_duration = duration:: double precision;
pos = position ('SELECT' in UPPER(sql_line) );
log_query = substring( sql_line from pos for LENGTH(sql_line));
log_query = regexp_replace(log_query,' +',' ','g');
log_query = regexp_replace(log_query,';+','','g');
log_query = trim(trailing ' ' from log_query);
log_md5hash = md5( log_query::text );
--Explain execution plan--
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')';
log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
PERFORM dblink_disconnect('LINK1');
--------------------------
BEGIN
INSERT INTO log_query
(
query_md5hash ,
database_id ,
timepoint ,
duration ,
query ,
explained_plan ,
plan_md5hash ,
explained_plan_wo_costs ,
plan_hash_value ,
ip ,
port
)
VALUES
(
log_md5hash ,
log_database_id ,
log_timepoint ,
log_duration ,
log_query ,
log_explain_plan ,
md5(log_explain_plan::text) ,
log_plan_wo_costs ,
md5(log_plan_wo_costs::text),
client_ip ,
client_port
);
activity_string = 'New query has logged '||
' database_id = '|| log_database_id ||
' query_md5hash='||log_md5hash||
' , timepoint = '||to_char(log_timepoint,'YYYYMMDD HH24:MI:SS');
RAISE NOTICE '%',activity_string;
PERFORM pg_log( log_database_id , 'log_query' , activity_string);
EXCEPTION
WHEN unique_violation THEN
RAISE NOTICE '*** unique_violation *** query already has logged';
END;
SELECT queryid
INTO log_queryid
FROM log_query
WHERE query_md5hash = log_md5hash AND
timepoint = log_timepoint;
IF log_queryid IS NOT NULL
THEN
RAISE NOTICE 'log_query with query_md5hash = % and timepoint = % has already has a QUERYID = %',log_md5hash,log_timepoint , log_queryid ;
RETURN result;
END IF;
------------------------------------------------
RAISE NOTICE 'Update queryid';
SELECT *
INTO log_query_rec
FROM log_query
WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
log_query_rec.query=regexp_replace(log_query_rec.query,';+','','g');
FOR pg_stat_history_rec IN
SELECT
queryid ,
query
FROM
pg_stat_db_queries
WHERE
database_id = log_database_id AND
queryid is not null
LOOP
pg_stat_query = pg_stat_history_rec.query ;
pg_stat_query=regexp_replace(pg_stat_query,'n+',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,'t+',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,' +',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,'$.','%','g');
log_query_text = trim(trailing ' ' from log_query_rec.query);
pg_stat_query_text = pg_stat_query;
--SELECT log_query_rec.query like pg_stat_query INTO found_flag ;
IF (log_query_text LIKE pg_stat_query_text) THEN
found_flag = TRUE ;
ELSE
found_flag = FALSE ;
END IF;
IF found_flag THEN
UPDATE log_query SET queryid = pg_stat_history_rec.queryid WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
activity_string = ' updated queryid = '||pg_stat_history_rec.queryid||
' for log_query with id = '||log_query_rec.id
;
RAISE NOTICE '%',activity_string;
EXIT ;
END IF ;
END LOOP ;
RETURN result ;
END
$$ LANGUAGE plpgsql;
Tabella tas-servizz tintuża waqt l-ipproċessar pg_stat_db_queries, li fih stampa ta' mistoqsijiet kurrenti mit-tabella pg_stat_history (L-użu tat-tabella huwa deskritt hawn −
TABLE pg_stat_db_queries
(
database_id integer,
queryid bigint ,
query text ,
max_time double precision
);
TABLE pg_stat_history
(
…
database_id integer ,
…
queryid bigint ,
…
max_time double precision ,
…
);
Il-funzjoni tippermettilek li timplimenta numru ta 'kapaċitajiet utli għall-ipproċessar ta' talbiet minn fajl log. Jiġifieri:
Opportunità #1 - Mistoqsija l-istorja tal-eżekuzzjoni
Utli ħafna biex tibda ssolvi inċident ta' prestazzjoni. L-ewwel, jiffamiljarizzaw ruħhom mal-istorja - meta beda t-tnaqqis?
Imbagħad, skond il-klassiċi, tfittex raġunijiet esterni. Forsi t-tagħbija tad-database sempliċement żdiedet drastikament u t-talba speċifika m'għandha x'taqsam xejn magħha.
Żid dħul ġdid fit-tabella log_query
port_start = position('(' in ip_port);
port_end = position(')' in ip_port);
client_ip = substring( ip_port from 1 for port_start-1 );
client_port = substring( ip_port from port_start+1 for port_end-port_start-1 );
SELECT e.host , d.name , d.owner_pwd
INTO database_rec
FROM database d JOIN endpoint e ON e.id = d.endpoint_id
WHERE d.id = log_database_id ;
log_timepoint = to_timestamp(log_date||' '||log_time,'YYYY-MM-DD HH24-MI-SS');
log_duration = to_number(duration,'99999999999999999999D9999999999');
pos = position ('SELECT' in UPPER(sql_line) );
log_query = substring( sql_line from pos for LENGTH(sql_line));
log_query = regexp_replace(log_query,' +',' ','g');
log_query = regexp_replace(log_query,';+','','g');
log_query = trim(trailing ' ' from log_query);
RAISE NOTICE 'log_query=%',log_query ;
log_md5hash = md5( log_query::text );
--Explain execution plan--
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')';
log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
PERFORM dblink_disconnect('LINK1');
--------------------------
BEGIN
INSERT INTO log_query
(
query_md5hash ,
database_id ,
timepoint ,
duration ,
query ,
explained_plan ,
plan_md5hash ,
explained_plan_wo_costs ,
plan_hash_value ,
ip ,
port
)
VALUES
(
log_md5hash ,
log_database_id ,
log_timepoint ,
log_duration ,
log_query ,
log_explain_plan ,
md5(log_explain_plan::text) ,
log_plan_wo_costs ,
md5(log_plan_wo_costs::text),
client_ip ,
client_port
);
Possibbiltà #2 - Issejvja l-pjanijiet ta 'eżekuzzjoni tal-mistoqsijiet
F’dan il-punt jista’ jqum oġġezzjoni-kjarifika-kumment: “Imma diġà hemm autoexplain" Iva, qiegħed hemm, imma x'inhu l-punt jekk il-pjan ta 'eżekuzzjoni jinħażen fl-istess fajl tal-ġurnal u sabiex issalvah għal aktar analiżi, għandek teżamina l-fajl tal-ġurnal?
Dak li kelli bżonn kien:
l-ewwel: aħżen il-pjan ta 'eżekuzzjoni fit-tabella tas-servizz tad-database ta' monitoraġġ;
it-tieni: biex tkun tista 'tqabbel il-pjanijiet ta' eżekuzzjoni ma 'xulxin sabiex immedjatament tara li l-pjan ta' eżekuzzjoni tal-mistoqsijiet inbidel.
Hemm talba b'parametri ta' eżekuzzjoni speċifiċi. Il-ksib u l-iffrankar tal-pjan ta 'eżekuzzjoni tiegħu bl-użu ta' EXPLAIN huwa kompitu elementari.
Barra minn hekk, billi tuża l-espressjoni SPJEGA (SPEJJEŻ FALZI), tista 'tikseb skeletru tal-pjan, li se jintuża biex jikseb il-valur hash tal-pjan, li jgħin fl-analiżi sussegwenti tal-istorja tal-bidliet fil-pjan ta' eżekuzzjoni.
Ikseb il-mudell tal-pjan ta 'eżekuzzjoni
--Explain execution plan--
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')';
log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
PERFORM dblink_disconnect('LINK1');
Possibbiltà #3 - L-użu tar-reġistru tal-mistoqsijiet għall-monitoraġġ
Peress li l-metriċi tal-prestazzjoni huma kkonfigurati mhux fuq it-test tat-talba, iżda fuq l-ID tagħha, għandek bżonn tassoċja talbiet mill-fajl log ma 'talbiet li għalihom il-metriċi tal-prestazzjoni huma kkonfigurati.
Ukoll, għall-inqas sabiex ikun hemm il-ħin eżatt ta 'okkorrenza ta' inċident ta 'prestazzjoni.
Dan il-mod, meta jseħħ inċident ta’ prestazzjoni għal ID ta’ talba, ikun hemm link għal talba speċifika b’valuri ta’ parametri speċifiċi u l-ħin eżatt ta’ eżekuzzjoni u t-tul tat-talba. Ikseb din l-informazzjoni billi tuża biss il-veduta pg_stat_statements - huwa pprojbit.
Sib il-queryid tat-talba u aġġorna l-entrata fit-tabella log_query
SELECT *
INTO log_query_rec
FROM log_query
WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
log_query_rec.query=regexp_replace(log_query_rec.query,';+','','g');
FOR pg_stat_history_rec IN
SELECT
queryid ,
query
FROM
pg_stat_db_queries
WHERE
database_id = log_database_id AND
queryid is not null
LOOP
pg_stat_query = pg_stat_history_rec.query ;
pg_stat_query=regexp_replace(pg_stat_query,'n+',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,'t+',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,' +',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,'$.','%','g');
log_query_text = trim(trailing ' ' from log_query_rec.query);
pg_stat_query_text = pg_stat_query;
--SELECT log_query_rec.query like pg_stat_query INTO found_flag ;
IF (log_query_text LIKE pg_stat_query_text) THEN
found_flag = TRUE ;
ELSE
found_flag = FALSE ;
END IF;
IF found_flag THEN
UPDATE log_query SET queryid = pg_stat_history_rec.queryid WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
activity_string = ' updated queryid = '||pg_stat_history_rec.queryid||
' for log_query with id = '||log_query_rec.id
;
RAISE NOTICE '%',activity_string;
EXIT ;
END IF ;
END LOOP ;
Wara kelma
It-teknika deskritta eventwalment sabet applikazzjoni fi
Għalkemm, ovvjament, fl-opinjoni personali tiegħi, ikun meħtieġ li taħdem aktar fuq l-algoritmu għall-għażla u l-bidla tad-daqs tal-porzjon imniżżel. Il-problema għadha ma ġietx solvuta fil-każ ġenerali. Probabbilment se jkun interessanti.
Imma dik hija storja kompletament differenti...
Sors: www.habr.com