Ma ọ bụ ntakịrị etinyere tetrisology.
A na-echefu ihe ọhụrụ nke ọma ochie.
Akwụkwọ edemede.
Ulationkpụzi nsogbu a
Ịkwesịrị ibudata faịlụ ndekọ PostgreSQL dị ugbu a site na igwe ojii AWS gaa na ndị ọbịa Linux gị. Ọ bụghị na ozugbo, mana, ka anyị kwuo, na-eji ntakịrị igbu oge.
Oge nbudata faịlụ ndekọ bụ nkeji ise.
A na-atụgharị faịlụ ndekọ na AWS kwa elekere.
Ngwa eji eme ihe
Iji budata faịlụ ndekọ na onye ọbịa, a na-eji script bash nke na-akpọ API AWS "
Ihe:
- —db-ntụgharị-ihe nchọpụta: AWS atụ aha;
- --log-file-name: aha faịlụ ndekọ emepụtara ugbu a
- --max-item: Ngụkọta ọnụ ọgụgụ nke ihe eweghachiri na mmepụta iwu.Oke akụkụ nke faịlụ ebudatara.
- --mbido-token: mmalite akara
Ma ọ dị mfe - ọrụ na-adọrọ mmasị maka ọzụzụ na iche iche n'oge awa ọrụ.
M ga-eche na e doziela nsogbu ahụ n'ihi ndụ kwa ụbọchị. Mana ngwa ngwa Google anaghị atụ aro ihe ngwọta ọ bụla, enweghịkwa m ọchịchọ dị ukwuu ịchọ n'ime omimi karị. Kedu ụzọ ọ bụla, ọ bụ mgbatị ahụ dị mma.
Formalization nke ọrụ
Faịlụ ndekọ ikpeazụ nwere ọtụtụ ahịrị ogologo agbanwe agbanwe. Na eserese, faịlụ ndekọ nwere ike ịnọchite anya ihe dịka nke a:
Ọ na-echetara gị ihe? Kedu ihe jikọrọ Tetris na ya? Na ebe a bụ ihe o nwere ime ya.
Ọ bụrụ na anyị na-eche nhọrọ enwere ike ibilite mgbe ị na-ebufe faịlụ na-esote graphically (maka ịdị mfe, na nke a, ka ahịrị ahụ nwee ogologo ogologo), anyị ga-enweta. Mpempe akwụkwọ Tetris ọkọlọtọ:
1) A na-ebudata faịlụ ahụ n'ozuzu ya ma bụrụ nke ikpeazụ. Nha akụkụ ahụ buru ibu karịa nha faịlụ ikpeazụ:
2) Faịlụ na-aga n'ihu. Nha chunk dị obere karịa nha faịlụ ikpeazụ:
3) Faịlụ a bụ n'ihu nke faịlụ gara aga ma nwee ọga n'ihu. Nha ntakiri ahụ pere mpe karịa nha nke fọdụrụ na faịlụ ikpeazụ:
4) Faịlụ bụ ihe na-aga n'ihu nke faịlụ gara aga ma bụrụ nke ikpeazụ. Nha ntakiri ahụ buru ibu karịa nha nke fọdụrụ na faịlụ ikpeazụ:
Ọrụ a bụ ikpokọta rectangle ma ọ bụ kpọọ Tetris n'ọkwa ọhụrụ.
Nsogbu na-ebilite mgbe a na-edozi nsogbu
1) Gbanye eriri nke 2 iberibe
N'ozuzu, ọ dịghị nsogbu pụrụ iche. Nsogbu ọkọlọtọ sitere na usoro mmemme izizi.
Nha ozi kacha mma
Ma nke a bụ ntakịrị ihe na-adọrọ mmasị.
N'ụzọ dị mwute, ọ nweghị ụzọ ị ga-esi jiri nkwụghachi mgbe akara mmalite mmalite:
Dị ka ị maralarị nhọrọ — a na-eji mmalite-token akọwapụta ebe a ga-amalite ide akwụkwọ. Nhọrọ a na-ewe ụkpụrụ String nke ga-apụta na ọ bụrụ na ị nwaa itinye uru akwụ ụgwọ n'ihu eriri Token na-esote, a gaghị echebara nhọrọ ahụ dị ka nkwụsị.
Ya mere, ị ga-agụ ya na nkenke.
Ọ bụrụ na ị na-agụ na nnukwu akụkụ, ọnụ ọgụgụ nke ọgụgụ ga-adị ntakịrị, ma olu ga-abụ kacha.
Ọ bụrụ na ị na-agụ na obere akụkụ, mgbe ahụ, n'ụzọ megidere, ọnụ ọgụgụ nke ịgụ ga-abụ nke kachasị, ma olu ga-adị ntakịrị.
Ya mere, iji belata okporo ụzọ na maka ịma mma zuru oke nke ngwọta ahụ, aghaghị m ịmepụta ihe ngwọta, nke, dị mwute ikwu, na-ele anya dị ka ihe mgbagwoju anya.
Maka ọmụmaatụ, ka anyị tụlee usoro nbudata faịlụ ndekọ na ụdị 2 dị mfe nke ukwuu. Ọnụ ọgụgụ nke ọgụgụ na ikpe abụọ ahụ dabere na oke akụkụ.
1) Ibu na obere akụkụ:
2) Ibu ibu na nnukwu akụkụ:
Dị ka ọ dị na mbụ, ngwọta kachasị mma dị n'etiti.
Ọnụ ọgụgụ na-eje ozi dị ntakịrị, ma n'oge usoro ọgụgụ, enwere ike ịbawanye nha iji belata ọnụ ọgụgụ ọgụgụ.
Okwesiri iburu n'uche na nsogbu nke ịhọrọ oke kachasị mma nke akụkụ a na-agụbeghị edozibeghị ma chọọ nyocha na nyocha miri emi karị. Ma eleghị anya, obere oge gachara.
Nkọwa zuru oke nke mmejuputa iwu
A na-eji tebụl ọrụ
CREATE TABLE endpoint
(
id SERIAL ,
host text
);
TABLE database
(
id SERIAL ,
…
last_aws_log_time text ,
last_aws_nexttoken text ,
aws_max_item_size integer
);
last_aws_log_time — временная метка последнего загруженного лог-файла в формате YYYY-MM-DD-HH24.
last_aws_nexttoken — текстовая метка последней загруженной порции.
aws_max_item_size- эмпирическим путем, подобранный начальный размер порции.
Ederede ederede zuru oke
budata_aws_piece.sh
#!/bin/bash
#########################################################
# download_aws_piece.sh
# downloan piece of log from AWS
# version HABR
let min_item_size=1024
let max_item_size=1048576
let growth_factor=3
let growth_counter=1
let growth_counter_max=3
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:''STARTED'
AWS_LOG_TIME=$1
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:AWS_LOG_TIME='$AWS_LOG_TIME
database_id=$2
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:database_id='$database_id
RESULT_FILE=$3
endpoint=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE_DATABASE -A -t -c "select e.host from endpoint e join database d on e.id = d.endpoint_id where d.id = $database_id "`
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:endpoint='$endpoint
db_instance=`echo $endpoint | awk -F"." '{print toupper($1)}'`
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:db_instance='$db_instance
LOG_FILE=$RESULT_FILE'.tmp_log'
TMP_FILE=$LOG_FILE'.tmp'
TMP_MIDDLE=$LOG_FILE'.tmp_mid'
TMP_MIDDLE2=$LOG_FILE'.tmp_mid2'
current_aws_log_time=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select last_aws_log_time from database where id = $database_id "`
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:current_aws_log_time='$current_aws_log_time
if [[ $current_aws_log_time != $AWS_LOG_TIME ]];
then
is_new_log='1'
if ! psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -q -c "update database set last_aws_log_time = '$AWS_LOG_TIME' where id = $database_id "
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - update database set last_aws_log_time .'
exit 1
fi
else
is_new_log='0'
fi
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh:is_new_log='$is_new_log
let last_aws_max_item_size=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select aws_max_item_size from database where id = $database_id "`
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: last_aws_max_item_size='$last_aws_max_item_size
let count=1
if [[ $is_new_log == '1' ]];
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: START DOWNLOADING OF NEW AWS LOG'
if ! aws rds download-db-log-file-portion
--max-items $last_aws_max_item_size
--region REGION
--db-instance-identifier $db_instance
--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
exit 2
fi
else
next_token=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -c "select last_aws_nexttoken from database where id = $database_id "`
if [[ $next_token == '' ]];
then
next_token='0'
fi
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: CONTINUE DOWNLOADING OF AWS LOG'
if ! aws rds download-db-log-file-portion
--max-items $last_aws_max_item_size
--starting-token $next_token
--region REGION
--db-instance-identifier $db_instance
--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
exit 3
fi
line_count=`cat $LOG_FILE | wc -l`
let lines=$line_count-1
tail -$lines $LOG_FILE > $TMP_MIDDLE
mv -f $TMP_MIDDLE $LOG_FILE
fi
next_token_str=`cat $LOG_FILE | grep NEXTTOKEN`
next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
grep -v NEXTTOKEN $LOG_FILE > $TMP_FILE
if [[ $next_token == '' ]];
then
cp $TMP_FILE $RESULT_FILE
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: NEXTTOKEN NOT FOUND - FINISH '
rm $LOG_FILE
rm $TMP_FILE
rm $TMP_MIDDLE
rm $TMP_MIDDLE2
exit 0
else
psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
fi
first_str=`tail -1 $TMP_FILE`
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
head -$lines $TMP_FILE > $RESULT_FILE
###############################################
# MAIN CIRCLE
let count=2
while [[ $next_token != '' ]];
do
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: count='$count
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: START DOWNLOADING OF AWS LOG'
if ! aws rds download-db-log-file-portion
--max-items $last_aws_max_item_size
--starting-token $next_token
--region REGION
--db-instance-identifier $db_instance
--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
exit 4
fi
next_token_str=`cat $LOG_FILE | grep NEXTTOKEN`
next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
TMP_FILE=$LOG_FILE'.tmp'
grep -v NEXTTOKEN $LOG_FILE > $TMP_FILE
last_str=`head -1 $TMP_FILE`
if [[ $next_token == '' ]];
then
concat_str=$first_str$last_str
echo $concat_str >> $RESULT_FILE
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
tail -$lines $TMP_FILE >> $RESULT_FILE
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: NEXTTOKEN NOT FOUND - FINISH '
rm $LOG_FILE
rm $TMP_FILE
rm $TMP_MIDDLE
rm $TMP_MIDDLE2
exit 0
fi
if [[ $next_token != '' ]];
then
let growth_counter=$growth_counter+1
if [[ $growth_counter -gt $growth_counter_max ]];
then
let last_aws_max_item_size=$last_aws_max_item_size*$growth_factor
let growth_counter=1
fi
if [[ $last_aws_max_item_size -gt $max_item_size ]];
then
let last_aws_max_item_size=$max_item_size
fi
psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
concat_str=$first_str$last_str
echo $concat_str >> $RESULT_FILE
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
#############################
#Get middle of file
head -$lines $TMP_FILE > $TMP_MIDDLE
line_count=`cat $TMP_MIDDLE | wc -l`
let lines=$line_count-1
tail -$lines $TMP_MIDDLE > $TMP_MIDDLE2
cat $TMP_MIDDLE2 >> $RESULT_FILE
first_str=`tail -1 $TMP_FILE`
fi
let count=$count+1
done
#
#################################################################
exit 0
Iberibe edemede nwere nkọwa ụfọdụ:
Parampat ntinye script:
- Oge stampụ aha faịlụ ndekọ n'ụdị YYYY-MM-DD-HH24: AWS_LOG_TIME=$1
- NJ nchekwa data: database_id=$2
- Aha faịlụ ndekọ aha anakọtara: RESULT_FILE=$3
Nweta stampụ oge faịlụ ndekọ ikpeazụ ebugoro:
current_aws_log_time=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select last_aws_log_time from database where id = $database_id "`
Ọ bụrụ na akara timestamp nke faịlụ ndekọ ikpeazụ ebugoro adabaghị na oke ntinye, faịlụ ndekọ ọhụrụ na-ebu:
if [[ $current_aws_log_time != $AWS_LOG_TIME ]];
then
is_new_log='1'
if ! psql -h ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -c "update database set last_aws_log_time = '$AWS_LOG_TIME' where id = $database_id "
then
echo '***download_aws_piece.sh -FATAL_ERROR - update database set last_aws_log_time .'
exit 1
fi
else
is_new_log='0'
fi
Anyị na-enweta uru nke akara nexttoken site na faịlụ ebudatara:
next_token_str=`cat $LOG_FILE | grep NEXTTOKEN`
next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
Uru efu nexttoken na-arụ ọrụ dị ka akara nke njedebe nke nbudata.
Na loop, anyị na-agụ akụkụ nke faịlụ ahụ, na-ejikọta ahịrị n'ụzọ ma na-abawanye nha nke akụkụ ahụ:
Isi loop
# MAIN CIRCLE
let count=2
while [[ $next_token != '' ]];
do
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: count='$count
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: START DOWNLOADING OF AWS LOG'
if ! aws rds download-db-log-file-portion
--max-items $last_aws_max_item_size
--starting-token $next_token
--region REGION
--db-instance-identifier $db_instance
--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
then
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
exit 4
fi
next_token_str=`cat $LOG_FILE | grep NEXTTOKEN`
next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
TMP_FILE=$LOG_FILE'.tmp'
grep -v NEXTTOKEN $LOG_FILE > $TMP_FILE
last_str=`head -1 $TMP_FILE`
if [[ $next_token == '' ]];
then
concat_str=$first_str$last_str
echo $concat_str >> $RESULT_FILE
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
tail -$lines $TMP_FILE >> $RESULT_FILE
echo $(date +%Y%m%d%H%M)': download_aws_piece.sh: NEXTTOKEN NOT FOUND - FINISH '
rm $LOG_FILE
rm $TMP_FILE
rm $TMP_MIDDLE
rm $TMP_MIDDLE2
exit 0
fi
if [[ $next_token != '' ]];
then
let growth_counter=$growth_counter+1
if [[ $growth_counter -gt $growth_counter_max ]];
then
let last_aws_max_item_size=$last_aws_max_item_size*$growth_factor
let growth_counter=1
fi
if [[ $last_aws_max_item_size -gt $max_item_size ]];
then
let last_aws_max_item_size=$max_item_size
fi
psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
concat_str=$first_str$last_str
echo $concat_str >> $RESULT_FILE
line_count=`cat $TMP_FILE | wc -l`
let lines=$line_count-1
#############################
#Get middle of file
head -$lines $TMP_FILE > $TMP_MIDDLE
line_count=`cat $TMP_MIDDLE | wc -l`
let lines=$line_count-1
tail -$lines $TMP_MIDDLE > $TMP_MIDDLE2
cat $TMP_MIDDLE2 >> $RESULT_FILE
first_str=`tail -1 $TMP_FILE`
fi
let count=$count+1
done
Gịnị na-esote?
Ya mere, ọrụ etiti mbụ - "budata faịlụ ndekọ site na igwe ojii" edozila. Kedu ihe ị ga-eme na ndekọ ebudatara?
Nke mbụ, ị ga-enyocha faịlụ ndekọ ma wepụ arịrịọ ndị dị na ya.
Ọrụ ahụ esighị ike nke ukwuu. Ederede bash kachasị mfe na-arụ ọrụ ahụ nke ọma.
upload_log_query.sh
#!/bin/bash
#########################################################
# upload_log_query.sh
# Upload table table from dowloaded aws file
# version HABR
###########################################################
echo 'TIMESTAMP:'$(date +%c)' Upload log_query table '
source_file=$1
echo 'source_file='$source_file
database_id=$2
echo 'database_id='$database_id
beginer=' '
first_line='1'
let "line_count=0"
sql_line=' '
sql_flag=' '
space=' '
cat $source_file | while read line
do
line="$space$line"
if [[ $first_line == "1" ]]; then
beginer=`echo $line | awk -F" " '{ print $1}' `
first_line='0'
fi
current_beginer=`echo $line | awk -F" " '{ print $1}' `
if [[ $current_beginer == $beginer ]]; then
if [[ $sql_flag == '1' ]]; then
sql_flag='0'
log_date=`echo $sql_line | awk -F" " '{ print $1}' `
log_time=`echo $sql_line | awk -F" " '{ print $2}' `
duration=`echo $sql_line | awk -F" " '{ print $5}' `
#replace ' to ''
sql_modline=`echo "$sql_line" | sed 's/'''/''''''/g'`
sql_line=' '
################
#PROCESSING OF THE SQL-SELECT IS HERE
if ! psql -h ENDPOINT.rds.amazonaws.com -U USER -d DATABASE -v ON_ERROR_STOP=1 -A -t -c "select log_query('$ip_port',$database_id , '$log_date' , '$log_time' , '$duration' , '$sql_modline' )"
then
echo 'FATAL_ERROR - log_query '
exit 1
fi
################
fi #if [[ $sql_flag == '1' ]]; then
let "line_count=line_count+1"
check=`echo $line | awk -F" " '{ print $8}' `
check_sql=${check^^}
#echo 'check_sql='$check_sql
if [[ $check_sql == 'SELECT' ]]; then
sql_flag='1'
sql_line="$sql_line$line"
ip_port=`echo $sql_line | awk -F":" '{ print $4}' `
fi
else
if [[ $sql_flag == '1' ]]; then
sql_line="$sql_line$line"
fi
fi #if [[ $current_beginer == $beginer ]]; then
done
Ugbu a ị nwere ike ịrụ ọrụ na arịrịọ ahọpụtara na faịlụ log.
Na ọtụtụ ohere bara uru na-emeghe.
Ọ dị mkpa ka echekwaa ajụjụ atụgharịrị ebe. A na-eji tebụl ọrụ eme ihe maka nke a log_query
CREATE TABLE log_query
(
id SERIAL ,
queryid bigint ,
query_md5hash text not null ,
database_id integer not null ,
timepoint timestamp without time zone not null,
duration double precision not null ,
query text not null ,
explained_plan text[],
plan_md5hash text ,
explained_plan_wo_costs text[],
plan_hash_value text ,
baseline_id integer ,
ip text ,
port text
);
ALTER TABLE log_query ADD PRIMARY KEY (id);
ALTER TABLE log_query ADD CONSTRAINT queryid_timepoint_unique_key UNIQUE (queryid, timepoint );
ALTER TABLE log_query ADD CONSTRAINT query_md5hash_timepoint_unique_key UNIQUE (query_md5hash, timepoint );
CREATE INDEX log_query_timepoint_idx ON log_query (timepoint);
CREATE INDEX log_query_queryid_idx ON log_query (queryid);
ALTER TABLE log_query ADD CONSTRAINT database_id_fk FOREIGN KEY (database_id) REFERENCES database (id) ON DELETE CASCADE ;
A na-ahazi arịrịọ a tụgharịrị na ya plpgsql ọrụ"log_query".
log_query.sql
--log_query.sql
--verison HABR
CREATE OR REPLACE FUNCTION log_query( ip_port text ,log_database_id integer , log_date text , log_time text , duration text , sql_line text ) RETURNS boolean AS $$
DECLARE
result boolean ;
log_timepoint timestamp without time zone ;
log_duration double precision ;
pos integer ;
log_query text ;
activity_string text ;
log_md5hash text ;
log_explain_plan text[] ;
log_planhash text ;
log_plan_wo_costs text[] ;
database_rec record ;
pg_stat_query text ;
test_log_query text ;
log_query_rec record;
found_flag boolean;
pg_stat_history_rec record ;
port_start integer ;
port_end integer ;
client_ip text ;
client_port text ;
log_queryid bigint ;
log_query_text text ;
pg_stat_query_text text ;
BEGIN
result = TRUE ;
RAISE NOTICE '***log_query';
port_start = position('(' in ip_port);
port_end = position(')' in ip_port);
client_ip = substring( ip_port from 1 for port_start-1 );
client_port = substring( ip_port from port_start+1 for port_end-port_start-1 );
SELECT e.host , d.name , d.owner_pwd
INTO database_rec
FROM database d JOIN endpoint e ON e.id = d.endpoint_id
WHERE d.id = log_database_id ;
log_timepoint = to_timestamp(log_date||' '||log_time,'YYYY-MM-DD HH24-MI-SS');
log_duration = duration:: double precision;
pos = position ('SELECT' in UPPER(sql_line) );
log_query = substring( sql_line from pos for LENGTH(sql_line));
log_query = regexp_replace(log_query,' +',' ','g');
log_query = regexp_replace(log_query,';+','','g');
log_query = trim(trailing ' ' from log_query);
log_md5hash = md5( log_query::text );
--Explain execution plan--
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')';
log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
PERFORM dblink_disconnect('LINK1');
--------------------------
BEGIN
INSERT INTO log_query
(
query_md5hash ,
database_id ,
timepoint ,
duration ,
query ,
explained_plan ,
plan_md5hash ,
explained_plan_wo_costs ,
plan_hash_value ,
ip ,
port
)
VALUES
(
log_md5hash ,
log_database_id ,
log_timepoint ,
log_duration ,
log_query ,
log_explain_plan ,
md5(log_explain_plan::text) ,
log_plan_wo_costs ,
md5(log_plan_wo_costs::text),
client_ip ,
client_port
);
activity_string = 'New query has logged '||
' database_id = '|| log_database_id ||
' query_md5hash='||log_md5hash||
' , timepoint = '||to_char(log_timepoint,'YYYYMMDD HH24:MI:SS');
RAISE NOTICE '%',activity_string;
PERFORM pg_log( log_database_id , 'log_query' , activity_string);
EXCEPTION
WHEN unique_violation THEN
RAISE NOTICE '*** unique_violation *** query already has logged';
END;
SELECT queryid
INTO log_queryid
FROM log_query
WHERE query_md5hash = log_md5hash AND
timepoint = log_timepoint;
IF log_queryid IS NOT NULL
THEN
RAISE NOTICE 'log_query with query_md5hash = % and timepoint = % has already has a QUERYID = %',log_md5hash,log_timepoint , log_queryid ;
RETURN result;
END IF;
------------------------------------------------
RAISE NOTICE 'Update queryid';
SELECT *
INTO log_query_rec
FROM log_query
WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
log_query_rec.query=regexp_replace(log_query_rec.query,';+','','g');
FOR pg_stat_history_rec IN
SELECT
queryid ,
query
FROM
pg_stat_db_queries
WHERE
database_id = log_database_id AND
queryid is not null
LOOP
pg_stat_query = pg_stat_history_rec.query ;
pg_stat_query=regexp_replace(pg_stat_query,'n+',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,'t+',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,' +',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,'$.','%','g');
log_query_text = trim(trailing ' ' from log_query_rec.query);
pg_stat_query_text = pg_stat_query;
--SELECT log_query_rec.query like pg_stat_query INTO found_flag ;
IF (log_query_text LIKE pg_stat_query_text) THEN
found_flag = TRUE ;
ELSE
found_flag = FALSE ;
END IF;
IF found_flag THEN
UPDATE log_query SET queryid = pg_stat_history_rec.queryid WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
activity_string = ' updated queryid = '||pg_stat_history_rec.queryid||
' for log_query with id = '||log_query_rec.id
;
RAISE NOTICE '%',activity_string;
EXIT ;
END IF ;
END LOOP ;
RETURN result ;
END
$$ LANGUAGE plpgsql;
A na-eji tebụl ọrụ eme ihe n'oge nhazi pg_stat_db_queries, nwere foto nke ajụjụ dị ugbu a sitere na tebụl pg_stat_ akụkọ ihe mere eme (A kọwara ojiji nke tebụl ebe a -
TABLE pg_stat_db_queries
(
database_id integer,
queryid bigint ,
query text ,
max_time double precision
);
TABLE pg_stat_history
(
…
database_id integer ,
…
queryid bigint ,
…
max_time double precision ,
…
);
Ọrụ ahụ na-enye gị ohere itinye ọtụtụ ikike bara uru maka nhazi arịrịọ site na faịlụ log. Ya bụ:
Ohere #1 - akụkọ mmezu nke ajụjụ
Ọ bara ezigbo uru maka ịmalite idozi ihe omume arụmọrụ. Nke mbụ, mata akụkọ ihe mere eme - olee mgbe mbelata ahụ malitere?
Mgbe ahụ, dị ka oge ochie, chọọ maka ihe ndị dị na mpụga. Ma eleghị anya, ibu nchekwa data abawanyela nke ukwuu na arịrịọ a kapịrị ọnụ enweghị ihe jikọrọ ya na ya.
Tinye ntinye ọhụrụ na tebụl log_query
port_start = position('(' in ip_port);
port_end = position(')' in ip_port);
client_ip = substring( ip_port from 1 for port_start-1 );
client_port = substring( ip_port from port_start+1 for port_end-port_start-1 );
SELECT e.host , d.name , d.owner_pwd
INTO database_rec
FROM database d JOIN endpoint e ON e.id = d.endpoint_id
WHERE d.id = log_database_id ;
log_timepoint = to_timestamp(log_date||' '||log_time,'YYYY-MM-DD HH24-MI-SS');
log_duration = to_number(duration,'99999999999999999999D9999999999');
pos = position ('SELECT' in UPPER(sql_line) );
log_query = substring( sql_line from pos for LENGTH(sql_line));
log_query = regexp_replace(log_query,' +',' ','g');
log_query = regexp_replace(log_query,';+','','g');
log_query = trim(trailing ' ' from log_query);
RAISE NOTICE 'log_query=%',log_query ;
log_md5hash = md5( log_query::text );
--Explain execution plan--
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')';
log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
PERFORM dblink_disconnect('LINK1');
--------------------------
BEGIN
INSERT INTO log_query
(
query_md5hash ,
database_id ,
timepoint ,
duration ,
query ,
explained_plan ,
plan_md5hash ,
explained_plan_wo_costs ,
plan_hash_value ,
ip ,
port
)
VALUES
(
log_md5hash ,
log_database_id ,
log_timepoint ,
log_duration ,
log_query ,
log_explain_plan ,
md5(log_explain_plan::text) ,
log_plan_wo_costs ,
md5(log_plan_wo_costs::text),
client_ip ,
client_port
);
Enwere ike #2 - Chekwaa atụmatụ mmezu ajụjụ
N'ebe a, nkwupụta mmegide-nkọwa-aka nwere ike ibilite: "Mana enwere ugbua autoexplain" Ee, ọ dị ebe ahụ, mana gịnị bụ isi ma ọ bụrụ na echekwara atụmatụ igbu ya n'otu faịlụ ndekọ ma iji chekwaa ya maka nyocha ọzọ, ị ga-atụgharị faịlụ ndekọ ahụ?
Ihe m chọrọ bụ:
mbụ: chekwaa atụmatụ igbu egbu na tebụl ọrụ nke nchekwa data nlekota;
Nke abuo: inwe ike tulee atumatu ogbugbu n'otu n'otu ka i wee mara ozugbo na atụmatụ ogbugbu ajụjụ agbanweela.
Enwere arịrịọ nwere parampat mkpochapụ akọwapụtara. Ịnweta na ichekwa atụmatụ mmezu ya site na iji EXPLAIN bụ ọrụ mbụ.
Ọzọkwa, site na iji okwu EXPLAIN (COSTS FALSE), ị nwere ike nweta ọkpụkpụ nke atụmatụ ahụ, nke a ga-eji nweta uru hash nke atụmatụ ahụ, nke ga-enyere aka na nyocha na-esote nke akụkọ ihe mere eme nke mgbanwe na atụmatụ mmezu.
Nweta template atụmatụ igbu
--Explain execution plan--
EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')';
log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
PERFORM dblink_disconnect('LINK1');
Enwere ike #3 - Iji ndekọ ajụjụ maka nleba anya
Ebe ọ bụ na ahaziri metrik arụmọrụ ọ bụghị na ederede arịrịọ, kama na NJ ya, ịkwesịrị ijikọ arịrịọ sitere na faịlụ ndekọ na arịrịọ nke ahaziri metrik arụmọrụ.
Ọfọn, opekempe, iji nweta oge kpọmkwem ihe omume nke omume omume.
N'ụzọ dị otú a, mgbe ihe omume arụmọrụ mere maka ID arịrịọ, a ga-enwe njikọ maka arịrịọ a kapịrị ọnụ nwere ụkpụrụ paramita akọwapụtara yana oge mmezu na ogologo oge nke arịrịọ ahụ. Nweta ozi a site na iji naanị nlele pg_stat_nkwupụta - amachibidoro ya.
Chọta ajụjụ nke arịrịọ ahụ wee melite ntinye na tebụl log_query
SELECT *
INTO log_query_rec
FROM log_query
WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
log_query_rec.query=regexp_replace(log_query_rec.query,';+','','g');
FOR pg_stat_history_rec IN
SELECT
queryid ,
query
FROM
pg_stat_db_queries
WHERE
database_id = log_database_id AND
queryid is not null
LOOP
pg_stat_query = pg_stat_history_rec.query ;
pg_stat_query=regexp_replace(pg_stat_query,'n+',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,'t+',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,' +',' ','g');
pg_stat_query=regexp_replace(pg_stat_query,'$.','%','g');
log_query_text = trim(trailing ' ' from log_query_rec.query);
pg_stat_query_text = pg_stat_query;
--SELECT log_query_rec.query like pg_stat_query INTO found_flag ;
IF (log_query_text LIKE pg_stat_query_text) THEN
found_flag = TRUE ;
ELSE
found_flag = FALSE ;
END IF;
IF found_flag THEN
UPDATE log_query SET queryid = pg_stat_history_rec.queryid WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
activity_string = ' updated queryid = '||pg_stat_history_rec.queryid||
' for log_query with id = '||log_query_rec.id
;
RAISE NOTICE '%',activity_string;
EXIT ;
END IF ;
END LOOP ;
Afterword
Usoro akọwara mechara chọta ngwa na
Ọ bụ ezie na, n'ezie, n'echiche nke m, ọ ga-adị mkpa ịrụ ọrụ karịa na algọridim maka ịhọrọ na ịgbanwe nha nke akụkụ ebudatara. Edobebeghị nsogbu ahụ n'okwu n'ozuzu. Ọ ga-abụ na ọ ga-adọrọ mmasị.
Mana nke ahụ bụ akụkọ dị iche kpamkpam...
isi: www.habr.com