የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

ወይም ትንሽ ተተግብሯል tetrisology.
አዲስ ነገር ሁሉ በደንብ የተረሳ አሮጌ ነው።
ኢፒግራፎች።
የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

የችግሩ ቀመር

የአሁኑን የPostgreSQL ምዝግብ ማስታወሻ ፋይል ከAWS ደመና ወደ አካባቢያዊ ሊኑክስ አስተናጋጅ ማውረድ አለቦት። በእውነተኛ ጊዜ አይደለም፣ ነገር ግን፣ እንበል፣ በትንሹ መዘግየት።
የምዝግብ ማስታወሻው የማውረድ ጊዜ 5 ደቂቃ ነው።
በAWS ውስጥ ያለው የምዝግብ ማስታወሻ ፋይል በየሰዓቱ ይሽከረከራል።

ያገለገሉ መሣሪያዎች

የምዝግብ ማስታወሻ ፋይሉን ወደ አስተናጋጁ ለማውረድ AWS ኤፒአይን የሚጠራ የባሽ ስክሪፕት ጥቅም ላይ ይውላል።aws rds አውርድ-db-log-file-portion».

መለኪያዎች

  • -db-intance-መለያ፡ AWS ምሳሌ ስም;
  • --log-file-name: በአሁኑ ጊዜ የተፈጠረው የምዝግብ ማስታወሻ ፋይል ስም
  • --max-item፡- በትእዛዝ ውፅዓት ውስጥ የተመለሱት ዕቃዎች ጠቅላላ ብዛት።የወረደው ፋይል ክፍል መጠን።
  • --starting-token፡ ማስመሰያ ማስጀመሪያ

በዚህ ሁኔታ, ምዝግብ ማስታወሻዎችን የመጫን ተግባር በስራው ወቅት ተነሳ የ PostgreSQL መጠይቅ አፈጻጸምን መከታተል።

እና ቀላል ነው - አስደሳች ተግባር ለሥልጠና እና በስራ ሰዓታት ውስጥ ልዩነት።
በዕለት ተዕለት ሕይወት ምክንያት ችግሩ ቀድሞውኑ እንደተፈታ እገምታለሁ. ግን ፈጣን Google ምንም መፍትሄዎችን አልሰጠም, እና በጥልቀት ለመፈለግ ብዙ ፍላጎት አልነበረኝም. ያም ሆነ ይህ ጥሩ የአካል ብቃት እንቅስቃሴ ነው።

ተግባሩን መደበኛ ማድረግ

የመጨረሻው የምዝግብ ማስታወሻ ፋይል ተለዋዋጭ ርዝመት ብዙ መስመሮችን ያካትታል. በግራፊክ፣ የምዝግብ ማስታወሻው ፋይል እንደዚህ ያለ ነገር ሊወከል ይችላል።
የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

አስቀድሞ የሆነ ነገር ያስታውሰዎታል? Tetris ከእሱ ጋር ምን ግንኙነት አለው? እና ከሱ ጋር ምን ግንኙነት እንዳለው እነሆ።
የሚቀጥለውን ፋይል በግራፊክ ሲጫኑ ሊፈጠሩ የሚችሉ አማራጮችን ካሰብን (ለቀላልነት ፣ በዚህ ሁኔታ ፣ መስመሮቹ ተመሳሳይ ርዝመት እንዲኖራቸው ያድርጉ) ። መደበኛ Tetris ቁርጥራጮች;

1) ፋይሉ ሙሉ በሙሉ ወርዷል እና የመጨረሻ ነው። የክፍሉ መጠን ከመጨረሻው የፋይል መጠን ይበልጣል፡-
የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

2) ፋይሉ ይቀጥላል። የክፈፉ መጠን ከመጨረሻው የፋይል መጠን ያነሰ ነው፡-
የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

3) ፋይሉ የቀደመው ፋይል ቀጣይ እና ቀጣይ ነው። የጭራሹ መጠን ከመጨረሻው ፋይል መጠን ያነሰ ነው፡-
የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

4) ፋይሉ የቀደመው ፋይል ቀጣይ እና የመጨረሻው ነው። የጭራሹ መጠን ከቀረው የመጨረሻው ፋይል መጠን ይበልጣል፡-
የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

ስራው አራት ማእዘንን መሰብሰብ ወይም ቴትሪስን በአዲስ ደረጃ መጫወት ነው.
የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

ችግር በሚፈታበት ጊዜ የሚከሰቱ ችግሮች

1) የ 2 ቁርጥራጮችን መስመር ይለጥፉ

የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ
በአጠቃላይ, ምንም ልዩ ችግሮች አልነበሩም. ከመጀመሪያው የፕሮግራም ኮርስ መደበኛ ችግር.

ምርጥ የአገልግሎት መጠን

ግን ይህ ትንሽ የበለጠ አስደሳች ነው።
እንደ አለመታደል ሆኖ ከጅምር ክፍል መለያ በኋላ ማካካሻን ለመጠቀም ምንም መንገድ የለም፡

እርስዎ አስቀድመው እንደሚያውቁት አማራጩ -starting-token የት እንደሚጀመር ለመለየት ይጠቅማል። ይህ አማራጭ የ String እሴቶችን ይወስዳል ይህም ማለት በሚቀጥለው Token ሕብረቁምፊ ፊት ለፊት የማካካሻ እሴት ለመጨመር ከሞከሩ አማራጩ እንደ ማካካሻ አይወሰድም.

እና ስለዚህ, በክፍል ውስጥ ማንበብ አለብዎት.
በትላልቅ ክፍሎች ውስጥ ካነበቡ, የንባብ ብዛት አነስተኛ ይሆናል, ነገር ግን መጠኑ ከፍተኛ ይሆናል.
በትንሽ ክፍሎች ካነበቡ, በተቃራኒው, የንባብ ብዛት ከፍተኛ ይሆናል, ነገር ግን መጠኑ አነስተኛ ይሆናል.
ስለዚህ, ትራፊክን ለመቀነስ እና ለመፍትሄው አጠቃላይ ውበት, አንድ መፍትሄ ማምጣት ነበረብኝ, በሚያሳዝን ሁኔታ, እንደ ክራንች ትንሽ ይመስላል.

ለማሳያ ያህል፣ የሎግ ፋይልን በ2 በጣም ቀላል ስሪቶች የማውረድ ሂደቱን እናስብ። በሁለቱም ሁኔታዎች ውስጥ ያለው የንባብ ብዛት በክፍል መጠን ይወሰናል.

1) በትንሽ ክፍሎች ይጫኑ;
የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

2) በትላልቅ ክፍሎች መጫን;
የ PostgreSQL ምዝግብ ማስታወሻን ከAWS ደመና በመስቀል ላይ

እንደተለመደው, ጥሩው መፍትሄ በመሃል ላይ ነው.
የአገልግሎት መጠኑ አነስተኛ ነው, ነገር ግን በንባብ ሂደት ውስጥ, የንባብ ብዛትን ለመቀነስ መጠኑን መጨመር ይቻላል.

መታወቅ አለበት የሚነበበው ክፍል ትክክለኛውን መጠን የመምረጥ ችግር ገና አልተፈታም እና የበለጠ ጥልቅ ጥናት እና ትንተና ይጠይቃል። ምናልባት ትንሽ ቆይቶ ሊሆን ይችላል.

የአተገባበሩ አጠቃላይ መግለጫ

ጥቅም ላይ የዋሉ የአገልግሎት ጠረጴዛዎች

CREATE TABLE endpoint
(
id SERIAL ,
host text 
);

TABLE database
(
id SERIAL , 
…
last_aws_log_time text ,
last_aws_nexttoken text ,
aws_max_item_size integer 
);
last_aws_log_time — временная метка последнего загруженного лог-файла в формате YYYY-MM-DD-HH24.
last_aws_nexttoken — текстовая метка последней загруженной порции.
aws_max_item_size- эмпирическим путем, подобранный начальный размер порции.

ሙሉ ስክሪፕት ጽሑፍ

አውርድ_አውስ_ቁራጭ.sh

#!/bin/bash
#########################################################
# download_aws_piece.sh
# downloan piece of log from AWS
# version HABR
 let min_item_size=1024
 let max_item_size=1048576
 let growth_factor=3
 let growth_counter=1
 let growth_counter_max=3

 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:''STARTED'
 
 AWS_LOG_TIME=$1
 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:AWS_LOG_TIME='$AWS_LOG_TIME
  
 database_id=$2
 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:database_id='$database_id
 RESULT_FILE=$3 
  
 endpoint=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE_DATABASE -A -t -c "select e.host from endpoint e join database d on e.id = d.endpoint_id where d.id = $database_id "`
 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:endpoint='$endpoint
  
 db_instance=`echo $endpoint | awk -F"." '{print toupper($1)}'`
 
 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:db_instance='$db_instance

 LOG_FILE=$RESULT_FILE'.tmp_log'
 TMP_FILE=$LOG_FILE'.tmp'
 TMP_MIDDLE=$LOG_FILE'.tmp_mid'  
 TMP_MIDDLE2=$LOG_FILE'.tmp_mid2'  
  
 current_aws_log_time=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select last_aws_log_time from database where id = $database_id "`

 echo $(date +%Y%m%d%H%M)':      download_aws_piece.sh:current_aws_log_time='$current_aws_log_time
  
  if [[ $current_aws_log_time != $AWS_LOG_TIME  ]];
  then
    is_new_log='1'
	if ! psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -q -c "update database set last_aws_log_time = '$AWS_LOG_TIME' where id = $database_id "
	then
	  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - update database set last_aws_log_time .'
	  exit 1
	fi
  else
    is_new_log='0'
  fi
  
  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:is_new_log='$is_new_log
  
  let last_aws_max_item_size=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select aws_max_item_size from database where id = $database_id "`
  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: last_aws_max_item_size='$last_aws_max_item_size
  
  let count=1
  if [[ $is_new_log == '1' ]];
  then    
	echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: START DOWNLOADING OF NEW AWS LOG'
	if ! aws rds download-db-log-file-portion 
		--max-items $last_aws_max_item_size 
		--region REGION 
		--db-instance-identifier  $db_instance 
		--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
	then
		echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
		exit 2
	fi  	
  else
    next_token=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -c "select last_aws_nexttoken from database where id = $database_id "`
	
	if [[ $next_token == '' ]];
	then
	  next_token='0'	  
	fi
	
	echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: CONTINUE DOWNLOADING OF AWS LOG'
	if ! aws rds download-db-log-file-portion 
	    --max-items $last_aws_max_item_size 
		--starting-token $next_token 
		--region REGION 
		--db-instance-identifier  $db_instance 
		--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
	then
		echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
		exit 3
	fi       
	
	line_count=`cat  $LOG_FILE | wc -l`
	let lines=$line_count-1
	  
	tail -$lines $LOG_FILE > $TMP_MIDDLE 
	mv -f $TMP_MIDDLE $LOG_FILE
  fi
  
  next_token_str=`cat $LOG_FILE | grep NEXTTOKEN` 
  next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
  
  grep -v NEXTTOKEN $LOG_FILE  > $TMP_FILE 
  
  if [[ $next_token == '' ]];
  then
	  cp $TMP_FILE $RESULT_FILE
	  
	  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:  NEXTTOKEN NOT FOUND - FINISH '
	  rm $LOG_FILE 
	  rm $TMP_FILE
	  rm $TMP_MIDDLE
          rm $TMP_MIDDLE2	  
	  exit 0  
  else
	psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
  fi
  
  first_str=`tail -1 $TMP_FILE`
  
  line_count=`cat  $TMP_FILE | wc -l`
  let lines=$line_count-1    
  
  head -$lines $TMP_FILE  > $RESULT_FILE

###############################################
# MAIN CIRCLE
  let count=2
  while [[ $next_token != '' ]];
  do 
    echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: count='$count
	
	echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: START DOWNLOADING OF AWS LOG'
	if ! aws rds download-db-log-file-portion 
             --max-items $last_aws_max_item_size 
             --starting-token $next_token 
             --region REGION 
             --db-instance-identifier  $db_instance 
             --log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
	then
		echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
		exit 4
	fi

	next_token_str=`cat $LOG_FILE | grep NEXTTOKEN` 
	next_token=`echo $next_token_str | awk -F" " '{ print $2}' `

	TMP_FILE=$LOG_FILE'.tmp'
	grep -v NEXTTOKEN $LOG_FILE  > $TMP_FILE  
	
	last_str=`head -1 $TMP_FILE`
  
    if [[ $next_token == '' ]];
	then
	  concat_str=$first_str$last_str
	  	  
	  echo $concat_str >> $RESULT_FILE
		 
	  line_count=`cat  $TMP_FILE | wc -l`
	  let lines=$line_count-1
	  
	  tail -$lines $TMP_FILE >> $RESULT_FILE
	  
	  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:  NEXTTOKEN NOT FOUND - FINISH '
	  rm $LOG_FILE 
	  rm $TMP_FILE
	  rm $TMP_MIDDLE
          rm $TMP_MIDDLE2	  
	  exit 0  
	fi
	
    if [[ $next_token != '' ]];
	then
		let growth_counter=$growth_counter+1
		if [[ $growth_counter -gt $growth_counter_max ]];
		then
			let last_aws_max_item_size=$last_aws_max_item_size*$growth_factor
			let growth_counter=1
		fi
	
		if [[ $last_aws_max_item_size -gt $max_item_size ]]; 
		then
			let last_aws_max_item_size=$max_item_size
		fi 

	  psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
	  
	  concat_str=$first_str$last_str
	  	  
	  echo $concat_str >> $RESULT_FILE
		 
	  line_count=`cat  $TMP_FILE | wc -l`
	  let lines=$line_count-1
	  
	  #############################
	  #Get middle of file
	  head -$lines $TMP_FILE > $TMP_MIDDLE
	  
	  line_count=`cat  $TMP_MIDDLE | wc -l`
	  let lines=$line_count-1
	  tail -$lines $TMP_MIDDLE > $TMP_MIDDLE2
	  
	  cat $TMP_MIDDLE2 >> $RESULT_FILE	  
	  
	  first_str=`tail -1 $TMP_FILE`	  
	fi
	  
    let count=$count+1

  done
#
#################################################################

exit 0  

የስክሪፕት ቁርጥራጮች ከአንዳንድ ማብራሪያዎች ጋር፡-

የስክሪፕት ግቤት መለኪያዎች፡-

  • የጊዜ ማህተም የምዝግብ ማስታወሻ ፋይል ስም በዓዓዓዓ-ወወ-DD-HH24 ቅርጸት፡ AWS_LOG_TIME=$1
  • የውሂብ ጎታ መታወቂያ፡ ዳታቤዝ_id=$2
  • የተሰበሰበው የምዝግብ ማስታወሻ ፋይል ስም፡ RESULT_FILE=$3

የመጨረሻውን የተጫነ የምዝግብ ማስታወሻ ፋይል የጊዜ ማህተም ያግኙ፡-

current_aws_log_time=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select last_aws_log_time from database where id = $database_id "`

የመጨረሻው የተጫነ የምዝግብ ማስታወሻ ፋይል የጊዜ ማህተም ከግቤት መለኪያው ጋር የማይዛመድ ከሆነ አዲስ የምዝግብ ማስታወሻ ፋይል ይጫናል፡

if [[ $current_aws_log_time != $AWS_LOG_TIME  ]];
  then
    is_new_log='1'
	if ! psql -h ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -c "update database set last_aws_log_time = '$AWS_LOG_TIME' where id = $database_id "
	then
	  echo '***download_aws_piece.sh -FATAL_ERROR - update database set last_aws_log_time .'
	  exit 1
	fi
  else
    is_new_log='0'
  fi

ከወረደው ፋይል የሚቀጥለውን ቶከን መለያ ዋጋ እናገኛለን፡-

  next_token_str=`cat $LOG_FILE | grep NEXTTOKEN` 
  next_token=`echo $next_token_str | awk -F" " '{ print $2}' `

ባዶ ቀጣይ ማስመሰያ ዋጋ እንደ ማውረዱ መጨረሻ ምልክት ሆኖ ያገለግላል።

በአንድ ዙር የፋይሉን ክፍሎች እንቆጥራለን፣ በመንገዱ ላይ መስመሮችን በማጣመር እና የክፍሉን መጠን እንጨምራለን፡
ዋና ዙር

# MAIN CIRCLE
  let count=2
  while [[ $next_token != '' ]];
  do 
    echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: count='$count
	
	echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: START DOWNLOADING OF AWS LOG'
	if ! aws rds download-db-log-file-portion 
     --max-items $last_aws_max_item_size 
	 --starting-token $next_token 
     --region REGION 
     --db-instance-identifier  $db_instance 
     --log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
	then
		echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
		exit 4
	fi

	next_token_str=`cat $LOG_FILE | grep NEXTTOKEN` 
	next_token=`echo $next_token_str | awk -F" " '{ print $2}' `

	TMP_FILE=$LOG_FILE'.tmp'
	grep -v NEXTTOKEN $LOG_FILE  > $TMP_FILE  
	
	last_str=`head -1 $TMP_FILE`
  
    if [[ $next_token == '' ]];
	then
	  concat_str=$first_str$last_str
	  	  
	  echo $concat_str >> $RESULT_FILE
		 
	  line_count=`cat  $TMP_FILE | wc -l`
	  let lines=$line_count-1
	  
	  tail -$lines $TMP_FILE >> $RESULT_FILE
	  
	  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:  NEXTTOKEN NOT FOUND - FINISH '
	  rm $LOG_FILE 
	  rm $TMP_FILE
	  rm $TMP_MIDDLE
         rm $TMP_MIDDLE2	  
	  exit 0  
	fi
	
    if [[ $next_token != '' ]];
	then
		let growth_counter=$growth_counter+1
		if [[ $growth_counter -gt $growth_counter_max ]];
		then
			let last_aws_max_item_size=$last_aws_max_item_size*$growth_factor
			let growth_counter=1
		fi
	
		if [[ $last_aws_max_item_size -gt $max_item_size ]]; 
		then
			let last_aws_max_item_size=$max_item_size
		fi 

	  psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
	  
	  concat_str=$first_str$last_str
	  	  
	  echo $concat_str >> $RESULT_FILE
		 
	  line_count=`cat  $TMP_FILE | wc -l`
	  let lines=$line_count-1
	  
	  #############################
	  #Get middle of file
	  head -$lines $TMP_FILE > $TMP_MIDDLE
	  
	  line_count=`cat  $TMP_MIDDLE | wc -l`
	  let lines=$line_count-1
	  tail -$lines $TMP_MIDDLE > $TMP_MIDDLE2
	  
	  cat $TMP_MIDDLE2 >> $RESULT_FILE	  
	  
	  first_str=`tail -1 $TMP_FILE`	  
	fi
	  
    let count=$count+1

  done

ቀጥሎ ምን አለ?

ስለዚህ, የመጀመሪያው መካከለኛ ተግባር - "የመዝገብ ፋይልን ከደመናው ያውርዱ" ተፈትቷል. በወረደው መዝገብ ምን ይደረግ?
በመጀመሪያ, የምዝግብ ማስታወሻ ፋይሉን መተንተን እና ትክክለኛ ጥያቄዎችን ከእሱ ማውጣት ያስፈልግዎታል.
ስራው በጣም አስቸጋሪ አይደለም. በጣም ቀላሉ የባሽ ስክሪፕት ስራውን በጥሩ ሁኔታ ይሰራል።
upload_log_query.sh

#!/bin/bash
#########################################################
# upload_log_query.sh
# Upload table table from dowloaded aws file 
# version HABR
###########################################################  
echo 'TIMESTAMP:'$(date +%c)' Upload log_query table '
source_file=$1
echo 'source_file='$source_file
database_id=$2
echo 'database_id='$database_id

beginer=' '
first_line='1'
let "line_count=0"
sql_line=' '
sql_flag=' '    
space=' '
cat $source_file | while read line
do
  line="$space$line"

  if [[ $first_line == "1" ]]; then
    beginer=`echo $line | awk -F" " '{ print $1}' `
    first_line='0'
  fi

  current_beginer=`echo $line | awk -F" " '{ print $1}' `

  if [[ $current_beginer == $beginer ]]; then
    if [[ $sql_flag == '1' ]]; then
     sql_flag='0' 
     log_date=`echo $sql_line | awk -F" " '{ print $1}' `
     log_time=`echo $sql_line | awk -F" " '{ print $2}' `
     duration=`echo $sql_line | awk -F" " '{ print $5}' `

     #replace ' to ''
     sql_modline=`echo "$sql_line" | sed 's/'''/''''''/g'`
     sql_line=' '

	 ################
	 #PROCESSING OF THE SQL-SELECT IS HERE
     if ! psql -h ENDPOINT.rds.amazonaws.com -U USER -d DATABASE -v ON_ERROR_STOP=1 -A -t -c "select log_query('$ip_port',$database_id , '$log_date' , '$log_time' , '$duration' , '$sql_modline' )" 
     then
        echo 'FATAL_ERROR - log_query '
        exit 1
     fi
	 ################

    fi #if [[ $sql_flag == '1' ]]; then

    let "line_count=line_count+1"

    check=`echo $line | awk -F" " '{ print $8}' `
    check_sql=${check^^}    

    #echo 'check_sql='$check_sql
    
    if [[ $check_sql == 'SELECT' ]]; then
     sql_flag='1'    
     sql_line="$sql_line$line"
	 ip_port=`echo $sql_line | awk -F":" '{ print $4}' `
    fi
  else       

    if [[ $sql_flag == '1' ]]; then
      sql_line="$sql_line$line"
    fi   
    
  fi #if [[ $current_beginer == $beginer ]]; then

done

አሁን ከመዝገብ ፋይሉ ከተመረጠው ጥያቄ ጋር መስራት ይችላሉ.

እና ብዙ ጠቃሚ እድሎች ይከፈታሉ.

የተተነተኑ መጠይቆች የሆነ ቦታ መቀመጥ አለባቸው። ለዚህ አገልግሎት ሰንጠረዥ ጥቅም ላይ ይውላል የምዝግብ ማስታወሻ

CREATE TABLE log_query
(
   id SERIAL ,
   queryid bigint ,
   query_md5hash text not null ,
   database_id integer not null ,  
   timepoint timestamp without time zone not null,
   duration double precision not null ,
   query text not null ,
   explained_plan text[],
   plan_md5hash text  , 
   explained_plan_wo_costs text[],
   plan_hash_value text  ,
   baseline_id integer ,
   ip text ,
   port text 
);
ALTER TABLE log_query ADD PRIMARY KEY (id);
ALTER TABLE log_query ADD CONSTRAINT queryid_timepoint_unique_key UNIQUE (queryid, timepoint );
ALTER TABLE log_query ADD CONSTRAINT query_md5hash_timepoint_unique_key UNIQUE (query_md5hash, timepoint );

CREATE INDEX log_query_timepoint_idx ON log_query (timepoint);
CREATE INDEX log_query_queryid_idx ON log_query (queryid);
ALTER TABLE log_query ADD CONSTRAINT database_id_fk FOREIGN KEY (database_id) REFERENCES database (id) ON DELETE CASCADE ;

የተተነተነው ጥያቄ በሂደት ላይ ነው። plpgsql ተግባራት "የምዝግብ ማስታወሻ».
log_query.sql

--log_query.sql
--verison HABR
CREATE OR REPLACE FUNCTION log_query( ip_port text ,log_database_id integer , log_date text , log_time text , duration text , sql_line text   ) RETURNS boolean AS $$
DECLARE
  result boolean ;
  log_timepoint timestamp without time zone ;
  log_duration double precision ; 
  pos integer ;
  log_query text ;
  activity_string text ;
  log_md5hash text ;
  log_explain_plan text[] ;
  
  log_planhash text ;
  log_plan_wo_costs text[] ; 
  
  database_rec record ;
  
  pg_stat_query text ; 
  test_log_query text ;
  log_query_rec record;
  found_flag boolean;
  
  pg_stat_history_rec record ;
  port_start integer ;
  port_end integer ;
  client_ip text ;
  client_port text ;
  log_queryid bigint ;
  log_query_text text ;
  pg_stat_query_text text ; 
BEGIN
  result = TRUE ;

  RAISE NOTICE '***log_query';
  
  port_start = position('(' in ip_port);
  port_end = position(')' in ip_port);
  client_ip = substring( ip_port from 1 for port_start-1 );
  client_port = substring( ip_port from port_start+1 for port_end-port_start-1 );

  SELECT e.host , d.name , d.owner_pwd 
  INTO database_rec
  FROM database d JOIN endpoint e ON e.id = d.endpoint_id
  WHERE d.id = log_database_id ;
  
  log_timepoint = to_timestamp(log_date||' '||log_time,'YYYY-MM-DD HH24-MI-SS');
  log_duration = duration:: double precision; 

  
  pos = position ('SELECT' in UPPER(sql_line) );
  log_query = substring( sql_line from pos for LENGTH(sql_line));
  log_query = regexp_replace(log_query,' +',' ','g');
  log_query = regexp_replace(log_query,';+','','g');
  log_query = trim(trailing ' ' from log_query);
 

  log_md5hash = md5( log_query::text );
  
  --Explain execution plan--
  EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')'; 
  
  log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
  log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
    
  PERFORM dblink_disconnect('LINK1');
  --------------------------
  BEGIN
	INSERT INTO log_query
	(
		query_md5hash ,
		database_id , 
		timepoint ,
		duration ,
		query ,
		explained_plan ,
		plan_md5hash , 
		explained_plan_wo_costs , 
		plan_hash_value , 
		ip , 
		port
	) 
	VALUES 
	(
		log_md5hash ,
		log_database_id , 
		log_timepoint , 
		log_duration , 
		log_query ,
		log_explain_plan , 
		md5(log_explain_plan::text) ,
		log_plan_wo_costs , 
		md5(log_plan_wo_costs::text),
		client_ip , 
		client_port		
	);
	activity_string = 	'New query has logged '||
						' database_id = '|| log_database_id ||
						' query_md5hash='||log_md5hash||
						' , timepoint = '||to_char(log_timepoint,'YYYYMMDD HH24:MI:SS');
					
	RAISE NOTICE '%',activity_string;					
					 
	PERFORM pg_log( log_database_id , 'log_query' , activity_string);  

	EXCEPTION
	  WHEN unique_violation THEN
		RAISE NOTICE '*** unique_violation *** query already has logged';
	END;

	SELECT 	queryid
	INTO   	log_queryid
	FROM 	log_query 
	WHERE 	query_md5hash = log_md5hash AND
			timepoint = log_timepoint;

	IF log_queryid IS NOT NULL 
	THEN 
	  RAISE NOTICE 'log_query with query_md5hash = % and timepoint = % has already has a QUERYID = %',log_md5hash,log_timepoint , log_queryid ;
	  RETURN result;
	END IF;
	
	------------------------------------------------
	RAISE NOTICE 'Update queryid';	
	
	SELECT * 
	INTO log_query_rec
	FROM log_query
	WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ; 
	
	log_query_rec.query=regexp_replace(log_query_rec.query,';+','','g');
	
	FOR pg_stat_history_rec IN
	 SELECT 
         queryid ,
	  query 
	 FROM 
         pg_stat_db_queries 
     WHERE  
      database_id = log_database_id AND
       queryid is not null 
	LOOP
	  pg_stat_query = pg_stat_history_rec.query ; 
	  pg_stat_query=regexp_replace(pg_stat_query,'n+',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,'t+',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,' +',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,'$.','%','g');
	
	  log_query_text = trim(trailing ' ' from log_query_rec.query);
	  pg_stat_query_text = pg_stat_query; 
	
	  
	  --SELECT log_query_rec.query like pg_stat_query INTO found_flag ; 
	  IF (log_query_text LIKE pg_stat_query_text) THEN
		found_flag = TRUE ;
	  ELSE
		found_flag = FALSE ;
	  END IF;	  
	  
	  
	  IF found_flag THEN
	    
		UPDATE log_query SET queryid = pg_stat_history_rec.queryid WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
		activity_string = 	' updated queryid = '||pg_stat_history_rec.queryid||
		                    ' for log_query with id = '||log_query_rec.id               
		   				    ;						
	    RAISE NOTICE '%',activity_string;	
		EXIT ;
	  END IF ;
	  
	END LOOP ;
	
  RETURN result ;
END
$$ LANGUAGE plpgsql;

በሂደቱ ወቅት የአገልግሎት ሰንጠረዥ ጥቅም ላይ ይውላል pg_stat_db_ጥያቄዎችከሰንጠረዡ ላይ የወቅታዊ መጠይቆችን ቅጽበታዊ ፎቶ የያዘ pg_stat_ታሪክ (የሰንጠረዡ አጠቃቀም እዚህ ተብራርቷል - የPostgreSQL መጠይቆችን የአፈጻጸም ክትትል። ክፍል 1 - ሪፖርት ማድረግ)

TABLE pg_stat_db_queries
(
   database_id integer,  
   queryid bigint ,  
   query text , 
   max_time double precision 
);

TABLE pg_stat_history 
(
…
database_id integer ,
…
queryid bigint ,
…
max_time double precision	 , 	
…
);

ተግባሩ ከሎግ ፋይል ጥያቄዎችን ለማስኬድ ብዙ ጠቃሚ ችሎታዎችን እንዲተገብሩ ያስችልዎታል። ይኸውም፡-

ዕድል #1 - የጥያቄ አፈፃፀም ታሪክ

የአፈጻጸም ችግርን ለመፍታት ለመጀመር በጣም ጠቃሚ ነው። በመጀመሪያ ከታሪክ ጋር መተዋወቅ - መቀዛቀዝ የጀመረው መቼ ነው?
ከዚያም, እንደ ክላሲኮች, ውጫዊ ምክንያቶችን ይፈልጉ. ምናልባት የመረጃ ቋቱ ጭነት በቀላሉ በከፍተኛ ሁኔታ ጨምሯል እና የተለየ ጥያቄ ከእሱ ጋር ምንም ግንኙነት የለውም.
ወደ log_query ሰንጠረዥ አዲስ ግቤት ያክሉ

  port_start = position('(' in ip_port);
  port_end = position(')' in ip_port);
  client_ip = substring( ip_port from 1 for port_start-1 );
  client_port = substring( ip_port from port_start+1 for port_end-port_start-1 );

  SELECT e.host , d.name , d.owner_pwd 
  INTO database_rec
  FROM database d JOIN endpoint e ON e.id = d.endpoint_id
  WHERE d.id = log_database_id ;
  
  log_timepoint = to_timestamp(log_date||' '||log_time,'YYYY-MM-DD HH24-MI-SS');
  log_duration = to_number(duration,'99999999999999999999D9999999999'); 

  
  pos = position ('SELECT' in UPPER(sql_line) );
  log_query = substring( sql_line from pos for LENGTH(sql_line));
  log_query = regexp_replace(log_query,' +',' ','g');
  log_query = regexp_replace(log_query,';+','','g');
  log_query = trim(trailing ' ' from log_query);
 
  RAISE NOTICE 'log_query=%',log_query ;   

  log_md5hash = md5( log_query::text );
  
  --Explain execution plan--
  EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')'; 
  
  log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
  log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
    
  PERFORM dblink_disconnect('LINK1');
  --------------------------
  BEGIN
	INSERT INTO log_query
	(
		query_md5hash ,
		database_id , 
		timepoint ,
		duration ,
		query ,
		explained_plan ,
		plan_md5hash , 
		explained_plan_wo_costs , 
		plan_hash_value , 
		ip , 
		port
	) 
	VALUES 
	(
		log_md5hash ,
		log_database_id , 
		log_timepoint , 
		log_duration , 
		log_query ,
		log_explain_plan , 
		md5(log_explain_plan::text) ,
		log_plan_wo_costs , 
		md5(log_plan_wo_costs::text),
		client_ip , 
		client_port		
	);

ዕድል #2 - የመጠይቅ አፈፃፀም እቅዶችን ያስቀምጡ

በዚህ ጊዜ ተቃውሞ-ማብራሪያ-አስተያየት ሊነሳ ይችላል፡- “ግን አስቀድሞ በራስ ማብራራት አለ።" አዎ፣ እዚያ አለ፣ ነገር ግን የማስፈጸሚያ ዕቅዱ በተመሳሳይ የመዝገብ መዝገብ ውስጥ ከተከማቸ እና ለበለጠ ትንተና ለማስቀመጥ የሎግ ፋይሉን መተንተን ካለቦት ምን ዋጋ አለው?

የሚያስፈልገኝ፡-
በመጀመሪያ: የአፈፃፀም እቅዱን በክትትል ዳታቤዝ የአገልግሎት ሠንጠረዥ ውስጥ ማከማቸት;
በሁለተኛ ደረጃ: የጥያቄ አፈፃፀም እቅድ እንደተለወጠ ወዲያውኑ ለማየት የማስፈጸሚያ እቅዶችን እርስ በርስ ማወዳደር መቻል.

የተወሰኑ የማስፈጸሚያ መለኪያዎች ያለው ጥያቄ አለ። EXPLAINን በመጠቀም የማስፈጸሚያ እቅዱን ማግኘት እና ማስቀመጥ የመጀመሪያ ደረጃ ተግባር ነው።
ከዚህም በላይ ኤክስፕላይን (ወጪ የውሸት) አገላለጽ በመጠቀም የዕቅዱን ሃሽ ዋጋ ለማግኘት የሚጠቅመውን የዕቅዱ አጽም ማግኘት ይችላሉ፣ ይህም በአፈጻጸም ዕቅዱ ላይ የተደረጉ ለውጦችን ታሪክ በቀጣይ ትንተና ይረዳል።
የማስፈጸሚያ እቅድ አብነት ያግኙ

  --Explain execution plan--
  EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')'; 
  
  log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
  log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
    
  PERFORM dblink_disconnect('LINK1');

ዕድል #3 - ለመከታተል የጥያቄ ምዝግብ ማስታወሻን መጠቀም

የአፈጻጸም መለኪያዎች የሚዋቀሩት በጥያቄው ጽሑፍ ላይ ሳይሆን በመታወቂያው ላይ ስለሆነ፣ ከሎግ ፋይል የሚቀርቡ ጥያቄዎችን የአፈጻጸም መለኪያዎች ከተዋቀሩ ጥያቄዎች ጋር ማያያዝ አለብዎት።
ደህና፣ ቢያንስ የአፈጻጸም ክስተት የተከሰተበትን ትክክለኛ ጊዜ ለማግኘት።

በዚህ መንገድ፣ ለጥያቄ መታወቂያ የአፈጻጸም ክስተት ሲከሰት፣ ከተወሰኑ መለኪያዎች እሴቶች እና የጥያቄው ትክክለኛ የማስፈጸሚያ ጊዜ እና የቆይታ ጊዜ ጋር ለአንድ የተወሰነ ጥያቄ አገናኝ ይኖራል። እይታውን ብቻ በመጠቀም ይህንን መረጃ ያግኙ pg_stat_አረፍተ ነገሮች - የተከለከለ ነው።
የጥያቄውን መጠየቂያ ይፈልጉ እና በ log_query ሠንጠረዥ ውስጥ ያለውን ግቤት ያዘምኑ

SELECT * 
	INTO log_query_rec
	FROM log_query
	WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ; 
	
	log_query_rec.query=regexp_replace(log_query_rec.query,';+','','g');
	
	FOR pg_stat_history_rec IN
	 SELECT 
      queryid ,
	  query 
	 FROM 
       pg_stat_db_queries 
     WHERE  
	   database_id = log_database_id AND
       queryid is not null 
	LOOP
	  pg_stat_query = pg_stat_history_rec.query ; 
	  pg_stat_query=regexp_replace(pg_stat_query,'n+',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,'t+',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,' +',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,'$.','%','g');
	
	  log_query_text = trim(trailing ' ' from log_query_rec.query);
	  pg_stat_query_text = pg_stat_query; 
	  
	  --SELECT log_query_rec.query like pg_stat_query INTO found_flag ; 
	  IF (log_query_text LIKE pg_stat_query_text) THEN
		found_flag = TRUE ;
	  ELSE
		found_flag = FALSE ;
	  END IF;	  
	  
	  
	  IF found_flag THEN
	    
		UPDATE log_query SET queryid = pg_stat_history_rec.queryid WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
		activity_string = 	' updated queryid = '||pg_stat_history_rec.queryid||
		                    ' for log_query with id = '||log_query_rec.id		                    
		   				    ;						
					
	    RAISE NOTICE '%',activity_string;	
		EXIT ;
	  END IF ;
	  
	END LOOP ;

ከቃል በኋላ

የተገለጸው ቴክኒክ በመጨረሻ ትግበራ ተገኝቷል የተገነባው የ PostgreSQL መጠይቅ የአፈፃፀም ክትትል ስርዓትብቅ ያሉ የመጠይቅ ክንዋኔዎችን ሲፈቱ ለመተንተን ተጨማሪ መረጃ እንዲኖርዎት ያስችላል።

ምንም እንኳን በእርግጥ, በግሌ አስተያየት, የወረደውን ክፍል መጠን ለመምረጥ እና ለመለወጥ በአልጎሪዝም ላይ የበለጠ መስራት አስፈላጊ ይሆናል. በአጠቃላይ ችግሩ እስካሁን አልተፈታም. ምናልባት አስደሳች ይሆናል.

ግን ይህ ፈጽሞ የተለየ ታሪክ ነው ...

ምንጭ: hab.com

አስተያየት ያክሉ