AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

๋˜๋Š” ์•ฝ๊ฐ„์˜ ์‘์šฉ Tetrisology.
์ƒˆ๋กœ์šด ๋ชจ๋“  ๊ฒƒ์€ ์ž˜ ์žŠํ˜€์ง„ ์˜ค๋ž˜๋œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์„œ๋ฌธ.
AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

๋ฌธ์ œ ์„ฑ๋ช…

AWS ํด๋ผ์šฐ๋“œ์—์„œ ๋กœ์ปฌ Linux ํ˜ธ์ŠคํŠธ๋กœ ํ˜„์žฌ PostgreSQL ๋กœ๊ทธ ํŒŒ์ผ์„ ์ฃผ๊ธฐ์ ์œผ๋กœ ๋‹ค์šด๋กœ๋“œํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์‹ค์‹œ๊ฐ„์€ ์•„๋‹ˆ์ง€๋งŒ ์•ฝ๊ฐ„์˜ ์ง€์—ฐ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
๋กœ๊ทธ ํŒŒ์ผ ์—…๋ฐ์ดํŠธ ๋‹ค์šด๋กœ๋“œ ์‹œ๊ฐ„์€ 5๋ถ„์ž…๋‹ˆ๋‹ค.
AWS์˜ ๋กœ๊ทธ ํŒŒ์ผ์€ ๋งค์‹œ๊ฐ„ ๊ต์ฒด๋ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉ ๋œ ๋„๊ตฌ

ํ˜ธ์ŠคํŠธ์— ๋กœ๊ทธ ํŒŒ์ผ์„ ์—…๋กœ๋“œํ•˜๊ธฐ ์œ„ํ•ด AWS API "๋ฅผ ํ˜ธ์ถœํ•˜๋Š” bash ์Šคํฌ๋ฆฝํŠธ๊ฐ€ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.aws rds ๋‹ค์šด๋กœ๋“œ-db-๋กœ๊ทธ-ํŒŒ์ผ-๋ถ€๋ถ„".

๋งค๊ฐœ ๋ณ€์ˆ˜ :

  • --db-instance-identifier: AWS์˜ ์ธ์Šคํ„ด์Šค ์ด๋ฆ„.
  • --log-file-name: ํ˜„์žฌ ์ƒ์„ฑ๋œ ๋กœ๊ทธ ํŒŒ์ผ์˜ ์ด๋ฆ„
  • --max-item: ๋ช…๋ น ์ถœ๋ ฅ์— ๋ฐ˜ํ™˜๋œ ์ด ํ•ญ๋ชฉ ์ˆ˜์ž…๋‹ˆ๋‹ค.๋‹ค์šด๋กœ๋“œํ•œ ํŒŒ์ผ์˜ ์ฒญํฌ ํฌ๊ธฐ์ž…๋‹ˆ๋‹ค.
  • --starting-token: ์‹œ์ž‘ ํ† ํฐ ํ† ํฐ

์ด ํŠน๋ณ„ํ•œ ๊ฒฝ์šฐ์— ์ž‘์—… ๊ณผ์ •์—์„œ ๋กœ๊ทธ ๋‹ค์šด๋กœ๋“œ ์ž‘์—…์ด ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. PostgreSQL ์ฟผ๋ฆฌ์˜ ์„ฑ๋Šฅ ๋ชจ๋‹ˆํ„ฐ๋ง.

์˜ˆ, ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค. ๊ทผ๋ฌด ์‹œ๊ฐ„ ๋™์•ˆ ๊ต์œก ๋ฐ ๋‹ค์–‘์„ฑ์— ๋Œ€ํ•œ ํฅ๋ฏธ๋กœ์šด ์ž‘์—…์ž…๋‹ˆ๋‹ค.
๋‚˜๋Š” ๊ทธ ๋ฌธ์ œ๊ฐ€ ๋ฃจํ‹ด ๋•๋ถ„์— ์ด๋ฏธ ํ•ด๊ฒฐ๋˜์—ˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋น ๋ฅธ Google์€ ์†”๋ฃจ์…˜์„ ์ œ์•ˆํ•˜์ง€ ์•Š์•˜์œผ๋ฉฐ ๋” ๊นŠ์ด ๊ฒ€์ƒ‰ํ•˜๋ ค๋Š” ํŠน๋ณ„ํ•œ ์š•๊ตฌ๋„ ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ์–ด์จŒ๋“  ์ข‹์€ ์šด๋™์ž…๋‹ˆ๋‹ค.

์ž‘์—…์˜ ๊ณต์‹ํ™”

์ตœ์ข… ๋กœ๊ทธ ํŒŒ์ผ์€ ๊ฐ€๋ณ€ ๊ธธ์ด์˜ ๋ผ์ธ ์„ธํŠธ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜ํ”ฝ์œผ๋กœ ๋กœ๊ทธ ํŒŒ์ผ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

์ด๋ฏธ ๋ญ”๊ฐ€ ์ƒ๊ฐ๋‚˜๋‚˜์š”? "ํ…ŒํŠธ๋ฆฌ์Šค"๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๊ทธ๋ฆฌ๊ณ  ์—ฌ๊ธฐ์— ๋ฌด์—‡์ด ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค์Œ ํŒŒ์ผ์„ ๊ทธ๋ž˜ํ”ฝ์œผ๋กœ ๋กœ๋“œํ•  ๋•Œ ๋ฐœ์ƒํ•˜๋Š” ๊ฐ€๋Šฅํ•œ ์˜ต์…˜์„ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒฝ์šฐ(๋‹จ์ˆœํ™”๋ฅผ ์œ„ํ•ด ์ด ๊ฒฝ์šฐ ํ–‰์˜ ๊ธธ์ด๋ฅผ ๋™์ผํ•˜๊ฒŒ ํ•จ) ๋‹ค์Œ์„ ์–ป์Šต๋‹ˆ๋‹ค. ํ‘œ์ค€ ํ…ŒํŠธ๋ฆฌ์Šค ์ˆ˜์น˜:

1) ํŒŒ์ผ์€ ์ „์ฒด๊ฐ€ ๋‹ค์šด๋กœ๋“œ๋˜๋ฉฐ ์ตœ์ข… ํŒŒ์ผ์ž…๋‹ˆ๋‹ค. ์ฒญํฌ ํฌ๊ธฐ๊ฐ€ ์ตœ์ข… ํŒŒ์ผ ํฌ๊ธฐ๋ณด๋‹ค ํฝ๋‹ˆ๋‹ค.
AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

2) ํŒŒ์ผ์— ์—ฐ์†์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ฒญํฌ ํฌ๊ธฐ๊ฐ€ ์ตœ์ข… ํŒŒ์ผ ํฌ๊ธฐ๋ณด๋‹ค ์ž‘์Šต๋‹ˆ๋‹ค.
AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

3) ํŒŒ์ผ์€ ์ด์ „ ํŒŒ์ผ์˜ ์—ฐ์†์ด๋ฉฐ ์—ฐ์†์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ฒญํฌ ํฌ๊ธฐ๋Š” ์ตœ์ข… ํŒŒ์ผ์˜ ๋‚˜๋จธ์ง€ ํฌ๊ธฐ๋ณด๋‹ค ์ž‘์Šต๋‹ˆ๋‹ค.
AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

4) ํŒŒ์ผ์€ ์ด์ „ ํŒŒ์ผ์˜ ์—ฐ์†์ด๋ฉฐ ์ตœ์ข… ํŒŒ์ผ์ž…๋‹ˆ๋‹ค. ์ฒญํฌ ํฌ๊ธฐ๋Š” ์ตœ์ข… ํŒŒ์ผ์˜ ๋‚˜๋จธ์ง€ ํฌ๊ธฐ๋ณด๋‹ค ํฝ๋‹ˆ๋‹ค.
AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

์ž‘์—…์€ ์ง์‚ฌ๊ฐํ˜•์„ ์กฐ๋ฆฝํ•˜๊ฑฐ๋‚˜ ์ƒˆ๋กœ์šด ์ˆ˜์ค€์—์„œ ํ…ŒํŠธ๋ฆฌ์Šค๋ฅผ ํ”Œ๋ ˆ์ดํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

๋ฌธ์ œ ํ•ด๊ฒฐ ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฌธ์ œ

1) 2 ๋ถ€๋ถ„์˜ ๋ˆ์„ ๋ถ™์ž…๋‹ˆ๋‹ค.

AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ
์ผ๋ฐ˜์ ์œผ๋กœ ํŠน๋ณ„ํ•œ ๋ฌธ์ œ๋Š” ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ์ดˆ๊ธฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๊ณผ์ •์˜ ํ‘œ์ค€ ์ž‘์—…์ž…๋‹ˆ๋‹ค.

์ตœ์ ์˜ ์„œ๋น™ ํฌ๊ธฐ

๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์€ ์กฐ๊ธˆ ๋” ํฅ๋ฏธ ๋กญ์Šต๋‹ˆ๋‹ค.
์•ˆํƒ€๊น๊ฒŒ๋„ ์‹œ์ž‘ ์ฒญํฌ ๋ ˆ์ด๋ธ” ๋’ค์— ์˜คํ”„์…‹์„ ์‚ฌ์šฉํ•  ๋ฐฉ๋ฒ•์ด ์—†์Šต๋‹ˆ๋‹ค.

์ด๋ฏธ ์•Œ๊ณ  ์žˆ๋“ฏ์ด --starting-token ์˜ต์…˜์€ ํŽ˜์ด์ง•์„ ์‹œ์ž‘ํ•  ์œ„์น˜๋ฅผ ์ง€์ •ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด ์˜ต์…˜์€ ๋ฌธ์ž์—ด ๊ฐ’์„ ์ทจํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๋‹ค์Œ ํ† ํฐ ๋ฌธ์ž์—ด ์•ž์— ์˜คํ”„์…‹ ๊ฐ’์„ ์ถ”๊ฐ€ํ•˜๋ ค๊ณ  ํ•˜๋ฉด ์˜ต์…˜์ด ์˜คํ”„์…‹์œผ๋กœ ๊ฐ„์ฃผ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ์ฒญํฌ ๋ถ€๋ถ„์„ ์ฝ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
๋งŽ์€ ๋ถ€๋ถ„์„ ์ฝ์œผ๋ฉด ํŒ๋… ํšŸ์ˆ˜๋Š” ์ตœ์†Œํ™”๋˜์ง€๋งŒ ๋ณผ๋ฅจ์€ ์ตœ๋Œ€๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
์ž‘์€ ๋ถ€๋ถ„์„ ์ฝ์œผ๋ฉด ๋ฐ˜๋Œ€๋กœ ์ฝ๊ธฐ ํšŸ์ˆ˜๋Š” ์ตœ๋Œ€๊ฐ€๋˜์ง€๋งŒ ๋ณผ๋ฅจ์€ ์ตœ์†Œํ™”๋ฉ๋‹ˆ๋‹ค.
๋”ฐ๋ผ์„œ ํŠธ๋ž˜ํ”ฝ์„ ์ค„์ด๊ณ  ์†”๋ฃจ์…˜์˜ ์ „๋ฐ˜์ ์ธ ์•„๋ฆ„๋‹ค์›€์„ ์œ„ํ•ด ๋ถˆํ–‰ํžˆ๋„ ๋ชฉ๋ฐœ์ฒ˜๋Ÿผ ๋ณด์ด๋Š” ์ผ์ข…์˜ ์†”๋ฃจ์…˜์„ ์ƒ๊ฐํ•ด ๋‚ด์•ผํ–ˆ์Šต๋‹ˆ๋‹ค.

์„ค๋ช…์„ ์œ„ํ•ด ํฌ๊ฒŒ ๋‹จ์ˆœํ™”๋œ ๋‘ ๊ฐ€์ง€ ๋ฒ„์ „์˜ ๋กœ๊ทธ ํŒŒ์ผ์„ ๋‹ค์šด๋กœ๋“œํ•˜๋Š” ๊ณผ์ •์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋‘ ๊ฒฝ์šฐ ๋ชจ๋‘ ํŒ๋… ํšŸ์ˆ˜๋Š” ๋ถ€๋ถ„ ํฌ๊ธฐ์— ๋”ฐ๋ผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

1) ์ ์€ ์–‘์˜ ํ•˜์ค‘:
AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

2) ๋งŽ์€ ์–‘์˜ ํ•˜์ค‘:
AWS ํด๋ผ์šฐ๋“œ์—์„œ PostgreSQL ๋กœ๊ทธ ์—…๋กœ๋“œ

๋Š˜ ๊ทธ๋ ‡๋“ฏ์ด ์ตœ์ ์˜ ์†”๋ฃจ์…˜์€ ์ค‘๊ฐ„์— ์žˆ์Šต๋‹ˆ๋‹ค..
๋ถ€๋ถ„ ํฌ๊ธฐ๋Š” ์ตœ์†Œ์ด์ง€๋งŒ ์ฝ๋Š” ๊ณผ์ •์—์„œ ํฌ๊ธฐ๋ฅผ ๋Š˜๋ ค ํŒ๋… ํšŸ์ˆ˜๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๊ฒƒ์€ ์ฃผ๋ชฉํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค ์ฝ๊ธฐ ๋ถ€๋ถ„์˜ ์ตœ์  ํฌ๊ธฐ๋ฅผ ์„ ํƒํ•˜๋Š” ๋ฌธ์ œ๋Š” ์•„์ง ์™„์ „ํžˆ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์•˜์œผ๋ฉฐ ๋” ๊นŠ์€ ์—ฐ๊ตฌ์™€ ๋ถ„์„์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์•„๋งˆ ์กฐ๊ธˆ ํ›„์—.

๊ตฌํ˜„์— ๋Œ€ํ•œ ์ผ๋ฐ˜์ ์ธ ์„ค๋ช…

์ค‘๊ณ  ์„œ๋น„์Šค ํ…Œ์ด๋ธ”

CREATE TABLE endpoint
(
id SERIAL ,
host text 
);

TABLE database
(
id SERIAL , 
โ€ฆ
last_aws_log_time text ,
last_aws_nexttoken text ,
aws_max_item_size integer 
);
last_aws_log_time โ€” ะฒั€ะตะผะตะฝะฝะฐั ะผะตั‚ะบะฐ ะฟะพัะปะตะดะฝะตะณะพ ะทะฐะณั€ัƒะถะตะฝะฝะพะณะพ ะปะพะณ-ั„ะฐะนะปะฐ ะฒ ั„ะพั€ะผะฐั‚ะต YYYY-MM-DD-HH24.
last_aws_nexttoken โ€” ั‚ะตะบัั‚ะพะฒะฐั ะผะตั‚ะบะฐ ะฟะพัะปะตะดะฝะตะน ะทะฐะณั€ัƒะถะตะฝะฝะพะน ะฟะพั€ั†ะธะธ.
aws_max_item_size- ัะผะฟะธั€ะธั‡ะตัะบะธะผ ะฟัƒั‚ะตะผ, ะฟะพะดะพะฑั€ะฐะฝะฝั‹ะน ะฝะฐั‡ะฐะปัŒะฝั‹ะน ั€ะฐะทะผะตั€ ะฟะพั€ั†ะธะธ.

์Šคํฌ๋ฆฝํŠธ ์ „๋ฌธ

๋‹ค์šด๋กœ๋“œ_aws_piece.sh

#!/bin/bash
#########################################################
# download_aws_piece.sh
# downloan piece of log from AWS
# version HABR
 let min_item_size=1024
 let max_item_size=1048576
 let growth_factor=3
 let growth_counter=1
 let growth_counter_max=3

 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:''STARTED'
 
 AWS_LOG_TIME=$1
 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:AWS_LOG_TIME='$AWS_LOG_TIME
  
 database_id=$2
 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:database_id='$database_id
 RESULT_FILE=$3 
  
 endpoint=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE_DATABASE -A -t -c "select e.host from endpoint e join database d on e.id = d.endpoint_id where d.id = $database_id "`
 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:endpoint='$endpoint
  
 db_instance=`echo $endpoint | awk -F"." '{print toupper($1)}'`
 
 echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:db_instance='$db_instance

 LOG_FILE=$RESULT_FILE'.tmp_log'
 TMP_FILE=$LOG_FILE'.tmp'
 TMP_MIDDLE=$LOG_FILE'.tmp_mid'  
 TMP_MIDDLE2=$LOG_FILE'.tmp_mid2'  
  
 current_aws_log_time=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select last_aws_log_time from database where id = $database_id "`

 echo $(date +%Y%m%d%H%M)':      download_aws_piece.sh:current_aws_log_time='$current_aws_log_time
  
  if [[ $current_aws_log_time != $AWS_LOG_TIME  ]];
  then
    is_new_log='1'
	if ! psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -q -c "update database set last_aws_log_time = '$AWS_LOG_TIME' where id = $database_id "
	then
	  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - update database set last_aws_log_time .'
	  exit 1
	fi
  else
    is_new_log='0'
  fi
  
  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:is_new_log='$is_new_log
  
  let last_aws_max_item_size=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select aws_max_item_size from database where id = $database_id "`
  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: last_aws_max_item_size='$last_aws_max_item_size
  
  let count=1
  if [[ $is_new_log == '1' ]];
  then    
	echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: START DOWNLOADING OF NEW AWS LOG'
	if ! aws rds download-db-log-file-portion 
		--max-items $last_aws_max_item_size 
		--region REGION 
		--db-instance-identifier  $db_instance 
		--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
	then
		echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
		exit 2
	fi  	
  else
    next_token=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -c "select last_aws_nexttoken from database where id = $database_id "`
	
	if [[ $next_token == '' ]];
	then
	  next_token='0'	  
	fi
	
	echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: CONTINUE DOWNLOADING OF AWS LOG'
	if ! aws rds download-db-log-file-portion 
	    --max-items $last_aws_max_item_size 
		--starting-token $next_token 
		--region REGION 
		--db-instance-identifier  $db_instance 
		--log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
	then
		echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
		exit 3
	fi       
	
	line_count=`cat  $LOG_FILE | wc -l`
	let lines=$line_count-1
	  
	tail -$lines $LOG_FILE > $TMP_MIDDLE 
	mv -f $TMP_MIDDLE $LOG_FILE
  fi
  
  next_token_str=`cat $LOG_FILE | grep NEXTTOKEN` 
  next_token=`echo $next_token_str | awk -F" " '{ print $2}' `
  
  grep -v NEXTTOKEN $LOG_FILE  > $TMP_FILE 
  
  if [[ $next_token == '' ]];
  then
	  cp $TMP_FILE $RESULT_FILE
	  
	  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:  NEXTTOKEN NOT FOUND - FINISH '
	  rm $LOG_FILE 
	  rm $TMP_FILE
	  rm $TMP_MIDDLE
          rm $TMP_MIDDLE2	  
	  exit 0  
  else
	psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
  fi
  
  first_str=`tail -1 $TMP_FILE`
  
  line_count=`cat  $TMP_FILE | wc -l`
  let lines=$line_count-1    
  
  head -$lines $TMP_FILE  > $RESULT_FILE

###############################################
# MAIN CIRCLE
  let count=2
  while [[ $next_token != '' ]];
  do 
    echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: count='$count
	
	echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: START DOWNLOADING OF AWS LOG'
	if ! aws rds download-db-log-file-portion 
             --max-items $last_aws_max_item_size 
             --starting-token $next_token 
             --region REGION 
             --db-instance-identifier  $db_instance 
             --log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
	then
		echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
		exit 4
	fi

	next_token_str=`cat $LOG_FILE | grep NEXTTOKEN` 
	next_token=`echo $next_token_str | awk -F" " '{ print $2}' `

	TMP_FILE=$LOG_FILE'.tmp'
	grep -v NEXTTOKEN $LOG_FILE  > $TMP_FILE  
	
	last_str=`head -1 $TMP_FILE`
  
    if [[ $next_token == '' ]];
	then
	  concat_str=$first_str$last_str
	  	  
	  echo $concat_str >> $RESULT_FILE
		 
	  line_count=`cat  $TMP_FILE | wc -l`
	  let lines=$line_count-1
	  
	  tail -$lines $TMP_FILE >> $RESULT_FILE
	  
	  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:  NEXTTOKEN NOT FOUND - FINISH '
	  rm $LOG_FILE 
	  rm $TMP_FILE
	  rm $TMP_MIDDLE
          rm $TMP_MIDDLE2	  
	  exit 0  
	fi
	
    if [[ $next_token != '' ]];
	then
		let growth_counter=$growth_counter+1
		if [[ $growth_counter -gt $growth_counter_max ]];
		then
			let last_aws_max_item_size=$last_aws_max_item_size*$growth_factor
			let growth_counter=1
		fi
	
		if [[ $last_aws_max_item_size -gt $max_item_size ]]; 
		then
			let last_aws_max_item_size=$max_item_size
		fi 

	  psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
	  
	  concat_str=$first_str$last_str
	  	  
	  echo $concat_str >> $RESULT_FILE
		 
	  line_count=`cat  $TMP_FILE | wc -l`
	  let lines=$line_count-1
	  
	  #############################
	  #Get middle of file
	  head -$lines $TMP_FILE > $TMP_MIDDLE
	  
	  line_count=`cat  $TMP_MIDDLE | wc -l`
	  let lines=$line_count-1
	  tail -$lines $TMP_MIDDLE > $TMP_MIDDLE2
	  
	  cat $TMP_MIDDLE2 >> $RESULT_FILE	  
	  
	  first_str=`tail -1 $TMP_FILE`	  
	fi
	  
    let count=$count+1

  done
#
#################################################################

exit 0  

๋ช‡ ๊ฐ€์ง€ ์„ค๋ช…์ด ์žˆ๋Š” ์Šคํฌ๋ฆฝํŠธ ์กฐ๊ฐ:

์Šคํฌ๋ฆฝํŠธ ์ž…๋ ฅ ๋งค๊ฐœ๋ณ€์ˆ˜:

  • YYYY-MM-DD-HH24 ํ˜•์‹์˜ ๋กœ๊ทธ ํŒŒ์ผ ์ด๋ฆ„ ํƒ€์ž„์Šคํƒฌํ”„: AWS_LOG_TIME=$1
  • ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ID: database_id=$2
  • ์ˆ˜์ง‘๋œ ๋กœ๊ทธ ํŒŒ์ผ ์ด๋ฆ„: RESULT_FILE=$3

๋งˆ์ง€๋ง‰์œผ๋กœ ์—…๋กœ๋“œ๋œ ๋กœ๊ทธ ํŒŒ์ผ์˜ ํƒ€์ž„์Šคํƒฌํ”„๋ฅผ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.

current_aws_log_time=`psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -c "select last_aws_log_time from database where id = $database_id "`

๋งˆ์ง€๋ง‰์œผ๋กœ ๋กœ๋“œ๋œ ๋กœ๊ทธ ํŒŒ์ผ์˜ ํƒ€์ž„์Šคํƒฌํ”„๊ฐ€ ์ž…๋ ฅ ๋งค๊ฐœ๋ณ€์ˆ˜์™€ ์ผ์น˜ํ•˜์ง€ ์•Š์œผ๋ฉด ์ƒˆ ๋กœ๊ทธ ํŒŒ์ผ์ด ๋กœ๋“œ๋ฉ๋‹ˆ๋‹ค.

if [[ $current_aws_log_time != $AWS_LOG_TIME  ]];
  then
    is_new_log='1'
	if ! psql -h ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -v ON_ERROR_STOP=1 -A -t -c "update database set last_aws_log_time = '$AWS_LOG_TIME' where id = $database_id "
	then
	  echo '***download_aws_piece.sh -FATAL_ERROR - update database set last_aws_log_time .'
	  exit 1
	fi
  else
    is_new_log='0'
  fi

๋กœ๋“œ๋œ ํŒŒ์ผ์—์„œ nexttoken ๋ ˆ์ด๋ธ”์˜ ๊ฐ’์„ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.

  next_token_str=`cat $LOG_FILE | grep NEXTTOKEN` 
  next_token=`echo $next_token_str | awk -F" " '{ print $2}' `

๋‹ค์šด๋กœ๋“œ ์ข…๋ฃŒ์˜ ์‹ ํ˜ธ๋Š” nexttoken์˜ ๋นˆ ๊ฐ’์ž…๋‹ˆ๋‹ค.

๋ฃจํ”„์—์„œ ์šฐ๋ฆฌ๋Š” ๋„์ค‘์— ํŒŒ์ผ์˜ ๋ถ€๋ถ„์„ ์„ธ๊ณ  ์ค„์„ ์—ฐ๊ฒฐํ•˜๊ณ  ๋ถ€๋ถ„์˜ ํฌ๊ธฐ๋ฅผ ๋Š˜๋ฆฝ๋‹ˆ๋‹ค.
๋ฉ”์ธ ๋ฃจํ”„

# MAIN CIRCLE
  let count=2
  while [[ $next_token != '' ]];
  do 
    echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: count='$count
	
	echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: START DOWNLOADING OF AWS LOG'
	if ! aws rds download-db-log-file-portion 
     --max-items $last_aws_max_item_size 
	 --starting-token $next_token 
     --region REGION 
     --db-instance-identifier  $db_instance 
     --log-file-name error/postgresql.log.$AWS_LOG_TIME > $LOG_FILE
	then
		echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh: FATAL_ERROR - Could not get log from AWS .'
		exit 4
	fi

	next_token_str=`cat $LOG_FILE | grep NEXTTOKEN` 
	next_token=`echo $next_token_str | awk -F" " '{ print $2}' `

	TMP_FILE=$LOG_FILE'.tmp'
	grep -v NEXTTOKEN $LOG_FILE  > $TMP_FILE  
	
	last_str=`head -1 $TMP_FILE`
  
    if [[ $next_token == '' ]];
	then
	  concat_str=$first_str$last_str
	  	  
	  echo $concat_str >> $RESULT_FILE
		 
	  line_count=`cat  $TMP_FILE | wc -l`
	  let lines=$line_count-1
	  
	  tail -$lines $TMP_FILE >> $RESULT_FILE
	  
	  echo $(date +%Y%m%d%H%M)':    download_aws_piece.sh:  NEXTTOKEN NOT FOUND - FINISH '
	  rm $LOG_FILE 
	  rm $TMP_FILE
	  rm $TMP_MIDDLE
         rm $TMP_MIDDLE2	  
	  exit 0  
	fi
	
    if [[ $next_token != '' ]];
	then
		let growth_counter=$growth_counter+1
		if [[ $growth_counter -gt $growth_counter_max ]];
		then
			let last_aws_max_item_size=$last_aws_max_item_size*$growth_factor
			let growth_counter=1
		fi
	
		if [[ $last_aws_max_item_size -gt $max_item_size ]]; 
		then
			let last_aws_max_item_size=$max_item_size
		fi 

	  psql -h MONITOR_ENDPOINT.rds.amazonaws.com -U USER -d MONITOR_DATABASE -A -t -q -c "update database set last_aws_nexttoken = '$next_token' where id = $database_id "
	  
	  concat_str=$first_str$last_str
	  	  
	  echo $concat_str >> $RESULT_FILE
		 
	  line_count=`cat  $TMP_FILE | wc -l`
	  let lines=$line_count-1
	  
	  #############################
	  #Get middle of file
	  head -$lines $TMP_FILE > $TMP_MIDDLE
	  
	  line_count=`cat  $TMP_MIDDLE | wc -l`
	  let lines=$line_count-1
	  tail -$lines $TMP_MIDDLE > $TMP_MIDDLE2
	  
	  cat $TMP_MIDDLE2 >> $RESULT_FILE	  
	  
	  first_str=`tail -1 $TMP_FILE`	  
	fi
	  
    let count=$count+1

  done

๋‹ค์Œ์€?

๋”ฐ๋ผ์„œ ์ฒซ ๋ฒˆ์งธ ์ค‘๊ฐ„ ์ž‘์—…์ธ "ํด๋ผ์šฐ๋“œ์—์„œ ๋กœ๊ทธ ํŒŒ์ผ ๋‹ค์šด๋กœ๋“œ"๊ฐ€ ํ•ด๊ฒฐ๋ฉ๋‹ˆ๋‹ค. ๋‹ค์šด๋กœ๋“œํ•œ ๋กœ๊ทธ๋Š” ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?
๋จผ์ € ๋กœ๊ทธ ํŒŒ์ผ์„ ๊ตฌ๋ฌธ ๋ถ„์„ํ•˜๊ณ  ์—ฌ๊ธฐ์—์„œ ์‹ค์ œ ์š”์ฒญ์„ ์ถ”์ถœํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
์ž‘์—…์€ ๊ทธ๋ฆฌ ์–ด๋ ต์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ bash ์Šคํฌ๋ฆฝํŠธ๋Š” ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
upload_log_query.sh

#!/bin/bash
#########################################################
# upload_log_query.sh
# Upload table table from dowloaded aws file 
# version HABR
###########################################################  
echo 'TIMESTAMP:'$(date +%c)' Upload log_query table '
source_file=$1
echo 'source_file='$source_file
database_id=$2
echo 'database_id='$database_id

beginer=' '
first_line='1'
let "line_count=0"
sql_line=' '
sql_flag=' '    
space=' '
cat $source_file | while read line
do
  line="$space$line"

  if [[ $first_line == "1" ]]; then
    beginer=`echo $line | awk -F" " '{ print $1}' `
    first_line='0'
  fi

  current_beginer=`echo $line | awk -F" " '{ print $1}' `

  if [[ $current_beginer == $beginer ]]; then
    if [[ $sql_flag == '1' ]]; then
     sql_flag='0' 
     log_date=`echo $sql_line | awk -F" " '{ print $1}' `
     log_time=`echo $sql_line | awk -F" " '{ print $2}' `
     duration=`echo $sql_line | awk -F" " '{ print $5}' `

     #replace ' to ''
     sql_modline=`echo "$sql_line" | sed 's/'''/''''''/g'`
     sql_line=' '

	 ################
	 #PROCESSING OF THE SQL-SELECT IS HERE
     if ! psql -h ENDPOINT.rds.amazonaws.com -U USER -d DATABASE -v ON_ERROR_STOP=1 -A -t -c "select log_query('$ip_port',$database_id , '$log_date' , '$log_time' , '$duration' , '$sql_modline' )" 
     then
        echo 'FATAL_ERROR - log_query '
        exit 1
     fi
	 ################

    fi #if [[ $sql_flag == '1' ]]; then

    let "line_count=line_count+1"

    check=`echo $line | awk -F" " '{ print $8}' `
    check_sql=${check^^}    

    #echo 'check_sql='$check_sql
    
    if [[ $check_sql == 'SELECT' ]]; then
     sql_flag='1'    
     sql_line="$sql_line$line"
	 ip_port=`echo $sql_line | awk -F":" '{ print $4}' `
    fi
  else       

    if [[ $sql_flag == '1' ]]; then
      sql_line="$sql_line$line"
    fi   
    
  fi #if [[ $current_beginer == $beginer ]]; then

done

์ด์ œ ๋กœ๊ทธ ํŒŒ์ผ์—์„œ ์ถ”์ถœํ•œ ์ฟผ๋ฆฌ๋กœ ์ž‘์—…ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ๋ช‡ ๊ฐ€์ง€ ์œ ์šฉํ•œ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

๊ตฌ๋ฌธ ๋ถ„์„๋œ ์ฟผ๋ฆฌ๋Š” ์–ด๋”˜๊ฐ€์— ์ €์žฅํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์„œ๋น„์Šค ํ…Œ์ด๋ธ”์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋กœ๊ทธ_์ฟผ๋ฆฌ

CREATE TABLE log_query
(
   id SERIAL ,
   queryid bigint ,
   query_md5hash text not null ,
   database_id integer not null ,  
   timepoint timestamp without time zone not null,
   duration double precision not null ,
   query text not null ,
   explained_plan text[],
   plan_md5hash text  , 
   explained_plan_wo_costs text[],
   plan_hash_value text  ,
   baseline_id integer ,
   ip text ,
   port text 
);
ALTER TABLE log_query ADD PRIMARY KEY (id);
ALTER TABLE log_query ADD CONSTRAINT queryid_timepoint_unique_key UNIQUE (queryid, timepoint );
ALTER TABLE log_query ADD CONSTRAINT query_md5hash_timepoint_unique_key UNIQUE (query_md5hash, timepoint );

CREATE INDEX log_query_timepoint_idx ON log_query (timepoint);
CREATE INDEX log_query_queryid_idx ON log_query (queryid);
ALTER TABLE log_query ADD CONSTRAINT database_id_fk FOREIGN KEY (database_id) REFERENCES database (id) ON DELETE CASCADE ;

๊ตฌ๋ฌธ ๋ถ„์„๋œ ์š”์ฒญ์€ ๋‹ค์Œ์—์„œ ์ฒ˜๋ฆฌ๋ฉ๋‹ˆ๋‹ค. plpgsql ๊ธฐ๋Šฅ "๋กœ๊ทธ_์ฟผ๋ฆฌ".
log_query.sql

--log_query.sql
--verison HABR
CREATE OR REPLACE FUNCTION log_query( ip_port text ,log_database_id integer , log_date text , log_time text , duration text , sql_line text   ) RETURNS boolean AS $$
DECLARE
  result boolean ;
  log_timepoint timestamp without time zone ;
  log_duration double precision ; 
  pos integer ;
  log_query text ;
  activity_string text ;
  log_md5hash text ;
  log_explain_plan text[] ;
  
  log_planhash text ;
  log_plan_wo_costs text[] ; 
  
  database_rec record ;
  
  pg_stat_query text ; 
  test_log_query text ;
  log_query_rec record;
  found_flag boolean;
  
  pg_stat_history_rec record ;
  port_start integer ;
  port_end integer ;
  client_ip text ;
  client_port text ;
  log_queryid bigint ;
  log_query_text text ;
  pg_stat_query_text text ; 
BEGIN
  result = TRUE ;

  RAISE NOTICE '***log_query';
  
  port_start = position('(' in ip_port);
  port_end = position(')' in ip_port);
  client_ip = substring( ip_port from 1 for port_start-1 );
  client_port = substring( ip_port from port_start+1 for port_end-port_start-1 );

  SELECT e.host , d.name , d.owner_pwd 
  INTO database_rec
  FROM database d JOIN endpoint e ON e.id = d.endpoint_id
  WHERE d.id = log_database_id ;
  
  log_timepoint = to_timestamp(log_date||' '||log_time,'YYYY-MM-DD HH24-MI-SS');
  log_duration = duration:: double precision; 

  
  pos = position ('SELECT' in UPPER(sql_line) );
  log_query = substring( sql_line from pos for LENGTH(sql_line));
  log_query = regexp_replace(log_query,' +',' ','g');
  log_query = regexp_replace(log_query,';+','','g');
  log_query = trim(trailing ' ' from log_query);
 

  log_md5hash = md5( log_query::text );
  
  --Explain execution plan--
  EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')'; 
  
  log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
  log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
    
  PERFORM dblink_disconnect('LINK1');
  --------------------------
  BEGIN
	INSERT INTO log_query
	(
		query_md5hash ,
		database_id , 
		timepoint ,
		duration ,
		query ,
		explained_plan ,
		plan_md5hash , 
		explained_plan_wo_costs , 
		plan_hash_value , 
		ip , 
		port
	) 
	VALUES 
	(
		log_md5hash ,
		log_database_id , 
		log_timepoint , 
		log_duration , 
		log_query ,
		log_explain_plan , 
		md5(log_explain_plan::text) ,
		log_plan_wo_costs , 
		md5(log_plan_wo_costs::text),
		client_ip , 
		client_port		
	);
	activity_string = 	'New query has logged '||
						' database_id = '|| log_database_id ||
						' query_md5hash='||log_md5hash||
						' , timepoint = '||to_char(log_timepoint,'YYYYMMDD HH24:MI:SS');
					
	RAISE NOTICE '%',activity_string;					
					 
	PERFORM pg_log( log_database_id , 'log_query' , activity_string);  

	EXCEPTION
	  WHEN unique_violation THEN
		RAISE NOTICE '*** unique_violation *** query already has logged';
	END;

	SELECT 	queryid
	INTO   	log_queryid
	FROM 	log_query 
	WHERE 	query_md5hash = log_md5hash AND
			timepoint = log_timepoint;

	IF log_queryid IS NOT NULL 
	THEN 
	  RAISE NOTICE 'log_query with query_md5hash = % and timepoint = % has already has a QUERYID = %',log_md5hash,log_timepoint , log_queryid ;
	  RETURN result;
	END IF;
	
	------------------------------------------------
	RAISE NOTICE 'Update queryid';	
	
	SELECT * 
	INTO log_query_rec
	FROM log_query
	WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ; 
	
	log_query_rec.query=regexp_replace(log_query_rec.query,';+','','g');
	
	FOR pg_stat_history_rec IN
	 SELECT 
         queryid ,
	  query 
	 FROM 
         pg_stat_db_queries 
     WHERE  
      database_id = log_database_id AND
       queryid is not null 
	LOOP
	  pg_stat_query = pg_stat_history_rec.query ; 
	  pg_stat_query=regexp_replace(pg_stat_query,'n+',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,'t+',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,' +',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,'$.','%','g');
	
	  log_query_text = trim(trailing ' ' from log_query_rec.query);
	  pg_stat_query_text = pg_stat_query; 
	
	  
	  --SELECT log_query_rec.query like pg_stat_query INTO found_flag ; 
	  IF (log_query_text LIKE pg_stat_query_text) THEN
		found_flag = TRUE ;
	  ELSE
		found_flag = FALSE ;
	  END IF;	  
	  
	  
	  IF found_flag THEN
	    
		UPDATE log_query SET queryid = pg_stat_history_rec.queryid WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
		activity_string = 	' updated queryid = '||pg_stat_history_rec.queryid||
		                    ' for log_query with id = '||log_query_rec.id               
		   				    ;						
	    RAISE NOTICE '%',activity_string;	
		EXIT ;
	  END IF ;
	  
	END LOOP ;
	
  RETURN result ;
END
$$ LANGUAGE plpgsql;

์ฒ˜๋ฆฌ์‹œ ์„œ๋น„์Šค ํ…Œ์ด๋ธ”์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. pg_stat_db_queriesํ…Œ์ด๋ธ”์—์„œ ํ˜„์žฌ ์ฟผ๋ฆฌ์˜ ์Šค๋ƒ…์ƒท์„ ํฌํ•จํ•˜๋Š” A pg_stat_history (ํ…Œ์ด๋ธ” ์‚ฌ์šฉ๋ฒ•์€ ์—ฌ๊ธฐ์— ์„ค๋ช…๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. PostgreSQL ์ฟผ๋ฆฌ์˜ ์„ฑ๋Šฅ ๋ชจ๋‹ˆํ„ฐ๋ง. ํŒŒํŠธ 1 - ๋ณด๊ณ )

TABLE pg_stat_db_queries
(
   database_id integer,  
   queryid bigint ,  
   query text , 
   max_time double precision 
);

TABLE pg_stat_history 
(
โ€ฆ
database_id integer ,
โ€ฆ
queryid bigint ,
โ€ฆ
max_time double precision	 , 	
โ€ฆ
);

์ด ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋ฉด ๋กœ๊ทธ ํŒŒ์ผ์˜ ์š”์ฒญ์„ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ์œ ์šฉํ•œ ๊ธฐ๋Šฅ์„ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰:

๊ธฐํšŒ #1 - ์ฟผ๋ฆฌ ์‹คํ–‰ ๊ธฐ๋ก

์„ฑ๋Šฅ ๋ฌธ์ œ๋ฅผ ์‹œ์ž‘ํ•˜๋Š” ๋ฐ ๋งค์šฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋จผ์ €, ์—ญ์‚ฌ์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์‹ญ์‹œ์˜ค. ๊ทธ๋ฆฌ๊ณ  ๋‘”ํ™”๋Š” ์–ธ์ œ ์‹œ์ž‘ ๋˜์—ˆ์Šต๋‹ˆ๊นŒ?
๊ทธ๋Ÿฐ ๋‹ค์Œ ๊ณ ์ „์— ๋”ฐ๋ฅด๋ฉด ์™ธ๋ถ€ ์›์ธ์„ ์ฐพ์œผ์‹ญ์‹œ์˜ค. ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๋กœ๋“œ๊ฐ€ ๊ธ‰๊ฒฉํžˆ ์ฆ๊ฐ€ํ–ˆ๊ณ  ํŠน์ • ์š”์ฒญ์ด ์ด์™€ ๊ด€๋ จ์ด ์—†๋Š” ๊ฒƒ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
log_query ํ…Œ์ด๋ธ”์— ์ƒˆ ํ•ญ๋ชฉ ์ถ”๊ฐ€

  port_start = position('(' in ip_port);
  port_end = position(')' in ip_port);
  client_ip = substring( ip_port from 1 for port_start-1 );
  client_port = substring( ip_port from port_start+1 for port_end-port_start-1 );

  SELECT e.host , d.name , d.owner_pwd 
  INTO database_rec
  FROM database d JOIN endpoint e ON e.id = d.endpoint_id
  WHERE d.id = log_database_id ;
  
  log_timepoint = to_timestamp(log_date||' '||log_time,'YYYY-MM-DD HH24-MI-SS');
  log_duration = to_number(duration,'99999999999999999999D9999999999'); 

  
  pos = position ('SELECT' in UPPER(sql_line) );
  log_query = substring( sql_line from pos for LENGTH(sql_line));
  log_query = regexp_replace(log_query,' +',' ','g');
  log_query = regexp_replace(log_query,';+','','g');
  log_query = trim(trailing ' ' from log_query);
 
  RAISE NOTICE 'log_query=%',log_query ;   

  log_md5hash = md5( log_query::text );
  
  --Explain execution plan--
  EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')'; 
  
  log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
  log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
    
  PERFORM dblink_disconnect('LINK1');
  --------------------------
  BEGIN
	INSERT INTO log_query
	(
		query_md5hash ,
		database_id , 
		timepoint ,
		duration ,
		query ,
		explained_plan ,
		plan_md5hash , 
		explained_plan_wo_costs , 
		plan_hash_value , 
		ip , 
		port
	) 
	VALUES 
	(
		log_md5hash ,
		log_database_id , 
		log_timepoint , 
		log_duration , 
		log_query ,
		log_explain_plan , 
		md5(log_explain_plan::text) ,
		log_plan_wo_costs , 
		md5(log_plan_wo_costs::text),
		client_ip , 
		client_port		
	);

๊ธฐ๋Šฅ #2 - ์ฟผ๋ฆฌ ์‹คํ–‰ ๊ณ„ํš ์ €์žฅ

์ด ์‹œ์ ์—์„œ ์ด์˜ ์ œ๊ธฐ-์„ค๋ช…-์„ค๋ช…์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.๊ทธ๋Ÿฌ๋‚˜ ์ด๋ฏธ autoexplain์ด ์žˆ์Šต๋‹ˆ๋‹ค.". ์˜ˆ, ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์‹คํ–‰ ๊ณ„ํš์ด ๋™์ผํ•œ ๋กœ๊ทธ ํŒŒ์ผ์— ์ €์žฅ๋˜๊ณ  ์ถ”๊ฐ€ ๋ถ„์„์„ ์œ„ํ•ด ์ €์žฅํ•˜๊ธฐ ์œ„ํ•ด ๋กœ๊ทธ ํŒŒ์ผ์„ ๊ตฌ๋ฌธ ๋ถ„์„ํ•ด์•ผ ํ•˜๋Š” ๊ฒฝ์šฐ ์š”์ ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๊ทธ๋Ÿฌ๋‚˜ ๋‹ค์Œ์ด ํ•„์š”ํ–ˆ์Šต๋‹ˆ๋‹ค.
์ฒซ์งธ: ๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์˜ ์„œ๋น„์Šค ํ…Œ์ด๋ธ”์— ์‹คํ–‰ ๊ณ„ํš์„ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
๋‘˜์งธ: ์ฟผ๋ฆฌ ์‹คํ–‰ ๊ณ„ํš์ด ๋ณ€๊ฒฝ๋˜์—ˆ์Œ์„ ์ฆ‰์‹œ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ์‹คํ–‰ ๊ณ„ํš์„ ์„œ๋กœ ๋น„๊ตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํŠน์ • ์‹คํ–‰ ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์žˆ๋Š” ์š”์ฒญ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. EXPLAIN์„ ์‚ฌ์šฉํ•˜์—ฌ ์‹คํ–‰ ๊ณ„ํš์„ ๊ฐ€์ ธ์˜ค๊ณ  ์ €์žฅํ•˜๋Š” ๊ฒƒ์€ ๊ธฐ๋ณธ์ ์ธ ์ž‘์—…์ž…๋‹ˆ๋‹ค.
๋˜ํ•œ EXPLAIN(COSTS FALSE) ํ‘œํ˜„์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ„ํš์˜ ํ•ด์‹œ ๊ฐ’์„ ์–ป๋Š” ๋ฐ ์‚ฌ์šฉํ•  ๊ณ„ํš์˜ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์œผ๋ฉฐ ์ด๋Š” ์‹คํ–‰ ๊ณ„ํš์˜ ๋ณ€๊ฒฝ ๋‚ด์—ญ์— ๋Œ€ํ•œ ํ›„์† ๋ถ„์„์— ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค.
์‹คํ–‰ ๊ณ„ํš ํ…œํ”Œ๋ฆฟ ๊ฐ€์ ธ์˜ค๊ธฐ

  --Explain execution plan--
  EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' dbname='||database_rec.name||' user=DATABASE password='||database_rec.owner_pwd||' '')'; 
  
  log_explain_plan = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN '||log_query ) AS t (plan text) );
  log_plan_wo_costs = ARRAY ( SELECT * FROM dblink('LINK1', 'EXPLAIN ( COSTS FALSE ) '||log_query ) AS t (plan text) );
    
  PERFORM dblink_disconnect('LINK1');

๊ธฐํšŒ #3 - ๋ชจ๋‹ˆํ„ฐ๋ง์„ ์œ„ํ•œ ์ฟผ๋ฆฌ ๋กœ๊ทธ ์‚ฌ์šฉ

์„ฑ๋Šฅ ๋ฉ”ํŠธ๋ฆญ์€ ์š”์ฒญ ํ…์ŠคํŠธ๊ฐ€ ์•„๋‹ˆ๋ผ ํ•ด๋‹น ID์— ๋Œ€ํ•ด ๊ตฌ์„ฑ๋˜๋ฏ€๋กœ ๋กœ๊ทธ ํŒŒ์ผ์˜ ์š”์ฒญ์„ ์„ฑ๋Šฅ ๋ฉ”ํŠธ๋ฆญ์ด ๊ตฌ์„ฑ๋œ ์š”์ฒญ๊ณผ ์—ฐ๊ฒฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
์Œ, ์ ์–ด๋„ ์„ฑ๋Šฅ ์‚ฌ๊ณ ์˜ ์ •ํ™•ํ•œ ๋ฐœ์ƒ ์‹œ๊ฐ„์„ ๊ฐ–๊ธฐ ์œ„ํ•ด์„œ์ž…๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ์š”์ฒญ ID์— ๋Œ€ํ•ด ์„ฑ๋Šฅ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๋ฉด ํŠน์ • ๋งค๊ฐœ ๋ณ€์ˆ˜ ๊ฐ’๊ณผ ์š”์ฒญ์˜ ์ •ํ™•ํ•œ ์‹คํ–‰ ์‹œ๊ฐ„ ๋ฐ ๊ธฐ๊ฐ„์„ ํฌํ•จํ•˜๋Š” ํŠน์ • ์š”์ฒญ์— ๋Œ€ํ•œ ๋งํฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. View ๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ฃผ์–ด์ง„ ์ •๋ณด ์–ป๊ธฐ pg_stat_statements - ๊ธˆ์ง€๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
์ฟผ๋ฆฌ์˜ queryid๋ฅผ ์ฐพ๊ณ  log_query ํ…Œ์ด๋ธ”์˜ ํ•ญ๋ชฉ์„ ์—…๋ฐ์ดํŠธํ•ฉ๋‹ˆ๋‹ค.

SELECT * 
	INTO log_query_rec
	FROM log_query
	WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ; 
	
	log_query_rec.query=regexp_replace(log_query_rec.query,';+','','g');
	
	FOR pg_stat_history_rec IN
	 SELECT 
      queryid ,
	  query 
	 FROM 
       pg_stat_db_queries 
     WHERE  
	   database_id = log_database_id AND
       queryid is not null 
	LOOP
	  pg_stat_query = pg_stat_history_rec.query ; 
	  pg_stat_query=regexp_replace(pg_stat_query,'n+',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,'t+',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,' +',' ','g');
	  pg_stat_query=regexp_replace(pg_stat_query,'$.','%','g');
	
	  log_query_text = trim(trailing ' ' from log_query_rec.query);
	  pg_stat_query_text = pg_stat_query; 
	  
	  --SELECT log_query_rec.query like pg_stat_query INTO found_flag ; 
	  IF (log_query_text LIKE pg_stat_query_text) THEN
		found_flag = TRUE ;
	  ELSE
		found_flag = FALSE ;
	  END IF;	  
	  
	  
	  IF found_flag THEN
	    
		UPDATE log_query SET queryid = pg_stat_history_rec.queryid WHERE query_md5hash = log_md5hash AND timepoint = log_timepoint ;
		activity_string = 	' updated queryid = '||pg_stat_history_rec.queryid||
		                    ' for log_query with id = '||log_query_rec.id		                    
		   				    ;						
					
	    RAISE NOTICE '%',activity_string;	
		EXIT ;
	  END IF ;
	  
	END LOOP ;

์‚ฌํ›„

๊ทธ ๊ฒฐ๊ณผ, ์„ค๋ช…๋œ ๋ฐฉ๋ฒ•์€ PostgreSQL ์ฟผ๋ฆฌ์˜ ์„ฑ๋Šฅ์„ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๊ธฐ ์œ„ํ•ด ๊ฐœ๋ฐœ๋œ ์‹œ์Šคํ…œ, ์ƒˆ๋กœ์šด ์ฟผ๋ฆฌ ์„ฑ๋Šฅ ์‚ฌ๊ณ ๋ฅผ ํ•ด๊ฒฐํ•  ๋•Œ ๋ถ„์„์„ ์œ„ํ•œ ๋” ๋งŽ์€ ์ •๋ณด๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฌผ๋ก  ๊ฐœ์ธ์ ์ธ ์ƒ๊ฐ์œผ๋กœ๋Š” ๋‹ค์šด๋กœ๋“œ ๋ถ€๋ถ„์˜ ํฌ๊ธฐ๋ฅผ ์„ ํƒํ•˜๊ณ  ๋ณ€๊ฒฝํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ž‘์—…์ด ์—ฌ์ „ํžˆ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ ๊ฒฝ์šฐ์—๋Š” ๋ฌธ์ œ๊ฐ€ ์•„์ง ํ•ด๊ฒฐ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์•„๋งˆ ์žฌ๋ฏธ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ๊ทธ๊ฑด ์ „ํ˜€ ๋‹ค๋ฅธ ์ด์•ผ๊ธฐ...

์ถœ์ฒ˜ : habr.com

์ฝ”๋ฉ˜ํŠธ๋ฅผ ์ถ”๊ฐ€