Incremental postgresql backups with pgbackrest - a young fighter course from a developer

Disclaimer

I am a developer. I write code, interact with the database only as a user. By no means do I pretend to be a system administrator and, moreover, a dba. But…

It so happened that I needed to organize a backup of the postgresql database. No clouds - keep SSH and make everything work and not ask for money. What do we do in such cases? That's right, we shove pgdump into cron, back up everything to the archive every day, and if we completely disperse, we send this archive somewhere far away.

This time, the difficulty was that, according to the plans, the database was supposed to grow by about + - 100 MB per day. Of course, after a couple of weeks, the desire to back up everything with pgdump will disappear. This is where incremental backups come in handy.

Interesting? Welcome under cat.

An incremental backup is a kind of backup when not all source files are copied, but only new and changed ones since the previous copy was created.

Like any developer who was TOTALLY unwilling (at that time) to understand the intricacies of postgres, I wanted to find a green button. Well, you know, like in AWS, DigitalOcean: I pressed one button - I got replication, I pressed the second - I set up backups, the third - everything rolled back a couple of hours ago. I did not find a button and a beautiful GUI tool. If you know one (free or cheap) - write about it in the comments.

Googling I found two tools pgbarman ΠΈ pgbackrest. With the first one, I just didn’t succeed (very poor documentation, I tried to pick everything up according to old manuals), but the second documentation turned out to be at the level, but not without a flaw. To simplify the work of those who face a similar task, this article was written.

After reading this article, you will learn how to make incremental backups, save them to a remote server (repository with backups) and restore them in case of data loss or other problems on the main server.

Prepare

You will need two VPS to play the manual. The first will be the repository (the repository on which the backups will be), and the second, in fact, the server itself with postgres (in my case, version 11 of postgres).

It is assumed that on the server with postgres you have root, sudo user, postgres user and postgres itself installed (postgres user is created automatically when installing postgresql), and on the repository server you have root and sudo user (username pgbackrest will be used in the manual) .

So that you have less problems when reproducing the instructions - I write in italics where, with what user and with what rights I executed the command while writing and reviewing the article.

pgbackrest installation

Repository (pgbackrest user):

1. Download the archive from pgbackrest and transfer its contents to the /build folder:

sudo mkdir /build
sudo wget -q -O - 
       https://github.com/pgbackrest/pgbackrest/archive/release/2.18.tar.gz | 
       sudo tar zx -C /build

2. Install the dependencies necessary for the assembly:

sudo apt-get update
sudo apt-get install build-essential libssl-dev libxml2-dev libperl-dev zlib1g-dev 
       libpq-dev

3. We collect pgbackrest:

cd /build/pgbackrest-release-2.18/src && sudo ./configure
sudo make -s -C /build/pgbackrest-release-2.18/src

4. Copy the executable file to the /usr/bin directory:

sudo cp /build/pgbackrest-release-2.18/src/pgbackrest /usr/bin
sudo chmod 755 /usr/bin/pgbackrest

5. Pgbackrest requires perl. Install:

sudo apt-get install perl

6. Create directories for logs, give them certain rights:

sudo mkdir -p -m 770 /var/log/pgbackrest
sudo chown pgbackrest:pgbackrest /var/log/pgbackrest
sudo mkdir -p /etc/pgbackrest
sudo mkdir -p /etc/pgbackrest/conf.d
sudo touch /etc/pgbackrest/pgbackrest.conf
sudo chmod 640 /etc/pgbackrest/pgbackrest.conf
sudo chown pgbackrest:pgbackrest /etc/pgbackrest/pgbackrest.conf

7. Check:

pgbackrest version

Postgres server (sudo user or root):

The process of installing pgbackrest on a postgres server is similar to installing on a repository (yes, pgbackrest must be installed on both servers), but in the 6th paragraph, the second and last command:

sudo chown pgbackrest:pgbackrest /var/log/pgbackrest
sudo chown pgbackrest:pgbackrest /etc/pgbackrest/pgbackrest.conf

replace with:

sudo chown postgres:postgres /var/log/pgbackrest
sudo chown postgres:postgres /etc/pgbackrest/pgbackrest.conf

Setting up communication between servers via passwordless SSH

In order for pgbackrest to work correctly, it is necessary to configure the interaction between the postgres server and the repository using the key file.

Repository (pgbackrest user):

Create a pair of keys:

mkdir -m 750 /home/pgbackrest/.ssh
ssh-keygen -f /home/pgbackrest/.ssh/id_rsa 
       -t rsa -b 4096 -N ""

Attention! The above commands are executed without sudo.

Postgres server (sudo user or root):

Create a pair of keys:

sudo -u postgres mkdir -m 750 -p /var/lib/postgresql/.ssh
sudo -u postgres ssh-keygen -f /var/lib/postgresql/.ssh/id_rsa 
       -t rsa -b 4096 -N ""

Repository (sudo user):

Copy the public key of the postgres server to the repository server:

(echo -n 'no-agent-forwarding,no-X11-forwarding,no-port-forwarding,' && 
       echo -n 'command="/usr/bin/pgbackrest ${SSH_ORIGINAL_COMMAND#* }" ' && 
       sudo ssh root@<postgres_server_ip> cat /var/lib/postgresql/.ssh/id_rsa.pub) | 
       sudo -u pgbackrest tee -a /home/pgbackrest/.ssh/authorized_keys

At this step, it will ask for the password from the root user. You need to enter exactly the password of the root user of the postgres server!

Postgres server (sudo user):

Copy the public key of the repository to the server with postgres:

(echo -n 'no-agent-forwarding,no-X11-forwarding,no-port-forwarding,' && 
       echo -n 'command="/usr/bin/pgbackrest ${SSH_ORIGINAL_COMMAND#* }" ' && 
       sudo ssh root@<repository_server_ip> cat /home/pgbackrest/.ssh/id_rsa.pub) | 
       sudo -u postgres tee -a /var/lib/postgresql/.ssh/authorized_keys

At this step, it will ask for the password from the root user. You need to enter the password of the root user of the repository!

We check:

Repository (root user, for the purity of the experiment):

sudo -u pgbackrest ssh postgres@<postgres_server_ip>

Postgres server (root user, for the purity of the experiment):

sudo -u postgres ssh pgbackrest@<repository_server_ip>

We make sure that we get access without problems.

Setting up a postgres server

Postgres server (sudo user or root):

1. Let's allow "knocking" on the postgres server from external ip. To do this, edit the file postgresql.conf (located in the /etc/postgresql/11/main folder) by adding the following line to it:

listen_addresses = '*'

If such a line already exists, either uncomment it or set the parameter value to '*'.

In file pg_hba.conf (also located in the folder /etc/postgresql/11/main) add the following lines:

hostssl  all  all  0.0.0.0/0  md5
host  all  all  0.0.0.0/0  md5

where:

hostssl/host - ΠΏΠΎΠ΄ΠΊΠ»ΡŽΡ‡Π°Π΅ΠΌΡΡ Ρ‡Π΅Ρ€Π΅Π· SSL (ΠΈΠ»ΠΈ Π½Π΅Ρ‚)
all - Ρ€Π°Π·Ρ€Π΅ΡˆΠ°Π΅ΠΌ ΠΏΠΎΠ΄ΠΊΠ»ΡŽΡ‡Π΅Π½ΠΈΠ΅ ΠΊΠΎ всСм Π±Π°Π·Π°ΠΌ
all - имя ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Ρ, ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠΌΡƒ Ρ€Π°Π·Ρ€Π΅ΡˆΠ°Π΅ΠΌ ΠΏΠΎΠ΄ΠΊΠ»ΡŽΡ‡Π΅Π½ΠΈΠ΅ (всСм)
0.0.0.0/0 - маска сСти с ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠΉ ΠΌΠΎΠΆΠ½ΠΎ ΠΏΠΎΠ΄ΠΊΠ»ΡŽΡ‡Π°Ρ‚ΡŒΡΡ
md5 - способ ΡˆΠΈΡ„Ρ€ΠΎΠ²Π°Π½ΠΈΡ пароля

2. Let's make the necessary settings in postgresql.conf (it's in the folder /etc/postgresql/11/main) for pgbackrest to work:

archive_command = 'pgbackrest --stanza=main archive-push %p' # Π“Π΄Π΅ main - Π½Π°Π·Π²Π°Π½ΠΈΠ΅ кластСра. ΠŸΡ€ΠΈ установкС postgres автоматичСски создаСт кластСр main.
archive_mode = on
max_wal_senders = 3
wal_level = replica

3. Make the necessary settings in the pgbackrest configuration file (/etc/pgbackrest/pgbackrest.conf):

[main]
pg1-path=/var/lib/postgresql/11/main

[global]
log-level-file=detail
repo1-host=<repository_server_ip>

4. Restart postgresql:

sudo service postgresql restart

Setting up a repository server

Repository (pgbackrest user):

Make the necessary settings in the configuration file pgbackrest
(/etc/pgbackrest/pgbackrest.conf):

[main]
pg1-host=<postgres_server_ip>
pg1-path=/var/lib/postgresql/11/main

[global]
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2 # ΠŸΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€, ΡƒΠΊΠ°Π·Ρ‹Π²Π°ΡŽΡ‰ΠΈΠΉ сколько Ρ…Ρ€Π°Π½ΠΈΡ‚ΡŒ ΠΏΠΎΠ»Π½Ρ‹Ρ… бэкапов. Π’.Π΅. Ссли Ρƒ вас Π΅ΡΡ‚ΡŒ Π΄Π²Π° ΠΏΠΎΠ»Π½Ρ‹Ρ… бэкапа ΠΈ Π²Ρ‹ создаСтС Ρ‚Ρ€Π΅Ρ‚ΠΈΠΉ, Ρ‚ΠΎ самый старый бэкап Π±ΡƒΠ΄Π΅Ρ‚ ΡƒΠ΄Π°Π»Π΅Π½. МоТно ΠΏΡ€ΠΎΠΈΠ·Π½ΠΎΡΠΈΡ‚ΡŒ ΠΊΠ°ΠΊ "Ρ…Ρ€Π°Π½ΠΈΡ‚ΡŒ Π½Π΅ Π±ΠΎΠ»Π΅Π΅ Π΄Π²ΡƒΡ… бэкапов" - ΠΏΠΎ Π°Π½Π°Π»ΠΎΠ³ΠΈΠΈ с ротациями Π»ΠΎΠ³ΠΎΠ². Бпасибо @Aytuar Π·Π° исправлСниС ошибки.
start-fast=y # НачинаСт Ρ€Π΅Π·Π΅Ρ€Π²Π½ΠΎΠ΅ ΠΊΠΎΠΏΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ Π½Π΅ΠΌΠ΅Π΄Π»Π΅Π½Π½ΠΎ, ΠΏΡ€ΠΎΡ‡ΠΈΡ‚Π°Ρ‚ΡŒ ΠΏΡ€ΠΎ этот ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ ΠΌΠΎΠΆΠ½ΠΎ Ρ‚ΡƒΡ‚ https://postgrespro.ru/docs/postgrespro/9.5/continuous-archiving

Create a vault

Repository (pgbackrest user):

Create a new storage for the cluster main:

sudo mkdir -m 770 /var/lib/pgbackrest
sudo chown -R pgbackrest /var/lib/pgbackrest/
sudo -u pgbackrest pgbackrest --stanza=main stanza-create

inspection

Postgres server (sudo user or root):

Check on postgres server:

sudo -u postgres pgbackrest --stanza=main --log-level-console=info check

Repository (pgbackrest user):

Check on the repository server:

sudo -u pgbackrest pgbackrest --stanza=main --log-level-console=info check

We make sure that in the output we see the line "check command end: completed successfully".

Tired? Let's move on to the most interesting.

Making a backup

Repository (pgbackrest user):

1. Perform backup:

sudo -u pgbackrest pgbackrest --stanza=main backup

2. We make sure that the backup has been created:

ls /var/lib/pgbackrest/backup/main/

pgbackrest will create the first full backup. If you wish, you can run the backup command again and make sure that the system creates an incremental backup.

If you want to re-make a full backup, then specify an additional flag:

sudo -u pgbackrest pgbackrest --stanza=main --type=full backup

If you want verbose console output, then also specify:

sudo -u pgbackrest pgbackrest --stanza=main --type=full --log-level-console=info backup

Restoring a backup

Postgres server (sudo user or root):

1. Stop the running cluster:

sudo pg_ctlcluster 11 main stop

2. Restoring from a backup:

sudo -u postgres pgbackrest --stanza=main --log-level-console=info --delta --recovery-option=recovery_target=immediate restore

To restore the database to the state of the last FULL backup, use the command without specifying recovery_target:

sudo -u postgres pgbackrest --stanza=main --log-level-console=info --delta restore

Important! After recovery, it may turn out that the database hangs in recovery mode (there will be errors like ERROR: cannot execute DROP DATABASE in a read-only transaction). To be honest, I still do not understand what it is connected with. It is solved as follows (it will be necessary to wait a little after the execution of the command):

sudo -u postgres psql -c "select pg_wal_replay_resume()"

In fact, it is possible to restore a specific backup by its name. Here I am only I will indicate a link to the description of this feature in the documentation. The developers advise using this parameter with caution and explain why. From myself I can add that I used it. If you really need it, make sure that the database is out of recovery mode after recovery (select pg_is_in_recovery() should show β€œf”) and just in case, make a full backup after recovery.

3. Start the cluster:

sudo pg_ctlcluster 11 main start

After restoring the backup, we need to perform a second backup:

Repository (pgbackrest user):

sudo pgbackrest --stanza=main backup

That's all. In conclusion, I want to remind you that I am by no means trying to pretend to be a senior dba and will use the clouds at the slightest opportunity. Currently, I myself am starting to study various topics like backup, replication, monitoring, etc. and I write small reports about the results in order to make a small contribution to the community and leave small cheat sheets for myself.

In the following articles I will try to talk about additional features - restoring data to a clean cluster, encrypting backups and publishing to S3, backups via rsync.

Source: habr.com

Add a comment