Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

This article will discuss software tools for backup, which, by splitting the data stream into separate components (chunks), form a repository.

Repository components can be additionally compressed and encrypted, and most importantly, during repeated backup processes, they can be reused.

A backup in such a repository is a named chain of components related to each other, for example, based on various hash functions.

There are several similar solutions, I will focus on 3: zbackup, borgbackup and restic.

Expected results

Since all applicants require the creation of a repository in one way or another, one of the most important factors will be the size of the repository. Ideally, its size should be no more than 13 GB according to the accepted methodology, or even less - subject to good optimization.

It is also highly desirable to be able to back up files directly without using archivers like tar, as well as work with ssh / sftp without additional tools like rsync and sshfs.

Behavior when creating backups:

  1. The size of the repository will be equal to the size of the changes, or less.
  2. High CPU usage is expected when using compression and/or encryption, and quite a heavy network and disk subsystem load is likely if the archiving and/or encryption process runs on the backup storage server.
  3. If the repository is damaged, a delayed error is likely both when creating new backups and when trying to restore. You need to plan additional measures to ensure the integrity of the repository or use the built-in integrity checkers.

Working with tar is taken as a reference value, as it was shown in one of the previous articles.

Testing zbackup

The general mechanism of zbackup operation is that the program finds areas containing the same data in the input data stream, then optionally compresses and encrypts them, saving each area only 1 time.

For deduplication, a 64-bit sliding window hash ring function is used to byte-by-byte check for a match against already existing data blocks (similar to how it is implemented in rsync).

For compression, lzma and lzo are used in multithreaded execution, and for encryption, aes is used. In the latest versions, it is possible to delete old data from the repository in the future.
The program is written in C++ with minimal dependencies. The author apparently was inspired by the unix-way, so the program receives data on stdin when creating backups, issuing a similar data stream to stdout when restoring. Thus, zbackup can be used as a very good "brick" when writing your own backup solutions. For example, for the author of the article, this program has been the main backup tool for home machines since about 2014.

The normal tar will be used as the data stream, unless otherwise noted.

Let's see what the results will be:

The work was checked in 2 versions:

  1. a repository is created and zbackup is run on the server with the original data, then the contents of the repository are transferred to the backup storage server.
  2. a repository is created on the backup storage server, zbackup is launched via ssh on the backup storage server, data is given to it through the pipe.

The results of the first option were as follows: 43m11s - when using an unencrypted repository and the lzma compressor, 19m13s - when replacing the compressor with lzo.

The load on the server with the initial data was as follows (an example with lzma is shown, with lzo it was about the same picture, but the share of rsync was about a quarter of the time):

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

It is clear that such a backup process is only suitable for relatively infrequent and small changes. It is also highly desirable to limit the work of zbackup to 1 thread, otherwise there will be a very high CPU load, because. The program is very well able to work in multiple threads. The load on the disk was small, which in general with a modern disk subsystem based on ssd will not be noticeable. You can also clearly see the start of the process of synchronizing the repository data to a remote server, the speed of work is comparable to the usual rsync and rests on the performance of the disk subsystem of the backup storage server. The disadvantage of the approach is the storage of a local repository and, as a result, duplication of data.

More interesting and applicable in practice is the second option with running zbackup immediately on the backup storage server.

To begin with, work will be checked without using encryption with the lzma compressor:

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

Running time of each test run:

Launch 1
Launch 2
Launch 3

39m45s
40m20s
40m3s

7m36s
8m3s
7m48s

15m35s
15m48s
15m38s

If you enable encryption using aes, the results are pretty close:

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

Operating time on the same data, with encryption:

Launch 1
Launch 2
Launch 3

43m40s
44m12s
44m3s

8m3s
8m15s
8m12s

15m0s
15m40s
15m25s

If encryption is combined with compression on lzo, it turns out like this:

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

Openning time:

Launch 1
Launch 2
Launch 3

18m2s
18m15s
18m12s

5m13s
5m24s
5m20s

8m48s
9m3s
8m51s

The size of the resulting repository was relatively the same at 13GB. This means that deduplication is working correctly. Also, on already compressed data, the use of lzo gives a tangible effect, in terms of the total running time, zbackup is very close to duplicity / duplicati, but lagging behind those based on librsync by 2-5 times.

The benefits are obvious - saving disk space on the backup storage server. As for the repository verification tools, they are not provided by the author of zbackup, it is recommended to use a fault-tolerant disk array or a cloud provider.

In general, a very good impression, despite the fact that the project has been standing still for about 3 years (the last feature request was about a year ago, but without a response).

borgbackup testing

Borgbackup is a fork of attic, another system similar to zbackup. Written in python, has a list of features similar to zbackup, but additionally can:

  • Mount backups with fuse
  • Check the contents of the repository
  • Work in client-server mode
  • Use various compressors for data, as well as heuristic determination of the file type when compressing it.
  • 2 encryption options, aes and blake
  • Built-in tool for

performance checks

borgbackup benchmark crud ssh://backup_server/repo/path local_dir

The results came out like this:

CZ-BIG 96.51 MB/s (10 100.00 MB all-zero files: 10.36s)
RZ-BIG 57.22 MB/s (10
100.00 MB all-zero files: 17.48s)
UZ-BIG 253.63 MB/s (10 100.00 MB all-zero files: 3.94s)
DZ-BIG 351.06 MB/s (10
100.00 MB all-zero files: 2.85s)
CR-BIG 34.30 MB/s (10 100.00 MB random files: 29.15s)
RR-BIG 60.69 MB/s (10
100.00 MB random files: 16.48s)
UR-BIG 311.06 MB/s (10 100.00 MB random files: 3.21s)
DR-BIG 72.63 MB/s (10
100.00 MB random files: 13.77s)
CZ-MEDIUM 108.59 MB/s (1000 1.00 MB all-zero files: 9.21s)
RZ-MEDIUM 76.16 MB/s (1000
1.00 MB all-zero files: 13.13s)
UZ-MEDIUM 331.27 MB/s (1000 1.00 MB all-zero files: 3.02s)
DZ-MEDIUM 387.36 MB/s (1000
1.00 MB all-zero files: 2.58s)
CR-MEDIUM 37.80 MB/s (1000 1.00 MB random files: 26.45s)
RR-MEDIUM 68.90 MB/s (1000
1.00 MB random files: 14.51s)
UR-MEDIUM 347.24 MB/s (1000 1.00 MB random files: 2.88s)
DR-MEDIUM 48.80 MB/s (1000
1.00 MB random files: 20.49s)
CZ-SMALL 11.72 MB/s (10000 10.00 kB all-zero files: 8.53s)
RZ-SMALL 32.57 MB/s (10000
10.00 kB all-zero files: 3.07s)
UZ-SMALL 19.37 MB/s (10000 10.00 kB all-zero files: 5.16s)
DZ-SMALL 33.71 MB/s (10000
10.00 kB all-zero files: 2.97s)
CR-SMALL 6.85 MB/s (10000 10.00 kB random files: 14.60s)
RR-SMALL 31.27 MB/s (10000
10.00 kB random files: 3.20s)
UR-SMALL 12.28 MB/s (10000 10.00 kB random files: 8.14s)
DR-SMALL 18.78 MB/s (10000
10.00 kB random files: 5.32s)

When testing, heuristics will be used during compression with the definition of the file type (compression auto), and the results will be as follows:

First, let's check the work without encryption:

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

Openning time:

Launch 1
Launch 2
Launch 3

4m6s
4m10s
4m5s

56s
58s
54s

1m26s
1m34s
1m30s

If you enable repository authorization (authenticated mode), the results will be close:

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

Openning time:

Launch 1
Launch 2
Launch 3

4m11s
4m20s
4m12s

1m0s
1m3s
1m2s

1m30s
1m34s
1m31s

When activating aes encryption, the results did not deteriorate much:

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

Launch 1
Launch 2
Launch 3

4m55s
5m2s
4m58s

1m0s
1m2s
1m0s

1m49s
1m50s
1m50s

And if you change aes to blake, the situation will improve altogether:

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

Openning time:

Launch 1
Launch 2
Launch 3

4m33s
4m43s
4m40s

59s
1m0s
1m0s

1m38s
1m43s
1m40s

As in the case of zbackup, the size of the repository was 13GB and even a little less, which, in general, is expected. I was very pleased with the run time, it is comparable to solutions based on librsync, providing much more opportunities. I was also pleased with the ability to set various parameters through environment variables, which gives a very serious advantage when using borgbackup in automatic mode. I was also pleased with the load during backup: judging by the processor load, borgbackup works in 1 thread.

There were no particular cons when using it.

restic testing

Despite the fact that restic is a fairly new solution (the first 2 candidates have been known since 2013 and older), it has quite good characteristics. Written in Go.

Compared to zbackup, it additionally gives:

  • Checking the integrity of the repository (including checking in parts).
  • A huge list of supported protocols and providers for storing backups, as well as support for rclone - rsync for cloud solutions.
  • Comparison of 2 backups with each other.
  • Mounting a repository with fuse.

In general, the list of features is quite close to borgbackup, sometimes more, sometimes less. Of the features - the inability to disable encryption, and therefore, backups will always be encrypted. Let's see in practice what can be squeezed out of this software:

The results are as follows:

Backup Part 4: Reviewing and testing zbackup, restic, borgbackup

Openning time:

Launch 1
Launch 2
Launch 3

5m25s
5m50s
5m38s

35s
38s
36s

1m54s
2m2s
1m58s

The results are also comparable to rsync-based solutions and, in general, are very close to borgbackup, but the processor load is higher (multiple threads are working) and sawtooth.

Most likely, the program rests on the performance of the disk subsystem on the storage server, as was the case with rsync. The size of the repository was 13GB, like zbackup or borgbackup, there were no obvious disadvantages when using this solution.

The results

In fact, all candidates got similar results, but at different prices. borgbackup proved to be the best, a little slower - restic, zbackup, probably, you should not start using,
and if it is already in use, try changing it to borgbackup or restic.

Conclusions

Restic seems to be the most promising solution, because it is he who has the best ratio of capabilities to speed, but for now we will not rush to general conclusions.

Borgbackup is no worse in principle, but zbackup is probably better to replace. However, zbackup's 3-2-1 rule can still be used to make it work. For example, in addition to (lib)rsync-based backup tools.

Announcement

Backup, part 1: Why backup is needed, an overview of methods, technologies
Backup Part 2: Reviewing and testing rsync-based backup tools
Backup Part 3: Review and testing of duplicity, duplicati
Backup Part 4: Reviewing and testing zbackup, restic, borgbackup
Backup Part 5: Testing bacula and veeam backup for linux
Backup Part 6: Comparing Backup Tools
Backup Part 7: Conclusions

Post Author: Pavel Demkovich

Source: habr.com

Add a comment