Backup at the ready: busting myths in honor of the holiday

Backup at the ready: busting myths in honor of the holiday

Backup is not one of the trendy technologies that are shouted about from every iron. It just has to be in any serious company, that's all. We back up several thousand servers in our bank - this is a complex, interesting job, some of the subtleties of which, as well as typical misconceptions about backups, just want to be told.

I have been working on this topic for almost 20 years, of which the last 2 years have been at Promsvyazbank. At the very beginning of the practice, I did backups almost manually, with scripts that simply copied files. Then convenient tools appeared in Windows: the Robocopy utility for preparing files and NT Backup for copying. And only then came the time for specialized software, primarily Veritas Backup Exec, which is now called Symantec Backup Exec. So I've been familiar with backups for a long time.

In simple terms, backup is keeping a copy of data (virtual machines, applications, databases and files) just in case with a certain regularity. Each case usually manifests itself as a hardware or logical failure and results in data loss. The purpose of a backup system is to reduce the loss of information. A hardware failure is, for example, a failure of the server or storage where the database is located. Logical - this is the loss or change of part of the data, including due to the human factor: they accidentally deleted a table, file, launched a crooked script for execution. There are also regulator requirements for storing a certain type of information for a long period, for example, up to several years.

Backup at the ready: busting myths in honor of the holiday

The most typical use of backups is the restoration of a saved copy of databases for the deployment of various test systems, clones for developers.

There are a few typical myths around backup that should be dispelled long ago. Here are the most famous of them.

Myth 1. Backup has long been just a small function inside security or storage systems

Backup systems still remain a separate class of solutions, and very independent. They've got too much work to do. In fact, they are the last line of defense when it comes to data integrity. So backup works at its own pace, on its own schedule. A daily report is generated for the servers, there are events that act as triggers for the monitoring system.

Backup at the ready: busting myths in honor of the holiday

Plus, the role model of access to the backup system allows you to delegate part of the authority to administrators of target systems to manage backups.

Myth 2. When there is a RAID, a backup is no longer needed.

Backup at the ready: busting myths in honor of the holiday

Undoubtedly, RAID arrays and data replication are a good way to protect information systems from hardware failures, and if you have a standby server, you can quickly organize switching to it in case the main machine fails.

From the logical errors that were made by the users of the system, redundancy and replication does not save. Here's a write-back standby server - yes, it can help out if an error is detected before it was synchronized. And if the moment is missed? Only a timely backup will help here. If you know that the data changed yesterday, you can restore the system to the day before yesterday and extract the necessary data from it. Given the fact that logical errors are the most common, the good old backup remains a proven and necessary tool.

Myth 3. A backup is something that is done once a month.

The backup frequency is a configurable setting that primarily depends on your backup system requirements. It is quite possible to find data that almost never changes and is not particularly important, their loss will not be critical for the company.
They, indeed, can be backed up once a month and even less often. But more critical data is saved more often, depending on the RPO (Recovery point objrective) indicator, which sets the allowable data loss. This can be once a week, once a day, or even several times an hour. We have these transaction logs from the DBMS.

Backup at the ready: busting myths in honor of the holiday

When systems are put into commercial operation, backup documentation must be approved, which reflects the main points, the update procedure, the procedure for restoring the system, the procedure for storing backups, and the like.

Myth 4. The volume of copies is constantly growing and takes up any allocated space completely.

Backups have a limited retention period. It makes no sense, for example, to store all 365 daily backups during the year. As a rule, it is acceptable to keep daily copies for 2 weeks, after which they are replaced with fresh ones, and the version that was made first in the month remains in long-term storage. It, in turn, is also stored for a certain time - each copy has a lifetime.

Backup at the ready: busting myths in honor of the holiday

There is data loss protection. The rule applies: before a backup is deleted, the next one must be formed. Therefore, the data will not be deleted if the backup has not been completed, for example, due to the unavailability of the server. Not only time frames are respected, but the number of copies in the set is also controlled. If the system is designed to have two full backups, there will always be two of them, and the old one will be deleted only when a new third one is successfully written. So the growth of the volume occupied by the backup archive is associated only with the growth of the amount of protected data and does not depend on time.

Myth 5. Backup started - everything hung

It is better to say this: if everything is hanging, then the hands of the administrator do not grow from there. In general, the performance of a backup depends on many factors. For example, on the speed of the backup system itself: how fast are disk storages, tape libraries. From the speed of the servers of the backup system: whether they have time to process data, perform compression and deduplication. And also on the speed of the communication lines between the client and the server.

The backup can go to one or more streams, depending on whether the system being backed up supports multithreading. For example, the Oracle DBMS allows you to give multiple threads, according to the number of available processors, until the transfer rate hits the network bandwidth limit.

If you try to back up a large number of threads, then there is a chance to overload a running system, it will really start to slow down. Therefore, the optimal number of threads is chosen to ensure sufficient performance. If even the slightest decrease in performance is critical, then there is an excellent option when the backup is carried out not from a combat server, but from its clone - standby in database terminology. This process does not boot the main working system. Data can be retrieved through more streams, since the server is not used for maintenance.

In large organizations, a separate network is created for the backup system so that the backup does not affect the production. In addition, traffic may not be transmitted through the network, but through the SAN.
Backup at the ready: busting myths in honor of the holiday
We try to spread the load over time as well. Backups are mostly done during non-working hours: at night, on weekends. Also, they don't all run at the same time. Backups of virtual machines are a special case. The process has practically no effect on the performance of the machine itself, so the backup can be spread over the daytime, and not postpone everything at night. There are many subtleties, if you take everything into account, backup will not affect the performance of systems.

Myth 6. Launched a backup system - that's fault tolerance for you

Never forget that a backup system is the last line of defense, which means there should be five more systems in front of it that ensure the continuity, high availability and disaster tolerance of the IT infrastructure and enterprise information systems.

Hoping that the backup will restore all the data and quickly raise the fallen service is not worth it. Data loss from the moment of backup to the moment of failure is guaranteed, and data can be uploaded to a new server for several hours (or days, as you're lucky). Therefore, it makes sense to create full-fledged fault-tolerant systems without shifting everything to a backup.

Myth 7. I set up a backup once, checked that it works. It remains only to look at the logs

This is one of the most harmful myths, the fakeness of which you realize only during the incident. Successful backup logs are not a guarantee that everything really went as it should. It is important to check the saved copy for deployability in advance. That is, start the recovery process in a test environment and look at the result.

And a little about the work of the system administrator

In manual mode, no one has been copying data for a long time. Modern SRKs can back up almost everything, you just have to set it up properly. If a new server has been added, set policies: select the content that will be backed up, specify storage options, and apply the schedule.

Backup at the ready: busting myths in honor of the holiday

At the same time, there is still a lot of work due to the extensive fleet of servers, including databases, mail systems, virtual machine clusters, and file shares both on Windows and Linux / Unix. Employees who keep the backup system running do not sit idle.

In honor of the holiday, I would like to wish all admins strong nerves, clarity of movements and endless space for storing backups!

Source: habr.com

Add a comment