What has changed in Capacity Tier when Veeam became v10

Capacity Tier (or as we call it inside vim - captir) appeared back in the days of Veeam Backup and Replication 9.5 Update 4 under the name Archive Tier. The idea behind it is to make it possible to move backups that have fallen out of the so-called operational restore window to object storages. This helped free up disk space for those users who had little of it. And this option was called Move Mode.

To perform this simple (as it seems) action, it was enough to meet two conditions: all points from the moved backup must be outside the boundaries of the above-mentioned operational restore window, which is set explicitly in the UI. And second: the chain must be in the so-called "sealed" (sealed backup chain or Inactive Backup Chain). That is, over time, this chain does not change.

But in VBR v10, the concept was supplemented with new features - Copy Mode, Sealed Mode, and a thing with the difficult-to-pronounce name Immutability appeared.

These are the fascinating things we'll talk about today. First about how it worked in VBR9.5u4, and then about the changes in the tenth version.

What has changed in Capacity Tier when Veeam became v10

And forgive me, champions of a pure language, but there are too many terms that cannot be translated.
So there will be a lot of anglicisms here.
And a lot of gifs.
And pictures.

  • Without the slightest regret. The author of the article.

As it was

Well, let's start with the analysis of the operational restore window and sealed backup (or as they are called in the Inactive Backup Chain documentation). Without their understanding, further explanation will not work.

As we can see in the picture, we have some kind of backup chain with data blocks, which is located on the Performance tier SOBR of the repository to which the Capacity Tier is connected. Our operational backup window is three days.

Accordingly, the .vbk created on Monday seals the previous chain, whose window is set to three days. And, therefore, you can safely start taking everything that is older than these three days to the capacity shooting gallery.

What has changed in Capacity Tier when Veeam became v10

But what exactly was meant by the sealed chain and what could be sent to the capacity shooting range in update 4?

For Forward Incremental, the sign of chain sealing is the creation of a new full backup. And it doesn’t matter how this fullnik turns out: both synthetic full and active full backups are considered.

In the case of Reverse, these are all files that do not fall into the operating window.

In the case of Forward increment with rollbacks, these are all rollbacks and .vbk, if there is another .vbk on the extent performance

What has changed in Capacity Tier when Veeam became v10

Now let's consider the option of working with Backup Copy chains. Only items that fall under GFS retention were transported here. Because everything stored in more recent backup copy chains can be changed in one way or another.

What has changed in Capacity Tier when Veeam became v10

Now let's look under the hood. There, a process called dehydration takes place - leaving empty backup files on the extent and dragging blocks from these files onto the capacity shooting gallery. To optimize this process, the so-called dehydration index is used, which allows you not to copy blocks that have already been copied to the capacity shooting range.

Let's look at how this looks like with an example: suppose we have a .vbk that has exited the operation window and belongs to a sealed chain. This means that we have every right to transfer it to the capacity shooting range. At the time of the move, a metadata file is created in the capacity dash and blocks of the transferred file. The link-level metadata file describes what blocks our file consists of. In the case in the picture, our first file consists of blocks a, b, c, and links to these blocks are placed in the metadata. When we have a second .vbk file, ready to move and consisting of blocks a, b and d, we, analyzing the dehydration index, understand that only block d needs to be moved. And its metadata file will contain links to the two previous blocks and one new one.

What has changed in Capacity Tier when Veeam became v10

Accordingly, the process of filling these blanks back with data is called rehydration. It uses its own rehydration index based on the oldest .vbk file on the local performance extent. That is, if the user wants to return a file from the capacity tier, we first create a block index of the oldest full backup and transfer only the missing blocks from the capacity tier. In the case shown in the picture, in order to rehydrate FullBackup1.vbk according to the rehydration index, we lack only block C, which we take from the capacity of the shooting range. If the capacity of the shooting gallery is a cloud object storage, this allows you to save a lot of money.

Here it may seem that this technology is identical to that used in WAN Accelerators, but it only appears. In accelerators, deduplication is global, here local deduplication is used within each file at a certain offset. This is due to the difference in the tasks being solved: here we need to copy large files of full backups, and according to our research, even if a long period of time passes between them, such a deduplication algorithm gives the best result.

What has changed in Capacity Tier when Veeam became v10

But more indexes for the god of indexes! There is also an index for data recovery! When we start the recovery of a machine located in the capacity dash, we will only read unique data blocks that are not in the performance dash.

What has changed in Capacity Tier when Veeam became v10

How it became

That's all with the introductory part. It is quite detailed, but as mentioned above, without these details it will not be possible to explain how the new features work. Therefore, without further ado, let's move on to the first one.

copy mode

It is largely based on existing technologies, but it carries a completely different usage logic. 

The purpose of this mode is to ensure that all data located on the local extent has a copy in the capacity dash.

If we compare the Move and Copy modes head-on, it will turn out like this:

  • Only a sealed chain can be moved. In the case of copy mode, absolutely everything is taken away, regardless of what happens in the backup job.
  • Moving is triggered when the files go beyond the boundaries of the operational backup window, and copying is triggered as soon as the backup file appears.
  • Tracking new data for copying occurs constantly, and for moving it worked once every 4 hours.

In considering the new mode, I propose to go from simple examples to complex ones.

In the most banal case, we simply have new files with increments, and we simply copy them to the capacity gallery. Regardless of which mode is used in the backup job, regardless of whether it belongs to the sealed part of the chain or not, regardless of whether our operating window has expired. They just took it and copied it.

The process behind this is still dehydration as described above. In copy mode, it also makes sure that we do not copy blocks that are already on our storage. The only difference is that if in the move mode we replaced real files with dummy files, here we do not touch them in any way and leave everything as it is. Otherwise, this is exactly the same dehydration index, which carefully tries to save you money and time.

What has changed in Capacity Tier when Veeam became v10

The question arises - if you look in the UI, then there is an opportunity to select both options at the same time. How would such a combined mode work?

What has changed in Capacity Tier when Veeam became v10

Let's deal.

The beginning is standard: a backup file is created and immediately copied. An increment is created to it and also copied. This happens until the moment when we realize that the files have left our operating window and a sealed chain has appeared. At this point, we perform a dehydration operation and replace these files with dummy files. Of course, we don’t copy anything again on the capacity shooting gallery.

For all this fascinating logic, only one checkmark in the interface is responsible: Copy backups to object storage as soon as they are created.

What has changed in Capacity Tier when Veeam became v10

Why do we need this Copy mode?

It is even better to rephrase the question like this - what risks do we protect against with its help? What problem does it help us solve?

The answer is obvious: of course, this is data recovery. If we have a complete copy of local data on the object storage, then no matter what happens to our production, we can always recover data from files located in the conditional Amazon.

So let's go through the possible scenarios, from the simplest to the more complex.

The simplest misfortune that can fall on our heads is the inaccessibility of one of the files in the backup chain.

A sadder story - we broke one of the extents of our SOBR repository.

It gets even worse when the entire SOBR repository becomes unavailable, but the capacity shooting gallery works.
And everything is completely bad - this is when the backup server dies and your first desire is to try to run to the Canadian border in ten minutes.

What has changed in Capacity Tier when Veeam became v10

Now let's look at each situation separately.

When we have lost one (yes, even with it, even several) backup files, then it will be enough for us to start the process of rescanning the repository, and the lost file will be replaced by a dummy file. And with the help of the rehydration process (which was discussed at the beginning of the article), the user will be able to download data from the capacity of the shooting range to the local storage.

What has changed in Capacity Tier when Veeam became v10

Now the situation is more complicated. Let's assume that our SOBR consists of two extents running in Performance mode, which means that our .vbk and .vib are spread over them in a rather uneven layer. And at some point in time, one of the extents becomes inaccessible, and the user needs to urgently restore the machine, part of the data of which lies precisely on this extent.

The user launches the recovery wizard, selects the point to which he wants to restore, and the wizard in the process of work comes to the realization that he does not have all the data necessary for recovery locally and therefore they must be downloaded from the capacity of the shooting range. In this case, the blocks that remain on the local storage will not be downloaded from the cloud. Glory to the restore index (yes, it was also mentioned at the beginning of the article).

What has changed in Capacity Tier when Veeam became v10

A subtype of this case is that the entire SOBR repository has become unavailable. In this case, we have nothing to copy from local storages, and all blocks are downloaded from the cloud.

And the most interesting situation is that the backup server died. There are two options here: the admin did a good job and made configuration backups, and the admin is evil Pinocchio himself and did not make a configuration backup.

In the first case, it will be enough for him to simply deploy a clean installation of VBR somewhere and restore its database from a backup using regular means. At the end of this process, everything will return to normal. Or it will be restored according to one of the scenarios above.

But if the admin or his own enemy, or the configuration backup also suffered an epic failure, then even here we will not leave him to the mercy of fate. For this case, we have introduced a new procedure called Import Object Storage. It allows you to skip the process of manually recreating a SOBR repository and attaching a capacity dash to it with a subsequent rescan, but simply adding a storage object to the vim interface and running the Import Storage Repository procedure. The only thing that can stand in the way between you and your backups is a request to enter a password if your backups were encrypted.

On this about Copy Mode, perhaps, everything and we move on to

Sealed Mode

The main idea is that new backups cannot appear on the selected SOBR extent of the repository. Before v10, we only had Maintenance Mode, when any work with the repository was completely prohibited. A sort of hardcore storage decommissioning mode, where only the Evacuate button is available, which once transported backups to another extent.

And Sealed mode is a kind of β€œsoft” option: we prohibit creating new backups and gradually delete old ones according to the chosen retention, but in the process we do not lose the ability to restore from stored points. A very useful thing when we either have a piece of iron whose life is coming to an end and will need to be replaced, or it just needs to be released for something more important, and there is nowhere to take and transfer everything at once. Or it can't be removed.

Accordingly, the principle of operation is quite simple: it is necessary to prohibit all write operations (the appearance of new data), leaving read (restors) and delete (retention).

Both modes can be used simultaneously, but keep in mind that Maintenance has a higher priority.

As an example, consider a SOBR consisting of two extents. Let's assume that for the first four days we created backups in the Forward Forever Incremental mode, and then we seal the extent. This leads to the fact that we initiate the creation of a new full asset on the second available extent. If our retention is equal to four, then when the entire chain located on the sealed extent goes beyond it, it is removed with a clear conscience.

What has changed in Capacity Tier when Veeam became v10

There are situations where deletion occurs earlier. For example, this is Forward incremental with periodic fulls. If the first two days we created full backups, and on Thursday we decide to seal the repository, then on Friday, when a new full backup is created, the file for Monday will be deleted. there are no dependencies to this point. And the point itself does not depend on anyone. After that, we wait until four points are created on the available extent and delete the remaining three, which cannot be deleted independently of each other.

What has changed in Capacity Tier when Veeam became v10

Things are easier with Reverse Incremental. In it, the oldest points do not depend on anything and can be safely deleted. Therefore, as soon as a new .vbk is created on the new extent, the old .vrb will be deleted one by one.

By the way, why do we create a new .vbk each time: if we do not create it, but continue the old chain of increments, then the old .vbk would hang for an infinitely long time in any mode, preventing its deletion. Therefore, it was decided that as soon as an extent is sealed, we create a full backup on a free extent.

What has changed in Capacity Tier when Veeam became v10

Things are more complicated with the capacity shooting range.

Let's look at copy mode first. Suppose that we actively created backups for four days, and then the capacity shooting gallery was sealed. We do not delete anything, but humbly maintain retention, after which we delete data from the capacity of the shooting range.

Approximately the same thing happens with move mode - we wait for the retension, delete the old one in local storage, delete the one stored in object storage.

What has changed in Capacity Tier when Veeam became v10

An interesting example is with Forever forward incremental. We set a three-point retention and start making backups from Monday, which are regularly copied to the cloud. After the vault is sealed, backups continue to be created, surviving three dots, but the data stored in the capacity dash remains dependent and cannot be deleted. Therefore, we wait until Thursday, when our .vbk goes beyond the scope of retention, and only then we calmly delete the entire saved chain.

What has changed in Capacity Tier when Veeam became v10

And a small caveat: all examples here are shown with one machine. If you have several of them in your backup, then their retension will differ depending on whether Active Full was made or not.

On this, in principle, and all. So let's move on to the most hardcore feature -

Immutability

As with the previous points, the first thing is about what problem this function solves. As soon as we unload our backups somewhere for storage, there is an acute desire to guarantee their safety, that is, to physically prohibit their deletion and any modification during a given retention period. Including admins, including under their root accounts. This allows you to protect them from accidental or deliberate damage. Those who work with AWS may have come across a similar feature called Object Lock.

Now let's look at the mode in general terms, and then delve into the details. In our example, Immutability will be enabled for our tier capacity with a four-day retention. And the copy mode is enabled in the backup.

Immutability has nothing to do with general retention. For example, it doesn't add extra points or anything like that. It's just that for four days a person cannot delete backup files. If you make a backup on Monday, then you can delete its file only on Friday.

What has changed in Capacity Tier when Veeam became v10

All previously explained concepts of dehydration, indexes, and metadata continue to work in exactly the same way. But with one condition - the block is set not only for data, but also for metadata. This is done in case an insidious attacker decides to erase our metadata base and so that data blocks do not turn into useless binary mess.

What has changed in Capacity Tier when Veeam became v10

And now it's a great time to explain our block generation technology. Or block generation. To do this, consider the situation that led to its appearance.

Let's take a time scale of six days and from the bottom we will mark the time of the expected expiration of immutability. We take and create on the first day a file consisting of a data block a, and its metadata. If immutability is set to three days, it is logical to assume that on the fourth day the data will be unlocked and deleted. On the second day, we add a new file2 consisting of block b with the same settings. Block a still needs to be removed on the fourth day. But on the third day, a terrible thing happens - a File3 file is created, consisting of a new block d and a link to the old block a. This means that for block a, its immutability flag must be re-set for a new period, which is shifted to the sixth day. And here a problem arises - in real backups of such blocks there is a huge amount. And in order to extend their immutability period, it is necessary to make a huge number of requests each time. And in fact, this will be a near-infinite daily process, since with a high degree of probability we will find hefty packs of deduplicated blocks with each copy. And what does a large number of requests from providers of objects of storage mean? Right! Huge bill at the end of the month.

What has changed in Capacity Tier when Veeam became v10

And in order not to expose their favorite customers out of the blue for solid money, the block generation mechanism was invented. This is an additional period that we add to the set immutability period. In the example below, this period is two days. But this is just for example. In reality, it uses its own formula, which gives approximately ten additional days with a monthly lock.

Let's continue to consider the same situation, but with block generation. We create on the first day file1 from block a and metadata. We add up the generation period and immutability - which means that the ability to delete the file will be on the sixth day. If on the second day we create a File2 consisting of block b and a link to block a, then nothing happens to the estimated deletion date. She stood as she stood on the sixth day, and so she stands. And by doing so, we are trying to save money on the number of requests. The only situation where the term can be shifted is if the generation period has expired. That is, if on the third day the new File3 contains a link to block a, then generation 2 will be added, since Gen1 has already expired. And the expected date of deletion of block a will shift to the eighth day. This allows us to drastically reduce the number of requests to extend the lifetime of deduplicated blocks, which saves customers a ton of money.

What has changed in Capacity Tier when Veeam became v10

The technology itself is available to users of S3 and S3-compatible hardware, whose manufacturers guarantee that their implementation does not differ from Amazon's. Hence the answer to the legitimate question why Azure is not supported - they have a similar feature, but it works at the container level, not individual objects. By the way, in the Amazon itself, there are two modes of object lock: compliance and governance. In the second case, it remains possible that the greatest admin over admins and root over root, despite the object lock, still delete the data. In the case of compliance, everything is nailed down tightly and backups cannot be deleted by anyone. Even the Amazon admins (according to their official statements). We support this mode.

And, traditionally, some useful links:

Source: habr.com

Add a comment