FAST VP on Unity storage: how it works

Today we will talk about an interesting technology implemented in the Unity / Unity XT storage system - FAST VP. If you have heard about Unity for the first time, then the link at the end of the article can be used to familiarize yourself with the characteristics of the system. I worked on FAST VP for over a year on the Dell EMC project team. Today I want to talk about this technology in more detail and reveal some details of its implementation. Of course, only those that are allowed to be revealed. If you are interested in the issues of efficient data storage or simply have not fully understood the documentation, then this article will certainly be useful and interesting.

FAST VP on Unity storage: how it works

I’ll tell you right away what will not be in the material. There will be no search for competitors and comparison with them. I also do not plan to talk about similar technologies from open source, because the curious reader already knows about them. And, of course, I'm not going to advertise anything.

storage tiering. Goals and objectives of FAST VP

FAST VP stands for Fully Automated Storage Tiering for Virtual Pool. Is it difficult? Nothing, we'll figure it out. Tiering is a way of organizing data storage, in which there are several levels (tiers) where this data is stored. Each has its own characteristics. The most important: performance, volume and price of storing a unit of information. Of course, there is a relationship between them.

An important feature of tiering is that access to data is provided uniformly regardless of what storage level it is currently on, and the size of the pool is equal to the sum of the sizes of the resources included in it. Here lies the difference from the cache: the size of the cache is not added to the total amount of the resource (the pool in this case), and the cache data duplicates some piece of data from the main medium (or will duplicate if the data from the cache has not yet been written). Also, the distribution of data by levels is hidden from the user. That is, he does not see exactly what data is located at each level, although he can influence this indirectly, by setting policies (about them later).

Now let's look at the features of the implementation of storage tiering in Unity. In Unity, there are 3 levels, or tiers:

  • Extreme performance (SSDs)
  • Performance (SAS HDD 10k/15k RPM)
  • Capacity (NL-SAS HDD 7200RPM)

They are presented in descending order of performance and price. Extreme performance includes only Solid State Drives (SSDs). In the other two tiers there are magnetic disk drives that differ in rotation speed and, accordingly, performance.

Storage media from the same level and the same size are combined into a RAID array, forming a RAID group (RAID group, abbreviated as RG); you can read about the available and recommended RAID levels in the official documentation. From RAID groups of one or more levels, storage pools are formed, from which free space is then distributed. And already from the pool space is allocated for file systems and LUNs.

FAST VP on Unity storage: how it works

Why do I need Tiering?

In short and abstract: to achieve more results with the least amount of resources. More specifically, the result is usually understood as a set of characteristics of the storage system - the speed and time of access, the cost of storage, and others. The minimum of resources means the least costs: money, energy, and so on. FAST VP just implements the mechanisms for redistributing data across different levels in the Unity / Unity XT storage system. If you believe me, you can skip the next paragraph. For the rest, I'll tell you a little more.

By properly tiering data, you can save on the overall cost of storage by sacrificing the speed of access to some rarely used information, and improve performance by moving frequently accessed data to faster media. Here someone may object that even without tiering, a normal admin knows where to put what data, what characteristics of the storage system are desirable for his task, etc. Sure, this is true, but the distribution of data "manually" has its drawbacks:

  • requires time and attention of the administrator;
  • it is not always possible to "reshape" storage resources under changing conditions;
  • an important advantage disappears: unified access to resources located at different storage levels.

To make storage admins worry less about job security, I’ll add that competent resource planning is also necessary here. Now that the tasks of tiering have been briefly outlined, let's see what you can expect from FAST VP. This is the time to return to the definition. The first two words - Fully Automated - literally translate as "fully automated" and mean that the distribution of levels occurs automatically. Well, Virtual Pool is a data pool that includes resources from different storage levels. Here's what it looks like:

FAST VP on Unity storage: how it works

Looking ahead, I will say that FAST VP only moves data within a single pool, and not between multiple pools.

Tasks solved by FAST VP

Let's talk abstractly first. We have a pool and some mechanism that can redistribute data within this pool. Keeping in mind that our task is to achieve maximum productivity, let's ask ourselves: in what ways can it be achieved? There may be several of them, and here FAST VP has something to offer the user, since the technology is something more than just storage tiering. Here are some ways FAST VP can increase pool performance:

  • Distribution of data across different types of disks, levels
  • Distribution of data among disks of the same type
  • Distribution of data when expanding the pool

Before looking at how these tasks are accomplished, we need to know some essential facts about how FAST VP works. FAST VP operates with blocks of a certain size - 256 megabytes. This is the smallest contiguous "chunk" of data that can be moved. In the documentation, it is called so: slice. From the point of view of FAST VP, all RAID groups consist of a set of such "pieces". Accordingly, all I/O statistics are accumulated for such data blocks. Why is this block size chosen and will it be reduced? The block is quite large, but this is a compromise between data granularity (smaller block size - more accurate distribution) and available computing resources: with the existing severe restrictions on RAM and a large number of blocks, statistics data can take too much, and the number of calculations will grow proportionally.

How FAST VP places data in the pool. Politicians

To control the placement of data in a pool with FAST VP enabled, there are the following policies:

  • Highest Available Tier
  • Auto Tier
  • Start High then Auto-Tier (default)
  • Lowest Available Tier

They affect both the initial allocation of the block (the data is first written) and the subsequent reallocation. When the data is already placed on the disks, the reallocation will be initiated according to the schedule or manually.

The Highest Available Tier tries to place the new block at the highest performing tier. If there is not enough space on it, the next one in terms of performance, but then the data can be moved to a more productive level (if there is space or crowding out other data). Auto-Tier places new data in different tiers based on the amount of space available, and redistributes it based on demand and free space. Start High then Auto-Tier is the default policy and is also recommended. Works as a Highest Available Tier initially, and then moves data based on usage statistics. The Lowest Available Tier policy seeks to place data at the least performing tier.

The data transfer goes with a low priority so as not to interfere with the useful work of the storage system, however, there is a β€œData relocation rate” setting that changes the priority. There is a peculiarity here: not all data blocks have the same redistribution order. For example, blocks marked as metadata will be moved to the faster tier first. Metadata is, so to speak, β€œdata about data”, some additional information that is not user data, but stores their description. For example, information in the file system about which block a particular file is in. This means that the speed of access to data depends on the speed of access to metadata. Given that metadata is typically much smaller, the benefits of moving it to faster disks are expected to be greater.

Criteria that Fast VP uses in its work

The main criterion for each block, if very roughly, is the characteristic of the "demand" of data, which depends on the number of read and write operations of a data fragment. This characteristic is called "Temperature". There is hot data that is hotter than unclaimed data. It is calculated periodically, by default with an interval of one hour.

The temperature calculation function has the following properties:

  • In the absence of I / O, data "cools down" over time.
  • With more or less the same load in time, the temperature first increases and then stabilizes in a certain range.

Further, the policies described above and the free space on each tier are taken into account. For clarity, I will give a picture from the documentation. Here, red, yellow and blue colors indicate blocks with high, medium and low temperatures, respectively.

FAST VP on Unity storage: how it works

But back to the tasks. So, we can begin to analyze what is being done to solve the problems of FAST VP.

A. Distribution of data across different types of disks, levels

Actually, this is the main task of FAST VP. The rest, in a sense, are derivatives of it. Depending on the selected policy, data will be distributed across different storage tiers. First of all, the placement policy is taken into account, then the temperature of the blocks and the size / speed of the RAID groups.

For Highest/Lowest Available Tier policies, everything is quite simple. For the other two, this is the case. Data is distributed over different levels taking into account the size and performance of RAID groups: so that the ratio of the total "temperature" of blocks to the "conditional maximum performance" of each RAID group is approximately the same. Thus, the load is distributed more or less evenly. Data that is more in demand is moved to faster media, less frequently used data is moved to slower media. Ideally, the distribution should look something like this:

FAST VP on Unity storage: how it works

B. Distribution of data among disks of the same type

Remember, at the beginning I wrote that information carriers from one or more levels are combined into one pool? In the case of a single level, FAST VP also has work to do. To maximize performance at any level, it is desirable to distribute data evenly across disks. This will allow (in theory) to get the maximum number of IOPS. Data within a RAID group can be considered distributed evenly across disks, but this is not always the case between RAID groups. In the event of an imbalance, FAST VP will move data between RAID groups in proportion to their size and "conditional performance" (in numerical terms). For clarity, I will show the rebalancing scheme among three RAID groups:

FAST VP on Unity storage: how it works

C. Distribution of data when expanding the pool

This task is a special case of the previous one and is performed when a RAID group is added to the pool. To prevent the newly added RAID group from being idle, some of the data will be transferred to it, which means that the load on all RAID groups will be redistributed.

SSD Wear Leveling

Through wear leveling, FAST VP can extend the life of an SSD, although this feature is not directly related to Storage Tiering. Since there is already temperature data, the number of write operations is also taken into account, we know how to move data blocks, it would be logical for FAST VP to solve this problem as well.

If the number of writes to one RAID group significantly exceeds the number of writes to another, FAST VP will redistribute the data according to the number of writes. On the one hand, this removes the load and saves the resource of some disks, on the other hand, it adds β€œwork” for less loaded ones, increasing overall performance.

Thus, FAST VP takes on the traditional tasks of Storage Tiering and does a little more than that. All this allows you to efficiently store data in the Unity storage system.

A few tips

  1. Don't neglect reading the documentation. There are best practices, and they work pretty well. If you follow them, then serious problems, as a rule, do not arise. The rest of the tips basically repeat or supplement them.
  2. If you configured and enabled FAST VP, then leave it enabled. Let it allocate data in the allotted time and little by little than once a year and have a serious impact on the performance of other tasks. In such cases, data redistribution can take a long time.
  3. Be careful when choosing a relocation window. Although this is obvious, try to choose a time with the least load on Unity and allocate a sufficient amount of time.
  4. Plan your storage expansion, do it on time. This is a general recommendation that is important for FAST VP too. If the amount of free space is very small, then the movement of data will slow down or become impossible. Especially if you neglected point 2.
  5. When expanding a pool with FAST VP enabled, don't start with the slowest drives. That is, either we add all the planned RAID groups at once, or we add the fastest disks first. In this case, redistributing data to new "fast" disks will increase the overall speed of the pool. Otherwise, starting with "slow" disks, you can get a very unpleasant situation. First, the data will be transferred to new, relatively slow disks, and then, when adding faster ones, in the opposite direction. There are nuances associated with different FAST VP policies, but in the general case, this situation is possible.

If you are looking at this product, then you can try Unity in action for free by downloading the Unity VSA virtual appliance.

FAST VP on Unity storage: how it works

At the end of the article, I share a few useful links:

Conclusion

I want to write about a lot, but I understand that not all the details will be of interest to the reader. For example, you can talk in more detail about the criteria by which FAST VP makes a decision to transfer data, about the processes for analyzing I / O statistics. Also, the topic of interaction with Dynamic Pools, and this pulls on a separate article. You can even fantasize about the development of this technology. I hope it wasn't boring and I didn't bore you. See you soon!

Source: habr.com

Add a comment