Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Please don't jump to conclusions because of the title! We have weighty arguments to back it up, and we've packed them as compactly as we could. We bring to your attention a post about the concept and principles of operation of our new storage system, which was released in January 2020.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

In our opinion, the main competitive advantage of the Dorado V6 storage family is provided by the performance and reliability mentioned in the title. Yes, yes, it’s so simple, but what tricky and not-so-tricky decisions we managed to achieve this “simple”, we’ll talk today.

In order to better unleash the potential of new generation systems, we will talk about the older representatives of the model range (models 8000, 18000). Unless otherwise stated, they are meant to be.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

A few words about the market

To better understand the place of Huawei solutions in the market, let's turn to a proven yardstick - "magic quadrants» Gartner. Two years ago, in the general-purpose disk array sector, our company confidently entered the group of leaders, second only to NetApp and Hewlett Packard Enterprise. Huawei's position in the SSD storage market in 2018 was characterized by the status of a "challenger", but something was missing to achieve a leadership position.

In 2019, Gartner, in its study, combined both of the above sectors into one - "Main Storage". As a result, Huawei was again in the leader quadrant, next to vendors such as IBM, Hitachi Vantara and Infinidat.

To complete the picture, we note that Gartner collects 80% of the data for analysis in the US market, and this leads to a significant bias in favor of those companies that are well represented in the US. Meanwhile, suppliers oriented to European and Asian markets find themselves in an obviously less advantageous position. Despite this, last year Huawei products took their rightful place in the upper right quadrant and, according to Gartner's verdict, "may be recommended for use."

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

What's new in Dorado V6

The Dorado V6 product line, in particular, is represented by entry-level 3000 series systems. Initially equipped with two controllers, they can be horizontally expanded to 16 controllers, 1200 drives and 192 GB of cache. Also, the system will be equipped with external Fiber Channel (8 / 16 / 32 Gb / s) and Ethernet (1 / 10 / 25 / 40 / 100 Gb / s) ports.

Note that the use of protocols that do not have commercial success is now being phased out, so at the start we decided to abandon support for Fiber Channel over Ethernet (FCoE) and Infiniband (IB). They will be added in later firmware versions. Support for NVMe over Fabric (NVMe-oF) is available out of the box on top of Fiber Channel. The next firmware, which is scheduled for release in June, is scheduled to support NVMe over Ethernet mode. In our opinion, the above set will more than cover the needs of most Huawei customers.

File access is not available in the current firmware version and will appear in one of the next updates towards the end of the year. Implementation is assumed at the native level, by the controllers themselves with Ethernet ports, without the use of additional equipment.

The main difference between the Dorado V6 3000 series model and the older ones is that it supports one protocol on the backend - SAS 3.0. Accordingly, drives there can only be used with the named interface. From our point of view, the performance provided by this is quite enough for a device of this type.

The Dorado V6 5000 and 6000 series systems are mid-range solutions. They are also made in the form factor 2U and equipped with two controllers. They differ from each other in performance, the number of processors, the maximum number of disks and cache size. However, in architectural and engineering terms, Dorado V6 5000 and 6000 are identical and look the same.

The hi-end class includes Dorado V6 8000 and 18000 series systems. Made in 4U size, they have a separate architecture by default, in which controllers and drives are spaced apart. They can also come with as few as two controllers as a minimum, although customers typically ask for four or more.

Dorado V6 8000 scales out to 16 controllers, and Dorado V6 18000 scales up to 32. These systems have different processors with different numbers of cores and cache sizes. At the same time, the identity of engineering solutions is preserved, as in mid-end class models.

2U storage shelves are connected via RDMA with a bandwidth of 100 Gb / s. The older Dorado V6 backend also supports SAS 3.0, but more in case SSDs with this interface drop in price a lot. Then there will be an economic feasibility of their use even taking into account lower productivity. At the moment, the difference in cost between SSDs with SAS and NVMe interfaces is so small that we are not ready to recommend such a solution.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Inside the controller

Dorado V6 controllers are made on our own element base. No processors from Intel, no ASICs from Broadcom. Thus, every single component of the motherboard, as well as the motherboard itself, is completely removed from the influence of the risks associated with sanctions pressure from American companies. Those who have seen any of our equipment with their own eyes have probably noticed shields with a red stripe under the logo. It means that the product does not contain American components. This is the official course of Huawei - the transition to components of its own production, or, in any case, produced in countries that do not follow the US policy.

Here's what you can see on the controller board itself.

  • Universal network interface (Hisilicon 1822 chip) responsible for connecting to Fiber Channel or Ethernet.
  • Providing remote accessibility of the system BMC chip, namely Hisilicon 1710, for full-featured remote control and monitoring of the system. Similar ones are also used in our servers and in other solutions.
  • The central processing unit, which is the Kunpeng 920 chip built on the ARM architecture, manufactured by Huawei. It is he who is shown in the diagram above, although other controllers may have different models with a different number of cores, a different clock speed, etc. The number of processors in one controller also changes from model to model. For example, in the older Dorado V6 series, there are four of them on one board.
  • SSD controller (Hisilicon 1812e chip) that supports both SAS and NVMe drives. In addition, Huawei independently produces SSDs, but does not manufacture NAND cells themselves, preferring to purchase them from the world's four largest manufacturers in the form of uncut silicon wafers. Cutting, testing and packaging into chips Huawei produces independently, after which it releases them under its own brand.
  • The artificial intelligence chip is Ascend 310. By default, it is absent on the controller and is mounted through a separate card, which occupies one of the slots reserved for network adapters. The chip is used to provide intelligent cache behavior, performance management or deduplication and compression processes. All these tasks can be solved with the help of the central processor, but the AI ​​​​chip allows you to do this much more efficiently.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Separately about Kunpeng processors

The Kunpeng processor is a system on a chip (SoC) where, in addition to the computing unit, there are hardware modules that accelerate various processes, such as calculating checksums or executing erasure coding. It also implements hardware support for SAS, Ethernet, DDR4 (from six to eight channels), etc. All this allows Huawei to create storage controllers that are not inferior in performance to classic Intel solutions.

In addition, proprietary solutions based on the ARM architecture enable Huawei to create complete server solutions and offer them to its customers as an alternative to x86.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

New Dorado V6 Architecture…

The internal architecture of the storage system Dorado V6 of the older series is represented by four main subdomains (factories).

The first factory is a common frontend (network interfaces responsible for communicating with the SAN factory or hosts).

The second is a set of controllers, each of which can “reach out” via the RDMA protocol both to any front-end network card and to the neighboring “engine”, which is a box with four controllers, as well as power and cooling units common to them. Now hi-end class Dorado V6 models can be equipped with two such "engines" (respectively, eight controllers).

The third factory is responsible for the backend and consists of RDMA 100G network cards.

Finally, the fourth factory "in hardware" is represented by plug-in intelligent storage shelves.

This symmetrical structure unleashes the full potential of NVMe technology and guarantees high performance and reliability. The I / O process is maximally parallelized across processors and cores, providing simultaneous reading and writing to multiple threads.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

…and what she gave us

The maximum performance of Dorado V6 solutions is approximately three times higher than that of previous generation systems (of the same class) and can reach 20 million IOPS.

This is due to the fact that in the past generation of devices, NVMe support only extended to draw-in shelves with drives. Now it is present at all stages, from the host to the SSD. The backend network has also undergone changes: SAS/PCIe has given way to RoCEv2 with a throughput of 100 Gb/s.

The SSD form factor has also changed. If earlier there were 2 drives per 25U shelf, now it has been brought up to 36 palm-sized physical disks. In addition, the shelves "wised up." Each of them now has a fault-tolerant system of two controllers based on ARM chips, similar to those installed in the central controllers.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

So far, they are only engaged in data reorganization, but with the release of new firmware, compression and erasure coding will be added to it, which will reduce the load on the main controllers from 15 to 5%. Transferring some tasks to the shelf at the same time frees up the bandwidth of the internal network. And all this significantly increases the scalability potential of the system.

Compression and deduplication in the previous generation storage system was performed with fixed-length blocks. Now, a mode of working with blocks of variable length has been added, which so far needs to be turned on forcibly. Subsequent updates may change this circumstance.

Also briefly about tolerance for failures. Dorado V3 remained operational if one of the two controllers failed. Dorado V6 will ensure the availability of data even if seven out of eight controllers fail in succession or four out of one engine simultaneously fail.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Reliability in terms of economics

Recently, a survey was conducted among Huawei customers on how much downtime of individual elements of the IT infrastructure the company considers acceptable. For the most part, respondents were tolerant of a hypothetical situation in which the application does not respond within a few hundred seconds. For the operating system or host bus adapter, tens of seconds (essentially reboot time) were critical downtime. Customers place even higher demands on the network: its bandwidth should not disappear for more than 10–20 seconds. As you might guess, the most critically important respondents considered storage system failures. From the point of view of business representatives, simple storage should not exceed ... a few seconds a year!

In other words, if the bank's client application does not respond for 100 seconds, this most likely will not cause catastrophic consequences. But if the storage system does not work for the same amount, business stoppage and significant financial losses are likely.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

The chart above shows the cost of an hour of work for the ten largest banks (Forbes data for 2017). Agree, if your company is approaching the size of Chinese banks, it will not be so difficult to justify the need to purchase storage systems for several million dollars. The converse statement is also correct: if a business does not incur significant losses during downtime, then it is unlikely to buy hi-end storage systems. In any case, it is important to have an idea of ​​​​what size a hole threatens to form in your wallet while the system administrator deals with the storage system that has refused to work.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Second per failover

In Solution A in the illustration above, you can recognize our previous generation Dorado V3 system. Its four controllers work in pairs, and only two controllers contain copies of the cache. Controllers within a pair can redistribute the load. At the same time, as you can see, there are no front-end and back-end "factories" here, so each of the storage shelves is connected to a specific controller pair.

The Solution B diagram shows a solution currently on the market from another vendor (recognized?). There are already front-end and back-end factories here, and the drives are connected to four controllers at once. True, there are nuances that are not obvious in the first approximation in the work of the internal algorithms of the system.

On the right is our current Dorado V6 storage architecture with the full set of internals. Consider how these systems survive a typical situation - the failure of one controller.

In classical systems, which include Dorado V3, the period required to redistribute the load in case of failure reaches four seconds. During this time, I/O stops completely. Solution B from our colleagues, despite the more modern architecture, has an even higher downtime on failure of six seconds.

Storage Dorado V6 restores its work in just one second after a failure. This result is achieved thanks to a homogeneous internal RDMA environment that allows the controller to access "foreign" memory. The second important circumstance is the presence of a front-end factory, thanks to which the path for the host does not change. The port remains the same, and the load is simply sent to the healthy controllers by the multipassing drivers.

The failure of the second controller in Dorado V6 is worked out in one second according to the same scheme. Dorado V3 takes about six seconds, and another vendor's solution takes nine. For many DBMS, such intervals can no longer be considered acceptable, since during this time the system is switched to standby mode and stops working. This first of all concerns DBMS consisting of many sections.

The failure of the third controller Solution A is not able to survive. Simply due to the fact that access to part of the data disks is lost. In turn, Solution B in such a situation restores its working capacity, which takes, as in the previous case, nine seconds.

What's in the Dorado V6? One second.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

What can be done in a second

Almost nothing, but we don't need it. Once again, in Dorado V6 of the hi-end class, the front-end factory is decoupled from the controller factory. This means that there are no hard-coded ports belonging to a specific controller. Failover does not involve finding alternative paths or reinitializing multipassing. The system continues to work as it used to.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Multiple failure tolerance

The older Dorado V6 models can easily survive the simultaneous failure of any two (!) Controllers from any “engines”. This is made possible by the fact that the solution now keeps three copies of the cache. Therefore, even with a double failure, there will always be one complete copy.

A synchronous failure of all four controllers in one of the "engines" will also not cause fatal consequences, since all three copies of the cache are distributed among the "engines" at any given time. The system itself monitors compliance with such logic of work.

Finally, a very unlikely scenario is the sequential failure of seven out of eight controllers. Moreover, the minimum allowable interval for maintaining operability between individual failures is 15 minutes. During this time, the storage system has time to perform the operations necessary for the cache migration.

The last surviving controller will run the data store and maintain the cache for five days (the default value, which can be easily changed in the settings). After that, the cache will be disabled, but the storage system will continue to work.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Non-disturbing updates

The new OS Dorado V6 allows you to update the storage firmware without rebooting the controllers.

The operating system, as in the case of previous solutions, is based on Linux, however, many operating processes have been moved from the kernel to the user mode. Most of the functions, such as those responsible for deduplication and compression, are now regular daemons running in the background. As a result, it is not necessary to change the entire operating system to update individual modules. Suppose, to add support for a new protocol, it will only be necessary to turn off the corresponding software module and start a new one.

It is clear that the issues of updating the system as a whole still remain, because there may be elements in the kernel that need to be updated. But those, according to our observations, are less than 6% of the total. This allows you to reboot controllers ten times less often than before.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Disaster tolerant and High Availability (HA/DR) solutions

Dorado V6 out of the box is ready for integration into geo-distributed solutions, city-level clusters (metro) and "triple" data centers.

On the left in the illustration above is a metro cluster already familiar to many. Two storage systems operate in active / active mode at a distance of up to 100 km from each other. Such an infrastructure with one or more quorum servers can be supported by solutions from different companies, including our FusionSphere cloud operating system. Of particular importance in such projects are the characteristics of the channel between the sites, all other tasks in our case are taken over by the HyperMetro function, available, again, out of the box. Integration is possible over Fiber Channel, as well as over iSCSI in IP networks, if such a need arises. There is no longer a need for mandatory presence of dedicated “dark” optics, since the system is able to communicate through existing channels.

When building such systems, the only hardware requirement for storage is the allocation of ports for replication. It is enough to purchase a license, run quorum servers - physical or virtual - and provide IP connectivity to the controllers (10 Mbps, 50 ms).

This architecture can easily be transferred to a system with three data centers (see the right side of the illustration). For example, when two data centers operate in metro-cluster mode, and the third site, located at a distance of more than 100 km, uses asynchronous replication.

The system technologically supports various business scenarios that will be implemented in the event of a large-scale excess.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Survival of a metro cluster with multiple failures

The above and below also show a classic metro cluster, consisting of two storage systems and a quorum server. As you can see, in six out of nine possible scenarios of multiple failures, our infrastructure will remain operational.

For example, in the second scenario, if the quorum server fails and synchronization between sites fails, the system remains productive because the second site stops working. This behavior is already built into the built-in algorithms.

Even after three failures, access to information can be maintained if the interval between them is at least 15 seconds.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

The usual trump card from the sleeve

Recall that Huawei produces not only storage systems, but also a full range of network equipment. Whichever storage provider you choose, if a WDM network is used between sites, in 90% of cases it will be built on the solutions of our company. A logical question arises: why assemble a zoo of systems when all the hardware that is guaranteed to be compatible with each other can be obtained from one vendor?

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

To the question of performance

Probably, no one needs to be convinced that the transition to All-Flash storage can significantly reduce infrastructure maintenance costs, since all routine operations are performed many times faster. All suppliers of such equipment testify to this. Meanwhile, many vendors are beginning to be cunning when it comes to performance degradation when various storage modes are enabled.

In our industry, it is widely practiced to issue storage systems for test operation for one or two days. The vendor runs a 20-minute test on an empty system, gaining cosmic performance figures. And in real operation, “underwater rakes” quickly crawl out. After a day, beautiful IOPS values ​​\u80b\u5bare reduced by half or three times, and if the storage system is filled by 10%, they turn out to be even less. When RAID 10 is enabled instead of RAID 15, another XNUMX-XNUMX% is lost, and in the metro cluster mode, performance is additionally halved.

Everything listed above is not about Dorado V6. Our customers have the opportunity to run a performance test over the weekend or at least overnight. Then garbage collection manifests itself, and it also becomes clear how the activation of various options - like snapshots and replication - affects the amount of IOPS achieved.

In Dorado V6, snapshots and RAID with parity have almost no effect on performance (3-5% instead of 10-15%). Garbage collection (filling the drive cells with zeros), compression, deduplication on a storage system that is 80% full will always affect the overall speed of request processing. But it is Dorado V6 that is interesting in that, no matter what combination of functions and protective mechanisms you activate, the final storage performance will not fall below 80% of the figure obtained without load.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Load balancing

The high performance of Dorado V6 is achieved by balancing at every stage, namely:

  • multipassing;
  • using multiple connections from one host;
  • availability of a front-end factory;
  • parallelization of the operation of storage controllers;
  • load distribution across all drives at the RAID 2.0+ level.

Basically, this is a common practice. These days, few people keep all the data on one LUN: everyone is trying to have eight, even forty, or even more. This is an obvious and correct approach, which we share. But if your task requires only one LUN, which is easier to maintain, our architectural solutions allow it to achieve 80% of the performance available with multiple LUNs.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

Dynamic CPU scheduling

The distribution of the load on processors when using one LUN is implemented in the following way: tasks at the LUN level are divided into separate small “shards”, each of which is rigidly assigned to a specific controller in the “engine”. This is done so that the system does not lose performance while it “jumps” with this piece of data across different controllers.

Another mechanism for maintaining high performance is dynamic scheduling, in which certain processor cores can be allocated to different pools of tasks. For example, if the system is now idle at the level of deduplication and compression, then some of the cores may be involved in the process of servicing I / O. Or vice versa. All this is done automatically and transparently to the user.

Data on the current load of each of the Dorado V6 cores is not displayed in the graphical interface, but through the command line you can access the controller OS and use the usual Linux command top.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

NVMe and RoCE support

As already mentioned, Dorado V6 currently fully supports NVMe over Fiber Channel out of the box and does not require any licenses. In the middle of the year, support for NVMe over Ethernet mode will appear. For its full use, you will need support for Ethernet with direct memory access (DMA) version v2.0 both from the storage system itself and from switches and network adapters. For example, such as Mellanox ConnectX-4 or ConnectX-5. You can also use network cards made on the basis of our chips. Also, RoCE support must be implemented at the operating system level.

Overall, we consider the Dorado V6 to be an NVMe-centric system. Despite the existing support for Fiber Channel and iSCSI, in the future it is planned to switch to high-speed Ethernet with RDMA.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

A pinch of marketing

Due to the fact that the Dorado V6 system is highly fault-tolerant, scales well, supports various migration technologies, etc., the economic effect of its acquisition becomes apparent with the start of intensive use of storage systems. We will continue to try to make ownership of the system as profitable as possible, even if at the first stage it is not evident.

In particular, we have formed the FLASH EVER program associated with extending the life cycle of storage systems and designed to offload the customer as much as possible during upgrades.

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

This program includes a number of measures:

  • the ability to gradually replace controllers and disk shelves with new versions without replacing the entire equipment (for Dorado V6 hi-end systems);
  • the possibility of federated storage (combining different versions of Dorado as part of one hybrid storage cluster);
  • smart virtualization (the ability to use third party hardware as part of the Dorado solution).

Why OceanStor Dorado V6 is the fastest and most reliable storage solution

It remains to be noted that the difficult situation in the world had little effect on the commercial prospects of the new system. Despite the fact that the official release of Dorado V6 took place only in January, we see significant demand for it in China, as well as great interest in it from Russian and international partners from the financial and government sectors.

Among other things, in connection with the pandemic, no matter how long they last, the issue of providing remote employees with virtual desktops is especially acute. In this process, Dorado V6 could also remove many questions. To this end, we are making all the necessary efforts, including practically agreeing on the inclusion of the new system in the VMware compatibility list.

***

By the way, don't forget about our numerous webinars held not only in the Russian-speaking segment, but also at the global level. The list of webinars for April is available at link.

Source: habr.com

Add a comment