RAID arrays on NVMe

RAID arrays on NVMe
In this article, we will talk about different ways to organize RAID arrays, and also show one of the first hardware RAID controllers with NVMe support.

The entire variety of applications of RAID technology is found in the server segment. In the client segment, software RAID0 or RAID1 on two disks is most often used exclusively.

This article will provide a brief overview of RAID technology, a short tutorial on how to create RAID arrays using three different tools, and a comparison of the performance of virtual disks using each method.

What is RAID?

Wikipedia gives a comprehensive definition of RAID technology:

RAID (English Redundant Array of Independent Disks - a redundant array of independent (independent) disks) - a data virtualization technology for combining multiple physical disk devices into a logical unit to improve fault tolerance and performance.

The configuration of disk arrays and the technologies used in this depend on the selected RAID level. RAID levels are standardized in the specification Common RAID Disk Data Format. It describes many levels of RAID, but the most common are RAID0, RAID1, RAID5, and RAID6.

RAID0, or Stripes, is a RAID level that combines two or more physical disks into one logical disk. The volume of the logical disk is equal to the sum of the volumes of the physical disks included in the array. There is no redundancy in this RAID level, and the failure of one disk can result in the loss of all data in the virtual disk.

Level of RAID1, or Mirror, creates identical copies of data on two or more disks. The volume of the virtual disk does not exceed the volume of the minimum of the physical disks. Data on a RAID1 virtual disk will be available as long as at least one physical disk in the array is operational. Using RAID1 adds redundancy, but is quite expensive, since in arrays of two or more disks, only one is available.

Level of RAID5 solves the cost problem. A minimum of 5 drives is required to create a RAID3 array, and the array is tolerant of a single drive failure. Data in RAID5 is stored in blocks with checksums. There is no strict division into disks with data and disks with checksums. Checksums in RAID5 are the result of an XOR operation applied to N-1 blocks, each taken from its own disk.

Although RAID arrays increase redundancy and provide redundancy, they are not suitable for storing backups.

After a brief tour of the types of RAID arrays, you can move on to devices and programs that allow you to assemble and use disk arrays.

Types of RAID controllers

There are two ways to create and use RAID arrays: hardware and software. We will consider the following solutions:

  • Linux Software RAID.
  • Intel® Virtual RAID On CPU.
  • LSI MegaRAID 9460-8i.

Note that the Intel® solution runs on a chipset, which raises the question of whether it is a hardware or software solution. So, for example, the VMWare ESXi hypervisor considers VROC software and does not officially support it.

Linux Software RAID

Software RAID arrays in the Linux OS family are a fairly common solution both in the client and server segments. All you need to create an array is the mdadm utility and a few block devices. The only requirement that Linux Software RAID imposes on the drives used is to be a block device accessible to the system.

The absence of hardware and software costs is an obvious advantage of this method. Linux Software RAID organizes disk arrays at the cost of CPU time. The list of supported RAID levels and the status of current disk arrays can be viewed in the mdstat file, which is located in the procfs root:

root@grindelwald:~# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid10] 
unused devices: <none>

Support for RAID levels is added by connecting the appropriate kernel module, for example:

root@grindelwald:~# modprobe raid456
root@grindelwald:~# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
unused devices: <none>

All operations with disk arrays are performed through the mdadm command line utility. The disk array is assembled in one command:

mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/nvme1n1 /dev/nvme2n1

After executing this command, the block device /dev/md0 will appear in the system, which represents a virtual disk from you.

Intel® Virtual RAID On CPU

RAID arrays on NVMeIntel® VROC Standard Hardware Key
Intel® Virtual RAID On CPU (VROC) is a hardware/software technology for creating RAID arrays based on Intel® chipsets. This technology is available primarily on motherboards that support Intel® Xeon® Scalable processors. VROC is not available by default. To activate it, you need to install a VROC hardware license key.

The standard VROC license allows you to create disk arrays with 0, 1 and 10 RAID levels. The premium version expands this list with support for RAID5.

Intel® VROC technology in modern motherboards works in conjunction with the Intel® Volume Management Device (VMD), which provides hot-swap capability for NVMe drives.

RAID arrays on NVMeIntel® VROC Standard License Arrays are configured through the Setup Utility when the server boots. On the tab Advanced Intel® Virtual RAID on CPU appears, allowing you to configure disk arrays.

RAID arrays on NVMeCreating a RAID1 array on two drives
Intel® VROC Technology has a trick up its sleeve. Disk arrays built with VROC are compatible with Linux Software RAID. This means that arrays can be monitored in /proc/mdstat and administered through mdadm. This "feature" is officially supported by Intel. After assembling RAID1 in the Setup Utility, you can observe the synchronization of drives in the OS:

root@grindelwald:~# cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md126 : active raid1 nvme2n1[1] nvme1n1[0]
      1855832064 blocks super external:/md127/0 [2/2] [UU]
      [>....................]  resync =  1.3% (24207232/1855832064) finish=148.2min speed=205933K/sec
      
md127 : inactive nvme1n1[1](S) nvme2n1[0](S)
      10402 blocks super external:imsm
       
unused devices: <none>

Note that you can't assemble arrays on VROC via mdadm (the assembled arrays will be Linux SW RAID), but you can change disks in them and disassemble arrays.

LSI MegaRAID 9460-8i

RAID arrays on NVMeExternal view of the LSI MegaRAID 9460-8i controller
The RAID controller is a standalone hardware solution. The controller only works with drives connected directly to it. This RAID controller supports up to 24 NVMe drives. It is NVMe support that distinguishes this controller from many others.

RAID arrays on NVMeMain menu of the hardware controller
When using UEFI mode, controller settings are integrated into the Setup Utility. Compared to VROC, the hardware controller menu looks much more complicated.

RAID arrays on NVMeCreating a RAID1 on two drives
Explaining how to configure disk arrays on a hardware controller is a rather thin topic and could be the reason for a full-fledged article. Here we will just limit ourselves to creating RAID0 and RAID1 with default settings.

Drives connected to a hardware controller are not visible to the operating system. Instead, the controller "masks" all RAID arrays as SAS drives. Drives connected to the controller, but not included in the disk array, will not be available to the OS.

root@grindelwald:~# smartctl -i /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-48-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               AVAGO
Product:              MR9460-8i
Revision:             5.14
Compliance:           SPC-3
User Capacity:        1,999,844,147,200 bytes [1.99 TB]
Logical block size:   512 bytes
Rotation Rate:        Solid State Device
Logical Unit id:      0x000000000000000000000000000000
Serial number:        00000000000000000000000000000000
Device type:          disk
Local Time is:        Sun Oct 11 16:27:59 2020 MSK
SMART support is:     Unavailable - device lacks SMART capability.

Despite masquerading as SAS drives, NVMe arrays will run at PCIe speeds. However, this feature allows you to boot from NVMe in Legacy.

Test stand

Each of the methods of organizing disk arrays has its own physical pros and cons. But is there a performance difference when working with disk arrays?

To achieve maximum fairness, all tests will be conducted on the same server. Its configuration:

  • 2x Intel® Xeon® 6240;
  • 12x DDR4-2666 16GB;
  • LSI MegaRAID 9460-8i;
  • Intel® VROC Standard Hardware Key;
  • 4x Intel® SSD DC P4510 U.2 2TB;
  • 1x Samsung 970 EVO Plus M.2 500GB.

The tested are P4510, of which one half is connected to the motherboard, and the other half is connected to the RAID controller. The M.2 is running Ubuntu 20.04 and the tests will be run with fio version 3.16.

The test is

First of all, let's check the delays when working with the disk. The test is executed in one thread, the block size is 4 KB. Each test lasts 5 minutes. Before starting, the corresponding block device is set to none as the I/O scheduler. The fio command looks like this:

fio --name=test --blocksize=4k --direct=1 --buffered=0 --ioengine=libaio  --iodepth=1 --loops=1000 --runtime=300  --rw=<mode> --filename=<blkdev>

From fio results we take clat 99.00%. The results are shown in the table below.

Random reading, µs
Random recording, µs

Диск
112
78

Linux SW RAID, RAID0
113
45

VROC, RAID0
112
46

LSI, RAID0
122
63

Linux SW RAID, RAID1
113
48

VROC, RAID1
113
45

LSI, RAID1
128
89

In addition to delays when accessing data, I want to see the performance of virtual drives and compare with the performance of a physical disk. Command to run fio:

fio --name=test --blocksize=4k --direct=1 --buffered=0 --ioengine=libaio  --loops=1000 --runtime=300  --iodepth=<threads> --rw=<mode> --filename=<blkdev>

Performance is measured in terms of I/O operations. The results are presented in the table below.

Random read 1 thread, IOPS
Random write 1 thread, IOPS
Random read 128 threads, IOPS
Random write 128 threads, IOPS

Диск
11300
40700
453000
105000

Linux SW RAID, RAID0
11200
52000
429000
232000

VROC, RAID0
11200
52300
441000
162000

LSI, RAID0
10900
44200
311000
160000

Linux SW RAID, RAID1
10000
48600
395000
147000

VROC, RAID1
10000
54400
378000
244000

LSI, RAID1
11000
34300
229000
248000

It is easy to see that the use of a hardware controller leads to an increase in delays and a drop in performance in comparison with software solutions.

Conclusion

Using hardware solutions to create disk arrays from two disks looks irrational. However, there are tasks where the use of RAID controllers is justified. With the advent of NVMe-enabled controllers, users have the opportunity to use faster SSDs in their projects.

RAID arrays on NVMe

Only registered users can participate in the survey. Sign in, you are welcome.

Are you using RAID solutions?

  • 29,6%Yes, hardware solutions32

  • 50,0%Yes, software solutions54

  • 16,7%No18

  • 3,7%No RAID required4

108 users voted. 14 users abstained.

Source: habr.com

Add a comment