How HPE SimpliVity 380 for VDI will work: hard load tests

How HPE SimpliVity 380 for VDI will work: hard load tests

The customer wanted VDI. I looked closely at the SimpliVity + VDI Citrix Virtual Desktop bundle. For all operators, employees of offices in cities and so on. There are five thousand users in the first wave of migration alone, and therefore they insisted on load testing. VDI can start to slow down, it can calmly lie down - and this does not always happen due to problems with the channel. We bought a very powerful testing package specifically for VDI and loaded the infrastructure until it lay down on the disks and on the processor.

So, we need a plastic bottle, LoginVSI software for fancy VDI tests. We have it with licenses for 300 users. Then they took HPE SimpliVity 380 hardware in a packing suitable for the task of maximum user density per server, cut up virtual machines with a good oversubscription, put office software on them on Win10 and started testing.

Let's go!

System

Two nodes (servers) HPE SimpliVity 380 Gen10. On each:

  • 2 x Intel Xeon Platinum 8170 26c 2.1Ghz.
  • RAM: 768GB, 12 x 64GB LRDIMMs DDR4 2666MHz.
  • Primary Disk Controller: HPE Smart Array P816i-a SR Gen10.
  • Hard drives: 9 x 1.92 TB SATA 6Gb/s SSD (in a RAID6 7+2 configuration, i.e. this is a Medium model in terms of HPE SimpliVity).
  • NICs: 4 x 1Gb Eth (user data), 2 x 10Gb Eth (SimpliVity and vMotion backend).
  • Dedicated built-in FPGA cards in each node for deduplication/compression.

The nodes are directly connected to each other by a 10Gb Ethernet interconnect without an external switch, which is used as a SimpliVity backend and for transferring virtual machine data via NFS. Virtual machine data in a cluster is always mirrored between two nodes.

The nodes are combined into a Vmware vSphere cluster managed by vCenter.

A domain controller and a Citrix Connection Broker are deployed for testing. The domain controller, broker and vCenter are placed on a separate cluster.
How HPE SimpliVity 380 for VDI will work: hard load tests
How HPE SimpliVity 380 for VDI will work: hard load tests
As a test infrastructure, 300 virtual desktops were deployed in the Dedicated - Full Copy configuration, i.e. each desktop is a complete copy of the original virtual machine image and saves all changes made by users.

Each virtual machine has 2vCPU and 4GB RAM:

How HPE SimpliVity 380 for VDI will work: hard load tests

How HPE SimpliVity 380 for VDI will work: hard load tests

The following software required for testing was installed on the virtual machines:

  • Windows 10 (64-bit), version 1809.
  • Adobe Reader XI.
  • Citrix Virtual Delivery Agent 1811.1.
  • Doro PDF 1.82.
  • Java 7 Update 13.
  • Microsoft Office Professional Plus 2016.

Between nodes - synchronous replication. Each block of data in the cluster has two copies. That is, now there is a complete set of data on each of the nodes. With a cluster of three or more nodes - copies of blocks in two different places. When a new VM is created, an additional copy is created on one of the cluster nodes. When one node fails, all VMs previously running on it are automatically restarted on other nodes where they have replicas. If a node is out of service for a long time, then gradual redundancy recovery begins, and the cluster returns to N + 1 redundancy again.

Balancing and data storage occurs at the level of software storage of SimpliVity itself.

Virtual machines run a virtualization cluster, which also places them on software storage. The desktops themselves were taken according to a standard template: the tables of financiers and operators drove in for the test (these are two different templates).

The test is

For testing, the LoginVSI 4.1 software test suite was used. The LoginVSI complex, consisting of a control server and 12 machines for test connections, was deployed on a separate physical host.
How HPE SimpliVity 380 for VDI will work: hard load tests

Testing was carried out in three modes:

Benchmark mode - load options 300 Knowledge workers and 300 Storage workers.

Standard mode is a load option of 300 Power workers.

To enable Power workers to work and increase load diversity, a library of additional Power Library files was added to the LoginVSI complex. To ensure the repeatability of the results, all settings of the test bench were left to Default.

The Knowledge and Power workers tests simulate the real workload of users working on virtual workstations.

The Storage workers test was created specifically for testing storage systems, is far from real loads and mostly consists of the user working with a large number of files of different sizes.

During testing, users log on to workstations for 48 minutes, approximately one user every 10 seconds.

The results

The main result of testing LoginVSI is the VSImax metric, which is composed of the execution time of various tasks launched by the user. For example: file opening time in Notepad, file compression time in 7-Zip, etc.

A detailed description of the calculation of metrics is available in the official documentation for link.

In other words, LoginVSI repeats a typical load pattern, simulating user actions in an office suite, reading a PDF, and so on, and measures various delays. There is a critical level of delays “everything slows down, it’s impossible to work”), until which it is considered that the maximum number of users has not been reached. If the response time is 1 ms faster than this "everything slows down" state, then the system is considered normal and more users can be added.

Here are the main metrics:

Metrics

Actions performed

Detailed description

Loadable Components

N.S.L.D.

Text opening time
file weighing 1 KB

Start notepad and
opens a random 1 KB document copied from the pool
resources

CPU and I/O

NFOs

Dialog opening time
windows in notepad

Opening a VSI-Notepad file [Ctrl+O]

CPU, RAM and I/O

 

ZHC*

High Compression Zip File Creation Time

Local Compression
random 5MB .pst file copied from
resource pool

CPU and I/O

ZLC*

Low Compression Zip File Creation Time

Local Compression
random 5MB .pst file copied from
resource pool

I / O

 

CPU

Calculating big
array of random data

Creating a large array
random data to be used in the I/O timer

CPU

When testing, the base metric VSIbase is initially calculated, which shows the speed of task execution without any load on the system. Based on it, the VSImax Threshold is determined, which is equal to VSIbase + 1ms.

Conclusions about system performance are made on the basis of two metrics: VSIbase, which determines the speed of the system, and VSImax threshold, which determines the maximum number of users that the system can withstand without significant degradation.

300 knowledge workers benchmark

Knowledge workers are users who regularly load memory, processor and IO with various small peaks. The software emulates the load from demanding office users, as if they are constantly poking something (PDF, Java, office suite, photo viewer, 7-Zip). As users are added from zero to 300, the delay for each gradually grows.

VSImax statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests
VSIbase = 986ms, VSI Threshold was not reached.

Storage load statistics from SimpliVity monitoring:
How HPE SimpliVity 380 for VDI will work: hard load tests

With this type of load, the system can withstand the increase in load with little or no performance degradation. The execution time for user tasks grows smoothly, the system response time does not change during testing and is up to 3 ms for writing and up to 1 ms for reading.

Conclusion: 300 knowledge users work on the current cluster without any problems and do not interfere with each other, reaching a pCPU / vCPU oversubscription of 1 to 6. The total delays grow evenly as the load increases, but the stipulated limit has not been reached.

300 storage workers benchmark

These are users who constantly write and read in the proportion of 30 to 70, respectively. This test was carried out more for the sake of experiment. VSImax statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests

VSIbase = 1673, VSI Threshold reached at 240 users.

Storage load statistics from SimpliVity monitoring:
How HPE SimpliVity 380 for VDI will work: hard load tests
This type of load is, in fact, a stress test of the storage system. When it is executed, each user writes a lot of random files of different sizes to the disk. In this case, it can be seen that when a certain load threshold is exceeded, some users increase the time it takes to complete tasks for writing files. At the same time, the load on the storage system, processor, and memory of the hosts does not change significantly, so it is currently impossible to determine exactly what the delays are associated with.

Conclusions about system performance using this test can only be made in comparison with the test results on other systems, since such loads are synthetic, unrealistic. However, overall the test went well. Everything went well until 210 sessions, and then incomprehensible responses began, which were not tracked anywhere except Login VSI.

300 Power workers

These are users who love CPU, memory and high IO. These "power users" routinely run complex tasks with long peaks like installing new software and unpacking large archives. VSImax statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests

VSIbase = 970, VSI Threshold has not been reached.

Storage load statistics from SimpliVity monitoring:
How HPE SimpliVity 380 for VDI will work: hard load tests

During testing, the processor load threshold was reached on one of the system nodes, but this did not significantly affect its operation:

How HPE SimpliVity 380 for VDI will work: hard load tests

How HPE SimpliVity 380 for VDI will work: hard load tests

In this case, the system can withstand the increase in load also without significant performance degradation. The execution time for user tasks grows smoothly, the system response time does not change during testing and is up to 3 ms for writing and up to 1 ms for reading.

The usual tests were not enough for the customer, and we went further: we increased the characteristics of the VM (the number of vCPUs to evaluate the increase in oversubscription and the size of the disk) and added an additional load.

When conducting additional tests, the following bench configuration was used:
Deployed 300 virtual desktops in 4vCPU, 4GB RAM, 80GB HDD configuration.

Configuration of one of the test machines:
How HPE SimpliVity 380 for VDI will work: hard load tests

The machines are deployed in the Dedicated - Full Copy variant:

How HPE SimpliVity 380 for VDI will work: hard load tests

How HPE SimpliVity 380 for VDI will work: hard load tests

300 Knowledge workers benchmark oversubscribed 12

VSImax statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests

VSIbase = 921ms, VSI Threshold was not reached.

Storage load statistics from SimpliVity monitoring:
How HPE SimpliVity 380 for VDI will work: hard load tests

The results obtained are similar to testing the previous VM configuration.

300 Power workers oversubscribed 12

VSImax statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests

VSIbase = 933, VSI Threshold has not been reached.

Storage load statistics from SimpliVity monitoring:
How HPE SimpliVity 380 for VDI will work: hard load tests

In this test, the processor load threshold was also reached, but this did not have a significant impact on performance:

How HPE SimpliVity 380 for VDI will work: hard load tests

How HPE SimpliVity 380 for VDI will work: hard load tests

The results obtained are similar to testing the previous configuration.

What happens if you run the load for 10 hours?

Now we are looking to see if there will be an “accumulation effect”, and we are running tests for 10 hours in a row.

Long tests and section description should be aimed at the fact that we wanted to check if there will be any problems with the farm during a long load on it.

300 Knowledge workers benchmark + 10 hours

In addition, a load option of 300 knowledge workers was tested, followed by user work for 10 hours.

VSImax statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests

VSIbase = 919ms, VSI Threshold was not reached.

VSImax Detailed statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests

The graph shows that during the entire test there is no performance degradation.

Storage load statistics from SimpliVity monitoring:
How HPE SimpliVity 380 for VDI will work: hard load tests

The performance of the storage system remains at the same level throughout the test.

Additional testing with the addition of a synthetic load

The customer asked to add a wild load on the disk. To do this, a task was added to the storage system in each of the user's virtual machines to launch a synthetic load on the disk when the user logs on to the system. The load was provided by the fio utility, which allows you to limit the load on the disk by the number of IOPS. Each machine ran a task to start an additional load in the amount of 22 IOPS 70% / 30% Random Read / Write.

300 knowledge workers benchmark + 22 IOPS per user

In initial testing, it was found that fio puts a significant amount of extra CPU overhead on virtual machines. This led to a rapid overload of hosts on the CPU and greatly affected the operation of the system as a whole.

Host CPU load:
How HPE SimpliVity 380 for VDI will work: hard load tests

How HPE SimpliVity 380 for VDI will work: hard load tests

At the same time, storage system delays also naturally increased:
How HPE SimpliVity 380 for VDI will work: hard load tests

The lack of computing power became critical to about 240 users:
How HPE SimpliVity 380 for VDI will work: hard load tests

As a result of the results obtained, it was decided to conduct testing that is less CPU-intensive.

230 Office workers benchmark + 22 IOPS per user

To reduce the load on the CPU, the Office workers load type was chosen, and 22 IOPS of synthetic load were also added for each session.

The test was limited to 230 sessions in order not to exceed the maximum CPU load.

The test was run with users running for 10 hours to test the stability of the system during prolonged use at close to maximum load.

VSImax statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests

VSIbase = 918ms, VSI Threshold was not reached.

VSImax Detailed statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests

The graph shows that during the entire test there is no performance degradation.

CPU load statistics data:
How HPE SimpliVity 380 for VDI will work: hard load tests

How HPE SimpliVity 380 for VDI will work: hard load tests

When performing this test, the load on the CPU of the hosts was almost maximum.

Storage load statistics from SimpliVity monitoring:
How HPE SimpliVity 380 for VDI will work: hard load tests

The performance of the storage system remains at the same level throughout the test.

The load on the storage system during the test was approximately 6 IOPS in a 500/60 ratio (40 IOPS read, 3 IOPS write), which is approximately 900 IOPS per workstation.

The response time averaged 3 ms for writing and up to 1 ms for reading.

Сonclusion

When simulating real-world workloads on the HPE SimpliVity infrastructure, results were obtained that confirmed the system's ability to run virtual desktops of at least 300 Full Clone machines on a pair of SimpliVity nodes. At the same time, the response time of the storage system was maintained at an optimal level throughout the entire testing.

We are very impressed with the approach of long tests and comparison of solutions before implementation. We can test the performance for your workloads too, if you like. Including on other hyperconverged solutions. The mentioned customer is now finishing tests on another solution in parallel. Its current infrastructure is just a fleet of PCs, a domain, and software at every workplace. Moving to VDI without tests is, of course, quite difficult. Specifically, it is difficult to understand the real capabilities of a VDI farm without migrating real users to it. And these tests allow you to quickly assess the real capabilities of a particular system without the need to involve ordinary users. This is where this study came from.

The second important approach is that the customer immediately laid down on the correct scaling. Here you can buy a server and add a farm, for example, for 100 users, everything is predictable at the price of a user. For example, when they need to add another 300 users, they will know that they need two servers in an already defined configuration, rather than reconsider upgrading their infrastructure as a whole.

The possibilities of HPE SimpliVity federation are interesting. The business is geographically separated, so it makes sense to put your own separate VDI piece of iron in a distant office. In the SimpliVity federation, each virtual machine is replicated on a schedule with the ability to do between geographically distant clusters very quickly and without load on the channel - this is a built-in backup of a very good level. When replicating VMs between sites, the channel is used as minimally as possible, and this makes it possible to build very interesting DR architectures with a single control center and a bunch of decentralized storage sites.
How HPE SimpliVity 380 for VDI will work: hard load tests
Федерация

All this together makes it possible to evaluate the financial side in great detail, and impose the costs of VDI on the company's growth plans, and understand how quickly the solution will pay off and how it will work. Because any VDI is a solution that ultimately saves a lot of resources, but at the same time, most likely, without the cost-effective opportunity to change it within 5-7 years of use.

In general, if you have questions not for comments, write to me by mail [email protected].

Source: habr.com

Add a comment