Review: Accelerator card by OCZ for ESX server
Turbocharger for VMs
Manufacturer OCZ advertises VXL, its storage acceleration software, with crowd-pulling arguments: It runs without special agents in applications on any operating system, reducing traffic to and from the SAN by up to 90 percent and allowing up to 10 times as many virtual machines per ESX host than without a cache. The ADMIN test team decided to take a closer look.
The product consists of an OCZ Z-Drive R4 solid state storage card for a PCI Express slot in the virtualization host and associated software. The software takes the form of a virtual machine running on the ESX server that you want to benefit from the solution: This component handles the actual caching.
Additionally, a Windows application exists for the cache control settings. The virtual machine runs on Linux and starts a software tool called VXL. The Windows application (which, incidentally, can also run on a virtual machine) and the VXL virtual machine need access to a separate iSCSI network that connects the storage to be accelerated on the virtualization host (Figure 1). The administrator communicates with the ESX server, its VMs, and the VXL components on a second management network.
Knowing How
I'll start by saying that the software configuration is anything but trivial. The somewhat complex architecture contains many tweaks for calibrating the caching mechanism. With this complexity, it is very easy to achieve a working, but not optimum, solution that sacrifices a large chunk of the possible acceleration. Letting the vendor help you, at least during the initial setup, is certainly not a bad idea, especially considering that the "VXL and StoragePro Installation and Configuration Guide" contains no explanation of the functionality, no hints about the meanings and purpose, or the whys and wherefores of the configuration steps. It does not explain the coherencies or provide a roadmap or a target, and does not even fully document the options. The entire guide consists only of a series of more or less sparsely annotated screenshots. In other words, if you encounter a problem en route, you have no orientation: You do not know what you want to achieve, how the components should cooperate, or what features they should have.
Unsuspecting admins likely will be caught by this trap solely because the descriptions are ambiguous – at least in places – and because some of the software components react in unusual and unexpected ways. For example, when you set up the required virtual VXL machine, a configuration script does not apply the default values it suggests when you press Enter, as you would assume. At the same time, it secretly filters the user input without giving you any feedback. Unless you hit upon the really wacky idea of actually writing down what you see, you will probably think that the script is hanging; after all, it does not respond to keystrokes. All of this impairs the utility value of documentation and the user friendliness of the software, at least in the setup phase.
Just Rewards
Once the software is properly configured and customized, which involves some major effort, and you try to boost the speed – given a matching workload – then the achievable data rates are unquestionably impressive. Thanks to the cache, at least five clients can read around 100MBps from the iSCSI targets, resulting in an aggregated read rate of 500MBps; without the accelerator card, no more than 100MBps crosses the iSCSI network in any situation (Figure 2).
At 500MBps, the going would start to get tough, even for SATA 3.0 (and even older versions running at 150 and 300MBps would have long since given up), but iSCSI with Gigabit Ethernet (GbE), which you could possibly extend to 10GbE, still has enough room to grow.
For an initial overview, I employed the well-known file copy utility dd
,
dd if=./testfile1 of=/dev/null bs=64k iflag=direct
which sets a block size of 64KB and bypasses the filesystem buffer cache if possible. Next, I launched this command simultaneously on one, two, three, four, and five virtual machines, which were supported by the VXL cache in one case and were not supported in the other case. In each constellation, the dd
file copy completed 10 rounds. From the results, I eliminated the smallest and largest values, respectively, and then computed the arithmetic mean from the remaining eight measurements.
At Speed
From the outset, the read performance of the unaccelerated volumes (46.6MBps) was less than half of the accelerated versions (97.66MBps) with one VM running. As the number of VMs competing for I/O increased, the performance of the unaccelerated version drops off much faster, and with five VMs working in parallel, it is less than a third of the rate achieved with cache. The results are far clearer in a benchmark with IOzone:
iozone -i1 -I -c -e -w -x -s2g -r128k -l1 -u1 -t1 -F testfile
I also ordered IOzone to read a 2GB test file, exclusively and using direct I/O. I set the record size to 128KB with IOzone running in throughput mode, using exactly one process. The mean values were computed as described above. Figure 3 shows the corresponding results.
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.