Data storage benchmarking can be quite esoteric in that vast complexity awaits anyone attempting to get to the heart of a particular benchmark.
Case in point: The Storage Networking Industry Association (SNIA) has developed the Emerald benchmark to measure power consumption. This invaluable benchmark has a vast amount of supporting literature. That so much could be written about one benchmark test tells you just how technical a subject this is. And in SNIA’s defense, it is creating a Quick Reference Guide for Emerald (coming soon).
But rather than getting into the nitty-gritty nuances of the tests, the purpose of this article is to provide a high-level overview of a few basic storage benchmarks, what value they might have and where you can find out more.
Storage vendors are always talking about I/O performance. Iometer is a simple I/O measurement tool that can help evaluate vendor claims. It was developed by Intel in the ‘90s and is now updated by various individuals organized around SourceForge.net.
“Iometer is popular and easy to use,” said Greg Schulz, an analyst with StorageIO Group.
He laid out how to get it set up and running: Once you have Iometer installed, using the GUI (or a text configuration script file) all you need to do is indicate what storage device to test, how much space to use (e.g. the number of 512 byte sectors/blocks), and specify the benchmark workload using the Access Specification tab (e.g. reads, writes, random, sequential, IO size). Then under the Test Specification tab, indicate how long to run each benchmark workload step (the test sequence) and click on the “Green” start tab.
“From there you can do additional things such as add more workload steps, change the test duration, specify where results go, increase the number of concurrent IOs, and save your configuration for later use among other things,” said Schulz.
Vdbench is a command line utility that was developed by StorageTek, inherited by Sun and is currently being looked after by Oracle. The idea is to assist engineers in the generation of disk and tape I/O workloads to validate storage performance and data integrity. It can be run on UNIX, Windows, Linux, and OS/X. It allows control over parameters such as read v write, random v sequential, I/O, data transfer size, and more.
Microsoft Diskspd is a workload generation tool that runs on various Windows systems as an alternative to benchmarks like Iometer and vdbench. It is a command line tool that can be scripted to perform reads and writes of various I/O size, including random as well as sequential activity.
“Storage can be a buffered file system as well non-buffered across different types of storage and interfaces,” said Schulz. “Various performance and CPU usage information is provided to gauge the impact on a system when doing a given number of IOPs, amount of bandwidth along with response time latency.”
SNIA Emerald Power Measurement
According to Herbert Tanzer of HP, who is part of the SNIA Emerald Working Group, Emerald basically benchmarks performance per watt of electricity consumed. The uses of Emerald are many and varied. It can be used to support submissions for EPA Energy Star Testing and certification, to profile storage workloads, or by anyone who is curious about energy usage of storage systems along with their corresponding performance activity.
Tanzer recommended that anyone running Emerald on a particular type of storage workload should run some practice and tuning tests, and review results before running the full test.
“The first step is to organize your test environment including applicable servers and software for running the workload scripts,” said Tanzer. “In addition to servers and software, you will need applicable cabling along with monitoring equipment for electrical power and environment temperatures.”
IPEEF helps you measure bandwidth and quality of a network link. Tests can be run for such things as latency, packet loss and jitter on Linux, UNIX and Windows systems. It is good for troubleshooting network related issues when storage performance is sluggish and you think the network might be to blame. In particular, it tests network throughput using Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) data streams. You can use it to measure throughput from one device to another or bi-directionally.
“If I am benchmarking a NAS technology, I will spend time up front testing the network with IPERF,” said Tom Becchetti, Minneapolis/St. Paul Computer Measurement Group Chair. “IPERF will run an IP network performance test from client memory to the NAS storage device.”
Another benchmark very relevant for storage is SPC, said Tanzer, by the Storage Performance Council. SPC-1 and SPC-2 are storage benchmarks and there are also a couple of energy extensions that use the same workloads but also measure power. These are known as SPC-1/E and SPC-2/E.
The impetus behind the SPC was to create a vendor-neutral storage standards body that would be able to give an apples-to-apples comparison in terms of performance in storage products, thereby serving to cut through exaggerated vendor claims. SPC benchmarks utilize I/O workloads that attempt to represent real world application behavior to address online transaction processing as well as sequential applications.
Other Data Storage Benchmarks
We are only really scratching the surface with those benchmarks covered above, as there are plenty of other useful benchmarks out there.
The Transaction Processing Council (TPC) oversees a large number of benchmarks related to database performance and transaction processing.
VMmark by VMware can be used to measure the performance as well as the scalability of applications running in VMware virtualized environments. For example, you can measure virtual data center performance as well as view the performance of different hardware and virtualization platforms.
Microsoft Exchange Solution Reviewed Program-Storage (ESRP) addresses Microsoft Exchange Server third-party storage. The goal is to ensure that proposed storage performs well when used with Exchange.
Login VSI is a testing tool for virtual desktop environments. Login VSI can be used to test the performance and scalability of virtual desktops such as Citrix XenApp/XenDesktop, VMware Horizon View and Microsoft Remote Desktop Services.
Whatever benchmark you use, the important thing is to set up workloads or simulations that are relevant to what you are doing in support of your applications. The best benchmark is your particular application and workload running as close to a production level as possible. The next best would be your application or a workload that reflects what you are doing in production and how the entire solution behaves along with the various pieces.
“You have to know your application and environment workload in terms of reads, writes, random, sequential, mixed workload and IO sizes to set those tools up to create metrics that matter, also looking not only at the storage, also at the server I/O data path and CPU usage,” said Schulz.
Becchetti added that the settings you choose play a big part in the results obtained.
“They key to running these tools is to spend the time to find the optimal setting and use these setting going forward on all subsequent testing,” said Becchetti.