Benchmarking Storage Systems, Part 1
Benchmarking storage systems has become very complex over the years given all of the hardware and software parts being used for both NAS and SAN systems. In order to conduct an effective storage benchmark, it’s important to have a solid understanding of all of the parts needed for a storage benchmark, how to create the benchmark based on your applications, and how to put the benchmark together. I'll begin this three-part series on benchmarking storage with an examination of each of the components that might be included in a benchmark.
The emergence of so many different types of computer and storage systems from a variety of vendors has led customers to develop benchmarks that characterize the performance of their applications on each of the systems they're looking at purchasing. Over the years, though, the benchmark process has become a game of “cat and mouse, dog eat dog, high stakes poker” – pick your cliché – with computer vendors and customers each trying to get the upper hand.
Customers would typically write rules and create emulations of their workload or use their real workloads in an attempt to get an accurate representation of the vendor’s performance as well as to prevent the vendors from taking advantage of a tactic I call SBT (Slimy Benchmarking Tricks). On the other side of the equation, the vendor’s customary goal of attempting to win the benchmark at all costs would often work against the customer’s goal of meeting requirements for performance, reliability, and cost. On the customer side, it is almost always a balancing act among these three points.
What to Benchmark and Why
There are the obvious pieces of hardware that you need to include in the benchmark, such as NAS, RAIDs, tape drives, and server/hosts. There are also a large number of not-so-obvious hardware and software components, such as file systems, OS system tunables, tape libraries, HBAs and HBA tunables, Fibre Channel switches, NIC and TOE cards, RAID tunables and cache sizes, NAS tunables and cache sizes, and failover and fail back, just to name a few.
Here are a couple of questions to start with when planning a benchmark:
- Are all of the not-so-obvious items important to the benchmark?
- Do I need to create benchmarks in order to measure and understand each component?
I believe that some of the not-so-obvious hardware and software parts can be quite important in some cases, and at a minimum you need to understand the issues surrounding them. You might remember the example I used previously of the $2K HBA that significantly reduces the performance of a $1M RAID system.
Regarding the measurement issue, this is generally your responsibility. The storage provider will generally tell you what hardware and software they will provide and what is being used in the benchmark. If the benchmark does not reflect your real workload, then you could have the $2K HBA problem when the system is installed, but again, that would be your problem.
On the other hand, measuring the performance of even a simple HBA is a non-trivial effort for all but the most experienced storage performance analyst. Add to the process a consideration of the HBA tunables and failover and fail back, and the problem becomes insurmountable for most organizations. That is why it is a good idea to understand the issues, but not necessarily a good idea to attempt to benchmark each of the component parts.