SHARE

Benchmarking Storage Systems, Part 2

This is the second in a three-part series on benchmarking. Part 1 examined each of the components that might be included in a typical benchmark. Today we’ll look at developing representations of your workload as well as the pros and cons of using your applications and real data in the benchmark as opposed to developing […]

Written By

Henry Newman

Dec 10, 2003

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

This is the second in a three-part series on benchmarking. Part 1 examined each of the components that might be included in a typical benchmark. Today we’ll look at developing representations of your workload as well as the pros and cons of using your applications and real data in the benchmark as opposed to developing emulations of both.

The most important part in the development of a set of storage benchmarks is ensuring that the benchmarks represent your current workload and how you run that workload on the system(s). With an understanding of your current workload, you can begin to predict how the current workload relates to the future workload.

There are many ways to create workloads that represent your real work. Some are more difficult for you to create, while some are much harder for the storage vendors to run, and of course some are halfway in between (I know, sounds a bit like Goldilocks and the Three Bears). Whatever you chose, you need to completely understand the tradeoffs and ensure that your organization understands the advantages and disadvantages of the decisions to be made.

Characterization

Going into the development of a benchmark, you have some hard choices to make. The first is whether the benchmark will utilize your real applications and real data, or will an emulation of the workload(s) need to be developed.

Each of these two paths has pros and cons for both you and the vendor. You might wonder why I seem to be so concerned for the vendors. Besides once being a vendor and a benchmarker, I have come to realize that there is no free lunch when doing benchmarks. In other words, if you develop an expensive benchmark, quite often the cost of the benchmark will more than likely be rolled into the bid price of the hardware depending on the size of your procurement and how much the vendor(s) want your business.

Here’s how I see the tradeoffs:

Characterization	Advantage	Disadvantage
Use actual applications code	This is the best measure of your real work if you structure the benchmark correctly. For database benchmarks, using the actual data may have security issues for your company, but using actual data will result in the most realistic benchmark	These types of benchmark generally have more setup time and are more difficult to run for the vendors, as a great deal of application tuning, file system tuning, system tuning, and RAID tuning may be required, all of which can affect final pricing
Develop a set of representative storage benchmarks	Far easier to run and to scale to larger workloads. Running this type of workload is also far easier for the storage vendors	Sometimes difficult to develop characterization of the workload given system tunables, file systems, volume managers, and the actual storage hardware

Page 2: Using Your Workload

Using Your Workload

If you are going to use your actual workload in a benchmark, the first step is end-to-end hardware and software characterization. You need to document and understand:

What applications are being run
The number, location, and sizes of the data sets being used
The server(s) hardware configuration, including CPUs, memory, NICs, and HBAs
The server(s) software configuration
Application requirements, such as redo logs for databases
File system and volume manager settings
HBA tunables
Storage configuration, including LUN sizes, RAID type, and RAID cache sizes and settings

All of this may seem obvious, but if you’re going to give a storage vendor your benchmark, the more documentation that you provide them with the more likely the results will meet your requirements and the fewer questions you will have to answer. And if you’ve gone so far as to document the above, then creating the operational procedures for things such as remote mirroring, tape backup, and other operational requirements will not be difficult.

When using your own workload in a benchmark, there are several additional areas that need to be clearly understood and documented for the storage vendors, including:

Server memory size and tunables settings – Many file systems use memory for the file system cache or the cache for the database based on system tunables or auto-configuration. If a vendor does not have the same amount of memory and use exactly the settings that you are using or the other vendors are using, that vendor’s results could be skewed
File system and volume manager settings – These settings will have a significant impact on the performance of your system, and because different settings could have a significant effect (positive or negative) on performance, they should be set the same for all the vendors

Emulating Your Workload

If you have a staff that can program in C, then writing the code to emulate your workload will not be that difficult. I believe that if you have done a good job with the emulation, then you’ll have a great deal more control of the benchmark in terms of scaling, and you’ll have a far better understanding of what your workload does to the actual hardware and software.

It also allows you to test the storage vendors’ hardware without the file system, as you can write/read directly to the raw devices. This allows a better understanding of the hardware that might otherwise be masked by the file system’s effect on I/O performance.

The steps for developing an emulation are relatively simple:

Use the system tools to get a system call listing of the application(s) doing the I/O. These tools are available from most OS vendors. For example, on Solaris it’s called truss, and on Linux it’s strace

After collecting this data you’ll need to develop some statistical analysis of:
1. Read and write ratio
2. Read and write sizes
3. File sizes
4. Seeks and seek distance
5. The amount of concurrent I/O
6. Number of open files
7. System call type (asynchronous or synchronous I/O)
Develop a program that reads and writes with the formation developed in #2, writing and reading to/from the raw devices

This seems fairly difficult and can be, but once you have completed the process, you’ll be able to easily scale your workload up and down. Another advantage is that when the vendor receives the benchmark information, you will be receiving from them a true benchmarking of the actual storage hardware, not the file system and volume manager tunables.

Page 3: What About Software?

What About Software?

Even if you’re going to benchmark a file system or shared file system, much of what was recommended for the analysis of the hardware should be done for the software. One big difference is obviously you cannot write to the raw device if you are testing a file system.

Benchmarking a file system is likely to be the most difficult benchmarking task because there are so many variables, and doing it correctly is very time consuming, both for you and for the vendors.

Here are some items that must be characterized as part of the process for benchmarking file systems, building upon the characterizations already done for the hardware:

File system size – current and future
Total number of directories – current and future
Total number of files – current and future
Largest number of files per directory – current and future

For shared file systems, add:

Amount of I/O from each client and the master machine
Amount of metadata I/O from each client and the master machine
Number of clients
Types of clients

Along with this you have the hardware topology, including HBAs, switches, TCP/IP network for metadata, and possibly tapes, as most shared file systems have an HSM (Hierarchical Storage Management) system built into them.

Developing the scripts, codes, and methodology to do this type of benchmarking is hard work, but while hard on your end, for the storage vendor it will be virtually impossible, as most have limited relationships with shared file systems vendors, limited server resources, and limited staff that know shared file systems.

Often what this type of benchmark becomes is really a benchmarking of the benchmarker, not a benchmarking of the software and hardware. The vendor who often wins is not the one with the best hardware and software, but the vendor with the best benchmarks. Therefore, it’s important to give the storage vendors as much information and guidance as possible.

The most important part of a file system benchmark that is often forgotten is creating fragmentation as part of the benchmark. Most file system benchmarks create a new file system and run the benchmark tests with tools such as iozone and bonnie. Most of the time this is not really valid given that on a real file system users’ files are created and removed many times, and multiple files are often written at the same time.

Some of the areas to look at are:

How many applications are doing reads and writes at the same time?
How many files will be created and deleted within the file system?
How full will the file system be over time?

Each of these issues will have an impact on the benchmark that you create, regardless of which tool you use.

Conclusions

The process of developing a benchmark that mimics your operational environment — those are the key words. The process of determining what the characteristics of your environment is the first step. Workload characterization, though seemingly a difficult process, is not really that difficult when separating it into the various parts of application I/O, file system configuration, system and file system tunables, and hardware configuration requirements. It’s also important to keep in mind how the new system will be used as compared to how the old system was used.

Next time we will review the process of packaging, rules, analysis, and scoring.

»

See All Articles by Columnist Henry Newman

Henry Newman

Henry Newman has been a contributor to TechnologyAdvice websites for more than 20 years. His career in high-performance computing, storage and security dates to the early 1980s, when Cray was the name of a supercomputing company rather than an entry in Urban Dictionary. After nearly four decades of architecting IT systems, he recently retired as CTO of a storage company’s Federal group, but he rather quickly lost a bet that he wouldn't be able to stay retired by taking a consulting gig in his first month of retirement.

Benchmarking Storage Systems, Part 2

Henry Newman

Recommended for you...

Company

Categories