This article will explain why the automated design and configuration of SANs is the only viable solution to these issues. To that end, the article will present a brief general discussion and comparison of a range of automated techniques for choosing the Redundant Array of Independent (or Inexpensive) Disks, or RAID
The simplest approaches presented in this article are modeled on existing, manual rules of thumb which involve "tagging" data with a RAID level before determining the configuration of the array it is assigned to. The best approach presented here simultaneously and automatically determines the RAID levels for the data, the array configuration, and the layout of data on that array. It can operate as an optimization process with the twin goals of minimizing array cost while ensuring that SAN workload performance requirements will be met.
The simplest approaches presented in this article are modeled on existing, manual rules of thumb which involve "tagging" data with a RAID level before determining the configuration of the array it is assigned to. The best approach presented here simultaneously and automatically determines the RAID levels for the data, the array configuration, and the layout of data on that array. It can operate as an optimization process with the twin goals of minimizing array cost while ensuring that SAN workload performance requirements will be met.http://o1.qnsr.com/log/p.gif?;n=203;c=204655439;s=10655;x=7936;f=201806121855330;u=j;z=TIMESTAMP;a=20400368;e=i
Disk arrays are an integral part of high-performance SANs, and their importance and scale are growing as continuous access to information becomes critical to the day-to-day operation of modern business. Before a disk array can be used to store data, values for many configuration parameters must be specified. Achieving the right balance between cost, availability, and application performance needs depends on many correct decisions. Unfortunately, the tradeoffs between the choices are surprisingly complicated. The focus here, therefore, is on just one of these choices -- which RAID level, or data redundancy scheme, to use.
The two most common redundancy schemes are RAID 1/0 (striped mirroring), where every byte of data is kept on two separate disk drives striped for greater I/O parallelism, and RAID 5, where a single parity block protects the data in a stripe from disk drive failures. RAID 1/0 provides greater read performance and failure tolerance but requires almost twice as many disk drives to do so.
Disk arrays, therefore, organize their data storage into Logical Units (LUs), which appear as linear block spaces to their clients. A small disk array with a few disks might support up to 8 LUs, whereas a large one with hundreds of disk drives can support thousands. Each LU typically has a given RAID level -- a redundancy mapping onto one or more underlying physical disk drives. This decision is made at LU-creation time and is typically irrevocable; once the LU has been formatted, changing its RAID level requires copying all the data onto a new LU.
Furthermore, the workloads should be run on a SAN as sets of stores and streams. A store is a logically contiguous array of bytes, such as a file system or a database table, with a size typically measured in gigabytes. A stream is a set of access patterns on a store described by attributes, such as request rate, request size, inter-stream phasing information, and sequentiality. A RAID level must be decided for each store in the workload.
Nevertheless, host-based logical volume managers (LVMs) complicate matters by allowing multiple stores to be mapped onto a single LU, effectively blending multiple workloads together. In other words, a host-based LVM manages disk space at a logical level. It controls fixed-disk resources by mapping data between logical and physical storage and by allowing data to span multiple disks, which in turn allows it to be discontiguous, replicated, and dynamically expanded.
- Selecting the type and number of arrays,
- Selecting the size and RAID level for each LU in each disk array, and
- Placing stores on the resulting LUs.
In other words, the administrator's goals are operational in nature (such as minimum cost or maximum reliability for a given cost) while satisfying the performance requirements of client applications. This is clearly a very difficult task, so manual approaches apply rules of thumb and gross over-provisioning to simplify the problem (i.e. stripe each database table over as many RAID 1/0 LUs as possible). Unfortunately, the resulting configurations can cost as much as two to three times more than necessary. This is especially relevant when the cost of a large SAN is easily measured in millions of dollars or when the cost of the SAN represents more than half the total system hardware cost. Perhaps even more important is the uncertainty that surrounds a manually-designed system -- how well will it meet its performance and availability goals?
Therefore, with the preceding in mind, it is believed that the automated design and configuration of SANs can overcome these limitations, as they can consider a wider range of workload interactions and can explore a great deal more of the search space than any manual method. To do so, though, these automated methods need to be able to make RAID-level selection decisions.
Automated Selection Of RAID Levels
One approach to automating SAN design relies on input -- a workload description and information about the target disk array types and their configuration choices. The other is output -- a design for a SAN capable of supporting that workload.
Automated RAID Level Selection Approaches
There are two main approaches to automatically selecting a RAID level -- tagging and integrated. Let's briefly discuss tagging approaches first:
Tagging approaches perform a pre-processing step to tag stores with RAID levels. Once tagged with a RAID level, a store cannot change its tag, and it must be assigned to an LU of that type. Tagging decisions consider each store and its streams in isolation. Two types of taggers are considered here: rule-based, which examine the size and type of I/Os, and model-based, which use performance models to make their decisions. The former tend to have many ad hoc parameter settings, while the latter have fewer of these settings but also need performance-related data for a particular disk array type. In some cases, the same performance models can be used.
Integrated approaches omit the tagging step and defer the automated choice of RAID level until data-placement decisions are made. This allows the RAID level decision to take into account interactions with the other stores and streams that have already been assigned.
Finally, two variants of this approach are a partially adaptive one, in which the RAID level of an LU is automatically chosen when the first store is assigned to it and cannot subsequently be changed, and a fully adaptive variant, in which any assignment pass can revisit the RAID level decision for an LU at any time during its best-fit search. In both cases, the reassignment pass can still change the bindings of stores to LUs and can even move a store to an LU of a different RAID level. Neither variant requires any ad hoc constants, and both can automatically and dynamically select RAID levels.
Summary And Conclusions
The focus of this article has been on a general variety of methods for automatically selecting RAID levels, running the gamut from ones that consider each store in isolation and make irrevocable decisions to ones that consider all workload interactions and can undo any decision. The simpler tagging schemes are similar to accepted knowledge and to the back-of-the-envelope calculations that system designers currently rely upon. However, they are highly dependent on particular combinations of devices and workloads and involve hand-picking the right values for many constants, which makes them suitable only for limited combinations of workloads and devices.
Integrating automated RAID level selection into a store-to-device assignment algorithm leads to much better results, and, as a result, the benefits of a fully-adaptive scheme outweigh its additional costs in terms of computation time and complexity.
Finally, for future work, implications should be explored that provide reliability guarantees in addition to performance. Fully-adaptive schemes would be suitable for this, albeit at the cost of increased running times. Thus, the automated selection of components of different cost for each individual LU within the arrays (i.e. deciding between big/slow and small/fast disk drives according to the workload being mapped onto them) extends the automated decisions to additional parameters, such as LU stripe size and disks used in an LU.
[The preceding article is based on material provided by ongoing research at the Storage and Content Distribution Department, Hewlett-Packard Laboratories, 1501 Page Mill Road, Palo Alto, CA 94304 on RAID level selections. Additional information for this article was also based on material contained in the following white paper:
"Selecting RAID levels for disk arrays" by Eric Anderson, Ram Swaminathan, Alistair Veitch, Guillermo A. Alvarez and John Wilkes at: http://www.hpl.hp.com/personal/John_Wilkes/papers/FAST2002-raid-level-selection.pdf]
John Vacca is an information technology consultant and internationally known author based in Pomeroy, Ohio. Since 1982, John has authored 39 books and more than 485 articles in the areas of advanced storage, computer security and aerospace technology. John was also a configuration management specialist, computer specialist, and the computer security official for NASA's space station program (Freedom) and the International Space Station Program, from 1988 until his early retirement from NASA in 1995. John was also one of the security consultants for the MGM movie titled : "AntiTrust," which was released on January 12, 2001. John can be reached on the Internet at email@example.com.