Implementing RAID in Enterprise Environments


Want the latest storage insights?

Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure

A great deal has been written about RAID hardware, software, and a myriad of related topics since the early 1990s when hardware RAID first appeared on the scene. This month I'll present the topic of RAID from a slightly different perspective than what you've likely previously seen written.

Types of RAID

I separate RAID devices into two types. They are:

  1. Cache-centric RAIDs
  2. Storage-centric RAIDs

Often these two types of technology are called "enterprise RAID" (cache-centric) and "mid-range RAID" (storage-centric). Whatever you choose to call these devices, there are major structural differences between the two types, even though both are called RAID and perform many of the same functions.

Cache-centric RAIDs

I use the term cache-centric because RAID devices in this category depend significantly on data residing in cache to achieve good performance. Most cache-centric devices generally have feature sets such as:

  • Very high reliability (dual and hot swap everything, with virtually no downtime)
  • Large caches (64 GB or greater)
  • Designed emphasis on using RAID-1
  • Software that allows for snapshots, hot upgrades, and a plethora of other features
  • If RAID-5 is supported, generally a smaller number of devices-supported stripe sizes (e.g. 3+1 -- 3 data disks plus 1 parity drive -- or 7+1)
  • Cache is always synchronously mirrored
  • Large number of front-end connections
  • Support for many types of remote mirroring (dark fibre, IP)
  • Smaller block sizes are more typical (4 KB, 8 KB)
  • Huge amounts of storage managed in a single box (100 TB)
  • Great deal of per component reliability testing
  • Error monitoring for hardware
  • Error monitoring for each disk for soft errors such as write and read retry and remapping of sectors
  • Designed for IOPs (Input/Output Processors), not streaming I/O
  • Far more bandwidth from cache to the servers than from cache to disk (sometimes up to 4 times greater). This is often called front-end bandwidth (cache to servers) and back-end bandwidth (cache to disk)
  • Much higher cost per MB of storage compared with storage-centric RAID (not surprising given the reliability and size of the cache)

Some of the cache-centric RAIDs vendors and devices include:

  • IBM Shark
  • EMC Symmetrix and new DMX2000
  • Hitachi Data Systems 99xx series
All of these products can run both on UNIX servers and on IBM mainframes. They are designed to support environments where reliability is an issue and, in some cases, where customers are willing to trade off some performance for reliability. For these RAID products to have great performance they need to have a high number of cache hits. These RAID systems allow huge amounts of storage behind a single controller for simplification of management of the storage.

Page 2: Storage-centric RAIDs

Submit a Comment


People are discussing this article with 0 comment(s)