Getting Real About RAID Subsystems
RAID (Redundant Array Of Independent Disks) subsystems have become popular for providing redundant, scalable shared storage for one or more physically attached servers. Also, RAID subsystems can increase the performance of accessing sequential multi-megabyte file transfer applications, such as multimedia, imaging, and Web access; and processing on-going bursts of timely transactions, such as Oracle and Sybase.
You'll have no trouble observing how fast a RAID subsystem can perform with your application. On the other hand, you'll have to get a flashlight and poke under the hood to determine if the RAID subsystem has the muscle to protect your data 24/7. Don't assume that all RAID subsystems enjoy the same standard of equality. A high availability storage system should combine RAID techniques with firmware in such a way that ensures the highest degree of access to data. Can the device provide complete protection against data access interruptions and loss of data integrity? Significant differences among RAID subsystems in these areas can appear very subtle. So, spend the time and get to know better the device you might trust your data too. For openers, ask the following five questions to your systems integrator or RAID subsystem vendor. Insist on no-frill facts and don't settle for fluff.
How does the RAID subsystem handle media errors on the disk?
How well the device handles disk media errors can spell the difference between having a smooth-running storage subsystem with data integrity or a system that's a loser (of data). Techniques to keep track of data that can't be reconstructed, however, offer trade-offs between performance, storage capacity, and completely. The bottom line is that the device deals properly with disk media errors.
Does the RAID subsystem perform any self-diagnosis or take any corrective action?
You're always better off finding a problem and correcting it before it adversely affects access to the device's data. A key area of capability includes the ability to schedule media scans and error management using a task scheduler. One example is disk drive S.M.A.R.T. (Self-Monitoring Analysis Reporting Technology) support. Available in most drive drives, S.M.A.R.T. support enables disk drives to perform predictive failure analysis on themselves, and thereby, to correct any problem before the drive actually fails.
Does the RAID subsystem provide an effective method of protecting data in the event of a power failure?
In battery-backed data cache systems, data in the RAID controller will get lost if the batteries become exhausted before you can restore power to your storage system. You could also have further lost data on the disk drives due to partially written, and, therefore, incomplete records. Look for a subsystem that can immediately remove this data from its volatile on-board memory and write it to a non-volatile medium such as disk drives.
Does the RAID subsystem allow its firmware to be updated without the need to take out the controller off-line or be restarted?
If you can't afford to take your RAID subsystem offline for periodic maintenance, make sure the RAID controller offers non-disruptive firmware updates.
Does the RAID subsystem provide redundant dedicated failover or cache mirroring paths between redundant pairs of controllers for mirroring data?
A high availability RAID subsystem should have redundant active/active or active/passive controller pairs that maintain redundant copies, or mirrors, of unwritten cache and operational checkpoint data. A RAID controller architecture that provides dedicated redundant data paths between two controllers doesn't burden the disk channels with non-disk related I/O, and thus provides better overall I/O performance.
Elizabeth M. Ferrarini is a free-lance writer from Boston, Massachusetts.