Making RAID Work into the Future
Many people have said RAID is dead. But just like the Phoenix, RAID may rise from the dead thanks to some new technologies.
One of the more recent and interesting of those technologies is Dynamic Disk Pools (DDP) from NetApp. An examination of DDP illustrates the problem with conventional RAID and what can be done to make it useful into the future.
The Problem with RAID
RAID is one of the most ubiquitous storage technologies. It allows you to combine storage media into groups to give more capacity, increased performance and/or increased resiliency in the case of drive failure. The cost is that for most RAID levels we lose some of the total capacity of the RAID group because parity information has to be stored or a mirror has to be constructed.
However, like many of us, RAID is showing its age.
Using RAID-5 or RAID-6, you can lose a drive and not lose any data. When a drive is lost in a RAID-5 or RAID-6 group, all of the remaining drives in the group are read so that the data can be reconstructed via the remaining drives and parity information.
For example, if you have ten 3TB drives in a RAID group (RAID-5 or RAID-6) and you lose a drive, you have to read every single block of the remaining nine drives. This is a total of 27TB of data to reconstruct 3TB of data. Reading this much data can take a very long time. The fact that all remaining nine drives are writing to one target spare drive doesn't help reconstruction performance either. Moreover, while this reconstruction is happening, the RAID group may still be in use, further slowing down the reconstruction. Plus, in the case of RAID-5, if we lose another drive then we can't recover any of the data (with RAID-6 we could tolerate the loss of one more drive).
While we now have 3TB and 4TB drives, 5TB and 6TB drives are on the way. In the case of a RAID-5 or RAID-6 group, if we have ten drives that are 6TB each, then the amount of data that needs to be read to recover from a single drive failure is 9*6 or 54TB of data to recover a single 6TB drive. Remember that drive speeds aren't really getting any faster, so the amount of time to read the remaining drives is much greater than it is today (i.e. the read performance stays the same but the amount of data to be read increases).
Moreover we also have the problem that the URE (Unrecoverable Read Error) rate for drives has not increased. The URE describes the amount of data that is read before hitting a read error (i.e. the controller cannot read a particular block). This is particularly important when a RAID rebuild is initiated because the entire capacity of all of the remaining drives in the group have to be read -- even if there is no data in the blocks.
In some cases, this can push the probability of hitting a URE to almost 1 (i.e. guaranteed to hit a block that can't be read), meaning that the rebuild will fail. In the case of RAID-5, this means that the rebuild has failed, and the data has to be restored from a backup. In the case of a RAID-6 you now have 2 failures -- the first failed disk and the disk that gave a URE. The RAID-6 rebuild can continue, but it has now lost its protection. If there is another error, then the RAID group is lost and the data must be restored from a backup.
Of course, storage companies have recognized the limitations of conventional RAID, and they have created some techniques for allowing the rebuild process to proceed in the event of a URE. For example, in the case of RAID-6 with a failed drive, you still have one parity drive, so a URE won't stop a reconstruction. This allows controllers to continue reconstructing the RAID group. But really this is just an application of RAID-6 (the ability to lose 2 drives).
One other idea that I haven't seen yet but that might exist is the ability to skip the block being reconstructed when a URE is encountered. The array would finish the reconstruction but notify the user which block is bad. Of course, the file associated with the failed block has not been fully reconstructed, but this does allow you to reconstruct all of the other data. This is much better than having to restore 54TB of data from a backup or a copy because ideally you could restore just the file(s) that were affected by the bad block.
So the big problems for conventional RAID groups are (1) as the drives get larger, the number of drives in a RAID group needed to encounter a URE, decreases (true for both RAID-5 and RAID-6 groups), and (2) the rebuild times are getting extraordinary long because read and write performance stays the same but the RAID groups have more data.
The combination of these issues, as well as others, have not painted a rosy picture for RAID. As drives have gotten bigger, the outlook for RAID has gotten gloomier. Many people have predicted its demise for years. But there are technologies that can help improve the situation. In this article I want to examine one such technology, Dynamic Disk Pools (DDP), that shows a great deal of promise in helping RAID be a viable technology into the next generation.