RAID has been around since David Patterson, Garth Gibson and Randy Katz first described the data storage reliability and performance concept more than 20 years ago, and it will likely be around for decades to come. The biggest issue for the technology is how it will keep pace with magnetic disks that get bigger by about 40 percent each year (see RAID’s Days May Be Numbered).
“The core of the problem is that the time it takes to read a whole disk is getting longer by about 20 percent each year,” Gibson, now the CTO and co-founder of Panasas Inc., told Enterprise Storage Forum.“Disk data rates are growing much slower than disk capacities, so it just takes longer to read each bigger disk than the last.”
RAID vendors and most storage vendors are not addressing the problem of failure density in high-capacity drives, claims Jered Floyd, founder and CEO of Permabit Technology.
“Using anything less than RAID 6 with large drives almost guarantees data loss,” said Floyd. “Even RAID 6 has problems if you have coupled drive failures due to bit error rate problems.”
To get beyond RAID 6, Floyd said vendors must adopt more advanced erasure coding schemes. Specifically, he identified schemes that would “protect against more failure and implement data distribution across larger systems to eliminate the ‘single full set of disks’ rebuild bottleneck.”
Due to bigger disks, RAID systems take longer to recover from a failed disk, Gibson noted. Traditional RAID systems recover a disk by reading all the remaining disks beginning to end and by writing the missing data beginning to end to an online spare disk.
The results are that the RAID system takes longer to get back to full data protection, the chances of incurring many other failures increases, and the probability of data loss goes up.
Smarter RAID Controllers
However, some insiders see the old fear, uncertainty and doubt (FUD) factor at play in the issue of drive densities doubling each year and increasing the chance of a double drive failure during a rebuild.
A lot of this ‘failure’ information runs counter to actual technological advances, said Gary Watson, CTO at Nexsan Technologies.
“First of all, the sequential performance of disk drives is increasing, and though it doesn’t quite keep pace with drive capacity growth, the gap is not unreasonable,” said Watson.
“Plus, hardware-level support for rebuild calculations means that the rebuild performance of RAID controllers from generation to generation is keeping pace nicely with drive capacity increases,” Watson added.
Part of the problem for RAID controllers is that most HDDs are underutilized, but the controllers do not know what is being used and what isn’t, said Luca Bert, director of DASRAID architecture and strategic planning for LSI (NYSE: LSI).
“Having a better understanding of this will allow a rebuild of only the used areas,” said Bert.
“One solution is thin provisioning, which allows the system to work only on the restricted data set it needs, so if the controller is aware of which set has been provided, it will rebuild only that part,” said Luca.
Another solution Luca mentioned is to have each file system tell the RAID controller when a block has been decommissioned so it can avoid rebuilding the block.
Alternatively, Luca said IT staffs could use RAID levels 1 or 10 (rather than RAID 5 or RAID 6) to minimize the rebuild time and amount of data transferred.
Luca also noted that if advanced data placement algorithms are used, an array can be distributed across a larger number of devices, minimizing the number of elements to rebuild.
Vertical Parity, Declustering
To minimize performance degradation, Gibson said some RAID systems drastically slow the rate of recovering from disk failures, dramatically increasing the chances of data loss.
One solution to that, first explored in the 1990s, is parity declustering.
It turns RAID from a local operation of one controller and a few disks into a parallel algorithm using all the controllers and disks in the storage pool, explained Gibson.
With pools of tens to hundreds of individual disk arrays, parity declustering enables recovery to be tens to hundreds of times faster. And, it spreads the work so thinly per disk that concurrent user work sees far less interference from recovery, said Gibson.
Parity declustering is rare in RAID storage products, but it is offered in Panasas parallel file systems. Interestingly, it is also present in the Google File System.
But there is another problem with the growth of disk capacity: unreadable sectors. Disks are built to specifications. One of the specifications is that unreadable sectors should not happen very often — typically, no more than one unreadable sector every 10 to 100 terabytes read. However, as disks get bigger, more sectors must be read in a recovery, and the chance of seeing at least one unreadable sector grows.
“In typical arrays, if too many sectors are lost during a disk failure recovery, then the recovery fails and the entire volume goes offline and is possibly lost,” said Gibson.
A workable answer is to make the redundancy encoding more powerful and more targeted at specific failures that are becoming more common — unreadable disk sectors, for example.
RAID 6, for example, can cope with two failed disks, or one failed disk and an unreadable disk sector. The recovery of two disk failures is almost certainly going to experience an unreadable disk sector.
One way to counteract such a disk failure is to add a layer of coding to each disk so that unreadable sectors can be recovered locally, without using the RAID system. Think of it as RAID across different sectors of the same disk.
Gibson said Panasas calls it vertical parity. Employing vertical parity allows RAID 5 to recover a failed disk even if unread sectors occur, and RAID 6 to recover two failed disks, should that be necessary.
The future has bigger disks, bigger systems, and more emphasis on failure recovery. But RAID can evolve to cope with all that. The future of RAID lies in more coding carefully targeted at specific failure cases, and more parallelism and load balancing in the reconstruction of lost data.
Follow Enterprise Storage Forum on Twitter