Taller hard drives with multiple actuator arms supplied in multi-drive packages — these are just a few of Google’s suggestions as it calls for a complete rethink of storage disk design.
In a white paper called “Disks for Data Centers,” published last month, the company gives some hints as to how the hard disk drive might evolve in the coming years.
There’s a need for change, the white paper asserts, because the fastest-growing use case for hard drives is for mass storage services housed in cloud data centers. YouTube alone requires 1 million GB of new hard drive capacity every day, and very soon cloud storage services will account for the majority of hard drive storage capacity in use, it says.
That’s significant because hard drives are not — and never have been — designed with these cloud storage services in mind. Far from it: enterprise grade disks have been optimized for use in servers, while standard consumer grade disks are designed for use in business and home PCs.
Here’s why Google believes that’s a problem. In servers and desktop machines it’s important that hard disks don’t lose data, so one of the design goals is that they have a low bit error rate (BER). Achieving this goal has an associated cost, which is reflected in the cost of the disk.
But when data is stored in cloud storage services, it is always stored on multiple disks as a matter of course to ensure availability. That means minimizing the BER is unnecessary. In fact, it’s an undesirable goal because when BER is minimized cloud providers are paying for reliability that they don’t need because the issue is taken care of elsewhere.
“Since data of value is never just on one disk, the bit error rate for a single disk could actually be orders of magnitude higher than the current target, ” the report says.
Avoiding paying for disk drive features (such as reliability) that are not needed is important to cloud storage service providers because of the vast numbers of disks that they buy and use. Other features that are important to cloud storage service providers include high performance (in terms of IOPs) and high capacity.
They also have different security requirements, which stem from the fact that in cloud storage many different people’s data may be stored on the same physical disk. Most enterprise drives offer data-at-rest encryption using a single key, but in a shared service there’s a need for more fine-grained control using different keys to access different areas of the disk.
A key to designing better disks in the future is recognizing that data center disks are always part of a large collection of disks. That’s important because any future designs need to optimize various metrics — specifically IOPs, capacity, security requirements, TCO and tail latency, Google believes — but these metrics need to be optimized for the collection of disks, not the individual disks that make up the collection.
That means that target levels of capacity and IOPs can be achieved by using a specific mix of drives, and new disks added to the collection as capacity requirements increase can be tailored to bring the collection closer to these targets.
So the obvious question is this: How will data center drives, optimized as collections, differ from today’s enterprise hard drives?
One promising avenue to explore is changing the form factor, Google believes.
One way of lowering capacity costs ($/GB) would be to increase the diameter of the platters inside a disk drive from 3.5″ to perhaps a 5″ disk. But the problem with that solution is that it decreases the performance (IOPs/GB) as there is a greater area for a head to move over.
Shrinking the platter size would increase $/GB but increase IOPs/GB, because there is less area for the head to move over, and smaller platters are also more stable and can therefore be spun faster to provide better performance.
The remaining option would be to change the height of a new disk drive. A standard 3.5″ hard drive is up to 1″ tall, but this figure was derived from the form factor of PC floppy disks and could easily be changed.
“We propose increasing the allowable height (of a drive),” Google’s report says. “Taller drives allow for more platters per disk which adds capacity and amortizes the costs of packaging, the printed circuit board and the drive motor /actuator.”
The report adds that it may be that a mix of different platter sizes, in different disks, provides the best aggregate solution.
Once you’re thinking of an aggregate solution, it also makes sense to stop thinking about disks as the individual units that are purchased, and instead think about groups of disks packaged as a single unit with a single power interface and one (or more) SATA interfaces.
Doing so would enable the group to share a larger cache, amortize fixed costs more efficiently and improve power distribution, the report points out.
How many disks should there be in a group? The report says that there’s a balance to be reached because larger groups have better amortization and power savings, but also take out more data in a failure (assuming a single disk failing requires replacing the whole group). It suggests that a package of four disks may be a good starting target.
Another design change could involve introducing parallel access so that the disk can offer two (or more) I/O streams at once. This would make disks cost more, but since capacity growth is outstripping IOPs growth, introducing parallel access is becoming more viable.
Google suggests a variety of ways that parallel access could be enabled, including designing disks with two full actuator arm units; disks with two half-height actuator arms, one on top of the other, which would benefit multi-queue random access or single queue sequential workloads; or disks with one actuator arm with heads that can read two adjacent tracks per platter surface simultaneously, which would double sequential workload throughput.
Flash storage makers like OCZ have already started to introduce storage systems that use host-managed solid state drives. In this case, the housekeeping activities that the SSDs’ controllers are normally responsible for are managed by the storage system itself.
Google proposes something similar for cloud storage disks, with background tasks such as media scanning and adjacent track interference (ATI) mitigation controlled by the host. The advantage of this is that the host can initiate background tasks on a disk when it knows that the disk is unlikely to be busy, so the performance impact of carrying out the background tasks will be minimized.
Another performance gain could be made by allowing the host to control the disk drives’ read retry policy (which would require a new API.) This would give hosts the option to tell drives to “give up” and report an error after only a few read attempts (rather than continuing to make read attempts for far longer.)
The benefit of this is that once a drive gives up it can get on with other operations, while the data can be retrieved from another disk. (That’s especially useful if the host requests the data from several disks at once, taking the data from whichever one responds first.)
An interesting way that the TCO of disk drives could be increased is by allowing drives to be treated as having flexible capacity. For example, if a single drive head fails in a drive then that head could simply be mapped out (and the data it is responsible for could be recovered from other disks.) The drive could then continue to be used with a lower capacity — extending the drive’s lifetime and thus decreasing the TCO.
In a similar vein, disk drives are supplied with reserve sectors that are brought in to service to replace damaged sectors which are swapped out of service during the life of the drive. Google suggests that these reserved sectors could be put into use while the drive is young, rather than lying idle, and reclaimed at a slow rate to replace damaged sectors as the drive ages.
What’s clear from all this is that disks deployed in massive collections in cloud storage facilities are used in very different ways to the sorts of disks installed in conventional servers and desktop machines.
And in the near future disks deployed in this way will be the rule, rather than the exception.
Given all that, it’s likely that tomorrow’s cloud storage disks, optimized for the specific needs of cloud storage service providers, will be very different to the disks they use today.
The good news for enterprises is that innovations in cloud storage disk design won’t be restricted to cloud storage facilities; technologies that enable the likes of parallel access, flexible capacity and host control will trickle down to the storage drives in corporate data centers too.
Photo courtesy of Shutterstock.