This article on How to Choose a Hard drive was updated on Feb. 27, 2014.
Last month I reviewed a blog from Backblaze on how not to choose a hard drive. This month I will weigh in on how to choose a hard drive for consumers and enterprises. I selected the following vendors and drives:
• Toshiba-Consumer, enterprise 3 TB, enterprise 2.5 inch 15K RPM and enterprise SSD
• HGTS-Consumer, enterprise 4 TB, enterprise 2.5 inch 15K RPM and enterprise SSD
• WD-Consumer, enterprise 4 TB, enterprise and 2.5 inch 15K RPM (No SSDs were listed)
• Seagate-Consumer, enterprise 4 TB, enterprise 2.5 inch 15K RPM and enterprise SSD
I chose the latest drive for each vendor as of January 31 2014, and collected all of the information from the each of the vendor’s web sites. I spent a fair amount of time looking around each vendor’s the web sites as best as I could for the most detailed document available.
Documentation from Seagate was the best, followed very closely by HGST, with WD far behind and Toshiba even farther behind. In the first part of this article, I'll cover consumer and 4 TB enterprise drives and, later in this article, I'll look at 2.5 inch 15K RPM drives and SSDs. (And as a reminder, HGST has been purchased by WD.)
Consumer Drives The first thing you will notice with consumer hard drives is that for many vendors there is a lack of documentation details like MTBF (mean time before failure) or MTTF (mean time to failure). This is especially true when comparing consumer drives to enterprise drives.
*Average rate of <55TB/year. The MTBF specification for the drive assumes the I/O workload does not exceed the average annualized workload rate limit of 55TB/year. Workloads exceeding the annualized rate may degrade the drive MTBF and impact product reliability. The average annualized workload rate limit is in units of TB per year, or TB per 8760 power-on hours. Workload rate limit = TB transferred × (8760/recorded power-on hours).
** Cannot determine as you have to put details about the customer or drive in to get details.
***MegaScale DC addresses low application workloads that operate within 180TB per year.
****Note WD40EFRX (AKA Caviar Red) lists 1M hours but this is a different drive.
So what does all of this information tell us about consumer hard drives? I find it very interesting that both HGST and Seagate list an average rate of I/O for a year. The Seagate number comes to .12% of the year at 104 hours running at full rate of the drive, while the HGST comes to 4.57% utilization of the year at 400 hours.
I am sure that both hard drives have counters which can tell you how much data has been transferred. Using these drives at home is likely fine unless you are doing lots of video editing or similar workflow, in which case it is likely that you will exceed these values. But I would find it pretty difficult for a home user to exceed these values. The only thing I could think of to challenge a home system is video surveillance.
I think there are a few other areas to consider.
1. The hard error rate for all of these hard drives is 1 sector in 10E14 bits read. This is another area that will not likely impact the home user but tells me that the drives are not designed for enterprise applications.
2. The drives are designed to operate between (55C) 131F and (60C) 140F at the upper range, which is again fine for the home users and is similar to what is found for enterprise drives.
3. There is not much information on seek latency for these hard drives, but there never has been much data. For the home user this does not matter much.
4. Warranties are much less than for enterprise drives for 3 of the 4 vendors, with Toshiba being the exception. I guess being third requires you to try harder.
The issues that I would consider for a home drive are: how will you be using it. Are you going to hit the write performance issues? That would be the big question that I would need to answer.
Clearly you are going to be outside the vendor specification for the HGST and Seagate drives using them in a commercial backup application. The performance range for the vendors for MB/sec is about the same except for Toshiba, which did not provide any data. For a home system – which is generally running a lower performance file system and a single application – the data rates listed are all reasonable. Those are the issues I would consider but I would never ever use these drives for a commercial application.
From what I can tell and remember the duty cycle documentation that Seagate and HGST have is reasonably new news. My guess is that these vendors are clearly stating that these drives should not be used in heavy I/O applications and if that is your workload you should be using enterprise drives.
Enterprise Nearline (AKA Business critical) These are the 4 TB drives that were SATA a few year ago and now have both a SAS and data interface. Some of the things to look at in these drives are different than consumer drives. Note for performance I am using the SAS interface values if provided and vendors often have multiple models for each of these drive families.
I am using the self-encrypting drives with secure erase if it is available as part of the table below. Since the encryption is done in hardware the vendors state it does not significantly impact the performance.
*HGST provided lots of details on ECC and error correction in Ultrastar_7K4000_SAS_Spec_V1.7.pdf, page 27
**Seagate provides a detailed discuss of reliability in chapter 5.0 RELIABILITY SPECIFICATIONS, which is the manual for this drive found on the website.
***States that the MTBF is based on a value of 40°C ****Seagate and Toshiba provided the rotational latency number, not the seek+latency.
*****Product MTBF and AFR specifications are based upon a 40°C base casting and system workloads of up to 180 TB/year (workload is defined as the amount of user data transferred to or from the hard drive).
******Toshiba states that they have “disk to disk data protection application.” This could be ANSI T10 PI/DIF but it is not clear.
Most of the drives in the category, for most users, will be going into RAID controllers and you will be getting these drives from your storage vendor. In most cases for large RAID vendors and even small RAID vendors they qualify multiple vendors disk drives.
I think the key consideration that people need to be concerned with for the duty cycle of the WD drive is: they are really not meant for the enterprise environments and therefore this drive should not be used for enterprise applications. Add to this the lack of SAS support for the WD drive.
Toshiba did not provide a great deal of information on their drive so it is hard to determine how it really compares in many areas. Hitachi clearly has the highest MTBF by over 600,000 hours. This is pretty amazing as back in 2005 enterprise fibre channel drives had about that number of hours as their MTBF, if I remember correctly. That is a big change over the last 9 years.
On the other hand the hard error rate of 1 bit in 10E15 bit read has not changed much over the same period of time. The HGST drive is a bit louder than the competition and uses a bit more power. So my recommendation would have to be between Seagate and HGST, and I would stay away from the WD drive for enterprise applications given the 180 TB workload limitation.
In Closing In comparing these two drive types it is very clear to me that consumer drives should not ever be used in an enterprise application. If they are, then shame on the system architect and buyers because the drive vendors are clearly specifying the usage environments and criteria for usage, such as the amount of data that can be written. Though the temperature ranges for both types drives is about the same, they are some clear differences. Consumer drives use SATA connectivity only, not SAS which has:
1. Higher reliability in the channel for ECC, which significantly reduces the potential for silent data corruption by many orders of magnitude.
2. Coming support for 12 Gb/sec interfaces, which are available today in some high end drives and will never be available for consumer drives as SATA’s future is 8 Gb/sec.
3. On drive error recovery is much more robust with SAS as compared with SATA.
4. SATA does not support the ANSI T10 end-to-end data protection model with validation of a CRC and LBA at the disk drive for high reliability.
Last but certainly not least is the desire in the enterprise for encryption of the whole disk drive. HGST and Seagate have a large number of details on encryption and how it works and that it has no impact on performance. Being in an enterprise environment, especially in a cloud environment or backup environment, means that the data is out of your data center being managed by someone else.
Now many of these services do provide their own encryption, but some do not. Having full disk encryption from my point of view is a must have and another major reason, due to reliability, that consumer drives should never be used for enterprise applications. Because if a hard drive gets removed from the environment it is unreadable.
How to Choose a Hard Drive: The Enterprise
Now, I’ll focus on choosing a hard drive for the enterprise. I’ll look at four enterprise SAS 15K (10K for WD as I wanted to include them) and three enterprise SSDs. The data presented here was collected from each of the vendor’s web sites January 31, 2014.
As a reminder, as mentioned earlier in article, Seagate had the best documentation, followed closely by HGTS, while Toshiba and WD were very far behind. This was true for both hard drives and SSDs. The following hard drives were evaluated.
• Toshiba-Enterprise 2.5 inch 15K RPM and enterprise SSD
• HGTS-Enterprise 2.5 inch 15K RPM and enterprise SSD
• WD-Enterprise 2.5 inch 10K RPM (No SSDs were listed)
• Seagate-Enterprise 2.5 inch 15K RPM and enterprise SSD
Both of the hard drive types chosen are enterprise drives with the highest reliability and performance provided by the vendor for SAS attached drives. I did not look at SSDs with PCIe connectivity even though some of the vendors listed manufacture them.
Enterprise 2.5 inch drives
Here is the information on the 2.5 inch 15K hard drives, and the enterprise10K for WD. Also note that HGST, Seagate and Toshiba all make enterprise 10K drives with similar density to WD. As you can see, this table is a bit different than the SATA/SAS nearline table from earlier in this article, as there are a number of new things that should be considered when evaluating these types of hard drives.
* Ambient Temperature 5°C to 55ºC, Relative humidity 5 to 90%, non-condensing Maximum wet bulb temperature 29.4ºC, non-condensing Maximum surface temperature gradient 20ºC/hour, Altitude -305 to 3,048 m.
**Only average power is listed.
Here is the location of some of the documentation for each drive:
Hard Drive Key Characteristics
With one exception for the HGST 4 TB drive, each of the other 3 enterprise 2.5 inch drives have a higher MTBF for the same vendor than the 4 TB drives reviewed earlier.
The lack of information from both Toshiba and WD I find annoying. I guess their belief is that this data does not need to be public as the public generally is not buying these type of drives. But this attitude doesn’t help them, and neither Seagate nor HGST think this way. Here are my thoughts.
The PC industry is being driven for the home user by the gaming community. If we can all agree on that then consider that SAS now runs at 12 Gbit/sec while SATA is stuck at 6 Gbit/sec, at least for 2014, from what is being said in the industry. Even when SATA gets to 8 Gbit/sec SAS will be nearing the availability of 16 Gbit/sec.
The need for speed in the gaming workload, I think will drive many from SATA to SAS. What is missing is SAS-based motherboards. I checked a few motherboard vendors and at least today, none of them has SAS motherboards for the gamer community.
The vendors all seem to do have them for their CAD workstations, but 12 Gbit/sec is somewhat new from the ~4Q13 time period for the mass market release, and I suspect that by mid-year we are going to see movement to sometime by mid-summer. The costs of 12 Gbit SAS support is not going to be much in the overall cost of a system. Of course if Intel would support SAS then it becomes a moot point.
So you might ask: why talk about SAS at all? The reason is that if SAS is available, these high performance drives might make it into home PCs and storage appliances for people with higher end needs. This would, I would hope, kind of force Toshiba and WD to provide some reasonably complete documentation on their products.
What is important in the evaluation of the drives is in two areas. First, performance, which includes streaming I/O performance and seek and latency time and of course reliability, and second, support for things coming like 4K sectors, encryption and end to end data protection.
You can see from the table that, with the publically available information that I could find, Toshiba and WD lack information in a number of areas that should be evaluated. So that leaves HGST and Seagate as the only vendors that really can be evaluated and there are of course tradeoffs between these two vendors. But Seagate has high MTBF and better performance for both seek and latency and for streaming I/O.
In looking at enterprise SAS attached SSDs, a few interesting things are apparent when you start to look at the data. All numbers were normalized from the vendor provided data for easy comparison.
1. For all vendors the difference in performance for read and write is significant. Write was only X % of read (HGST 56%, Seagate 67%, and Toshiba 42%).
2. Toshiba had a significant performance reduction for read if using a drive that supported encryption.
3. Hard error rate reliability is the same as enterprise 2.5 inch drives for Seagate, for HGST it was 10x better than their 2.5 inch drive, and Toshiba provided no data.
4. The endurance is listed and is a big issue for the drives. See below for a review of the information provided.
*HGST write value is 231 MB/sec for 5 years
**HGST assumes 4 KiB aligned I/O requests
***Seagate Warranty terms will vary based on type of warranty chosen: “Managed Life” or “Limited Warranty with Media Usage.” Consult the Seagate sales representative for warranty terms and conditions
****Sustainable 4KB Random combined IOPS for 5 year Endurance (65%/35% R/W, 70% Duty Cycle)
*****Note Toshiba numbers are higher (~18%) for the non-encrypted drive read rate.
#Random reads and/or writes for 12.0Gbit/s interface speed by dual port.
##Toshiba warranty is 5 years or the max TBW (total bytes written) per model capacity whichever occurs first.
##Toshiba values are based on write rate of 92 MB/sec for 5 years. Definition and conditions of TBW (Terabytes Written) are based on JEDEC standard; JESD218A, February 2011, and defined for the service life.
Here is the location of some of the documentation for each drive.
Hard Drive Endurance
The Seagate formula above was confusing to me. So I have translated into something that I found more understandable.
Each of the vendors state that you need to stay within their write budget to meet the support requirements for a 5 year warranty. The below table shows the MiB/sec that each of the vendors supports to meet this requirements. I have included the percentage of total bandwidth as a function of the total write bandwidth of the drive to meet the 5 year write budget support requirement.
I believe that these numbers are pretty good. With the HGST drive you can write on average 220.2999 MiB/sec to the drive 24x7x365x5 and meet the support requirement.
As you can see there are a lot more caveats on SSD storage than there are on enterprise storage. The write budget numbers remind me of what I noted earlier in this article, that consumer hard drives have a write budget that – of course – is much lower than these SSDs.
Choosing a Hard Drive: Bottom Line
In my mind there is a great deal more to picking a hard drive than just the price. Vendors write specifications (at least some of them do) to allow us to better understand where the drive will best fit. Of course every vendor has had a manufacturing problems where a lot of the line of hard drives has had problem.
The enormity of the testing that is done on hard drives is pretty darn impressive, while at the same time the enormity of the usage models grows faster than the testing can environments can be updated. Even if the vendors could test all the software environments, there is still the hardware or software defect issue and no process is perfect. But for anyone that has been around awhile, the number of problems today as compared to say 1994 is far, far less.
Choosing a hard drive requires attention to detail about the drive and about how you are going to use it. The how you are going to use it is a big deal, as it is not just the application reads and write but also includes the file system, file system layout for thing such as logs, the storage controller or appliance framework, the protocols and even something as simples the number of failures you have.
Think about this if you have a RAID device with SSDs with a write budget and one drive fails. You have to spend a lot of time and writes rebuilding parity or re-mirroring. This is something you might not have thought about when calculating your write budget. The whole concept of a write budget was unheard of 5 years ago except for a few people serious about flash drives.
Now we are starting to see this concept applied to consumer spinning disk and, with hybrid drives in the future, who knows. All of this means that we are going to need to spend more time thinking about drives for every problem from consumer to enterprise drives, and think about how these boxes will really be used to make sure we get the right drive for the right job.