Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Hard disk drives have been around forever, if you define “forever” as 1953. In 2018 they are still the backbone of data storage, even with fast-growing SSD sales and the tenacity of tape.
Let’s look behind the curtains at modern HDD technology, capacity, performance, and reliability. We’ll revisit the types of hard drives in use in business today and peek into the future of storage and hard disk drives.
Hard disk drives are magnetic media that store and retrieve digital data. Its architecture is rigid rapidly rotating disks, or platters, held in place by spindles. The platters are not magnetic but are coated with magnetic material.
Magnetic heads on actuator arms move over the platter surface to read and write binary data bits by detecting changes in magnetization on the platter. This is random access storage that does not require sequential blocks to work. Disks are of course non-volatile and will not lose stored data when the drive powers off.
Storage professionals often refer to a HDD's "spinning disks," because these rotating discs are a key feature of the HDD.
Modern HDDs spin from a low consumer speed of 4200 revolutions per minute (rpm) to enterprise-grade 15K rpm. HDDs usually have two motors: one for the spindle to spin the disks and the motor that positions the arms and read/write heads.
High capacity and performance are primary HDD characteristics, with reliability a close second.
The industry measures capacity in powers of 1000, so a 1TB drive stores 1000GB. Not all this room is available for user data storage thanks to the file system and computer OS, and on most disks reserved space for RAID operations or other recovery options. The OS will report available storage correctly to the user.
Commercial available HDD capacity varies wildly from a few hundred GBs to 12TB for enterprise drives. That number has been inching up for years and will likely go higher as HDD capacity development continues.
Performance is calculated by three measurements: average access time, average latency, and average data rate.
1. Access time is the time it takes for the disk drive to move the heads to a track to read or write the data. Access time includes the actual seek time (how long it takes the heads to get to the right track), rotational latency, and sufficient time to complete command processing.
2. Rotational latency is the time it takes for the requested sector to move under the head. Latency is calculated from seek time and the rpm of the spinning disk and is measured in milliseconds. Typical rotational speeds range from 6.25 ms at 4800 rpm to 2 ms at 15K rpm.
3. Transfer rate is how fast the data is transmitted to and from the read/write heads. It’s usually described in as megabytes per second (MBps).
HDD reliability does not directly correlate to HDD failures. Many external factors can cause a disk failure, including power loss, wildfire or floods, magnetic interference, malware, dropping a drive (it happens), or environmental contamination that causes a head crash.
HDD reliability is concerned with internal threats to the HDD, including equipment failures, data errors, and head crashes.
· Equipment failure. Sometimes drives fail because of wear and tear or poor manufacturing quality. Drive manufacturers report their drives’ reliability using mean time between failures (MTBF) or annualized failure rates (AFR). These are averages that cannot predict individual disk reliability but can yield average reliability for a specific model or family of disk drives.
Manufacturers measure reliability using constantly running sample drives, analyzing subsequent wear and tear, and expanding results across the expected lifecycle. In practice, if a drive is reliable for just few months it will generally be reliable throughout its lifetime.
· Head crashes. A head crash is the most common cause of drive failure. It occurs when the read/write heads touch or scrape the platter surface, which understandably causes a disk failure. Several factors can damage the heads causing head crashes: power loss, motor failure, concussive shock, dust particles that make it through the air filter, wear and tear from aging, or subpar manufacturing. Scratch damage and particles from damaged heads or platters create bad sectors, which can severely damage a disk and its data. Customers may or may not be able to recover data.
· Data errors. Each sector on a hard disk contains 512 bytes of user data, or 4096 bits. These bits are subject to errors from many different causes. Some of the errors are identifiable by firmware or OS; others are undetectable until the hard drive fails. Error correcting codes (ECC) are a redundant technology developed to protect against errors on storage devices, including magnetic disk. ECC stores non-user data bits on each sector that contain information about the user data on the sector. When the heads write a sector to hard disk, ECC generates codes and stores them in reserved bits within the sector.
When the heads read back the user data, ECC uses the read and the ECC bits to report any errors to the controller. If ECC can correct the errors, it will. If not, it will send an error notification of an unrepaired error. The ECC functionality may also remap the data from the failing sector into a reserve sector pool. ECC reports these actions including the total number of remappings, so admins can head off an imminent HDD failure.
There are multiple interfaces that connect the HDD to external computing components. The most common include SCSI, SAS, ATA, SATA, and Fibre Channel.
1. SCSI is one of the oldest HDD interfaces and is still in wide use. SCSI interfaces with external computing components and is usually backwards-compatible.
2. SAS, or Serial Attached SCSI, evolved from SCSI and uses SCSI commands. SAS is a serial communication protocol that enables significantly higher speed data transfers than SCSI’s parallel protocol. It is SATA-compatible: standard SAS and SATA HDDs use the same type of data and power connectors, and SAS RAID controllers can interface with SATA drives.
3. ATA, also called IDE, is an older protocol that moved the HDD controller from the interface card to the disk drive. This standardized and simplified host/controller operations.
4. SATA or Serial ATA replaced a parallel version of ATA (PATA) with smaller cable sizes, reduced costs, native disk hot swapping, and faster data transfers.
5. Fibre Channel (FC) is a high-speed and highly reliable serial protocol aimed at enterprise-grade storage area networks (SANs).
· External HDDs. External hard disk drives connect to computer systems using USBs. Their hard disk drives can be as high capacity as their internal counterparts, and like internal hard disks come as 2.5” or 3.5” models. The smaller form factor can be powered through the USB; the larger needs its own power brick.
· Price trends. HDD prices have consistently fallen over decades, from the 1981 equivalent of half a million dollars per GB (that size didn’t exist yet) to an average of .03 cents per GB in 2017. Top of the line disk is more expensive than the average by but not much: in 2017 8TB drives cost about the same per GB as 4TB drives. It’s safe to say that trend will continue. Although an 8TB or higher drive isn’t exactly cheap, its cost-per-GB is highly economical.
· Competition from solid-state drives (SSDs). SSDs range in sizes just like HDDs do, and are the largest sized drives available today. In February of this year, Samsung introduced a jaw-dropping 30TB sized SSD. SSD flash memory’s areal density roughly doubles every two years, or around 40% per year. This is considerably higher than HDD’s average 10-20% areal density growth. SSD prices are dropping as well, although they are not yet as inexpensive as HDDs.
Despite SSD competition, don’t discount HDDs any time soon. Developers regularly increase areal density and introduce more heads and disk into a single drive for even more storage capacity.
And HDDs still average a far lower cost than SSDs. Today SSD prices hover around 6.6 times the cost of HDDs, and IDC thinks they are heading to 2.2 times HDD costs by 2021. Since SSDs can have higher capacity than HDDs, that will shrink SSD cost-per-GB even more. Nevertheless, today’s 6.6 is a big price differential. This is why many businesses opt for all-flash storage systems for high performance applications and buy mixed SSD/HDD systems for everything else.
Ultimately HDDs are not going to die any more than tape did, which is alive and kicking as a strategic storage tier. Storage systems are not going to turn monolithic overnight. They will combine SSDs for high performance and high capacity active data storage, HDDs for Tier 1 and Tier 2 active data storage, and HDDs and tape and/or Blu-ray for long-term cold storage. HDDs, with their high capacity and fast performance at a low cost, will stay a large part of that storage mix.