To understand RAID levels – and RAID storage overall – it helps to know that “always-on” is more than marketing hype for business: it’s a basic expectation of customers. One of the oldest and still active technologies to achieve always-on status is RAID, or “redundant array of independent disks.”
Types of RAID
Storage administrators can deploy RAID as hardware (controller card or chip) or software (software-only or hybrid).
A dedicated hardware controller provides hardware-based RAID services. IT can deploy hardware RAID two ways: an external RAID Controller Card or internal RAID-on-Chip.
- RAID Controller Card: This plug-in expansion card connects to a PCIe or PCI-X motherboard slot. The card contains a RAID processor and I/O processors with drive interfaces. The cards are expensive, but since they are independent of the host, all RAID operations are offloaded from the CPU to the dedicated card.
- RAID-on-Chip: A single chip on the motherboard integrates the host interface, I/O interfaces for HDDs, the RAID processor, and a memory controller.
Software-based RAID delivers RAID services from the host. Software RAID comes in two flavors: pure software defined running from the OS, and hybrid software that contains a hardware component to relieve the load on the CPU.
- Software-only. Software RAID is the least expensive of the RAID types, and is often included as a native function on the OS. It is a host-based software application that manages RAID calculations for attached hard disk drives. It’s attached via an an HBA or native I/O interface, and activates when the OS loads the RAID driver.
- Hybrid. This software-based RAID uses a hardware component to deliver RAID BIOS functions from RAID BIOs on the motherboard or on an HBA. This technology offers a layer of redundant protection from a faulty boot process. Software-only RAID boots from the operating system, and boot errors could affect the entire RAID subsystem. The addition of a RAID BIOS hardware component protects the subsystem from operating system boot errors.
Whether hardware or software, RAID is available in different schemes, or RAID levels. The most commonly levels are RAID 0, 1, 5, 6, and 10. RAID 0, 1, and 5 work on both HDD and SSD media. (RAID levels 4 and 6 also work on both media, but are rarely seen in practice.)
Raid 0: Striping
Requiring a minimum of two disks, RAID 0 splits files and stripes the data across two disks or more, treating the striped disks as a single partition. Because multiple hard drives are reading and writing parts of the same file at the same time, throughput is generally faster.
RAID 0 does not provide redundancy or fault tolerance. Since it treats multiple disks as a single partition, if even one drive fails, the striped file is unreadable. This is not an insurmountable problem in video streaming or computer gaming environments where performance matters the most, and the source file will still exist even if the stream fails. It is a problem in high availability environments.
RAID 1: Mirroring
RAID 1 requires a minimum of two disks to work, and provides data redundancy and failover. It reads and writes the exact same data to each disk. Should a mirrored disk fail, the file exists in its entirety on the functioning disk. Once IT replaces the failed desk, the RAID system will automatically mirror back to the replacement drive. RAID 1 also increases read performance.
It does take up more usable capacity on drives, but is an economical failover process on application servers.
Raid 5: Striping with Parity
This RAID level distributes striping and parity at a block level. Parity is raw binary data. The RAID system calculates its values to create a parity block, which the system uses to recover striped data from a failed drive. Most RAID systems with parity functions store parity blocks on the disks in the array. (Some RAID systems dedicate a disk to parity calculations, but these are rare.)
RAID 5 stores parity blocks on striped disks. Each stripe has its own dedicated parity block. RAID 5 can withstand the loss of one disk in the array.
RAID 5 combines the performance of RAID 0 with the redundancy of RAID 1, but takes up a lot of storage space to do it – about one third of usable capacity.
This level increases write performance since all drives in the array simultaneously serve write requests. However, overall disk performance can suffer from write amplification, since even minor changes to the stripes require multiple steps and recalculations.
RAID 6: Striping with double parity
This RAID level operates like RAID 5 with distributed parity and striping. The main operational difference in RAID 6 is that there is a minimum of four disks in a RAID 6 array, and the system stores an additional parity block on each desk. This enables a configuration where two disks may fail before the array is unavailable. Its primary usage case or application servers and large storage arrays.
RAID 6 offers higher redundancy than 5 and increased read performance. It can suffer from the same server performance overhead with intensive write operations. This performance hit depends on the RAID system architecture: hardware or software, if it’s located in firmware, and if the system includes processing software for high-performance parity calculations.
RAID 10: Striping and Mirroring
RAID 10 requires a minimum of four disks in the array. It stripes across disks for higher performance, and mirrors for redundancy. In a four-drive array, the system stripes data to two of the disks. The remaining two disks mirror the striped disks, each one storing half of the data.
This RAID level serves environments that require both high data security and high performance, such as high transactional databases that store sensitive information. It is the most expensive of the RAID levels with lower usable capacity and high system costs.
SSDs can use traditional RAID systems. However, RAID performance improvements do not accelerate SSDs, which are already considerably faster than HDDs.
In order to add value to RAID functions, some SSD vendors have developed proprietary RAID functions for all-flash arrays, including Pure Storage RAID-3D and Dell XtremIO Data Protection. They not only provide data redundancy in AFAs, but also accelerate SSD RAID performance by cutting the amount of I/O needed to update stripes.
Other RAID Types
- RAID 2 is an original RAID level but is rarely used today. It is a striping technology that stripes at the bit level instead of the block level, and uses a complex type of error correcting code that takes the place of parity. Raid 2 is generally limited to serving single requests, and its error correction code is far more complex than parity technology.
- RAID 3 is rarely implemented. It uses byte-level striping and parity, and stores parity calculations on dedicated disk. Like RAID 2, it typically cannot service multiple requests at the same time. This does not affect the performance of large sequential reads and writes, but does slow down random access workloads.
- RAID 4 stripes block level data and like RAID 5, dedicates a disk to parity. The striping provides high performance for random reads. But because RAID 4 needs to write all parity data to one disk, random write performance suffers.
When researching which RAID level to use, remember that even the best RAID solution cannot take the place of backup. RAID protects data availability and redundancy, but does not recognize or remediate file corruption, write errors, or hacking. IT must always backup and store data on a separate system, ideally a remote one.
Having said that, RAID is still useful as long as data centers have hard drives. And since SSDs only comparable only comprise 20 to 25% of the modern data centers media, HDD’s are not going anywhere for quite some time. Protect them.