The acronym RAID stands for redundant array of independent disks. A RAID system may be hardware or software, and virtualizes physical storage drives to improve performance and create data redundancy. Controller-based RAID generally refers to hardware-based RAID, as opposed to server-based RAID, which is both software-defined and software-hardware hybrids.
What is a RAID Controller?
A RAID controller is a card or chip located between the operating system and the storage drives, usually hard disk drives. RAID provides data redundancy and/or improves hard disk drive performance; most RAID levels do both. RAID does provide redundancy on SSDs, but does not improve SSD performance. RAID manufactured specifically for SSDs will provide both redundancy and improve performance.
RAID controllers work by virtualizing the drives into distinct groups with specific data protection and redundancy characteristics. The front-end interface communicates with the server, usually via a host-based adapter (HBA); and the backend communicates with and manages the underlying storage media; usually ATA, SCSI, SATA, SAS, or Fibre Channel.
RAID controllers are classified by multiple characteristics including drive types such as SATA or SAS, the number of ports and number of drives it can support, specific RAID levels, interface architecture, and how much memory exists in native cache. For example, this means that a controller manufactured for a SATA environment will not work on a SAS array, and that a RAID 1 controller cannot be modified into a RAID 10.
RAID controllers are not storage controllers. Storage controllers presents active disks to the OS, while the RAID controller acts as a RAM cache and provides RAID functionality. The number and identity of RAID disks depends on a RAID controller’s configuration.
The Dell EMC PERC H739P RAID controller is an enterprise-grade RAID unit.
Hardware-Based: RAID Controller
Dedicated hardware controllers come in two different architectures: an external RAID Controller Card and an internal RAID-on-Chip:
- RAID Controller Card: A RAID controller card is a plug-in expansion card that connects to a PCIe or PCI-X motherboard slot. It contains a RAID processor and I/O processors with drive interfaces.
- RAID-on-Chip: Less expensive RAID-on-Chip is a single motherboard chip that integrates the host interface, HDD I/O interfaces, the RAID processor, and a memory controller. Firmware start up RAID during bootup, then transfer control to the drivers.
Software-Based: Server-Based RAID
Software RAID delivers RAID services from the host. It comes in two flavors: software-defined hosted in the OS, and a hybrid architecture that contains a hardware component to relieve the load on the CPU.
- Software-only RAID: Software-only RAID is usually included as a native function on the OS, which makes it the least expensive of the RAID options. The host-based application manages RAID calculations, and attaches to the storage drives using an HBA or native I/O interface. It starts up when the OS loads the RAID driver.
- Hybrid hardware/software RAID: Hybrid hardware/software RAID uses a hardware component to deliver RAID BIOS functions from the motherboard or HBA. The hybrid technology adds another layer and is more expensive the software-only, but it protects the RAID system from boot errors should something happen to the operating system.
The 3ware 9650SE SATA II RAID controller is a relatively inexpensive RAID unit.
What are the Different RAID Levels?
See this in-depth discussion of RAID levels. Here’s a summary:
RAID controllers are specific to RAID levels. The most common levels are RAID 0, 1, 5/6, and 10. For more in-depth information, read RAID Levels.
- Raid 0: Striping. RAID 0 is the only RAID level that does not provide redundancy, but only increases hard disk performance. RAID 0 splits files and stripes the data across two disks or more, treating the striped disks as a single partition. Since it treats multiple disks as a single partition, if even one drive fails, the striped file is unreadable. Usage case: HDD performance improvement only; no data redundancy.
- RAID 1: Mirroring. RAID 1 works on two or more desks to provide data redundancy and failover. It reads and writes the exact same data to each disk. Should a mirrored disk fail, the file exists in its entirety on the functioning disk. When the failed desk is repaired or replaced, RAID system will automatically mirror data back to the replacement drive. RAID 1 also increases read performance. Usage case: Data redundancy and faster reads at a low cost.
- Raid 5/6: Striping with Parity/Double Parity. RAID 5/6 combine the performance of RAID 0 with the redundancy of RAID 1, but requires about one-third of usable capacity. “Parity” refers to raw binary data. RAID 5 stripes data across two or more disks, and calculates block-level values to create a parity block. RAID 5 stores dedicated parity blocks on striped HDD. Should a drive fail, RAID 5 uses its dedicated parity block to rebuild data on the remaining nodes. RAID 6 operates like RAID 5 but requires a minimum of four disks in an array, so it can store an additional parity block on each HDD. This results in a highly available configuration where two disks may fail before the array becomes unavailable. Usage case: Web servers, intensive read environments, application servers, large storage arrays.
- RAID 10: Striping and Mirroring. RAID 10 is the most expensive of the RAID levels. It stripes across at least four disks for higher performance, and mirrors for redundancy. In a four-drive array, the system stripes data to two of the disks. The remaining two disks mirror the striped disks, each one storing half of the data. Usage case: high security and high-performance environments such as intensive transactional databases storing sensitive information. RAID 10 is the most expensive RAID level for HDDs, but offers high read and write speeds as well as strong data redundancy.
Advantages of RAID
- Better reliability. Except for RAID 0, RAID ensures that a single crashed node will not take an array down with it. Applications continue to operate on remaining nodes while the failed node is repaired or replaced, which maintains data consistency and guards against data loss.
- Data redundancy. RAID mirroring and/or striping with parity spreads data across multiple nodes, ensuring that no data is lost should a node fail.
- Higher HDD performance. Most RAID levels improve throughput by allowing applications to simultaneously read and write data from multiple drives. This is not an automatic improvement: higher RAID levels, especially RAID 10, take up system overhead, making them unsuitable for low- or mid-performance arrays. These arrays benefit most from RAID – performance improvement, or RAID 5/6 for performance plus redundancy. On a high-performance array, RAID 10 increases performance and provides redundancy and high availability.
Advantages of RAID Controllers
The controller architecture of hardware-based RAID is more expensive than software-based RAID, but increases system performance and is not subject to boot errors.
- Cache memory. Controller-based RAID usually provides additional dick cache memory, which accelerates RAID operations.
- Dedicated processing. Controller-based systems independently manage RAID configuration apart from the OS. And since the RAID controller does not require disk processing power, capacity and speed win out over software-only RAID.
- Lack of boot errors. And since software-only RAID is located on the OS, it is subject to boot errors that can compromise an entire array. Boot errors will not affect RAID controllers.
Not every environment makes sense for controller-based RAID. In a tightly budgeted storage environment, software-only RAID 0 will improve HDD performance and software-only RAID 1 will provide acceptable data redundancy.
However, in higher performance environments with compute-intensive arrays, hardware RAID 5/6 will provide better performance than software defined RAID. And if you need to scale-up to RAID 10, you will probably not find software-based alternatives.