Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Enterprise storage environments continue to evolve, which means that storage hardware systems are evolving in step. Here's a look at the storage hardware that makes a modern storage environment tick, including flash arrays, NVMe, hyperconvergence and the various other storage technologies that are slinging data across today's data centers.
Conceptually, storage infrastructure management is pretty straightforward. On a fundamental level, IT professionals use a combination of hardware tools and processes to ensure the timely delivery of data to end users, applications servers and other IT systems in accordance to an organization's objectives and policies, along with making certain that data is stored appropriately along the way.
In this regard, it's essential that IT professionals tasked with managing data storage systems always keep a few things in mind.
- Storage Capacity
Simply put, an enterprise's storage systems must have the sheer space or capacity to accommodate the amount of data generated by its users and applications.
One obvious scenario to avoid is running out of disk space. Budget-minded IT leaders are also keen on avoiding overprovisioning or acquiring much more storage capacity than necessary. This can lead to cost overruns due to upfront spending on unused storage and maintaining excess capacity.
A good grasp of an organization's storage requirements and proper planning are key to right-sizing one's storage network.
- Storage Availability
Data that's trapped in a storage system is of no use to anyone. Reliable and resilient storage architectures help ensure that data is available exactly when its needed. This is a major reason why enterprise storage systems fetch premium prices.
- Storage Performance
Speed is another important consideration. Although users may not mind a delay of a second or two while they fetch a file from shared storage, critical business applications and high-volume transaction systems typically don't fare well if they're kept waiting.
Not all data storage mediums are the same. Different performance and capacity characteristics command different prices.
In general, high-performance storage devices like SSDs (solid-state drives) are faster yet more expensive on a per-GB basis than spinning HDDs (hard-disk drives). The same is true when comparing HDDs to tape.
Hierarchical storage management is a method of moving data between high-cost, high-performance systems and lower-cost, lower performance systems. This requires a tiered storage environment—more on this later—where data is moved from fast arrays that store data used in critical applications to slower, HDD-packed disk storage systems and eventually tape, in many instances, as it ages and loses value.
Let's examine some of the components that comprise modern enterprise storage environments.
- Object storage systems
Popularized by cloud storage providers, object storage is gaining ground in the enterprise storage environments.
Data is represented by objects that are stored in a flat structure, unlike file systems that organize files in a hierarchical manner. Unlike block storage, objects can also contain rich metadata, which opens up all sorts of storage management and analytics options and innovative ways to extract more value out an organization's data.
- All-flash and hybrid-flash storage systems
Adding SSDs and other types of speedy flash storage devices to arrays is becoming an increasingly popular way of improving storage performance for databases and applications. They typically serve as a higher storage tier, given their expense and the relatively high level of performance they deliver.
As the term suggests, all-flash arrays are completely decked out in flash storage. Hybrid-flash systems employ a mix of flash storage and traditional hard drives, blending the performance of SSDs with the comparatively lower cost and higher capacity of HDDs.
- Predictive Storage Analytics
Another way enterprises are getting their most of their data storage investments is to perform data analytics on their storage infrastructures. The insights gleaned by this approach help organizations optimize their systems, manage capacity more efficiently and prevent system failures.
- Non-Volatile Memory Express (NVMe)
As fast as SSDs are, they are still susceptible to bottlenecking, depriving applications of flash storage's full performance potential. Longstanding bus architectures and storage protocols like SAS and SATA, which were designed in the days of slow HDDs, often fall short in efficiently pumping enough data between today's multi-core CPUs, copious amounts of RAM and flash storage devices.
Enter Non-Volatile Memory Express (NVMe). To alleviate those bottlenecks, the storage protocol and host controller interface uses a system's high-performance Peripheral Component Interconnect Express (PCIe) to feed data to multi-core processors at higher speed and lower latency than SAS and SATA.
- Hard disk and tape systems
Although they may have lost some of their luster with the advent of enterprise-grade flash storage solutions, HDDs still play a critical role in the data center. HDD-based storage systems are still "fast enough" for a variety of database and application workloads at capacities that eclipse SSDs.
Similarly, tape and tape drives and libraries remain go-to backup and archival data solutions. Although slower compared to SSDs and HDDs, tape still wins in price/capacity comparisons. As of this writing, an LTO-8 tape, priced at well under $200, can store 12TB of raw data or 30TB compressed.
- Redundant Array of Independent Disks (RAID)
RAID, an acronym for Redundant Array of Inexpensive Disks, is a technology that allows for the same data to be distributed across multiple drives, which appear as a single logical drive. This can enable better overall storage performance, increased capacity, and perhaps most importantly, better fault tolerance.
In the latter case, data remains accessible if a drive should fail, an important consideration when managing critical information. RAID levels define how data is doled out among disks in a RAID set.
RAID level 0, for example, spreads block-level data across multiple drives to improve performance but does not provide redundancy. RAID level 1, on the other hand, offers disk mirroring, providing twice the read performance of a single disk while maintaining copies of data that users can fall back on if a drive fails.
As noted earlier, each type of data storage system or medium has its own price, capacity and performance characteristics.
Placing rarely-accessed archival files and other low-value data on an expensive and high-performance all-flash array makes as little sense as using cheap-but-slow tape for an organization's day-to-day storage requirements. Simply put, it's not the most efficient way of making the most out of one's IT investments.
This is where storage tiering comes in.
Organizations often rely on hierarchical storage management or automated tiered storage capabilities to automatically place data on an appropriate storage tier according to policies that reflect an enterprise's application performance objectives and other factors.
Typically, as data ages, it is moved from expensive, Tier 1 storage arrays used for mission-critical workloads down to other tiers composed of less expensive and lower-performance storage media such as tape. Often, storage vendors will refer to their flash-enabled arrays as Tier 0, reflecting the performance edge that newer SSD-based arrays have over traditional high-end HDD arrays.
In the past, the typical enterprise network had servers on one side and storage systems on the other, figuratively speaking. It's an arrangement that worked, allowing server and storage teams to focus on their respective domains of expertise to deliver IT services and manage application workloads.
Then came converged storage, which bundles computing, networking and storage components into an integrated system that is typically based on commodity hardware and uses server virtualization, an approach that can vastly improve application and database performance at a lower cost if implemented properly. This was followed by hyperconvergence.
Hyperconverged infrastructures (HCI) combine all the above and use software-defined approaches to provide virtual computing, storage and networking services. HCI allows for applications and data to be fully abstracted from physical hardware and can be used to create massive pools of shared capacity at lower capital and operating expenditures.