Servers are growing more and more powerful, but decades-old storage controller technology has not kept pace. This puts storage at an extreme disadvantage just as companies are growing huge infrastructures of virtual and physical systems, applications of all types and very large volumes of primary data.
Advances in solid-state drives have helped by introducing higher performance/lower latency into the computing stack. Now customers do not necessarily have to sacrifice simplicity or budgets for performance improvements. However, these SSD products – which may be at the server, network or storage layers -- can be very expensive. In addition, they're complex to deploy and manage even at the enterprise level, let alone the mid-market.
Customers want and need the performance boost from SSD products, but they are asking if the improvement is worth the expense, management complexity and confusion surrounding SSD offerings.
The answer is: it depends.
Why Flash Matters to Applications
Storage-based SSD solves serious performance issues by providing fast IO processing at the storage controller. That makes it ideal for high performance and low latency storage requirements.
The wrench in the works is that all-flash arrays are quite expensive and small capacity compared to hard disk. That puts them out of reach for most mid-sized businesses and renders them unnecessary for the majority of data center applications.
Some critical Tier 1 application environments can take advantage of dedicated all-flash arrays or even card SSDs at the host to dramatically accelerate applications whose service levels are measured in microseconds. Most mainstream applications do not need the stratospheric IOPS numbers advertised for all-flash storage systems, but there are still crucial performance requirements for mission-critical applications such as databases and OLTP applications, SharePoint, Exchange, virtual servers, and virtual desktops. While businesses of all sizes must maintain high random IO performance and low latency for these applications, they cannot justify paying the premium prices for dedicated all-flash storage.
However, there is a way to apply the high SSD performance with reasonable cost per GB: hybrid storage systems that combine SSDs and spinning disk to provide high performance storage at a reasonable cost. In this environment, the flash-based storage controller provides intelligent storage management. SSDs provide fast performance for more active data, and back-end disk houses the majority of data.
Yet even in this category, systems differ from one another in several important respects. One of the most important differentiations is in fact one of the least obvious: does the array use cache or tiering to deliver active data into the SSD layer? At first glance these two approaches do not look too different, but in fact their differences over time can be substantial.
- Automated Tiering.Data tiering has been around for years. It is the process of matching data value to storage costs. Traditional tiering moves disk-based data from primary down to nearline, archival and tape tiers for economical long-term retention.
Primary tiering using SSD has a very different meaning. SSD tiering moves data back and forth between Tier 0 SSDs and Tier 1 fast spinning disk. Movement occurs in response to application usage, a manual command or a schedule. The data moves, not copies, so it must be highly protected using some form of SSD RAID. These are not granular moves; GB-size blocks and file volumes will move even if the actual data request is very small. Additional RAID calculations may impact performance and add to the cost with additional SSD capacity.
- Caching.Caching has also been around for a long time, but SSD caching changes the equation considerably. Rather than tiering, which requires administrators to understand IO usage and set policies, caching uses predictive algorithms to copy data to the SSD layer for fast IO processing.
The storage array may do write caching as well as read caching; we much prefer to see both. Write caching buffers incoming writes and efficiently aggregates them before doing disk writes. Using SSDs for the read cache enables the array to have a much larger cache, which increases the amount of data the cache can store and accelerate.
Advantages of Caching
Flash caching drives dramatic performance gains by keeping active data copies on very fast flash cache. This saves money since customers do not have to increase spindles to increase performance. Some customers can even replace expensive SAS with SATA because they gain so much performance from the SSD cache.
Customers also save money because flash caching can be effective with a relatively small amount of flash and does not require heavy management overhead because they can leave optimization to the caching algorithms.
Caching’s predictive learning algorithms takes that problem out of administrators’ hands by learning and applying the optimal placement for best application performance. Larger SSD capacity, granularity, simplicity and redundancy also work into the caching equation:
- Larger flash capacity. Caching has benefited from important advances in flash SSD sizes, which have reached TB capacity. Entire large worksets can fit into cache for very fast IO processing.
- Granularity. Cached data copies are highly granular. Flash tiering moves large chunks of data even if a just small block or byte has changed. But caching can copy large or highly granular data. This minimal data copy dramatically increases performance over large data movements between a flash tier and backend disk.
- Simplicity. Management simplicity saves on ongoing expenses, which is key to cost savings over the life of the array. Letting the storage system provide good caching algorithms is preferable to requiring IT to figure it out data placement and matching best performance for an application’s needs.
- Fast reads and writes. Read and write flash caching greatly accelerates performance over spinning disk drives. Reads are considerably faster than writes, but caching vendors will usually employ write acceleration technologies just as writing from DRAM, aggregating writes and employing logs.
- Accelerates overall storage performance. All applications sharing the SSD storage will increase performance and decrease latency. Benefits spread to all end-users.
- Real-time vs. periodic.Most automated tiering solutions do not operate in real-time but offer manual commands, factory-set data movement schedules or policy-controlled data movement between tiers. Flash cache operates in real-time to automate and optimize copy movement between tiers as the workload changes.
Let’s look at an example of a hybrid storage array with SSD-based caching, the Nexsan NST5000. The array combines gigabytes of DRAM, terabytes of high capacity flash drives and SAS or SATA for long term spinning disk capacity.
Nexsan calls its cache technology FastTier. It performs reads and writes in its high-speed DRAM and flash layer for extremely high performance and low latency. Frequently used data can granularly stage to the cache for fast processing. The NST5000 does not need to RAID its SSD read cache because the data always lives on the underlying RAID’ed spinning disks. Performance for sequential files is stellar; performance for random IO files – by far the harder task – hits 100,000 CIFS op/sec at 1.23ms latency.
SSD cache helps to solve the big business problems of poor performance and high latency for mainline applications, without a large per/GB price tag. When utilized within hybrid storage arrays, SSD-based cache consolidates high performance storage onto a shared storage system. These solutions cost a fraction of all-flash and, in many cases, just slightly more than an all disk based system, but they greatly multiply IO performance over traditional storage arrays.
Christine Taylor is an analyst specializing in data storage and information governance at Taneja Group.