As the saying goes, the best I/O is the I/O you don't have to do, but the reality is that even with I/O virtualization and virtualized I/O technologies (see I/O, I/O, It's Off to Virtual Work We Go), there's still a need to perform I/O operations to store or access data in a more effective manner.
The need for more effective I/O performance is linked to the decade's old and still growing server-to-storage performance I/O gap, where the performance of hard disk drive (HDD) storage has not kept pace with the decrease in cost and increase in reliability and capacity of server processing power. You can read more about data center bottlenecks and the server-storage performance I/O gap here.
With the growing awareness of power, cooling, floor space and associated green and eco (ecological and economical) issues affecting IT data centers and storage, solid state devices (SSD) have reemerged as a solution to address multiple woes. SSD is not new technology, having been around for decades, but over the last couple of years there has been a renewed interest, with new forms of SSD-class technologies, along with packing options and market price bands (consumer, SMB, mid-market, enterprise). Even EMC has gotten in on the act, with a big announcement just last week (see EMC Goes Solid State).
Almost 20 years ago, as an early adopter and launch customer of open systems SSD from the company formerly known as DEC, I recall the price being in the low six figures for a couple hundred Megabytes (yes, that's MB) of SSD (two devices mirrored for resiliency) that were very big and bulky. These prices are embarrassing by today's standards; however, they were a bargain compared to the predecessor mainframe SSD solutions that came before them.
Traditionally speaking, SSD has been based on dynamic random access memory, known as DRAM, or what's installed in your computer and commonly known as RAM or memory. DRAM is also known as cache or volatile memory and is found in many storage systems to boost the performance of HDDs. The benefit of using RAM is that it is significantly faster than doing I/O (reads or write) operations (IOPS) to a HDD in that there are no moving parts to delay seek and transfer time, as is the case with even the fastest HDDs.
Some SSD vendors will claim there is no latency. However, do your homework and you will find it is more along the lines of nominal to not noticeable rather than non-existent. When you look at a classic storage I/O access pattern, there is the I/O command initiation, seek or positioning, and then data transfer time. Assuming that you can improve data transfer time with faster media and interfaces, by eliminating seek time, you can boost performance. In the case of SSD, seek time is essentially eliminated and media transfer times are reduced if not eliminated, leaving the bulk of I/O time to transfer time over a particular interface, command or protocol overhead, and I/O pre- and post-processing on the application server.
Sounds great, so why don't we have more SSDs installed? Simply put, the answer is cost. A myth has been that SSD in general costs too much when compared to HDDs, and when you compare strictly on a cost per GB or TB basis, HDDs are cheaper. However, if you compare on the ability to process I/Os and the number of HDDs, interfaces, controllers and enclosures needed to achieve the same level of IOPS, bandwidth, transaction or useful work, then SSD should be more cost-effective. The downside to DRAM, in addition to cost, compared to HDD on a capacity basis is that electrical power is needed to preserve data.
DRAM has great performance capabilities for reads or writes, but there is a myth that SSD is only for small random IOPS, which was the case in early generations and for some current generations. There are DRAM SSD solutions that support Fibre Channel and InfiniBand that handle both small random IOPS and large sequential throughput workloads.
DRAM SSDs have over the years addressed data persistence issues, battery backed cache or in the cabinet, UPS devices to maintain power to memory when primary power is turned off. SSDs have also combined battery backup with internal HDDs, where the HDDs are either standalone, mirrored or parity protected and powered by a battery, to enable DRAM to be flushed (de-staged) to the HDDs in the event of a power failure or shutdown. While DRAM-based SSDs can exhibit significant performance advantages over HDD-based systems, SSDs still require electrical power for internal HDDs, DRAM, battery (charger) and controllers. Note that if you are concerned about green and environmental issues, you should be concerned about batteries and their safe disposal (e.g., WEEE and RoHS).
A Flashy Alternative
Unlike DRAM, which does not preserve data when power is turned off, flash-based memories have risen in popularity, given their low cost per capacity point and non-volatile nature, meaning no power is required to preserve the data on the medium. Flash memories have gained popularity in low-end USB thumb drives and MP3 players like the iPod, given their lower power, low cost and good capacity points and ability to preserve data when powered off.
The downside to flash-based memory is that its performance, particularly on writes, is not as good as DRAM-based memory, and historically, flash has a limited duty cycle in terms of how many times the memory cells can be rewritten or updated. With current generations of enterprise-class flash, these duty cycles are much higher than what you would get with a consumer or throw-away class flash products.
The best of both worlds (hybrid approach) is to use RAM as a cache in a shared storage system combined with caching algorithms to maximize cache effectiveness, optimize read-ahead and write behind and parity to boost performance. Flash is then used for data persistence as well as lower power consumption and improved performance compared to an all HDD-based storage system.
Consequently, you can expect to pay more for an enterprise-class flash-based storage system, since it will have some DRAM for a cache buffer to speed up reads and writes to flash as well as some form of data integrity guards above and beyond normal memory parity bits. Given the lower general cost of flash even in an enterprise configuration, more data can now be stored on SSD and achieve the benefits of lower power consumption and better performance at a more effective price per transaction or workload.
For example, building on what Texas Memory Systems (TMS) announced in 2007 with its RAM-SAN 500, which has a RAM cache and flash-based storage for persistence. EMC just announced flash-based SSD installed in their DRAM cache-centric DMX series of storage systems. For EMC, SSD and DRAM-based in particular is nothing new, having leveraged the technologies back in the late 1980s and what ultimately became the successful DMX line of storage systems in the form of a large cache-centric storage system.
These and other announcements bring validation and have placed SSD back in the industry spotlight. On the entry level, desktop and prosumer level, there have been many other instances of SSD appearing, mostly leveraging flash like Samsung's offerings to replace or supplement HDDs in laptops, or the FusionIO card that installs into a computer I/O slot for flash access of storage by direct attachment via an internal PCI bus. These are in addition to the traditional DRAM SSD vendors of component and storage solutions who continue to enhance their solutions from a performance, price and packaging standpoint.
With "green" storage and data center issues such as power, cooling and floor space becoming more important (see PACE Yourself to Meet Storage Power and Cooling Needs), we are seeing a shift in discussion of power avoidance to an awareness of doing more with what you have for energy efficiency, including boosting performance while reducing energy consumption and intelligent power management, where power consumption can be reduced without compromising application performance or availability.
There is an interesting parallel here between the oil energy crisis of 1970s and the current buzz around green IT and storage. During the 1970s oil crisis, there was the race to conserve and avoid energy consumption, and we are seeing similar messages around power avoidance for storage, with first-generation MAID generating lots of headlines. The next wave for energy conservation was introducing more energy-efficient vehicles. The analogous story for storage is intelligent power management, along with storage that can do more work or activity, transactions, files accessed, IOPS, or bandwidth per watt of energy than in the past. The subsequent waves will involve adopting best practices, including better data and storage management, archiving, compression, de-duplication and other forms of data footprint reduction along with other techniques to do more with what resources you have to sustain growth.
There is an old myth that SSD is only for databases and that that SSD does not work with files. The reality is that in the past, given the cost of DRAM based solutions, specific database tables or files, indices, log or journal files or other transient performance-intensive data were placed onto SSD. If the database were small enough or your budget large enough, then the entire database may have been put on SSD. Today, given the cost of DRAM and flash, many new applications and usage scenarios are leveraging SSD technologies. For example, NFS filer data access can be boosted using caching I/O accelerators like those from Gear6. Here's a link to an industry trends and perspectives solutions brief about the Business Benefits of FLASH SSD, along with use cases that should help put some things into perspective.
While not NAS-based, traditionally speaking, the way to manually address I/O performance issues with SSD has been to use "Hot File" I/O analysis tools to spot performance bottlenecks or files with excessive I/O activity as candidates for moving to block-based SSD. Given the focus the last several years on driving up storage utilization, in some cases without a focus on performance, it's just a matter of time before we see storage and IT resource management tools put more focus on performance to help leverage higher-performing tiers of storage. In other words, a shift from energy conservation to energy efficiency.
There is a long list of vendors who were at one time in the SSD business, among them Imperial, MTI, Quantum via DEC and StorageTek among others. Vendors currently involved with DRAM and flash-based solid state devices, components or solutions include Bitmicro, Curtis, Dell, EMC, Gear6, Imation, Mtron, STEC, Samsung, SanDisk, Seagate, SolidData, Ridata, TMS and Toshiba, among others.
For those who think the HDD is now dead, well, in actuality, just as disk is helping to keep magnetic tape around, SSD (both DRAM and flash) will in fact help take some performance pressure off HDDs so they can be leveraged in more efficient and economical ways, similar to what disk is to tape today. I expect that we will also see some vendors start to back up their performance and energy claims using standard benchmarks as "A Barometer" of capabilities, including SPC, SPEC, TPC and Microsoft ESRP, among others.
Keep in mind, however, that these benchmarks should be taken with a grain of salt as an indicator of capability, and that the true comparison is how the solution scales with stability to meet your specific needs. I'm not going to hold my breath, but maybe we will finally see EMC take part in the Great Benchmarking Battle (see Hitachi Reopens Benchmark Debate).
To wrap up for now, 2007 was the year for industry to prepare for the revival of SSD and flash-based technologies. 2008 will be the year of awareness and early adoption, adoption by vendors and early deployment by customers. 2009 will be the broader adoption phase for more energy-efficient storage technologies that do not sacrifice performance while doing more work with less energy. Learn more at www.storageio.com/xreports.htm and www.greendatastorage.com or drop me a note about what's on your mind regarding SSD and other energy- efficient technologies to sustain IT growth without hurting application service delivery.
Greg Schulz is founder and senior analyst of the StorageIO group and author of "Resilient Storage Networks" (Elsevier).