More and more vendors are making wild claims about their appliances with Flash cache. Most RAID controller vendors and NAS providers are planning to add Flash to their product designs, which seems like a good idea for all cases as Flash offers a significant amount more cache than is available using standard DRAM. Since the storage stack latency for NAS is often greater than SAN, having cache with a bit more latency is usually does not impact performance.
All of this reminds me of something a late friend of mine, Larry Schermer, used to say back in the 1980’s when he was working on some of the first Cray solid state disks (SSD). He said, “Cache is good if you are reusing data or if it is large enough to handle the data being written to reduce latency. Otherwise, cache management will eat your lunch.”
The point is clear: if your data does not fit in cache, you are not doing small writes that can be coalesced or you are reusing data. In that scenario, cache is not going to help much – it might actually hurt. The issues surrounding caching Web pages as opposed to caching other data are pretty interesting using Larry’s analysis framework. Over the last year, an increasing number of vendors have told me that their cache-based appliances work for all applications and dramatically increase I/O performance. I know that this is just not true for all applications.
The latest crop of storage appliances have been designed in a way that puts more bandwidth between the cache and the hosts than they do between the cache and back-end storage. This means if you are doing a streaming write from an application (or multiple applications) that do not fit in cache, the performance will be limited to the performance between the cache and the back-end storage. All of this is pretty obvious to me, but clearly not to the vendors that are making these claims.
The purpose of this article is to provide some background information on what works and what does not, and since some of the largest uses for data storage are Web-based, what works well on the Web. The real issue is data access patterns for both the Web and local storage – what is being accessed and when.
Web Access
Almost all website data is read, not written. Take the website you are viewing this article right now, for example. The only writing you do is when you post comments (hopefully all positive) about this and other articles. The only other writes that you incur – and they are not something you can control – pertain to logging. Websites often do lots of logging, but these are very small I/O requests. User-initiated writes that occur on most websites, from CNN and Yahoo! Finance to Enterprise Storage Forum, are small and infrequent. Even for sites that are regularly updated, such as CNN, the ratio of reads to writes is often 1000x even with all of the logging. For a website like CNN, caching the data could be very expensive given the sheer amount of audio and visual data. You might be able to cache some of the heavily used files, but how much sense does that make? This started me thinking about Web latency and data and, of course, disk drives.
While sitting in my house in St. Paul, Minn., I decided to ping a few sites. I have a cable connection from a big-name cable provider with a DOCSIS-3 modem and Wireless N connection. Here are some times:
(ms=milliseconds)
ping www.yahoo.com
Pinging any-fp.wa1.b.yahoo.com [209.191.122.70] with 32 bytes of data:
Reply from 209.191.122.70: bytes=32 time=45ms TTL=51
Reply from 209.191.122.70: bytes=32 time=44ms TTL=51
Reply from 209.191.122.70: bytes=32 time=44ms TTL=51
ping google.com
Pinging google.com [74.125.95.99] with 32 bytes of data:
Reply from 74.125.95.99: bytes=32 time=53ms TTL=49
ping washington.post.com
Pinging washington.post.com [216.246.74.34] with 32 bytes of data:
Reply from 216.246.74.34: bytes=32 time=76ms TTL=51
Reply from 216.246.74.34: bytes=32 time=94ms TTL=51
Reply from 216.246.74.34: bytes=32 time=48ms TTL=51
Reply from 216.246.74.34: bytes=32 time=50ms TTL=51
ping nsf.gov
Pinging nsf.gov [128.150.4.107] with 32 bytes of data:
Reply from 128.150.4.107: bytes=32 time=56ms TTL=238
Reply from 128.150.4.107: bytes=32 time=53ms TTL=238
Reply from 128.150.4.107: bytes=32 time=53ms TTL=238
Since I am near the middle of the country, it is not surprising the ping times are about the same – about 10 percent between coasts. For the most part, I route via Chicago and then out to somewhere. We can get some faster routers, but network latency is almost always going to be higher than disk latency unless you are close to the site. The speed of light and the associated latency is not going to change anytime soon.
Next, I tried to ping some of my cable provider’s close routers to determine the latency, all of which were within about 20 miles of my home. The range was as low as 7ms for a ping and, to the same router, as high as 74 ms for an average of about 15ms. I also pinged my home cable modem and the latency was less than 1ms.
In comparison, the latency of enterprise-class 3.5 inch SATA disk drives (using average seek, plus average read latency) is approximately 14ms. The average read seek plus latency for the latest 15K RPM 2.5 inch SAS drives is less than 5ms. Clearly, network latency accounts for a far greater percentage of overall end-to-end latency for most Web access unless the data is very close of your location.
Given the latencies, if you have enough disk bandwidth, which we all do (the bandwidth of an OC-192 channel is equal to the streaming bandwidth of about eight enterprise SATA drives), I do not understand all of the hype over Flash cache for Web devices, given that most of the latency is not in the local storage system, but in the network between the user and the website. Sure, you can save maybe 10 percent to at most 30 percent of the latency, but at what cost?
Other Data Access
Accessing local data using a large Flash cache instead of a small DRAM cache might make some sense, assuming that my friend Larry’s corollary is followed. What works is really dependent upon the size of the data that is being accessed. Almost all devices cache data based on block address, as they are block storage devices in the SAN world. There are some exceptions in the NAS world, but NAS devices aren’t mind readers and do not know what parts of a file are going to be accessed.
What it really comes down to is whether the range of data that is being accessed exceeds the cache for both read and write. Also realize that some vendor offerings have a separate read and write cache and might use DRAM for writing, given the latency and performance issues with Flash and Flash for read cache. But the issues still hold true. Having a range of read data that exceeds the cache means Flash cache will not help performance very much. If your range of read data is 5x your cache you can expect a 20 percent reduction in latency on average.
The cache hit rate statistics on the device will likely be much higher because cache hits are based on physical reads from the disks and the blocks read from the application. For example, if data is allocated sequentially, the controller reads a full stripe of 1MB and an application reads in 256KB requests, the controller will report three hits and one miss.
Using the same example, if the full stripe read is 2MB for sequentially allocated data, the result would be seven hits and one miss.
Many vendors calculate cache hit rates in this way and they are of little value when trying to gain an understanding of the data reuse. The benefits of using cache in a local data environment will depend on the above factors. My friend Larry Schermer’s analysis framework from more than 20 years ago still holds true.
Final Thoughts
Data that is being accessed locally and has latency dependencies on access, such as a database index table, is far more sensitive to the latency of local devices than the latency of the same database 1,000 miles away over a WAN. Since the greatest part of the latency in the remote database case is the WAN – not the storage device – does cache matter that much? If you’re reading and re-reading the same data over and over again for a hot new item, it might help with back-end storage contention, but is it worth the cost of maybe 5x or more per GB of data for Flash? I do not know, but it must be a consideration.
Henry Newman, CEO and CTO of Instrumental Inc. and a regular Enterprise Storage Forum contributor, is an industry consultant with 29 years experience in high-performance computing and storage.
Follow Enterprise Storage Forum on Twitter.