SSDs, pNFS Will Test RAID Controller Design
RAID controllers designed for hard disks and IOPS aren't going to cut it in the new world of solid state drives (SSDs) and pNFS, which are capable of feeding so much more data into the channel that streaming bandwidth is going to become the problem for RAID vendors.
RAID vendors look at me like I'm nuts when I say this. Data storage is an IOPS world, they say, so why should I care about streaming performance? For those of you who don't know, IOPS is the number of I/O requests per second, while streaming bandwidth is how many gigabytes per second the controller can deliver to a server or servers.
Disk drives can support a limited number of random IOPS, but for flash drives, the number is virtually unlimited. Add parallel NFS to the picture and the number of I/O requests per second won't matter if the controller has relatively poor streaming I/O.
Common wisdom has held until now that I/O is random. This may have been true for many applications and file system allocation methodologies in the recent past, but with new file system allocation methods, pNFS and most importantly SSDs, the world as we know it is changing fast. RAID storage vendors who say that IOPS are all that matters for their controllers will be wrong within the next 18 months, if they aren't already. Flash-based SSDs, file system design changes and NFSv4.1 (pNFS) will affect everything end-to-end, from high-end arrays to low-end SAS and SATA.
Let's start with what flash SSDs are going to do to RAID controllers. Flash SSDs have extremely low latency and are capable of up to 55,000 random reads per second, as claimed by Bitmicro. As Bitmicro did not provide any information on the request size, let's assume it's small. Since a disk sector is 512 bytes, requests would translate to 26.9 MB/sec if 55,000 IOPS were done with this size. On the other end of testing for small block random is 8192 byte I/O requests, which are likely the largest request sizes that are considered small block I/O, which translates into 429.7 MB/sec with 55,000 requests, using the 4Gb Fibre Channel interface performance of 400 MB/sec. (As a side note, I hate when vendors quote numbers but then leave it to you to figure out how they go them.) I doubt that they used random 512 byte I/O and more likely something like 1024 byte or 4096 byte requests, but who knows. So I looked on the Intel (NASDAQ: INTC) Web site and found this:
The reason I picked 8 lanes in a PCIe 2.0 bus is that is the highest performance I generally see in the design for controllers. I also saw a number of SSDs just below and around the performance range of the Intel SSD. Most RAID controllers use a PCIe bus to interface between either caches or communications to backend channels. There are actually some RAID controllers that do not even use PCIe 2.0 (500 MB/sec per lane) and still use PCIe 1.1 (250 MB/sec) per lane. As you can see, even when doing purely random I/O to the SSD, the I/O quickly becomes a bandwidth issue given the high performance of the SSD. 29 SSDs might seem like a lot, but if a few of them were doing larger blocks and sequential I/O at 300 MB/sec, four drives would saturate a PCIe 1.1 bus with 8 lanes.
My conjecture is that SSDs change random problems into streaming problems, and current RAID design cannot address the bandwidth issues, which do not even take into account the required command queues in the control to support the large number of IOPS. SSDs will require a complete rethink of IOPS and streaming.
File System Changes
Many file system vendors are realizing that more and more environments have a bi-modal distribution of file sizes. Most file systems have many small files, which do not take up much space, and fewer larger files that take most of the space. As file system vendors learned about this trend, they began to increase the allocation sizes for large files. What this means is that today we have a number of file systems that support large allocations of 16 MB or greater, which changes the writing and reading of large blocks of data from IOPS to a streaming I/O problem requiring bandwidth. More and more file systems are beginning to write and read larger blocks, given the overhead of managing many small allocations. Having bigger allocations reduces the overhead of managing the file system allocation maps and therefore improves file system performance and reduces fragmentation.
Many RAID vendors do not understand file systems and file system changes and the effect these changes will have on their hardware. This has been my experience; even if you talk with a RAID vendor who also sells file systems and applications, they still do not understand the combined impact.
Not Your Father's NFS
The old NFS protocol makes very small requests and therefore looks like an IOPS problem even if the data is sequentially allocated in the file system. Most NFS servers are designed to handle hundreds to thousands of connections and address the IOPS issues caused by each of those connections making small requests. pNFS changes all of this, allowing for large transfers if the data is sequentially allocated. Combined with 10Gb Ethernet, I expect that over time, more file systems will be able to stream data (see The Future of NFS Arrives).
The Future is IOPS and Streaming
When someone tells me that the world is an IOPS problem or a streaming problem, I suspect they do not understand I/O and what happens with the request and how file systems work, because both are required. If you use SSDs — and most of us will in the future — if you have enough requests queued to the device and RAID controller, you will be able to stream I/O via the SSDs. Note what I just said: You have to have enough requests (IOPS) and then you will stream. Storage controllers of the future are going to need to be able to manage a large queue of requests on the host side (IOPS) and execute those requests and manage them sending them to SSDs streaming data.
Combine this with change in file systems and pNFS and it becomes clear that both IOPS and streaming are required for balanced performance. The world needs storage controllers. We call them RAID today — who knows what they will be called in the future — that can support huge command queues on the front end and streaming I/O on the back end. Some of the storage controller vendors must understand these requirements, and I am sure that some are working on updates. I am also convinced that others do not get it, and these vendors risk getting left behind by the market.
You'll need both IOPS and streaming I/O in future products to move data quickly. Ask your vendor for full duplex (write and read at the same time) bandwidth to disk (not cache) for the RAID controller, and ask for a block diagram so you can count the controllers and number of PCIe buses and figure out the upper end of performance. Make sure you're covered for the brave new world we're about to enter.
Henry Newman, a regular Enterprise Storage Forum contributor, is an industry consultant with 28 years experience in high-performance computing and storage.
See more articles by Henry Newman.
Follow Enterprise Storage Forum on Twitter