Where is the Data Storage Innovation?
No, it is not time for my yearly predictions (that happens next month), but this month, leading up to my yearly predictions, I want to look at innovation in general in data storage. Given applications today and future applications, does innovation really matter, or is it in fact required for success?
Here are some of the challenges. At the basic technology level we have the interface, which is SATA vs. SAS. But a more fundamental challenge is this: big companies with lots of products and support compared with smaller, new, more flexible companies with innovative products and ideas.
I think the SATA vs. SAS is a precursor of things to come. What seems to be left out of the vendor and product equation is the application. As usual I will do my best not to make product value judgments (good or bad). But if I need to refer, I will refer to vendors to talk about the old guard vs. the new vendors trying to take over, reporting on what happened in the past.
So back to the question: is there innovation?
Innovation: SAS vs. SATA?
We can look at innovation from a number of places on the stack, but since I really like to start at the device and work my way back to the application, I will start with the disk drive interface. With that in mind let’s start with SAS vs. SATA.
It is not arguable that SAS is a more robust protocol that allows more error management at the drive, and that SAS is a more efficient interface and historically was a more expensive hard drive interface. These are historical facts. And the cost factor has not been the case for the last few years with the advent of dual interface nearline SAS/SATA drives on enterprise 3.5 inch drives.
SAS is – and has been for a number of years – the drive interface choice for the enterprise, replacing fibre channel. In the past, enterprise drives generally cost more, and are built to perform at moderate capacity but high performance. SATA was built for high capacity but had lower reliability in many areas of the protocol.
This all changed in about 2009 when vendors came out with drives with a single ASIC that supported both SAS and SATA and marked the end of the fibre channel drive interface era. SAS currently supports 12 Gbits/sec per lane, while the SATA interface still only supports 6 Gbits/sec per lane. Of course, for a single drive disk it does not matter, nor for even about 6 drive 4 TB drives. But for anything after that, 12 Gbits/sec makes a difference.
So for your home, 6 Gbits/sec vs. 12 Gbits/sec makes no difference for disk drives and likely the same for SSDs. But it does make a big difference for the enterprise as it reduces the number of chipsets and connectivity on the backend of a large storage controller, and reduces the complexity of the design, e.g. it reduces cost.
It's not about speed of the interface, but how much more SAS bandwidth one is able to utilize compared to SATA. The latest offerings of 12 Gbits/sec technology definitely provide more lanes and more bandwidth for its users and less complexity for storage design.
The current plans for SATA is to move to 8 Gbit/sec sometime next year. It will be interesting to see how many disk drive vendors move in this direction given the performance of SAS will be 50% greater. And with 24 Gbit/sec SAS on the horizon (disk connectors supposedly available in 1Q14) I do not see why anyone would use SATA in the near future.
One of the things that still puzzles me is that Intel CPUs, which at one time were supposed to have SAS built into the CPU, still have SATA. This is going to be a problem in the future as SAS will dominate the enterprise and SATA is going to be relegated to the low end of the market. Besides the performance in error control, you cannot support things like T10 Protection Information (PI)/Data Integrity Field (DIF) in SATA, and never will.
Conclusion: SATA is a dying technology for the enterprise and will not be used in the future. Anyone using it or claiming SATA is for the enterprise should be told – clearly – it is not.
Innovation: Old Guard New Guard?
The traditional enterprise storage companies, EMC, HDS, IBM, NetApp and the rest, are being challenged by new players like Caringo, Cleversafe, Crossroads, Kaminario, Nimbus Data, PureStorage, SanDisk SpectraLogic, Violin Memory and others – vendors with both hardware only and embedded hardware/software solutions.
Most of the new vendors are developing products in either one of two areas. These are either storage tiering or storage interfaces or both.
Storage tiering has been around for a long time. I remember back in the 2002, there was a company named Cereva, whose assets were purchased by EMC. They had a product that places heavily used data on the outer cylinders of the disk and less used data on the inner cylinders. On paper this made sense, while in practice (because of disk errors and remapping of blocks) it did not make much sense.
Fast forward a decade later and with flash drives and flash PCIe cards, vendors are building caching products that move blocks based on usage to flash devices. Sounds like a great idea for applications that are read intensive, but how do these products fit into a world where Seagate has moved the same technology to the disk drive? Who is going to own that business, Seagate or the new caching controller vendors? Does the world need an all-flash storage device for everything, or does the product just solve an important problem for some environments, but not a broad market important problem? Lots of questions. But the answer depends, I think, on the application.
Moving up the stack
Storage interfaces are changing also. The REST interface is getting lots of market traction and the fact that Seagate has announced a REST-based product codifies this change in the market.