Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
We have all heard that software defined networks are going to solve all of our networking problems. Many vendors are already using software defined storage and I think there might be more vendors coming to the table.
The questions I always ask myself: is this all good and what is going to happen and why? What are the challenges and issues that need to be considered when evaluating software solutions that were once designed using ASICs for both storage and networking?
As we see more storage systems running over standard networks with ISCSI and FCoE, I think we should lump these issues all together.
I think there are a number of challenges on both sides of the equation: the ASICs specific design and designs using commodity hardware for both storage and networks.https://o1.qnsr.com/log/p.gif?;n=203;c=204650394;s=9477;x=7936;f=201801171506010;u=j;z=TIMESTAMP;a=20392931;e=i
Before I get started it’s good to remember where we have come from, so we can understand why and how things have progressed over the years. The following references of performance on storage and networking reflect some of the design decisions that companies have made over the years:
The point is that once you get to storage, performance increases are pretty pathetic. Though ethernet performance and CPU performance have increase somewhat, storage performance – except for SSD performance – has increased far less than two orders of magnitude. And we are just starting to use 40 Gbit ethernet so 10 GbE might be a better comparison for networking.
This begs the question: can commodity CPUs address the storage problem and could they address the networking problem? Let’s look at the two methods for address the design of storage and networking hardware.
ASICs (Application-Specific Integrated Circuit)
On the side of ASIC design, it takes a long time for both the hardware design, and the verification process. Once that is completed the software teams need to confirm that the software designed for the ASIC works and works as expected with the hardware.
Sometimes minor ASIC hardware design issues can be fixed in software and sometimes they cannot. Once completed, ASIC is very fast at the tasks they were designed for. In the RAID storage world starting in the 1990s, ASIC were often designed for parity generation as computational performance and latency was an issue.
ASICs are also used for networking switches and routers and from everything to SSDs to disk drives. So ASICs are used in many places but the number of places is dwindling, given the performance increases in commodity CPUs and of course the availability of commodity software. There used to be a lot more fabrication houses that fabricated ASICs in the US and the world than there are today.
The technology for, say, 45 nanometer design is very expensive, requiring billions of dollars of investment. And you need to start preparing to build a new plant at 32 nanometers. If you are a vendor requiring an ASIC for, say, 100,000 storage controllers a year (which means you are a pretty good size vendor), you have to amortize for the cost of the multi-billion dollar fab. Of course it is not just your company using the fab, but trust me the cost is still pretty high.
On the networking side of the equation, if your company is a large networking vendor then maybe the number is more like 1,000,000 parts. But this is still a high cost even though the cost per unit will be lower.
Commodity Hardware and Software
Today, lots of vendors are using commodity CPUs for storage controllers, and some are starting to look at doing this on the network side. On the storage side, from the low end RAID cards to the high end storage controllers, this has been going on for a number of years.
On the networking side, with the availability of PCIe 3.0, we are just starting to see products in this area. In some cases vendors are taking standard motherboards, while in other cases they are using design specific motherboards. The reason, of course, should be no surprise to anyone: the cost of commodity hardware is far lower than development of an ASIC unless you are building a huge number.
The cost includes having EE engineers develop an ASIC, simulate it, send it out for fabrication, and likely fix the problems – and getting the software working is not cheap. On the other hand you can buy a Sandy Bridge motherboard (soon Ivey Bridge) and you get 40 GiB per socket of PCIe bandwidth, you have free operating systems (or can purchase support) with Linux or in some cases BSD all for say about $1,000 per unit. And if you include memory and the mother board likely around $5,000 on the high end.
The costs for these types of appliances are in the software development. For storage, one of the computations that will be required is do parity generation, so you need some good programmers that likely need to write in C or maybe even assembly language. As we move to other RAID methods such as declustering, that will need to be ported, but for the most part storage vendors have been using commodity hardware for over a decade for management and monitoring.
On the networking side we have had multithreaded TCP stacks for a long time in BSD and Linux, though BSD seems to have been an early choice for many. Large scale networking seems to be lagging storage, and for a good reason.
Why Networking Lags Storage
Development of storage controllers using commodity hardware is way ahead of large scale networking (many smaller network device such as your home routers use commodity hardware). This is true for a number of reasons in my opinion.
1. Most storage controllers have been using commodity hardware for a number of years for controller and management.
2. Storage latency requirements for the most part – even with SSDs – are far more forgiving than network latencies.
3. Storage bandwidths are far less than network bandwidths on large controllers.
Some vendors used a real-time operating system but others have used commodity UNIX (Linux, BSD and others) operating systems for more than a decade. Though this might be the case for network management, networking bandwidths have required the development of specialized ASICs.
As you can see from the above table, storage performance has not kept pace with CPU performance by at least 10x. On the networking side, you might be able to get away with using commodity hardware for your home router, but it will not work and has never worked for the high speed networking side.
I remember back in 1997/8 that you could get about 94 MiB/sec out of the fastest RAID controller available at the time. Today that number is, say, 32 GiB/sec – an increase of 384 times. Seemingly a significant increase, but it only took 10 disk drives to get the 94 MiB/sec back in 1997/8 and today it likely takes well over 500 drives to get the 32 GiB/sec. More on this later.
Storage vs. Network Latency
Back in the 1990s, the seek and latency time on disk drives was a bit over 12 milliseconds. Today that that number has changed to just under 8 milliseconds for 4 TB drives, and for the 2.5 15K drives, it’s under 4 milliseconds. Best case: a factor of 3x improvement.
Yes, SSDs have far less latency, but some of the SSD performance is limited by the operating system, as you have to get in and out of the operating system through the kernel, VFS layer and drivers down the channel to the device. Network latencies, on the other hand, are measured in milliseconds also, but are far lower in general than storage latencies.
You might have 1 million people expecting 10 millisecond latency on a Google or Yahoo search, but 1 million people getting 8 millisecond latency on disk requires about 1,000,000 disk drives. This is not going to happen, as that is 4,000 PB of storage or 4 exabytes. By design and expectation, networks are required to have far less latency than storage.
Network and Storage Bandwidth
Today we have networking backends with many terabits of bandwidth. In a perfect world and with a single Intel processor, you only get 320 Gbits of bandwidth (40 GiB per socket 8 bits per byte). For example, the Cisco Catalyst 6807-XL chassis is capable of delivering up to 11.4 terabits per second of networking bandwidth. That would equal over 35 Intel processor of bandwidth plus all the communications overhead, so likely double the number of CPUs. On the storage side, the faster controller is around 32 GiB/sec. Let’s assume about a 50 percent disk performance utilization and 121 MiB/sec maximum disk performance for 4 TB drive, and you can run about 537 disk drives for about 2 PB of storage.
What I Expect will happen
I have had a home NAS product for a few years that uses an Intel Atom for the RAID engine and management. It works great and has about the same performance as the high performance RAID I used back in 1997/8, all for about $600 including the storage.
High end storage performance can easily fit cost effectively into today’s commodity hardware footprint, even with SSDs. Storage bandwidth clearly has not grown beyond the commodity bandwidth with PCIe 3.0 and CPU performance that’s available with Intel Sandy Bridge. It has grown to the point where RAID parity, declustering and likely in the future erasure codes (aka for older people, forward error correction) can be done in the CPU without incurring significant latency and overhead.
This is not the case for high end networking and will not be the case at least for the foreseeable future, given that bandwidth is limited by PCIe 3.0 bandwidth. 35 Intel processors at $1000 a piece plus the motherboards, etc., plus the overhead. So multiply by two it’s likely around $700,000 just for the hardware, or over $61K per terabit just for the hardware. And this makes the assumption that you can get 50% of the bandwidth from the CPUs. This is far more cost than the equivalent terabits from Cisco’s hardware ASICs.
In a nutshell, what I see happening: storage will use commodity hardware, even for the most part SSDs, just given the sales volume required to develop your own ASIC. But high-end networking will continue to develop its own ASICs, given that commodity hardware will not be able to meet the needs for performance and latency required in large core switches. Low and mid-range networks, which are not as latency sensitive, will continue to move toward software on commodity hardware.