What are the factors that are causing SAN storage to fall from dominance, and what does the rise of appliances mean for the storage industry?
Recently, I have been thinking about the design of past, current and future storage platforms. It is pretty clear to me and lots of other people that our industry is quickly moving away from SAN storage and local file systems to storage-based appliances.
Thinking about this phenomenon, I have asked a few of my friends, "Are we having to dumb down storage architectures because we do not have the storage talent to manage the complexity we have, or are we moving to the appliance model because it is the natural progression?"
In my mind, this is a classic "which came first, the chicken or egg" sort of question.
Oh, how the world has changed from seven years ago when SAN dominated the storage environment.
The SAN revolution started with the introduction of fibre channel around 1997. It continued with little competition until about 2007 when 10 Gbit Ethernet hit the market, taking a big bite out of SANs market.
During this time, there were major changes in the market. Linux become dominant in lots of areas, and there was little progress in file system development. Why was progress so limited? Was it because Linux was free and file systems are very difficult to develop and no one wanted to pay for a file system?
NAS and other storage appliances allowed for simplified management of large storage environments. The file system, storage and management were all combined into a single framework.
In a number of large organizations, SAN management and the file system management were handled by different groups. I frequently saw lots of infighting between these groups, but the big issue was integration. The SAN management people often did not tell the file system groups the underlying storage architecture. So the file system group often created a file system that was not optimized to the storage.
Were they using LUNs 7+1 RAID-5 (often used with some enterprise RAIDs), which, of course, does not match application allocations? Or were they using 4+1 RAID-5? What should the stripe sizes be set to? What should the file system allocation be to match the stripe sizes and the underlying storage architecture? All of this was very confusing for most sites and significantly impacted performance.
Even if the groups worked together or everyone was in a single group, there was a big learning curve for tuning each RAID device, file system, network and architecture. Since these were often different parts from different vendors, staff had lots of different training classes to go to. That, of course, cost money and time.
In addition, organizations had limited performance tools that could tell them what the problems were and which mistakes were made. It was very difficult to get everything optimally configured without spending a pile of money.
SAN vendors making good margins might put people on-site. Or organizations paid consultants to fix performance problems (I know as we did a good amount of working doing this). In the end, the costs of SANs were far more than just the cost of the hardware.
At the same time, NAS appliances were getting faster and easier to configure and use. And many studies touted their lower cost of ownership.
Before 10 Gb Ethernet, there was only 1 Gb Ethernet. Compared to SAN channels, it was pretty slow. Then came 10 Gb Ethernet, which was faster than the fastest SAN channel 8 Gbit fibre channel. It remained the fastest option until mid-2012 with the release of 16 Gb fibre channel about the same time as PCIe 3.0 servers hit the market. Of course, 10 Gbit Ethernet and NAS protocol does have more overhead than SAN (SCSI), but it is fast enough and scales well enough compared to SANs.
As I look back in on 2012, I see it as a year that lots of things started to change in storage. There was clearly an emerging trend and movement to application-specific appliances. Hadoop appliances, a correlation appliances, other big data analysis and large parallel file system appliances--all of these types of appliances showed both significant market growth and had a significant number of new vendors coming into the market.
Most of these appliances with built-in applications have limited tuning parameters, as they are already optimized for the underlying storage infrastructure and application design. This is not to say that there are not some knobs that could be turned to improve performance, but to say the range of knobs is limited. The appliances get good performance out of the box, and the integration is already done for you. This means that people are not required to have as much storage knowledge to operate these new storage appliances.
I think it is a combination of reasons that is causing traditional SAN-based storage environments to become a thing of the past.
First and foremost, the lack of well-trained people that could optimally configure and maintain large complex systems has had a negative impact.
Second, where is the integrated stack for complex systems? Where is a file system that understands the underlying topology and auto-configures itself? Well that is a dream, as the lack of communication between the layers and the narrowness of the interfaces means that will never happen.
Third, the vendors did not see the writing on the wall. They had too much of a good thing—or they thought they did. Why was there never integration end-to-end with file system vendors to make things work easily? This lack of cooperation spurred the development of other technologies.
Fourth and lastly (although I am sure you might have other reasons that I have not considered), Linux stifled growth in high-speed file systems as everyone wanted something for free. Heck, we all want a deal, but Linux does not have scalable file systems and good file system SAN management. As the saying goes, there is no free lunch.
Let's assume that I am correct in my assertion that the storage world is changing and we are moving to an appliance model that requires far less administration, monitoring and interaction with the hardware. What is left for storage analysts and storage administrators to do?
One of the things I have learned in almost 32 years in the market is that every newfangled tool still requires people to run it. Users and applications expand to the budgets that are allocated. If the appliance model requires fewer people, you will just find over time that you will likely buy more appliances. In the short term, there might be some personnel reduction, but in the long term nothing will change. Just as with virtualization, the reduction in staff will not last for long, as far as I can tell.
What it does mean is that we in the storage industry are going to have to change with the times or become dinosaurs. This includes all of us. We are going to have to move from management of SAN and RAID configuration to a better understanding of applications and how those applications are mapped onto the hardware. As I said, all of these appliances have tunable parameters. You are going to need to understand how these parameters need to be used for the application(s) and data types that run on the system. This will be a learning opportunity for all of us.
Everyone is talking about how big data is going to change our world. For the most part, I agree with many of the assessments. The amount of data that is going to be collected and analyzed is growing by leaps and bounds. In my opinion, it is only limited by the algorithms and the imagination of the people who are asking questions and developing these applications.
The key to our success is being able to make the move from the old ways and technology to the new ways and new technology.