Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Why should you keep your data stored in-house instead of in the cloud? It’s a heated debate, to be sure. This article showcases the many good reasons to keep your storage in-house.
1. Data Frequency
On-premise is ideal when you have to access data frequently and quickly, said Michael King, Senior Director of Marketing Operations, DDN.
“The biggest issue with using the public cloud is that the access rate is not fast enough to make it easily accessible and usable for data analysis,” he said.
2. The Law of Gravity
There are also data gravity problems. What it comes down to is, if you are not careful, the size of the data can get so large, that it can become immovable. One example of this was when Nirvanix, a cloud storage services provider, closed down in 2013 and gave its customers 30 days to move all of their data; some just weren’t physically able to move their data within that time frame.
“At a certain point, it just isn’t possible to sustain an IP connection long enough to move it,” said King.
3. High Performance
High performance computing (HPC) organizations, in particular, have a tendency to have extremely large amounts of data and seek to process it in a high-frequency manner. That’s why these users tend to deploy hybrid and private, on-premises clouds rather than public clouds for the vast majority of their data. A recent survey of HPC organizations managing data-intensive infrastructures worldwide and representing hundreds of petabytes of storage investment confirmed this trend.
“For big files or high capacity, or where local access speeds are critical (e.g. editing 4K video), on-premises is probably the way to go,” said King. “If the compute layer or the power users are within the building, you will see orders of magnitude higher performance from the on-prem storage equipment (such as a SAN or a high-quality Unified Storage solution).”
The public cloud is supposed to be cheaper – a lot cheaper. And many times that is the case – but not always. There are certain use cases and gotchas that can mean the opposite. Public cloud storage might be cheaper per TB, but if you don’t pay attention to all fees, costs can rise alarmingly. There are typically charges for accessing and restoring data, for example.
“Public cloud storage can be much more expensive than on-premises solutions, especially for large capacity points or long retention times,” said Gary Watson, Vice President of Technical Engagement, Nexsan.
He added that with on-prem, you don’t pay for the I/O. Users can be more free to “process” the data without fear of sticker shock when the cloud vendor’s invoice comes in.
5. Sensitive Data
There are so many stories about hacks into large companies that result in data breaches, that you wonder just how secure your data is. If large telecom providers, dating sites, banks and the NSA can be compromises, just how safe is your data in the public cloud. Certainly, these providers go to great lengths to secure the perimeter, detect intrusion and mitigate malware intrusions. But it’s only a matter of time, it seems, before a breach takes place. So avoid placing the crown jewels in there.
“When the data stored is irreplaceable or there are regulatory compliances involved, keep that sensitive data off the public cloud,” said Jason Pan, Senior Director, Business Development and Product Marketing at Promise Technology.
For example, sensitive encrypted public safety surveillance videos are probably held under heavy scrutiny and may be subject to regulatory restrictions. This is one area that should be stored in house. But large quantities of low-value surveillance video could quite happily be offloaded to the public cloud until their retention period expires.
6. Meeting SLAs
Pan recommended that those with demanding Service Level Agreements (SLAs) stick to in-house storage. If there is a certain I/O response time and latency that must be met in verticals such as financial and commercial high performance computing (HPC), for example, these stringent requirements make it unlikely that the public cloud storage can be an option.
“Before storing data in the public cloud, imagine what would happen if an unauthorized person were to gain access to these files,” said Pan. “Also consider the business impact should the service provider be inaccessible.”
7. Avoiding Lock In
Vendor lock in has been a sore point for decades. Cloud providers promote how they help you avoid it. Yet in some cases, it may be a different form of lock in. In fact, some have voiced uncertainty that they can get all their data back if they decide to change providers or move data back in house.
“From a lock-in perspective, while public cloud storage is more open than a traditional storage appliance, it presents its own flavor of lock-in,” said Irshad Raihan, Senior Principal, Product Marketing at Red Hat. “Also, cloud players often charge a lot to take data out of the cloud (their own form of lock-in).”
For instance, applications deployed on the public cloud need significant re-writes if they are to be transitioned to back on-premise or sent to a private cloud. So consider carefully what you move to the cloud, as getting it back may not be easy.
8. Corporate Culture
Corporate culture can also play a part. Some companies are so controlling about data that they don’t let anyone say anything to do anyone without 15 levels of corporate vetting. Others are paranoid about information leaving the premise so they remove USB drives, police employee access and try to ensure their information repositories are fortresses. In such cultures, even suggesting the use of the public cloud might get you fired.
“The IT governance model of the enterprise is usually a good indicator,” said Raihan. “Companies that mandate greater control over their IT should consider on-premise installations, while those willing to forgo some control in return for flexibility and time to value should go cloud.”
On the same lines, another consideration is how finely the company would like to control their storage knobs. For instance, IOPS per GB, a standard enterprise storage metric, can be much more finely tuned by varying the underlying hardware components in an on-premise installation. Cloud vendors typically provide few options from which to choose.
9. Tier One Apps
The application world appears to be splitting into two camps. At the top of the line are Tier One applications that run business critical applications. Here performance and availability are the predominant factors. After that come Tier Two and Tier Three applications where economics is most important.
“IT Managers use the fastest CPU’s and Flash/SSD to support Tier One applications on-premises,” said Bob Spurzem Director of Field Marketing at Archive360.
10. Still Unsure?
Sometimes the use case in favor of on-prem is obvious – as it can be for public cloud storage. But there are situations where uncertainty remains. Should it stay or should it go?
“For each database or body of content, can you determine if there is any sensitive, proprietary or regulated content?” said Paul LaPorte, Director of Product, Metalogix. “If you can’t, then leave it on-premises.
11. Media Content
Scott Adametz, Product Solutions Architect at SwiftStack, has found that certain workloads, especially around media, didn't lend themselves to public cloud. For example, storing hundreds of hours of daily, active media content in S3. The bill would be astronomical, he said.
12. Data Assembly Line
Adametz suggested a factory assembly line analogy. Raw materials at one end, finished products at the other. Everything works smoothly until someone in management gets the bright idea to save money by storing all the raw materials offsite rather than taking up space at the factory. Production grinds to a complete halt.
People-driven workflows suffer the same fate when their "raw materials" are separated from their creative assembly line. The only difference between running an application in public cloud vs. your own facility is the location of the people who use it. For workloads where your people run apps in the public cloud, storage there works fine. But if you have active, people-driven workloads like content creation, it's probably best to keep that media near the people.
“Regardless of cost, the latency between your people and public cloud, even on a direct connect, prohibits actively accessing data stored in public cloud,” said Adametz. “Content creators generally get mad when they press play and nothing happens.”