In cloud storage versus on-premises, there is no definitive choice.
It all depends on preference, cost analysis, and infrastructure philosophy. But regardless of individual preferences, cloud storage is here to stay.
Here are five of the top trends in companies and IT teams are seeing in the cloud storage market:
1. More cloud than on-premises storage
In capacity terms, the cloud is winning. There is now more data stored in the cloud than on-premises.
In response to an explosion in unstructured data, enterprises are turning to the cloud, according to Eric Burgener, an analyst at IDC.
An average of 71 PB of file storage is going to on-premises locations per year, while over 91 PB is being sent to public cloud locations.
On-premises storage will grow at a healthy rate of 46% per year over the next five years, which shows that it remains a viable enterprise option. However, the public cloud rate of storage growth will be 53% per year.
2. Continued growth in nearline storage
Some of this storage will be shunted off to some distant cloud that lacks high performance.
But an increasing amount is going to nearline storage, said Steve Emerson, spokesperson for Broadcom’s Data Center Solutions Group (DCSG).
Emerson cited Seagate’s recent quarterly financial results, which attributed high revenues to “record” nearline product revenue driven by cloud customers.
“End-customer demand for cloud-based storage continues unabated; user-generated video (short video format like TikTok) and other video content is now a significant part of that growth,” Emerson said.
“HDDs still the primary storage medium of choice due to continued innovation on capacity and features focused on the cloud use case.”
Emerson added that cloud storage growth seems to have little to slow it down.
The social factors driving this growth with new forms of short-video content being shared and other content creation will keep this going for many more years.
Meanwhile, hard disk drive (HDD) vendors continue to add and enhance features for cloud vendors to help on capacity, performance, OPEX, and overall cost, including features like SMR (for capacity), dual actuator (for performance), and SATA command duration limits (CDL).
3. Metadata growth
Metadata is a serious problem in cloud storage, according to Speedb co-founder and CEO Adi Gelvan.
Metadata is expanding at an alarming rate, he said, and the issue will only get worse, as the amount of unstructured data continues to explode.
Ten years ago, the typical ratio between data and metadata was 1,000:1. Today, the ratio is often more like 1:10.
Many organizations now find that their metadata exceeds the volume of data being stored. The impact on performance and scalability can be profound, and typical answers like sharing databases or throwing resources at the problem create time and cost drains of their own.
“The metadata explosion is forcing IT teams to look at the deepest part of the software stack that sorts and indexes data — the storage engine, also known as a data engine — to make sure it is suited for modern application demands,” Gelvan said.
4. Cloud repatriation
Yes, there is major growth in cloud storage. But there also is a backlash.
Cloud storage consumption will no longer track cloud compute spending growth as it has done for the past few years, according to DataCore Software’s Augie Gonzalez, director of technical marketing.
“More companies are moving data back on-prem from the cloud, where they can better contain spending and limit external exposure,” Gonzalez said.
“Early indicators suggest the repatriation trend will accelerate in 2022.”
Utility computing in public clouds still makes perfect sense. But financial and location-dependent facets have companies revisiting where the bulk of their data lives. IT teams can readily stop and start cloud computing resources, like AWS EC2 instances to minimize CPU/memory costs.
Yet, the meter is always running on S3 buckets serving as persistent cloud disks. Even if seasonal workloads are down 40%, data consumption continues to climb because historical records continue accumulating, and monthly storage bills compound year on year, Gonzalez said.
“Data gravity also plays a strong role in repatriation: Much of the new data being generated arrives at the edge, far from any public cloud presence,” Gonzalez said.
“It lands first on outposts of local storage for triage, before transmission to the mothership. With the volume of IoT data piling up quickly, the earlier economic justification for on-prem long-term retention storage becomes all the more pressing.”
5. Dynamic cloud provisioning
Dynamic cloud provisioning has led several industries to move key infrastructure from on-premises to the cloud.
Take the semiconductor industry. The design of advanced semiconductors is a high-performance process that is at the core of every industry. Traditionally, exploratory data analysis (EDA) pipelines have been built in high-performance computing (HPC) data centers leveraging high-bandwidth, low-latency storage, and compute resources.
Supply chain problems are making it harder to provision on-premises data centers with the needed resources. Additionally, an inability to rapidly retool fixed infrastructure for new chip designs has pushed organizations to dynamically provision and extend their on-premises infrastructure to run parallel EDA pipelines in the cloud.
“Cloud compute and storage can be rapidly provisioned to take advantage of the unique requirements needed for a given project,” said Floyd Christofferson, VP of product marketing, Hammerspace.
“This enables multiple designs to be executed in parallel, dramatically decreasing the time to market. And when the project is complete, cloud resources can be equally rapidly decommissioned when no longer needed to avoid unnecessary costs.”