Edge computing architectures offer huge benefits to organizations in almost every industry, but they also come with challenges. While collecting billions of pieces of data from millions of interconnected devices can allow firms to offer truly adaptable services, this data also presents a major problem when it comes to storage.
Part of the challenge of building storage infrastructure for edge computing is the sheer volume of data that edge systems generate. The overall volume of data generated from edge computing is projected to triple in the next five years.
There are other challenges involved with storing the data generated and used by edge systems. Much of this data is of transient utility. Efficient computational storage systems must be able to retrieve data easily, and then remove them, in order to avoid systems becoming overwhelmed with data.
In this article, we’ll look at the challenges involved with storing edge computing data, and the proposed solutions to overcoming these obstacles.
Also read: Data Storage 2020: Trends in Data Storage
Exponential Growth
First, let’s look at the biggest challenge facing organizations looking to build edge storage infrastructures – the huge amounts of data that such systems generate. If you work with computer networks, you won’t need to be told that the amount of data required to run the average system is now much larger than it was five years ago, and that this volume of data is still growing exponentially.
It’s also important to recognize, however, that the edge model is not the source of this data avalanche. In fact, the development of the edge model was largely driven by the problems involved with storing and using data generated by contemporary Internet of Things (IoT) systems. As the number of IoT nodes continued to grow over the past ten years, it became increasingly apparent that it would eventually become impossible to use monolithic server architectures to store the data required for them.
The edge model was originally designed to overcome this problem. By storing data closer to the places that they are generated and used, it was hoped that latency times could be reduced, and computational efficiency increased. An early application of this, for instance, was in self-driving cars that were able to perform sophisticated data processing completely independently of their associated server infrastructure.
Now, however, it seems that the edge model is creating challenges of its own.
Also read: Best Storage Management Software
On-demand Data
Many of these challenges are a consequence of the growing popularity of the edge model. Edge systems have proved extremely useful in a very broad set of industries and applications — not just in self-driving cars, but also in the entertainment industry, and even in investment firms. The speed at which these systems are able to process data has not only made them invaluable, but it has also led to increased expectations when it comes to instant access to data.
For storage engineers, the problem with edge data is not only one of scale, it’s also one of agility. The data generated and used by edge systems is very different from what we work with in traditional server architectures. This is for two main reasons:
- One is that edge data must be easily accessible to edge computing devices and, therefore, needs to be stored near to them — not on a distant server with a high latency.
- In many cases, this data is also only of transient utility. It is collected, processed, and then should be deleted in order to free space for incoming data. In some edge systems, this means that data storage management systems are actually more complex than the systems making use of the data collected.
It’s also important to note that these issues are only going to get worse. The IDC, for instance, projects a fourfold increase between 2019 and 2025 in data generated by the IoT, and much of this data will need to be stored at the edge, close to the devices that are using them.
Hybrid Goes Mainstream
For almost a decade, most storage engineers assumed that meeting this challenge was best approached via public cloud models. Accordingly, investment in public cloud storage surged over the past few years, and is still ascendant. Worldwide expenditure on public cloud services is poised to double to almost half a trillion dollars in the four years through 2023, according to research from IDC.
There are also good reasons to believe, however, that the public cloud model may soon lose its preeminent place as the repository of edge data. As cloud computing systems grow in size and complexity, many organizations are finding that third-party public cloud systems do not offer the reliability they require, and that varying network loads mean that the price they pay for this storage is also very unpredictable.
That’s not to say that public clouds will become obsolete, of course. Rather, it’s becoming clear that organizations with significant edge systems will increasingly use hybrid cloud models. These models will combine the control and reliability of on-premises cloud storage with the easy scalability of the public cloud model.
Also read: What Is Hybrid Cloud Storage? Implementing a Strategy
Flexible, Accessible, Secure
Hybrid cloud storage solutions make a lot of sense for edge computing, but they also increase the complexity of an organization’s data storage infrastructure. Any storage solution will necessarily have to make compromises between flexibility, accessibility, and security.
When it comes to hybrid clouds storing data for edge computing, the primary way in which this compromise is manifest is simply in which data are stored on-premises, and which are stored in public clouds. Many organizations feel, for instance, that data which is critical to the functioning of their edge systems should always be stored on servers which are directly managed in-house.
The argument for doing that rests on concerns about the security of the cloud, and in some ways, these concerns are well founded. However, today it is more than possible to find a public cloud provider who offers both security and scale, and even one that is capable of having the data stored in sub-clouds that remain close to edge computing devices.
As we note in our guide to data security, in fact, some of the received wisdom about the security of public clouds has become obsolete in the light of the edge paradigm.
A New Paradigm
The increased demand for data storage at the edge is likely to drive changes in the way in which data is stored more generally. Just as the advent of the fibre channel edge id driving innovation in edge computing, large-scale hybrid cloud infrastructures will drive progress in data management.
Among these changes is likely to include a more thorough approach to discarding transient data, in order that edge storage devices are not overwhelmed by the sheer amount of data they generate and use. Alongside this, it’s also likely we will see better structuring of the data that remains for storage.
These changes are not going to happen overnight, of course, but in the long term they have the potential to transform the way in which edge computing works, and in turn the way that IoT networks operate.
The Future Edge
In short, organizations and companies will need to devise new ways for handling the large amount of data that these edge computing devices can produce, and should also embrace hybrid cloud infrastructures as an everyday part of their business.
This means, ultimately, that the edge computing model is likely to become even more popular in the coming years, and drive innovation across the storage ecosystem. And that, as long as we can ensure security at the edge, can only be a good thing.
Read next: Top Business Continuity Management Solutions & Software for 2021