Ask most CTOs what the next decade will bring, and they’ll tell you that AI and ML technologies are going to play a major part in almost every industry. But while the computational requirements and programming tools needed to run AI systems are now well understood within most organizations, the storage infrastructure that is also necessary is often overlooked.
This is a potentially risky situation because AI and ML systems typically require (and sometimes create) vast amounts of data. Advanced data storage approaches are, therefore, needed for even relatively “simple” AI and ML systems. In fact, there now exists a close relationship between data storage and AI, and object storage is the newest paradigm to stem from this cross over.
Object storage supports AI and ML systems in several key ways. Here’s a closer look at a few of them.
The primary reason why most AI and ML systems now use object storage is scalability. With “traditional” file and block storage, expanding the storage available to a particular application or system generally means scaling vertically, by adding further hierarchical levels to your storage infrastructure. Object storage, in contrast, allows systems to be scaled horizontally, and almost without limits.
The importance of this ability becomes even more apparent when you look at the sheer size of the data sets now in use in AI and ML systems. Tesla, for instance, is using three billion miles of driving data to train it’s autonomous vehicle AIs, and so now requires huge amounts of adaptable, fast storage.
It’s huge projects like this that are driving the key trends in AI-driven storage, and object storage among them.
A second major attraction of object storage for AI and ML systems is the extra level of security oversight it brings to these systems. Because object storage is generally defined at the software level, rather than relying on monolithic hardware infrastructure, storage architectures can be mapped to security structures more closely. This simplifies security across large AI systems, and limits the risks of them.
This is particularly important with public-facing AI systems, because it has long been recognized that while AI can improve cybersecurity, it also creates new risks that can’t be ignored. While, for instance, AI can be used to guard backups against ransomware, it is also being used to develop novel types of ransomware that are more easily able to evade these same systems. Object storage, by providing better visibility on the data used in these systems, helps to make them more secure.
One of the reasons why firms of all sizes need to look at their storage needs before they implement large AI and ML systems is that storage is often the biggest cost associated with these new, advanced tools. AI engines and smart analysis software might be relatively inexpensive in itself, but if it requires petabytes of extra storage it can easily end up requiring a relatively large budget.
Object storage is one of the solutions to this issue. Most object storage systems make use of relatively low-cost hardware — it is at the software level where most of the advantages of this system are created. By making use of smart management schemes and space-saving data compression, object storage systems can reduce the disk footprint of AI systems by 70% when compared to “traditional” enterprise storage.
Most AI and ML systems rely on extensive metadata to function. This is because AI and ML systems typically use metadata to find the data they need to function, and require this to be highly detailed in order to identify the correct data autonomously. Unfortunately, traditional enterprise storage systems were designed long before AI and ML systems existed, and were not designed to carry large amounts of metadata.
In fact, most file and block storage only supports the most basic metadata natively — and sometimes only the date that data were created, who they were created by, and where. Object storage, in contrast, offers fully customizable metadata schema that are expandable. This means that, by using object storage, AI and ML systems find it easier to locate, identify, and use the data they need.
Also read: Trends in AI-Driven Storage
A fifth major advantage of object storage is that these systems can be built to be massively parallel — where the same data exists on multiple servers, but this complexity is hidden from human operators. This can greatly improve data storage performance, because AI systems are able to pull data from the location closest to them, reducing latency.
This is particularly important for AI and ML systems that need to function in real time. The clearest examples here are the systems that drive autonomous vehicles, and those that underpin speech recognition software. However, most AI systems come with a performance cost, and after computational resources read/write speeds are likely to be the biggest “choke point” on these systems.
Durability and Resilience
A final added benefit of massively parallel systems is that these systems don’t just improve performance, they also improve resilience by automatically creating multiple copies of a particular dataset. Object storage is crucial in building these systems, because the object storage paradigm allows engineers to build storage architectures that autonomously copy data to multiple locations.
Redundancy creates resilience, and data resilience is going to become the watchword of data security in the next few years. We’ve already seen a dramatic spike in the level of ransomware attacks affecting businesses of all sizes, and ultimately it will only be through building resilience that these types of attack can be defeated.
Storage of the Future
Object storage is among the most visible of current AI and ML data storage trends. By providing storage that is scalable, secure, and flexible, it offers businesses the performance and security necessary to run the largest AI and ML systems. And as these systems become more and more critical to businesses, object storage will as well.
Read next: Best Object Storage Solutions