DNA Data Storage: Could Data Files Be Stored as DNA?

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Ask the average computer user to name a storage medium, and they’ll give you a list of those they use on a daily basis — hard drives, cloud storage, and perhaps flash memory. Very few, if any, will name DNA as a way to store documents, audio recordings, or even pictures of cats. And yet this technology exists today.

Using DNA storage for digital data is, in fact, a pretty well established idea. It’s been around for more than a decade, and many different types of data have been stored in this way. And yet, despite all this progress, DNA storage remains a very niche field. This hasn’t stopped some analysts claiming that DNA storage is the future of data storage.

Here is how DNA storage works, the challenges that are still holding it back, and emerging research that promises to overcome them.

Why DNA?

At first glance, DNA doesn’t sound like a very good storage medium. It is a messy organic molecule, after all, that looks nothing like the magnetic disk drives or switching arrays that make up the fundamental building blocks of our storage infrastructure.

From a philosophical point of view, there can be few better candidates for storing large amounts of data. DNA has evolved to store the entire set of instructions to build a human body, to fit these into the nucleus of every cell in your body, and to do all that without electricity. After all, the four “bases” that make up a DNA code are very much like the 1s and 0s of digital information.

This means that DNA has several key advantages over more familiar storage media. One is that DNA can be used to store data without the use of electricity. Given that energy costs are among the highest overheads for most data centers today, that makes DNA storage very attractive from an economic perspective. 

Secondly, globally we will produce about 175 zettabytes of data in four years. To be more precise, that’s 175 followed by no less than 22 zeros. DNA storage can retain all of that data in less volume than a golf ball. With statistics like that, it’s no wonder that scientists have long looked to DNA as a potential storage medium.

Also read: Developments in Cloud Storage for IoT Data

The Challenges of DNA Storage

There already exist DNA data storage systems, and the techniques needed to encode data in this way are well understood. But, as with any new technology, DNA has some daunting technical challenges to overcome before it can go mainstream.

Some of these challenges are social. Some people worry about using a biological molecule, and indeed the molecule that makes us who we are, to store Facebook messages. Doing so not only runs the risk of releasing rogue DNA into the world, but it also compounds some of the fears around the dangers of Big Data.

However, there are more practical challenges as well. At the moment the sheer cost of DNA storage makes it difficult to use for most organizations. This is not because data costs a lot of money to store — once it is encoded into a DNA molecule, it can be stored quite easily. Instead, the cost comes at the point of turning data into information. The techniques and machinery currently required to take a digital photograph and turn it into a DNA strand is currently several times more expensive than comparable, completely digital technologies.

Then, there are a number of even more technical challenges that relate to the unique ways in which data is written to DNA strands. Without getting too deep into these issues, it’s enough to say that if even small errors creep into the transcription of data onto a DNA helix, this can easily make the whole data set unreadable.

The Future of DNA Storage

Real progress is, however, being made in overcoming these issues. Many of these solutions are focused on making data stored as DNA easier to work with, either through improving the labeling of files or allowing users to make use of tools familiar from digital storage interfaces. New techniques for labeling files stored as DNA, for instance, promise to make this technology easier to work with for smaller organizations. 

Similarly, Mark Bathe, a professor of biological engineering at the Massachusetts Institute of Technology, told DTNext recently that his team was looking at ways of storing individuals’ social media data in DNA strands, opening up new possibilities for a medium which is currently mostly used to store “hard” data for scientific purposes.

Another team, from North Carolina State, have developed a way of generating previews of files stored in this way. 

“The advantage to our technique is that it is more efficient in terms of time and money,” Kyle Tomek, lead author of a paper on the work and a Ph.D. student at NC State, told Science Daily. “If you are not sure which file has the data you want, you don’t have to sequence all of the DNA in all of the potential files. Instead, you can sequence much smaller portions of the DNA files to serve as previews.”

Though this new research is certainly exciting, it is unlikely to change the economic hurdles that affect DNA data storage at the moment. Though there is much promise in the technology, it is currently too expensive to be used by most companies, most of the time. This means we may still be several decades away from mainstream adoption of the technology.

That said, there have always been different types of file storage, each used by a different type of organization and each suitable for a different type of data. Because of this, it’s likely that DNA data storage will soon see niche usage wherever huge amounts of data need to be stored, and cost is not a constraint.

Read next: 6 Developments in Healthcare Data Storage

Nahla Davies
Nahla Davies
Nahla Davies is a software developer and writer. Before devoting her work full time to technical writing, she managed—among other intriguing things—to serve as a lead programmer at an Inc. 5,000 experiential branding organization whose clients include Samsung, Time Warner, Netflix, and Sony.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends, and analysis.

Latest Articles

15 Software Defined Storage Best Practices

Software Defined Storage (SDS) enables the use of commodity storage hardware. Learn 15 best practices for SDS implementation.

What is Fibre Channel over Ethernet (FCoE)?

Fibre Channel Over Ethernet (FCoE) is the encapsulation and transmission of Fibre Channel (FC) frames over enhanced Ethernet networks, combining the advantages of Ethernet...

9 Types of Computer Memory Defined (With Use Cases)

Computer memory is a term for all of the types of data storage technology that a computer may use. Learn more about the X types of computer memory.