Microsoft Makes Breakthrough in the Quest to Use DNA as Data Storage

The world has a data problem. Each day, we created another 2.5 million gigabytes of data, and each year, the amount of data produced globally increases exponentially. This puts us on a collision course with a serious problem: the rate at which we produce data is outpacing our ability to store it.

If every YouTube video we watched, every photo we snapped from our phone, and every document we saved was stored on traditional flash memory chips, it would consume 10 to 100 times the expected supply of silicon by 2040, Nature predicts.

What’s clear is that we need another way to store data–and not just any way. The data storage method of the future needs to be robust and dense. That is, the data currently being stored in data centers the size of football fields needs to be placed in a much smaller vehicle. And that solution needs to transfer data quickly and store our most prized media for decades without causing it to break down.

Where are we looking to find this holy grail of data storage? In the molecule that houses our genetic information: DNA. Where hard drives use ones and zeros, DNA storage uses four chemical bases, adenine (A), guanine (G), cytosine (C) and thymine (T). Remember elementary school science class? These compounds connect in pairs (A to T; G to C) to create rungs on a double helix ladder. It turns out that you can use DNA to convert ones and zeros into those four letters for storing complex data.

Microsoft, one of the pioneers of DNA storage, is making some headway, working with the University of Washington’s Molecular Information Systems Laboratory, or MISL. The company announced in a new research paper the first nanoscale DNA storage writer, which the research group expects to scale for a DNA write density of 25 x 10^6 sequences per square centimeter, or “three orders of magnitude” (1,000x) more tightly than before. What makes this particularly significant is that it’s the first indication of achieving the minimum write speeds required for DNA storage.

Microsoft is one of the biggest players in cloud storage and is looking at DNA data storage to gain an advantage over the competition by using its unparalleled density, sustainability, and shelf life. DNA is said to have a density capable of storing one exabyte, or 1 billion gigabytes, per square inch—an amount many magnitudes larger than what our current best storage method, Linear Type-Open (LTO) magnetic tape, can provide.

What do these advantages mean in real-world terms? Well, the International Data Corporation predicts data storage demands will reach nine zettabytes by 2024. As Microsoft notes, only one zettabyte of storage would be used if Windows 11 were downloaded on 15 billion devices. Using current methods, that data would need to be stored on millions of tape cartridges. Cut the tape and use DNA, and nine zettabytes of information can be stored in an area as small as a refrigerator (some scientists say every movie ever released could fit in the footprint of a sugar cube). But perhaps a freezer would be a better analogy, because data stored on DNA can last for thousands of years whereas data loss occurs on tape with 30 years and even sooner on SSDs and HDDs.

Finding ways to increase write speeds addresses one of the two main problems with DNA storage (the other being cost). With the minimum write speed threshold within grasp, Microsoft is already pushing ahead with the next phase.

“A natural next step is to embed digital logic in the chip to allow individual control of millions of electrode spots to write kilobytes per second of data in DNA, and we foresee the technology reaching arrays containing billions of electrodes capable of storing megabytes per second of data in DNA. This will bring DNA data storage performance and cost significantly closer to tape,” Microsoft told TechRadar.

As promising as this all sounds, we are many years away from storing DNA on data. Ignoring the technical complexities, DNA data storage is simply too expensive–a few megabytes would cost thousands of dollars, according to ETH Zurich’s Robert Grass, and the slow write speeds mean you wouldn’t want to use DNA for frequently accessed data. Still, researchers around the world are working to bring us closer to an era where the data we create is stored on the very molecule that contains our genetic information.