U.S. Intelligence Community Wants To Use DNA For Data Storage
JUNE 13, 2018
By Aaron Kesel
The U.S. intelligence community wants to unlock more efficient ways to store troves of data humans generate every day, and it believes inside DNA is the storage area, NextGov reported.
The Intelligence Advanced Research Projects Activity last month issued a broad agency announcement seeking research teams for the agency’s Molecular Information Storage program, which aims to create a system for storing vast quantities of data on sequence-controlled polymers, like human DNA.
Selected teams would have two primary assignments over the four-year initiative: build a table-top device that writes data onto polymers and another that reads the information once it’s stored. Teams must also develop an operating system to index, access and search data within the network.
By the program’s end, the system must be able to write one terabyte and read 10 terabytes per day, and “present a clear and commercially viable path to future deployment at the exabyte scale” within 10 years, according to IARPA.
Encoding into DNA isn’t a new concept; a group of researchers at the Swiss Federal Institute of Technology previously found a way to encode data onto DNA—the code of life that all living beings’ genetic information is stored on—that could survive for millennia. One gram of DNA can potentially hold up to 455 exabytes of data, according to New Scientist.
Another group of scientists at Harvard University encoded a book onto DNA in 2012, and that research has since evolved further as more and more scientists are attempting to use DNA as a storage medium.
While the hard drives of many desktop computers sold today are mostly able to store one terabyte of information, other researchers at Harvard created a technique years ago that could store 700 terabytes on a single gram of DNA. Than another team advanced their research and pushed the limit by raising the capacity to 2200 terabytes.
In 2015, other groups of researchers at the University of Illinois, led by Professor Olgica Milenkovic, detailed a new system capable of storing 490 exabytes on a single gram, which is equal to 490 billion gigabytes!
Scientists create data-encoded DNA by taking advantage of its inherent coding language. DNA is made up of four chemicals — commonly known as A, C, G and T, that can be converted into the 1s and 0s we are already accustomed to using for data storage. As Quartz noted, it’s an incredibly efficient system:
One gram of DNA can potentially hold up to 455 exabytes of data, according to the New Scientist. For reference: There are one billion gigabytes in an exabyte, and 1,000 exabytes in a zettabyte. The cloud computing company EMC estimated that there were 1.8 zettabytes of data in the world in 2011, which means we would need only about 4 grams (about a teaspoon) of DNA to hold everything from Plato through the complete works of Shakespeare to Beyonce’s latest album (not to mention every brunch photo ever posted on Instagram).
Today, exabyte-scale data centers take up masses of land and cost billions to operate every year, an infrastructure IARPA argues will no longer be feasible in the years to come for research. Tech firm Domo estimates there will be more than 140 gigabytes of data generated daily for every single human on Earth by 2020; and as the Internet of Things (IOT) expands, that number is only assumed to grow. Another firm in contrast, EMC Corporation, believes that by 2020 the amount of digital data produced will total to 40 trillion gigabytes!
“This resource intensive model does not offer a tractable path to scaling beyond the exabyte regime in the future,” IARPA wrote. “Faced with exponential data growth, large data consumers may soon face a choice between investing exponentially more resources in storage or discarding an exponentially increasing fraction of data.”
In February, the agency outlined its vision for creating an exabyte-scale storage unit that could be housed in a single room and cost less than $1 million to run per year. Though scientists have yet to build a system anywhere close to that level thus far, studies have indicated that sequence-controlled polymers are capable of virtually error-free data storage, according to IARPA.
Researchers estimate DNA and similar polymers can store information more than 100,000 times more efficiently than traditional data storage technology, and polymers’ stable molecular structure allows them to last hundreds of years without losing or corrupting information. For example, CDs and DVDs only have shelf lives of about 25 years, according to the U.S. National Archives.
Earlier this year scientists from ETH Zurich managed to encode a music album Massive Attack’s late-1990s album Mezzanine inside five thousand tiny glass beads spread out over almost a million short DNA strands, Sputnik reported.
As all the researchers showed, and those at Harvard first showed by storing 700 terabytes of data, there are better mediums to store data that are far more efficient. Human DNA just might be the next final frontier for data. To put that in perspective for you, 700 terabytes of data is equal to 14,000 50-gigabyte Blu-ray discs… in a single droplet of DNA that would fit on the tip of your pinky. To further that point, in order to store the same kind of data on hard drives you would need 233 3TB drives, weighing a total of 332,898 pounds.