Microsoft: What can DNA do for long (long)-term digital data storage?
Microsoft purchased 10 million long oligonucleotides – DNA or RNA molecules used for genetic testing and research – from San Francisco startup Twist Bioscience, and is using them to encode digital data.
Twist Bioscience uses a proprietary semiconductor-based synthetic DNA manufacturing process to create synthetic biology tools such as genes, oligonucleotide pools and variant libraries.
Its Microsoft's idea that DNA's resilience and unparalled storage capacity could hold huge opportunity for housing digital data, which is big and getting bigger – more than 5 trillion gigabytes, according to some estimates, and doubling every two years.
The thought is that using DNA as an archival technology address two big limitations of digital storage media: lifespan and low density.
DNA data storage could last up to 2,000 years without deterioration, according to Twist Bioscience – and a single gram of DNA can store nearly trillion gigabytes of digital information.
"We need new methods for long-term, secure data storage," said Doug Carmean, a Microsoft partner architect within the company’s Technology and Research organization, in a statement.
The hope is that Twist's scalable, silicon-based DNA synthesis technique offers an opportunity to harness the DNA, to become a viable long-term storage medium.
"Today, the vast majority of digital data is stored on media that has a finite shelf life and periodically needs to be re-encoded," said Emily M. Leproust, CEO of Twist Bioscience.
"DNA is a promising storage media, as it has a known shelf life of several thousand years, offers a permanent storage format and can be read for continuously decreasing costs," she added. "We are thrilled to work with Microsoft, and University of Washington researchers, to address the growing challenge of digital data storage."
After translating a given data set into ACGT genetic code – the acronym refers to the four types of bases in a molecule of DNA, adenine, cytosine, guanine and thymine – Microsoft then asked Twist Bioscience to synthesize 10 million DNA strands using those letters.
What kind of data? "We don’t have the decoder key, so I have no idea what it is," Twist CEO Emily Leproust told IEEE Spectrum.
"The initial test phase with Twist demonstrated that we could encode and recover 100 percent of the digital data from synthetic DNA," said Carmean. "We’re still years away from a commercially- viable product, but our early tests with Twist demonstrate that in the future we’ll be able to substantially increase the density and durability of data storage."