Harvard stores 70 billion books using DNA

Research team stores 5.5 petabits, or 1 million gigabits, per cubic millimeter in DNA storage medium

By , Computerworld |  Storage, Harvard

Last year, Keio University Institute for Advanced Biosciences and the Keio University Shonan Fujisawa Campus announced that researchers there used artificial DNA to carry more than 100 bits of data within the genome sequence.

The Japanese universities said they successfully encoded "e= mc2 1905!" -- Einstein's theory of relativity and the year he enunciated it -- on common soil bacteria Bacillius subtilis.

The Harvard researchers used the four DNA nucleobases - adenine (A), cytosine (C), guanine (G) and thymine (T) - as binary markers. The A and C stand for the digit 0 and the T and G represent the digit 1, according to Kosuri.

And where some experimental media -- like quantum holography -- require temperatures approaching absolute zero (273 degrees Celsius) and tremendous energy, DNA is stable at room temperature, the researchers noted. "You can drop it wherever you want, in the desert or your backyard, and it will be there 400,000 years later," Church said.

Unlike earlier researchers, Church said his team was able to use commercial DNA microchips to create standalone DNA.

"We purposefully avoided living cells," Church said. "In an organism, your message is a tiny fraction of the whole cell, so there's a lot of wasted space. But more importantly, almost as soon as a DNA goes into a cell, if that DNA doesn't earn its keep, if it isn't evolutionarily advantageous, the cell will start mutating it, and eventually the cell will completely delete it."

In another departure from earlier research, the team rejected so-called "shotgun sequencing," which reassembles long DNA sequences by identifying overlaps in short strands.

Instead, the Harvard team took their cue from information technology, and encoded the book in 96-bit data blocks, each with a 19-bit address to guide reassembly. Including jpeg images and HTML formatting, the code for the book required 54,898 of these data blocks, each a unique DNA sequence.

"We wanted to illustrate how the modern world is really full of zeroes and ones, not As through Zs alone," Kosuri said.

Lucas Mearian covers storage, disaster recovery and business continuity, financial services infrastructure and health care IT for Computerworld. Follow Lucas on Twitter at @lucasmearian, or subscribe to Lucas's RSS feed . His e-mail address is lmearian@computerworld.com.

Read more about emerging technologies in Computerworld's Emerging Technologies Topic Center.


Originally published on Computerworld |  Click here to read the original story.
Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question