Scientists have discovered how to fit the maximum amount of data in a single nucleotide.
Our data-driven society is churning out more information than traditional storage technology can handle, so scientists are looking for a solution in Nature’s hard drive: DNA. A pair of researchers at Columbia University and the New York Genome Center recently wrote a full computer operating system, an 1895 French film, an Amazon gift card and other files into DNA strands and retrieved them without errors, according to a study published in the latest edition of Science.
There are several advantages to using DNA. It’s a lot smaller than traditional media; a single gram can fit 215,000 times more data than a one terabyte hard drive, The Atlantic notes. It’s also incredibly durable. Scientists are using DNA thousands of years old to de-extinct wooly mammoths, for example. But, until now, they’ve only unlocked a fraction of its storage capacity. Study coauthors Yaniv Erlich and Dina Zielinski were able to fit the theoretical maximum amount of information per nucleotide using a new method inspired by how movies stream across the internet.
“We mapped the bits of the files to DNA nucleotides. Then, we synthesized these nucleotides and stored the molecules in a test-tube,” Erlich explained in an interview with ResearchGate. “To retrieve the information, we sequenced the molecules. This is the basic process. To pack the information, we devised a strategy—called DNA Fountain—that uses mathematical concepts from coding theory. It was this strategy that allowed us to achieve optimal packing, which was the most challenging aspect of the study.”
When an online streaming service like Netflix sends information, it uses fountain codes, which partition data into small packets. Even if a few packets are lost, Netflix can reconstruct the entire stream. DNA has a similar problem; scientists can only create and sequence it in small batches. This means that large amounts of data need to be broken down, and bits of them can be lost. Another downside to DNA? It breaks down after sequencing, which means the information is lost the more it’s read. Luckily, DNA is easy to replicate.
Although companies like Microsoft are currently looking into DNA as a storage option, Erlich estimates it’ll be more than a decade before it goes mainstream. “We are still in early days, but it also took magnetic media years of research and development before it became useful.”