Our DNA may no longer only serve to support our genetic material, it could soon be the storage of our personal digital data, and also a place to store huge amounts of data archives. Research and development by teams on both sides of the Atlantic have proved that data stored in DNA can be and has been done – but it isn’t quite as easy as plugging in your USB stick…
How does it work?
Here comes the science, so anyone who is faint of heart should probably skip this section as it is, like we said, pretty complicated stuff, do don’t say we didn’t wanr you if you fall asleep half way through. :)
DNA is made up of small units of nucleotides. These consist of a phosphate “backbone” and four nitrogenous bases arranged into pairs: Adenin (A), Thymin (T), Cytosin (C) and Guanin (G). The DNA is stored in the form of chromatin in the cell (a straight chain consisting of DNA molecules), these chromatins are stored within a cell nucleus. Here is a diagram of how it all fits together, I think most readers will recognise the double helix structure of DNA.
During their work, the researchers in both the Unites States and England faced a problem; with cells being alive, sooner or later they will die. So, if we had our information stored in the DNA, that will die along with the cell. To cope with this problem, they have developed a retrieval system that requires no cell, just the DNA molecules. So even when the cell dies, your data is safe.
This type of storage is based on the binary system (a series of 0 and 1) which is in turn is translated into a trinary system (0, 1 & 2), and finally into the four base system of DNA. The DNA is then synthesized in a laboratory. The following illustration describes the process more explicitly:
Have we lost you yet? No? OK, let’s carry on…
How to use the synthesized DNA
Reading a DNA fragment is relatively simple. However, with regards to ‘writing’ data to the DNA, there are two major limitations. On the first hand, using current technology, it is only possible to manufacture short chains of DNA and secondly, reading and writing are likely to lead to errors, especially when the same DNA letter is repeated.
Working in collaboration, Californian company, Agilent Technologies and the European Bioinformatics Institute (EMBL-EBI) in England have developed an data indexing system which is based on DNA fragmentation. So, information is stored only in numerous short DNA fragments, meaning that the DNA is stored redundantly and it also cuts down on errors. Think of it like a DNA RAID.
To demonstrate this new method, the team managed to archive Martin Luther King’s famous “I have a dream” speech, a text file of Shakespeare’s sonnets and a PDF of an article about the double helix structure of DNA by Watson and Crick. They managed archive this 5.27 MB of data into a billionth of a gram of DNA. The sample was then shipped overseas, and researchers decoded the data with an accuracy of 100%, using a standard DNA sequencer.
To put those numbers into context, that 5.27 MB of storage equates to around 640,000 GB of storage per gram of DNA. That’s a lot of space!
So how can I do it?
Reading DNA requires a sequencer. Once sequenced, the information is processed by a computer to recover the original information. This process is simple enough (with the right training), but very expensive. According to some research however, the cost of sequencing could be fall by 20 times over the next decade, making DNA storage economically viable for businesses at least, if not for home users also.
DNA storage requires absolutely no power whatsoever (once the data is stored) and with the ability to store millions of Terabytes of data into a space smaller than a coffee cup. This really could be a viable option for data retention in the future.
About Kevin François Bile Ebelle
Kevin Francois is a student of Medicine in Cameroon. He is a geek, and is passionate about everything related directly or indirectly to Opensource. You can find him on Google+.









