Storing information in DNA: Improving DNA storage with nanoscale electrode wells

Storing information in DNA: Improving DNA storage with nanoscale electrode wells

Geneticists can store data in synthetic DNA as a medium for long-term storage due to its density, ease of copy, longevity and sustainability. Research in the field had recently advanced with new encoding algorithms, automation, preservation and sequencing. Nevertheless, the most challenging hurdle in DNA storage deployment remains the write throughput, which can limit the data storage capacity. In a new report, Bichlien H. Nguyen, and a team of scientists in Microsoft Research and computer science and engineering at the University of Washington, Seattle, U.S., developed the first nanoscale DNA storage writer. The team intended to scale the DNA write density to 25 x 106 sequences per square centimeter, an improved storage capacity compared to existing DNA synthesis arrays. The scientists successfully wrote and decoded a message in DNA to establish a practical DNA data storage system. The results are now published in Science Advances.

Long-term DNA archives

The current pace of data generation exceeds existing storage capacities, DNA is a promising solution to this problem at an expected practical density of more than 60 petabytes per cubic centimeter. The material is durable under a range of conditions, relevant and easy to copy, with promise to be more sustainable or greener than commercial media. During the process, digital data in the form of sequences of bits can be encoded in sequences of the four natural DNA bases—guanine, adenine, thiamine and cytosine, although additional bases are also possible. The team can next write the sequences into molecular form via de novo DNA oligonucleotide synthesis to create specific molecules based on a set of repeating chemical steps. The resulting oligonucleotides can be preserved and stored after synthesis. To access the data, the DNA storage can be amplified using polymerase chain reactions and sequenced to return the DNA base sequences to the digital domain, then the DNA base sequences can be decoded to recover the original sequence of bits.

A new method for synthetic DNA data storage

In this study, Nguyen et al. produced an electrode array which demonstrated independent electrode-specific control of DNA synthesis with electrode sizes and pitches to establish synthesis density of 25 million oligonucleotides per cm2. This value is estimated as the electrode density required to achieve the minimum target of kilobytes per second of data storage in DNA. The team pushed the state-of-the-art in electronic-chemical control and provided experimental evidence to the write bandwidth necessary for DNA data storage.

The team introduced a proof-of-concept molecular controller in the form of a tiny DNA storage writing mechanism on a chip. The chip could tightly pack DNA synthesis at 3-orders of magnitude higher than before to achieve greater DNA writing throughput. To store information in DNA at the scale necessary for commercial use required two crucial processes. First the team had to translate digital bits (ones and zeros) into strands of synthetic DNA representing bits with encoding software and a DNA synthesizer. Then they must be able to read and decode the information back to its bits to recover that information into digital form again with a DNA sequencer and decoding software.

Developing electrochemical arrays for nanoscale features

During the traditional synthesis of DNA chains, scientists use a multistep method known as phosphoramidite chemistry, in which a DNA chain can be grown sequentially by the addition of DNA bases. Each DNA base contains a blocking group to prevent multiple additions of DNA bases to the growing chain. On attachment to a DNA chain, acid can be delivered in the setup to cleave the blocking group and prime the DNA chain to add the next base. During electrochemical DNA synthesis, each spot in the array contains an electrode and when a voltage is applied, acid is generated at the working electrode (anode) to deblock the growing DNA chains, while an equivalent base is generated at the counter electrode (cathode). The team prevented acid diffusion in the setup by designing an electrode array, where each working electrode around which acid formation occurred during DNA synthesis was sunk in a well, and surrounded by four common counter electrodes, i.e., cathodes that drove base formation, to confine the acid to specific regions. Nguyen et al. verified the effectiveness of the design using finite element analysis. During the experiments, when presented in sufficient concentration, the acid deblocked the surface-bound nucleotides to allow the next nucleotide to couple. Using the setup of chips containing feature spots to confine acids, they developed electrochemical arrays with four individual electrodes to regulate DNA synthesis. The team then performed experiments with two fluorescently labeled bases in green and red. As proof of concept, they showed the device's capacity to write data by synthesizing four unique DNA strands, each 100 bases long with an encoded message, without errors.

Outlook: Synthesizing short oligonucleotides on the electrode array for data storage

Using the setup, Nguyen et al. also demonstrated spatially controlled synthesis of short oligonucleotides on the electrode array to assess the maximum length of DNA that could be formed. The scientists created a single DNA sequence with 180 nucleotides and PCR-amplified various length products from the complete length of the oligonucleotides. As the amplicon got longer, the expected PCR products appeared fainter and less well defined, while shorter amplicons showed stronger and more well-defined bands indicative of higher synthesis errors. Based on the results, the researchers selected sequence length accounting to 100 bases for ease of purification to provide a practical demonstration of DNA data storage without further optimization. In this way, the proof-of-concept method demonstrated in this work by Bichlien H. Nguyen and colleagues paved the way forward to generate large-scale and unique DNA sequences in parallel for data storage. The work outpaced previous reports on dense synthetic DNA sequences to provide a first experimental indication to achieve the write bandwidth required for data storage at nanoscale feature sizes. The scientists expect immediate applications of the devices in information technology and foresee their practical applications in materials science, synthetic biology and large-scale molecular biology assays.

Source. Image: Billion Photos/Shutterstock