Hello! The world's first automatic storage of data in DNA molecules

Hello! The world's first automatic storage of data in DNA molecules

Researchers from Microsoft and the University of Washington have demonstrated the first fully automated and readable storage system for artificially engineered DNA. This is a key step in moving new technology from research labs to commercial data centers.

The developers validated the concept with a simple test: they successfully encoded the word "hello" in fragments of a synthetic DNA molecule and converted it back into digital data using a fully automated end-to-end system, which is described in article, published March 21 in Nature Scientific Reports.


This article is on our website.

DNA molecules can store digital information at a very high density, that is, in a physical space that is many orders of magnitude smaller than modern data centers occupy. It is one of the promising solutions for storing the vast amount of data the world generates every day, from business records and videos of cute animals to medical and space images.

Microsoft is exploring ways to bridge the potential gap between the amount of data we produce and we want to preserve, and our ability to preserve them. These methods include the development of algorithms and molecular computing technologies for encoding data in artificial DNA. This would allow all the information stored in a large modern data center to fit into a space roughly the size of several dice.

β€œOur main goal is to launch a system that, to the end user, will look almost the same as any other cloud storage system: information is sent to the data center and stored there, and then just appears when the client needs it,” says Sr. Microsoft Researcher Karin Strauss. β€œTo do this, we needed to prove that it makes practical sense from an automation point of view.”

The information is stored in synthetic DNA molecules created in a laboratory, not in the DNA of humans or other living beings, and can be encrypted before being sent to the system. While complex machines such as synthesizers and sequencers already perform key parts of the process, many of the intermediate steps have so far required manual labor in the research lab. "It's not suitable for commercial use," said Chris Takahashi, senior fellow at the Paul Allen School of Computer Science and Engineering at U.S. University (Paul G. Allen School of Computer Science & Engineering).

β€œPeople with pipettes cannot run around the data center, this approach is too high for human error, it is too expensive and takes up too much space,” Takahashi explained.

For this method of data storage to be commercially viable, both the cost of DNA synthesisβ€”creating the fundamental building blocks with meaningful sequencesβ€”and the sequencing process that is needed to read the stored information must be reduced. The researchers say that this is the direction rapid development.

Automation is another key piece of the puzzle to enable commercial-scale storage and make it more accessible, according to Microsoft researchers.

Under certain conditions, DNA can exist for much longer than modern archival storage facilities, which decay over decades. Some DNA managed to survive in less-than-ideal conditions for tens of thousands of yearsβ€”in mammoth tusks and in the bones of early humans. This means that data can be stored in this way as long as humanity exists.

The DNA automated data storage system uses software developed by Microsoft and the University of Washington (UW). It converts the XNUMXs and XNUMXs of digital data into nucleotide sequences (A, T, C, and G) that are the "building blocks" of DNA. The system then uses inexpensive, mostly standard, laboratory equipment to supply the necessary fluids and reagents to the synthesizer, which collects the fabricated DNA fragments and places them in a storage container.

When the system needs to extract information, it adds other chemicals to properly prepare the DNA and uses microfluidic pumps to push fluids into the parts of the system that read the DNA sequences and convert them back into computer-readable information. The researchers say the goal of the project was not to prove the system could be fast or cheap, but simply to show that automation was possible.

One of the most obvious benefits of an automated DNA storage system is that it frees up scientists to solve complex problems without wasting time looking for bottles of reagents or the tedium of adding drops of liquid to test tubes.

β€œHaving an automated system to do repetitive work allows labs to go directly into research, develop new strategies to innovate faster,” said Microsoft researcher Bichlin Nguyen.

Team from the Laboratory of Molecular Information Systems Molecular Information Systems Lab (MISL) has already demonstrated that it can store cat pictures, wonderful literary works, video and archived DNA records and extract these files without errors. To date, they have been able to store 1 gigabyte of data in DNA, beating previous world record of 200 MB.

Researchers have also developed methods for performing meaningful calculationssuch as finding and extracting only images that have an apple or a green bike, using the molecules themselves, without converting the files back to digital format.

β€œIt is safe to say that we are witnessing the birth of a new type of computer system that uses molecules to store data and electronics to control and process. This combination opens up very interesting possibilities for the future,” said a professor at the Allen School of the University of Washington. Luis Cese.

Unlike silicon-based computing systems, DNA-based storage and computing systems must use fluids to move molecules around. But liquids are inherently different from electrons and require completely new technical solutions.

The University of Washington team, in collaboration with Microsoft, is also developing a programmable system that automates laboratory experiments by using the properties of electricity and water to move droplets on a grid of electrodes. A complete set of software and hardware called Puddle and PurpleDrop, can mix, separate, heat or cool various liquids and follow laboratory protocols.

The goal is to automate laboratory experiments currently performed manually or by expensive fluid-handling robots and reduce costs.

The next steps for the MISL team include integrating a simple, end-to-end automated system with technologies such as Purple Drop, as well as other technologies that allow searches in DNA molecules. The researchers deliberately made their automated system modular so that it can evolve as new technologies for DNA synthesis, sequencing and manipulation emerge.

β€œOne of the advantages of this system is that if we want to replace one of the parts with something new, better or faster, we can just plug in the new part,” Nguyen said. β€œThis gives us more flexibility for the future.”

Top image: Researchers from Microsoft and the University of Washington recorded and counted the word "Hellousing the first fully automated data storage system in DNA. This is a key step in moving new technology from labs to commercial data centers.

Source: habr.com

Add a comment