Research opens new horizons. Can our digital images and data be stored in DNA? | technology

Scientists have proven that images and text pages can be encoded as RNA, but there is also a need to find an easy way to access the required file.

Our planet now has about 10 trillion gigabytes of digital data, and every day people are generating emails, photos, tweets and many other digital files that add another 2.5 million gigabytes of data. Much of this data is stored in large facilities known as “ex-grip data centers” (ex-grips = 1 billion gigabytes) that are the size of a soccer field, and cost about $ 1 billion to build and maintain.

Store all the data in the world in one cup of DNA

Many scientists believe the alternative solution lies in the molecule that contains our genetic information, RNA or what is known as “DNA”, which can be developed to store large amounts of information at a very high density. In this context, Mark Bathy, professor of biological engineering at the Massachusetts Institute of Technology (MIT), says that “a cup of coffee filled with DNA could theoretically store all the world’s data,” as technology.org recently reported, citing to the Institute platform that published the research.

“We need new solutions to store these vast amounts of data that the world produces and compiles, especially archival data, and RNA is a thousand times denser than flash memory,” explains Pathy. It requires some energy, and you can write DNA and then store it forever. “

Scientists have already proven that they can encode images and text pages in the form of RNA. However, there is also a need to find an easy way to access the required file from the many overlapping pieces of DNA, and it was’ a complex problem that scientists have faced in the past, but Pathé and colleagues solved this problem and found a way to do it by chopping each specific data file into a 6-μm silica particle, which was labeled with short DNA sequences that reveal its contents.

Using this method, the researchers showed that they could extract exactly individual images stored in DNA strands, and in this way they were able to categorize files that could be as large as 1020.

HAMBURG, GERMANY - 07 JUNE: A 'Mistral' supercomputer, installed in 2016, at the German Climate Computing Center (DKRZ, or Deutsches Klimarechenzentrum) on 7 June 2017 in Hamburg, Germany.  The DKRZ provides HPC (high performance computing) and related services for climate research institutes in Germany.  Its high-performance computer and storage systems have been specifically selected with respect to climate and Earth System modeling.  With a total of 100,000 process
Much of this data is stored in massive facilities known as “exabyte data centers” (Getty Images)

stable storage

Digital storage systems encode text, images or any other type of information as a string of 0, 1 or bits and bytes, and the same information can be encoded in DNA by the four nucleotides that make up the genetic code: a, t, g, c. (A, T, G, C), for example, “G” and “C” can be used to represent the bit (bit) “0”, while “A” (A) and “T” represent (T) Byte “1”.

DNA has many other advantages that make it desirable as a storage medium: it is very stable, easy to use (though expensive), and due to its high density it saves a lot of space, with 1 exagreep stored data reaching barely 1 nanometer cubic, which you can fit in the palm of your hand without feeling it instead of a large soccer field.

One of the major obstacles to this type of storage is the high physical cost, with the cost of writing one petabyte (one million gigabytes) of data currently at a trillion dollars. To become a competitor to magnetic tape, which is widely used today to store archive data, costs will have to drop dramatically, and Bathy expects that to happen within a decade or two at the latest.

The biggest obstacle researchers face

Apart from the cost, the biggest obstacle for the research team to use DNA to store data is the difficulty of finding the file you want among all the others.

“Suppose the cost is reasonable and economically feasible, and we can store exagreps or zettagreps data in DNA, what then? We will have a large pile of data stored in DNA, and if you want to find a specific movie or image,” says Paty. It’s like trying to find a needle in a haystack. “

Currently, DNA files are traditionally recovered using polymerase chain reaction or PCR, and each DNA data file contains a sequence number associated with a PCR primer. Primer to the sample to find the desired sequence. One disadvantage of this approach, however, is the possibility of a cross-reactivity between the primer and the DNA sequence, leading to the pulling of unwanted coils.

The biggest obstacle the research team faced in using DNA to store data is the difficulty of finding the file you want among all the others (Getty Images)

What is the solution to this dilemma?

As an alternative approach, the MIT team has developed a new recycling technology that involves encapsulating each DNA-stored coil in a small silica capsule. Each capsule is encoded with single-stranded DNA “barcodes” that match the contents of the file, and these codes are the name of the capsule contained in the file.

To make sure this method works, the researchers encoded (named) 20 different images in pieces of DNA about 3,000 nucleotides long, which is about 100 bytes.

The result was astounding.The raw materials are labeled with fluorescent or magnetic particles, making it easy to pull them out and make sure they fit the required coil, and then pull or open that coil while the rest of the DNA is left intact for return to storage. This search process allows you to type words like “President, America, the eighteenth century” to be the result of President George Washington, which is the same as what is currently being done when searching for such words in the Google search engine (Google) word.

For the barcodes they used, the researchers used single-stranded DNA sequences from a library of 100,000 rows, each about 25 nucleotides in length, developed by Dr. Stephen Elig, Professor of Genetics and Medicine at Harvard Medical School. .

And if you put two of these labels on each file, you can uniquely name 1010 (10 billion) different files, and with 4 labels on each, you can uniquely name 1020 files.

A giant leap in search technology

George Church, professor of genetics at Harvard Medical School, describes this technology as “a giant leap in knowledge management and research technology.”

“Rapid progress in writing, copying, reading, and storing low-energy archival data in DNA has left untapped opportunities to accurately recover data files from large databases of up to 1,021 bytes on the zeta scale,” says Church.

Church continues, “The new study was able to achieve this incredibly by using a completely independent outer layer of DNA and taking advantage of the different properties of this acid, all using what we currently have from the tools and chemistry we have. .

It may take a while for the financial cost of this amazing method of storing digital data to come down, but it will definitely come in the near future.

It should still be mentioned that the research team that achieved this amazing achievement consists of Professor Dr. Mark Pathy as the team leader, researcher James Bandall of the Massachusetts Institute of Technology, Associate Professor at the Watson Shepherd Institute, and graduate student at the Joseph Berlant Institute.

Leave a Comment