When the Human Genome Project and biotechnology company Celera Genomics announced the sequencing of the human genome two decades ago, it was not really a complete genome; About 15% of the genome’s composition was still unknown. Given the limitations of technology, scientists could not understand how certain parts of DNA are related to each other, especially those with a large number of repeating letters (ie base pairs). Over time, scientists were able to decipher this puzzle. However, the most recent human genome – to which geneticists have referred since 2013 – is still 8% short for sequencing (see figure: ‘The Human Genome Completeness’).
More recently, researchers who participated in the Telomere-to-Telomere Consortium, T2T, whose membership includes 30 institutions, have bridged the gaps in the human genome. In the research draft released on May 27, entitled, “The Complete Sequencing of the Human Genome,” Karen Mega, a genetics researcher at the University of California, Santa Cruz, and her colleagues reported that they were able to sequencing complete from the rest of the genome, by discovering about 115 new genes, and that these genes encode a total of 19,969 proteins (S. Nurk). et al. Preference at bioRxiv https://doi.org/gj8jk3; 2021).
“It’s exciting to find solutions to some aspects of the problem,” said Kim Pruitt, a bioinformatics specialist at the U.S. National Center for Biotechnology Information in Bethesda, Maryland, adding that their finding was a “huge achievement. “is.
This new-sequence genome, called T2T-CHM13, adds approximately 200 million base pairs to the human genome-sequence identified in 2013.
Instead of taking DNA from a living person, the scientists this time relied on a cell line taken from an entire hydrated molar, a type of tissue that forms in the human body when a sperm leaves an egg without a nucleus fertilizes, resulting in a cell that contains chromosomes from the father alone, which saves scientists the trouble of distinguishing between two sets of chromosomes from two different people.
Mega says that this achievement would not have been possible without modern DNA sequencing technology, started by Pacific Biosciences of Menlo Bart, California; It is a technique that lasers use to scan long pieces of DNA extracted from cells, at a rate of about 20,000 base pairs at a time. Conventional methods of genomic sequencing are based on reading DNA sequences into fragments of only a few hundred base pairs at a time. Researchers are comparing these extensions to pieces of a puzzle. Larger pieces are easier to put together as they often contain overlapping rows.
However, the T2T-CHM13 genome is not considered the best in human genome studies. Scientists have struggled with some regions, and the team notes that there may be errors in about 0.3% of the genome. It is true that we have a genome free of gaps, but Mega says that quality control processes in these genomic regions have been extremely difficult. The sperm cell that made up the “molar tissue” contained the X chromosome, which means that scientists have not yet determined the genomic sequence of the Y chromosome, which is responsible for the formation of males.
The T2T-CHM13 genome is reported to represent only one person’s genome. Developed by T2T and the Human Pangenome Reference Consortium, they form a team aimed at sequencing the DNA of more than 300 genomes from people around the world over the next three years. In this regard, Mega says that teams of researchers will be able to use the T2T-CHM13 genome as a reference to help them understand which parts of the genome differ from one individual to another. In addition, the scientists plan to sequence the entire genome that contains chromosomes from both parents. In this regard, the team of researchers led by Mega tried to sequence the genome of the X chromosome, using the same new methods, in an attempt to fill in the remaining gaps.
Megan expects genetic researchers to quickly find out if the newly sequenced genomic regions, and their potential genes, are relevant to human disease. Now we should get much faster information about the functions of newly discovered genes than in the past. Considering the “massive amount of resources we have at our disposal,” Mega said.
Mega hopes that in the future genomic sequencing will include all parts of the human genome, including newly discovered parts, and not just easily readable parts; Which is now easier – with the completion of the reference genome – than it was before. “We need to reach a new level in genetics where whole genome sequencing is not exceptional, but known,” she says.