New Limits to Functional Portion of Human Genome Reported

Evolutionary biologist Dan Graur has published new calculations that indicate no more than 25 percent of the human genome is functional.

An evolutionary biologist at the University of Houston has published new calculations that indicate no more than 25 percent of the human genome is functional. That is in stark contrast to suggestions by scientists with the ENCODE project that as much as 80 percent of the genome is functional.

In work published online in Genome Biology and Evolution, Dan Graur reports the functional portion of the human genome probably falls between 10 percent and 15 percent, with an upper limit of 25 percent. The rest is so-called junk DNA, or useless but harmless DNA.

Graur, John and Rebecca Moores Professor of Biology and Biochemistry at UH, took a deceptively simple approach to determining how much of the genome is functional, using the deleterious mutation rate – that is, the rate at which harmful mutations occur – and the replacement fertility rate.

Both genome size and the rate of deleterious mutations in functional parts of the genome have previously been determined, and historical data documents human population levels. With that information, Graur developed a model to calculate the decrease in reproductive success induced by harmful mutations, known as the “mutational load,” in relation to the portion of the genome that is functional.

The functional portion of the genome is described as that which has a selected-effect function, that is, a function that arose through and is maintained by natural selection. Protein-coding genes, RNA-specifying genes and DNA receptors are examples of selected-effect functions. In his model, only functional portions of the genome can be damaged by deleterious mutations; mutations in nonfunctional portions are neutral since functionless parts can be neither damaged nor improved.

Because of deleterious mutations, each couple in each generation must produce slightly more children than two to maintain a constant population size. Over the past 200,000 years, replacement-level fertility rates have ranged from 2.1 to 3.0 children per couple, he said, noting that global population remained remarkably stable until the beginning of the 19th century, when decreased mortality in newborns resulted in fertility rates exceeding replacement levels.

If 80 percent of the genome were functional, unrealistically high birth rates would be required to sustain the population even if the deleterious mutation rate were at the low end of estimates, Graur found.

“For 80 percent of the human genome to be functional, each couple in the world would have to beget on average 15 children and all but two would have to die or fail to reproduce,” he wrote. “If we use the upper bound for the deleterious mutation rate (2 × 10−8 mutations per nucleotide per generation), then … the number of children that each couple would have to have to maintain a constant population size would exceed the number of stars in the visible universe by ten orders of magnitude.”

In 2012, the Encyclopedia of DNA Elements (ENCODE) announced that 80 percent of the genome had a biochemical function. Graur said this new study not only puts these claims to rest but hopefully will help to refocus the science of human genomics.

“We need to know the functional fraction of the human genome in order to focus biomedical research on the parts that can be used to prevent and cure disease,” he said. “There is no need to sequence everything under the sun. We need only to sequence the sections we know are functional.”

eggn00dles on July 16th, 2017 at 20:29 UTC »

what exactly does 'functional' mean? does this mean i can remove 75% of my genome and still be completely normal?

Ramartin95 on July 16th, 2017 at 20:19 UTC »

His idea of using deleterious mutations to determine function is flawed at the offset when you consider the possibility of multiple areas producing the same product.

Let's say that there is a piece of miRNA that is vital to the regulation of a certain protein and is produced at 3 spots in the genome. A mutation in one location would cause that site to stop producing the miRNA, but the other two sites would continue to be productive and no negative effects would be observed.

As to why we should think there would be multiple sites of the genome doing the same thing? That all comes down to the existence of transposons: genetic elements that are able to copy themselves and inject themselves into other parts of the genome. Transposons are found all over the genome and dominate the areas previously thought of as 'junk'.

InductorMan on July 16th, 2017 at 18:43 UTC »

copying my comment from another post of the same research to this sub:

Wait... what am I missing here:

The mean fitness of the population can be defined by two variables, the mean deleterious mutation rate per functional nucleotide site per generation ( μdel ) and the number of functional nucleotide sites (n) in the genome (Kimura 1961; Nei 2013).

w=(1−μdel)n

kay, this is the probability that an individual offspring has zero mutations in any of the functional nucleotide sites. I will grant that this metric is related to fitness, but by what function?

Let us now consider the connection between mutational load and replacement level fertility ( F ). If the mortality rate before reproduction age is 0 and mean fertility is 1, then the population will remain constant in size from generation to generation. In real populations, however, the mortality rate before reproduction is greater than 0 and, hence, mean fertility needs to be larger than 1 to maintain a constant population size. In the general case, for a population to maintain constant size, its replacement level fertility should be

F=1 / w

Uhhhhhhhhh, what? All of a sudden the probability of mortality before reproduction is identically equal to the probability that an individual offspring has more than zero mutations in functional nucleotide sites? I'm pretty sure we know that's not how it works, for a typical definition of "functional". This would imply that there is no genetic drift of functional genomic data. This would imply that there's no speciation, because there can be no change in function. Right?

Ok, so the only way any of this makes sense to me is if the "functional fraction" isn't really the fraction of the genome that has any function, but the fraction that is vitally functional. They're literally equating "functional genome" length with the number of single site deadly mutational targets in the genome.

This strikes me as a nonsensical way to define functionality. Just look at protein tertiary structure. We know that there's structural redundancy in proteins. Many structures are stabilized by multiple bonds between adjacent alpha helices, or by a cloud of buried hydrophobic residues, all of which contribute to the structural stability of the molecule (across temperature, say). We would never call the group of hydrophobic residues that stabilize a globular protien "nonfunctional," although substitution of any given one wouldn't necessarily be deadly.

I don't know, maybe this calculated functional fraction relates in a useful way to the information density or informational redundancy of the genome. But if that's the case then the phrasing is obfuscatory and misleading. The title makes it sound like 75% of the coding regions of the genome don't do anything.

EDIT: I've gotten good criticisims of my criticism. One important mis-understanding that folks have brought to my attention is that the definition of "fitness" here includes the rate of offspring production by an individual's offspring, so the mutation doesn't need to be deadly: the total long term selective pressure on a mutation is folded into the "w" metric. I've also had my argument that the scenario captured by the model doesn't allow for drift or speciation refuted. So I think that the only reason my comment is the top comment here is not because it's necessarily a great refutation (it isn't) but because it was posted early, and lots of readers share the same sentiment (that this analysis is excessively simplistic).

I still am left with the impression that the simplifications used to make the analysis have caused so much complexity to be folded into the working definition of "functional genome" (even beyond the careful definition that the authors give) that it's just not super useful to talk about the fraction of the genome which is functional by this metric.