Pseudogenes: Great Evidence for Evolution
Updated: Aug 25
"To conclude: our genome contains thousands of disabled genes... When multiple species share one of these mutations, it is only because they have inherited it from the reproductive cell in which the unique mutation occurred. The burgeoning scientific field of pseudogenealogy establishes the concept of common descent in a way that would have been inconceivable before the DNA sequencing revolution. Humans and the other apes have common ancestry". ~ Graeme Finlay
Biological Plagiarisms as a model
I had a friend who used to work for Intel, the giant chip maker. He (B.D.) related to me that the company purposely put some error codes in their chips. Why? Because if a competitor copied their chip and claimed that they did not copy Intel’s chips but developed the same approaches separately due to common design constraints, the competitor would need to explain why they had the same exact errors in their chips in the same locations. Claiming “common design”, that they just happened to arrive at the same solution and code would not explain the same exact unique errors. Textbook publishers sometimes purposely put in errors for the same reason. To prove that a competitor copied their work and did not arrive at the same “design” independently. Rather, it was “common ancestry” - in this case the competitor’s work was derived and copied from the original. A last example is a teacher teaching an online course involving students from all over the world. If the teacher is grading papers and two papers come in from students who claimed they did not communicate with each other, but there are large sections of the exact same sentences and paragraphs in their papers, the teacher knows the students derived their papers from the same online source. The two papers were not derived independently; they shared the same origin and had a common literary ancestor because they contained the exact same errors. Those papers were not original to the students. It was not “common design” but rather “common ancestry or descent” - in this case the Internet. And if those two papers even had the same errors that were missed in the on-line source, then there is no question they used the same source. The use of errors in original work to expose competitor’s copying has been upheld in court.
When it comes to shared errors, common design as an explanation for similar DNA characteristics between species utterly fails. It must be common ancestry, common descent. If you have read this web site, you will see that this same principle holds true with shared ERVs that insert randomly and are found in the exact same locations between species, or with shared DNA breaks & repairs (because the patches are unique). Since the retroviruses insert randomly and that’s been demonstrated, when we find identical 200,000 ERVs (mostly as LTRs) in the exact same positions in the DNA between two different species, we can be sure that the resultant ERVs formed because the retroviruses inserted before the species split. See ERVs, this site. There is no other rational explanation. Likewise, the finding of shared DNA scars between two species that involve random DNA damage and random emergency patching leaves one with only one rational explanation - common descent, or evolution. There is yet a third area of DNA findings that provides solid robust evidence of evolution, human evolution and macroevolution, and those are pseudogenes.
What are pseudogenes?
The human genome contains about 3 billion base pairs, the ATCG nucleotide "letters". Only 1.5% of those are protein coding genes, and they number about only 20,000. Scientists studying our genomes have discovered about 20,000 genes that are also disabled, corrupted and no longer function or perform their original functions. They have been deactivated by various mutations such as stop mutations (codons), deletions and insertions, frameshift destruction, and the loss of regulatory sites. These are called pseudogenes. Some are the original genes but most are copies. Some are functional or partially functional, as they can be partially transcribed. Some have even been able to take on new functions.
There are three main types of pseudogenes, but the vast majority fall into only two categories. One type is called duplicated pseudogenes. They result when large parts of DNA are duplicated producing segmental duplications and within the large sections a gene is also swept up and caught and duplicated. Large duplications are not uncommon; many of these segmental duplications lead to genes that become cancerous. In some plants the entire genome was duplicated in the past. Since it’s a duplicate, a pseudogene is often not needed by the host and is not maintained but decays to the point where it no longer can produce a product from the original parental gene. Less likely in evolution, some duplicated genes can undergo changes and even develop new functions.
The second major type are called a processed pseudogene. These are derived from parental genes by an RNA intermediate. Note that this is similar to how transposable elements (TE) jump around the genome. They arise because TE-encoded enzymes randomly select RNA transcripts of genes and copy them, convert them to DNA, and insert them back into the genome (1). These RNA copies are at least partially processed (for example by having their introns cut out; exons are the parts left over that go on to code for proteins). Nearly all of the pseudogenes are evenly distributed between these two types.
The third type is not nearly as common and are called unitary pseudogenes. Unlike the other two types that involve damage to copied genes, this is where a single gene, the original, is damaged.
Confusion in the Literature
Pseudogenes may develop new functions “...but they are defined by their loss of the original parent gene function and not whether they have functions or not currently.” (Finlay, 1). Confusion arises when people assume that pseudogenes can’t have functions (2, 8). Indeed, a few pseudogenes have been noted to exhibit gain of function as noncoding RNAs (3). If one only includes pseudogenes that have no known function some will claim there are only about 12,000 pseudogenes but as Finlay and Moran point out, that is not how pseudogenes are defined. We know some pseudogenes have gained new functions or are partially transcribed because the gene has only been partially disabled. That however, does not negate the fact they have lost their original function: they are pseudogenes.
Moran makes these points in his blog: “The idea that most duplicated genes will become pseudogenes is consistent with a ton of data and fits well with our understanding of mutation rates and genome evolution. This is an important point. We don't arbitrarily assign the word "pseudogene" to any old DNA sequence. The designation is based on the fact that the duplicated region is no longer transcribed, or it is no longer correctly spliced, or that it carries mutations rendering the product nonfunctional. (In the case of protein-coding genes it could be that the reading frame is disrupted.) It's also important to understand that the frequency these inactivating mutations and the rate of fixation of the resulting allele is perfectly consistent with everything we know about molecular evolution.
There are some examples of DNA sequences that appear to be pseudogenes but they also have functional regions. The best examples are duplicates that contain small RNA genes within their introns or genes that contain other functional regions like SARs and origins of replication. In those cases, the inactivated gene is still a pseudogene but the other functional regions are best characterized as something else.
There are also quite a few examples of pseudogenes that have secondarily acquired a distinct new function such as producing a small RNA that might have a regulatory function. The review by Cheetham et al. contains several examples of such pseudogenes. They are still pseudogenes but the region may now specify a new lncRNA gene or some other gene such as an siRNA gene.” (4)
Pseudogenes: molecular fossils
If we look around at various species we find all kinds of damaged and dead genes that once produced viable products or are suppressed because their regulatory genes are damaged. Chickens still have the genes for making teeth and a tail (5,6). Baleen whales per my Part 1 whale evolution video make teeth buds as a fetus and have pseudogenes for making teeth enamel. Of course adult baleen whales do not have teeth. We know however from paleontology that they evolved from toothed whale ancestors and are not surprised that they still make teeth during fetal development. Placental animals without teeth such as anteaters, tree sloths and armadillos still have the pseudogenes for making teeth enamel. Under the right conditions, snakes can grow legs and cavefish can grow eyes their ancestors had (6). Sperm whales grow atavistic hind legs in about 1:5000 births (15). The DNA is there, but the genes or regulatory sequences are damaged or turned off. Sometimes it can be exposed. It makes no sense unless evolution is true that species would have the DNA instructions to make ancestral structures (atavisms) that they will never use or supposedly never had in the first place.
Humans have about 850 genes that code for olfactory receptors. Over half are knocked out and disabled (14). Although all primates have several hundred functioning olfactory genes, dolphins and whales have very few. They share a distant ancestor with the hippopotamus and it also has very few functioning genes, consistent again with a shared ancestry with whales and dolphins (1). See whale evolution videos. Humans make a yolk sac that is visible in the normal 5 week embryo. It has important functions presently, but if it was originally for holding yolk due to our ancestors laying eggs we should find decayed genes, pseudogenes, for making egg-yolk which is normally only found in egg laying species. This is why the Theory of Evolution is science; it makes testable predictions. Recall that vestigial can mean without function or without the original function. The yolk sac is vestigial. At first scientists had trouble finding the predicted egg-yolk pseudogenes because they were so degraded. But they did eventually by using the clever trick of looking for preserved genes flanking where the pseudogenes should be. And not surprisingly for evolution, they are at the same homologous chromosomal positions as in chickens (7). See Figure 1. See my blog on Intelligent Design where this example and many more observations in nature point definitively to common ancestry and not intelligent design.
Figure 1. Egg yolk human pseudogenes.
Fair use attribution. VIT 1,2,3 are yolk producing genes.
In 2022 researchers discovered that many species with little hair still had the genes for making hair to cover their bodies. They studied 62 species and compared the hairy ones to several that had little hair. These included genes and regulatory sequences for elephants, rhinos, the naked mole rat, human, pig, armadillo, walrus, manatee, dolphin and orca along with 52 hairy species (9,10). Many of the genes involved in hair production were damaged and pseudogenes, but more had disabling mutations instead in the non-coding/regulatory DNA. In other words, the DNA areas that controlled if a gene turned on or off was damaged but not significantly the genes for hair itself.
Humans have a pseudogene that is not shared by other primates. The MYH16 gene at one of the codons suffered a loss of the two bases (AC). Instead of ACC, the deletion produced - - C. We know this because the ACC codon is present in all apes and many monkeys but not humans. This deletion resulted in a gene destroying frame shift mutation.(1) A frame shift mutation is devastating to a gene because like reading a sentence it shifts all the letters over. The big dog ate... > The gdo gat...
Yet another example is where chimps and humans but not other apes share the same mutation in the ACYL13 gene - a point mutation in a codon changed a TGG to a TGA, which is a stop codon (TGA) and thus disabled the gene. (1: pg. 155)
The discussion of shared pseudogenes would not be complete without mentioning the GULO pseudogene since this has generated a significant amount of anti-evolution articles. Vitamin C, or ascorbic acid, is made by most mammals in which case it is not a vitamin for them. It is produced in a four enzyme series from glucose. The final step is catalyzed by the last enzyme, L-gulono-y-lactone oxidase, or GULO. The gene is non-functional and located on chromosome 8 at p21 (12).
"Human GULO is a severely degenerated copy of the gene as only 5 of the original 12 exons remain, the locus has been bombarded by with retrotransposons and those parts of the gene that are identifiable are riddled with mutations... The GULO pseudogene contains multiple indel and stop mutations. The oldest appears to be a stop mutation, shared by representatives of all simian primate groups - apes, Old world Monkeys and New World Monkeys... Subsequently, exons 2 and 3 were lost from the genomes of apes and OWMs by a DNA deletion event that eliminated approximately 2,500 bases from the genome. A representative stop mutation is shared by OWMs. A codon specifying the amino acid arginine (possibly CGA) has ended up as a gene-truncating TGA codon." (1). Below is a view of the GULO gene sequence in Figure 2. Notice that there are two large exon deletions shared by all humans, chimps and macaques. It is estimated based on neutral substitution rate analysis that the gene was disabled about 61 mya (12).
Other species such as the guinea pig and some bats have also inherited GULO pseudogenes but their mutations are different from those shared by selected apes and monkeys.
Figure 2. GULO pseudogene showing identical deletions shared by humans, chimps, macaques but not galagos (bush babies).
From Sandwalk, 2017. Fair use and educational use applied. https://sandwalk.blogspot.com/2017/10/creationists-questioning-pseudogenes_28.html
Recall there are an estimated 20,000 human pseudogenes and as we find them, we can compare them in different species. Some are only found in humans, some only in humans and chimps, and still others across several apes species. The ABCC13 pseudogene and the glucocerebrosidase pseudogene show identical mutations and are found only human, chimp and gorilla species because the mutation occurred in a shared ancestor of those three species (1:pg. 158). See Figure 3.
Figure 3. ABCC13 pseudogene (top). Glucocerebrosidase pseudogene (bottom). Bases in bold are identical in species. See text.
From: Finlay, Graeme. 2013. Human Evolution: Genes, Genealogies and Phylogenies.
p 158. Figure 3.8. Cambridge University Press. 2021 ed. Social sharing and Fair dealing applied per publisher's web instructions.
In contrast the urate oxidase pseudogene was damaged because a C was mutated to a T producing a stop codon and this unique mutation is found only in four species of apes: human, chimp, gorilla, and orangutan (1: pg. 161) See Figure 4.
Figure 4. The urate oxidase pseudogene shared by the great apes. See text.
From: Finlay, Graeme. 2013. Human Evolution: Genes, Genealogies and Phylogenies.
p 161. Figure 3.10. Cambridge University Press. 2021 ed.Social sharing and Fair dealing applied per publisher's web instructions.
Do you see what is forming? Thousands of pseudogenes can be found in humans and we can look for them in other species. They produce a pattern where some species have them and if they have identical mutations the only rational explanation is shared ancestry. If we group the raw observations an evolutionary tree is produced. And a specific pseudogene tree matches the paleontology trees, the ERV trees, the LTR trees, and the DNA repair trees. It’s how we can be assured that we have the evolutionary story correct because evidence from independent lines of DNA findings confirms macroevolution in apes and monkeys. If all these damaged genes happened at one time, a nested hierarchy of data and observations that shows evolution would not be possible. See Figure 5.
Figure 5. Nested hierarchy of various pseudogenes showing times they appeared during evolution. Specific pseudogenes in boxes. Numbers refer to additional unitary pseudogenes. See text.
From: Finlay, Graeme. 2013. Human Evolution: genes, genealogies and phylogenies.
p 172. Figure 3.18. Cambridge University Press. 2021 ed. Social sharing and Fair dealing applied per publisher's web instructions.
We can even show an evolutionary tree with a single pseudogene. How? Because some pseudogenes are very old and have accumulated different mutations of the gene in different species. These can also be nested into an evolutionary tree. One gene that demonstrates this is the ARG pseudogene.
"Multiple mutations are shared by humans, chimps, macaques (representing OW monkeys) and marmosets (a NW monkey). All simian species studied shared one stop, three splice-site, three frameshift and two TE insertion mutations. In addition, apes and OWMs share mutations that are absent in NWMs, and apes share a splice-site mutation that is absent in OWMs and NWMs.” (1: pg. 169). This one pseudogene provides a nested evolutionary tree by itself!
Another example of different mutations occurring over time to a single shared pseudogene that produces an evolutionary nested tree is the TRPC2 pseudogene. See Figure 6.
Figure 6. Mutations in the TRPC2 pseudogene of apes and Old World Monkeys. Mutations (S) are stop mutations, (ind) indels - insertions or deletions, and (Rv) reversions. See text. From: Finlay, Graeme. 2013. Human Evolution: genes, genealogies and phylogenies.
p 174. Figure 3.19. 2021 ed. Cambridge University Press. Social sharing and Fair dealing applied per publisher's web instructions.
Recall that unlike unitary pseudogenes that were knocked out by mutations and have no copies of themselves, many pseudogenes represent disabled copies from the original gene. One type results from a “copy and paste” method. In this case a type of “jumping gene” known as a LINE-1 retrotransposon grabs a gene as it copies and then the gene disengages from the LINE-1, inserting randomly back into the DNA. Because it left behind associated regulatory sequences it can no longer be transcribed and is termed DOA - 'dead on arrival'. This happens in Duchenne muscular dystrophy where a fragment of a non-coding RNA from chromosome 11 was inserted into exon 67 of the dystrophin gene located on the X chromosome (1). The human genome contains over 5,000+ processed pseudogenes alone.
One particular gene, NANOG, is a master regulator of gene expression and 11 pseudogenes are known from it, 10 being of the processed type. Nine of these are also present in the chimp genome. One of these, NANOGP4, has developed several gene killing mutations. Humans have four stop mutations and three deletions. Three of the stop mutations are shared by chimps, and chimps also share two of the deletions with humans. The NANOGP8 acts as an oncogene and is probably responsible for our increased tendency to develop cancers compared to other primates (1, 11).
Studies of various ape and monkey genomes have shown 48 processed pseudogenes in humans only, 94 shared in humans and chimps only, and 337 in the genomes of all three great ape species (humans, chimps, gorillas) but not in macaques. As you should be able to guess by now this will produce an evolutionary phylogenic tree that matches the other trees (1). Please note that all three types of pseudogenes produce independent phylogenetic trees separately that match those produced by ERVs and DNA repairs discussed elsewhere on this web site. This DNA evidence for evolution, macroevolution and human evolution, is confirming and overwhelming.
About 800 retrotransposed pseudogenes produced from transfer RNA and hY RNA genes have been discovered in the human genome. The four hY RNA genes we have have been copied by retrotransposons and inserted into our DNA as 966 pseudogenes; 95% are identical to those shared with chimps (1). The vast majority thus must have occurred before the human-chimp ancestor split.
1. The most frequent objection is a straw man characterization that pseudogenes can't have functions. As pointed out above, both Finlay and Moran note some can and some can even have gain of function with new functions. Many that show functions are only partially transcribed. But as in the definition of vestigial, pseudogenes are not defined by the presence or absence of function. Again, Finlay writes:
"The progressive changes in base sequence provide the history of a pseudogene, and this history defines evolutionary relationships of those species that share the pseudogene. Current functionality is irrelevant to the value of pseudogenes as evolutionary markers." [author emphasis in the underline only] (1)
Most anti-evolutionary attacks seem to fall into trotting out a few functional pseudogenes as if that is a knock-out blow to evolution. It is not. Recall there are 20,000 pseudogenes and the independent evolutionary trees for three types of pseudogenes are solid evidence for evolution and discount a supposed appeal to a one time introduction of zapped disease and suffering into the world which would not produce nested evolutionary trees. The vast number of pseudogenes are non functional. Evolutionary trees rule out any species narratives that do not include macroevolution and disprove a one time event that damaged most or all of the DNA, introducing disease and death.
2. The GULO pseudogene is a very common target for anti-evolutionists. Articles have been written extensively by anti-evolutionists against GULO as a pseudogene and include Tomkins, Truman, Terborg (Borger?), RTB, and others. Their objections have been addressed and countered. See Moran (12), Venema (13) just for a few examples.
3. Most creationists are anti-evolutionists (ICR/AIG/CMI/RTB) and will deny macroevolution at every turn. For example the denial of transitional fossils occurs despite scores of found and predicted transitional fossils because in their origin narratives and presuppositions there can never be transitional fossils. In whales alone over 200 fossil species have been found, some showing the gradual movement of the blow hole up the skull and the gradual shrinking of hind limbs as the fossils for example. Likewise there is no room for pseudogenes in their a priori views - ever. Thousands of pseudogenes have been found and hundreds can be nested in evolutionary trees that anti-evolutionists cannot accommodate in their species origin narratives.
Our genome contains up to 20,000 pseudogenes. Most are copies either through gene duplications or "copy and paste" mechanisms with retrotransposons after they are processed and inserted randomly back into a new site in the DNA. The third type is mutations to genes that have no back up copy, called unitary pseudogenes. All three types produce nested evolutionary hierarchal trees independently that rise to the level of proof of macroevolution.
The denial of DNA present in genomes to make structures that anti-evolutionists claim as impossible is telling. Atavisms abound to negate evolution denial. Chickens carry genes for making teeth and tails, tails in humans have been seen in about 100 cases, whales and dolphins are occasionally born with hind legs that attach to a vestigial pelvis (recall the proper definition of vestigial), and baleen whales carry pseudogenes for making teeth enamel. All of these are explained well by evolution but cause mortal damage to origin species narratives that deny macroevolution. Or they are poorly rationalized by anti-evolutionists sometimes to absurd lengths. If these species were formed separately none of these findings would be possible or expected.
Most creationists (ICR/AIG/CMI/RTB) will deny macroevolution at every turn. For example the denial of transitional fossils occurs despite scores of found and predicted transitional fossils because in their origin narrative and presuppositions there can never be transitional fossils. In whales alone over 200 fossil species have been found, some showing the gradual movement of the blow hole up the skull through time, and the gradual shrinking of hind limbs for example. Likewise there is no room for pseudogenes in their a priori views - ever.
Thousands of pseudogenes have been found and hundreds can be nested in evolutionary trees that anti-evolutionists cannot accommodate or discount effectively. Pseudogenes and nested pseudogenes are fantastic evidence for macroevolution and join human chromosome 2 fusion, shared ERVs , shared segmental DNA duplications and shared DNA identical repairs as amazing evidence for macroevolution for those without ant-evolution commitments. With these DNA findings constituting a "second fossil record", one can wonder if traditional fossils are still the best evidence for evolution. Well, we need those also but I assert that the DNA findings are great at showing macroevolution is true with perhaps fewer interpretations needed.
Literature Cited and References
1. Finlay, Graeme. 2013. Human Evolution: Genes, Genealogies and Phylogenies. Cambridge University Press. 283 pp. not including References and Index. Paperback edition 2021 - ISBN 978-1-009-00525-8
11. Evolution of the NANOG pseudogene family in the human and chimpanzee genomes. Fairbanks, Daniel J. and Maugan, Peter J. 2006. BMC Evolutionary Biology. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1457002/ Feb 9. doi: 10.1186/1471-2148-6-12
15. Limbs in whales and limblessness in other vertebrates