A while ago I wrote about the value of genome sequences, not just for helping us understand the biology of a particular organism, but also for enabling large-scale comparisons across species that can help spot patterns in genome evolution which wouldn’t otherwise be apparent. A recent paper in Journal of Heredity by Craig Lowe, David Haussler and colleagues at the University of California provides an excellent example of this in action, using sequences from the tuatara genome to identify the evolutionary origin of parts of the human genome.
Lowe and colleagues were looking for functional elements (like parts of genes and their regulatory regions) in the human genome that originated from retrotransposon insertions. Retrotransposons are mobile bits of DNA that have a tendency to make copies of themselves and insert themselves in various different places in the genome. They contain everything needed for this copying, plus often include functional modules like exons of genes, or transcription factor binding sites. These functional modules may be co-opted for a new function in the new site, a process known as exaptation. Once a retrotransposon is inserted in a new location it is often inactivated, and then begins to accumulate mutations which render it unrecognisable as a retrotransposon. This makes it difficult to identify exaptation events in any given genome and hence trace the origin of many of the functional elements of that genome. However, by comparing the genomes of many different species in different lineages it may be possible to identify ancestral versions of these elements, and so trace their evolutionary history.
Lowe and colleagues found a previously unknown retrotransposon in the small part of the tuatara genome that has been sequenced. This retrotransposon is of a type known as a LINE – Long Interpersed Nucleotide Element - and was named EDGR-LINE (endangered-LINE). A search of human genome against this sequence found 18 elements that are likely to be the result of insertion of this retrotransposon into the genome at some point in evolutionary time. Seventeen of these elements are gene regulatory regions and one is an exon of a gene called ASXL3. ASXL3 is important for regulation of other genes during development and the additional exon co-opted from EDGR-LINE appears to help control its expression.
These 18 exaptation events likely occurred early in mammalian evolution, but the retrotransposon itself has long since been inactivated in humans so all traces of it have been lost. The functional elements it contained are able to be identified because they are under strong purifying selection (i.e. have not accumulated many mutations), so can still be aligned with the tuatara sequence. Its only through this comparison that it is possible to know that these 18 elements originated from the same retrotransposon.
EDGR-LINE was also found in the lizard, frog, and coelecanth, but no traces of it remain in mammals, crocodylia and birds. EDGR-LINE appears to be more slowly evolving in tuatara than in lizards, so is closest to the mammalian ancestral version of EDGR-LINE and hence more informative for identifying elements in the human genome. In fact, 10 of the 18 elements could only be identified by comparison with tuatara and not with these other species.
This is not the only example of genomic information from a rare species shedding light on the evolutionary history of human genome. The genome of the threatened desert tortoise Gopherus agassizii also harbours an ancient LINE that has enabled functional elements of the human genome to be identified. Lowe and colleagues speculate that this may be due to the very nature of endangered species, and ran simulations to show that theoretically, mobile elements like LINEs are active for longer and evolve more slowly in small populations. This effect comes about because of the relationship between population size and selection – selection is more efficient in large populations so is more likely to remove genetic variants which are mildly harmful (or deleterious) to the organism, and to fix mutations which are beneficial. The smaller the population, the more likely it is that deleterious genetic variants will become fixed in that population and beneficial mutations will be removed. Insertion of mobile elements into new places in the genome is almost always deleterious, as it messes with existing genes and their regulatory regions. Thus small populations will be more likely to accumulate additional copies of the mobile elements, and less likely to accumulate mutations which would remove or inactivate them. I should point out here that tuatara are not actually classified as endangered (as the paper claims), but they have had a historically low population size, with probably a severe population bottleneck during the oligocene inundation of the New Zealand land mass. In addition, we now know that even large tuatara populations can have a small effective population size, as few individuals actually contribute to mating at any one time.
Lowe and colleagues point out that without the tuatara, we would not have been able to identify these particular functional elements in the human genome, and that we never know what additional information about human evolution we might glean from threatened species in the future. This underscores the importance of projects like the Genome10K initiative to sequence 10,000 vertebrate genomes. Of course I would add that we should preserve these species for their intrinsic worth not just because of what they can tell us about human evolution, but this paper does highlight the unexpected ways that genomic data from diverse species can help us understand evolution.
Lowe, C., Bejerano, G., Salama, S., & Haussler, D. (2010). Endangered Species Hold Clues to Human Evolution Journal of Heredity DOI: 10.1093/jhered/esq016