|
|
|
|
|
|
|
|
|
INTRONS, EXONS, TRANSPONSONS, REGULATORY GENES RETRO-VIRUSES Rhawn Joseph, Ph.D.
THE ORIGIN OF EARTHLY LIFE
Life had taken root and repeatedly arrived on this planet between 3.8 to 4.2 BYA, a time period during which Earth was undergoing continual pummeling from the remnants and debris produced by the exploding parent star and its planetary system. Although microfossils resembling yeast cells and fungi were discovered in 3.8 BY old quartz (Pflug 1978), the nature of the first and earliest Earthlings can only be inferred indirectly based on the residue of photosynthesis, oxygen secretion, carbon isotopes, the structure of banded iron formations and high concentrations of carbon 12, or “light carbon;” all of which are typically associated with microbial life (Manning et al. 2006; Mojzsis et al. 1996; Nemchin et al. 2008; O'Neil et al. 2008; Rosing, 1999, Rosing and Frei, 2004; Schoenberg et al. 2002).
However, if simple eukaryotes such as fungi and yeast cells had arrived on Earth by 3.8 BYA, then we can certainly assume that those sojourners from the stars who had arrived hundreds of millions of years earlier, included bacteria, archae, and viruses--and this has been demonstrated by geo-physical and biochemical analysis (Manning et al. 2006; Mojzsis et al. 1996; Nemchin et al. 2008; O'Neil et al. 2008; Rosing, 1999, Rosing and Frei, 2004; Schoenberg et al. 2002).
THE FIRST EUKARYOTES
It is generally assumed based on genomic analysis, that the first Earthly unicellular eukaryotes were fashioned when genes from archae and bacteria combined thereby inducing eukaryogenesis and giving rise to the eukaryote genome. These genes subsequently underwent repeated single gene and whole genome duplications, perhaps in response to regulatory signals or environmental triggers, and unicellular eukaryotes became multicellular and then increasingly complex and intelligent.
However, the possibility that the first eukaryotes also arrived on Earth contained in jettisoned planetary debris and ejecta from the shattered remnants of the parent star's solar system, cannot be ruled out. Many species of bacteria form spores (Marquis and Shin 2006) and some survive in a state of suspended animation for hundreds of millions of years (Satterfield et al. 2005; Vreeland et al. 2000). Simple eukaryotic organisms, including yeast and fungi (Botts et al., 2009), also produce spores, often for reproductive purposes, but also in response to adverse, life threatening conditions. Therefore, it is possible that some simple eukaryotic organisms, and their descendants, along with trillions of other microbes, may have survived the destruction of the parent star system only to be hurled upon the newly forming Earth hundreds of millions of years later. This would account for the presence of microfossils resembling yeast cells and fungi, discovered in 3.8 BY old quartz (Pflug 1978).
Once on Earth, these simplified eukaryotes may have phagatocized archae and bacteria (Kurland et al., 2006; Poole and Penny, 2007) and incorporated their genes, or were infiltrated by parasitic prokaryotes which donated genes to the eurkayotic genome.
Woese, (2004) has proposed that these initial bacteria, archaea and eukaryotes may have lived together and repeatedly swapped and shared genes. "Eventually this collection of eclectic and changeable cells coalesced into the three basic domains known today. These domains become recognisable because much (though by no means all) of the gene transfer that occurs these days goes on within domains" (Woese, 2004).
A second possbility is that the first Earthly eukaryotic cells were created by the genetic fusion of bacteria and archae, and possibly the injection of viral genes. Thus, hundreds of millions of years after arriving on Earth, archae, bacteria, and a virus may have joined together, combining their genomes, and in so doing, created the first eukaryotes, which, nearly 4 billion years later, would give rise to humans.
The activity of photosynthesizing organisms and prokaryotic genes altered the environment via the liberation, secretion, and synthesis of a variety of chemicals and enzymes including oxygen (Buick 1992, 2008; Falkowski and Godfrey 2008; Holland 2006; Olson 2006; Williams and Fraústo da Silva 2006). The changed environment acted on gene selection, activating genes contributed by bacteria and archae, giving rise to new traits and new species perfectly adapted for a world that had been prepared for them.
CONSERVED GENES & GENE EXPRESSION
Thousands of orthologous genes and hundreds of conserved genes can be traced back to the last common ancestor for eukaryotes (Snel et al., 2002; Mirkin et al., 2003; Kunin and Ouzounis 2003; Koonin 2003; Mushegian 2008; Bejerano et al., 2004). Almost all underwent duplication at the onset of eukaryotic evolution (Makarova et al., 2005). These genes then continued to undergo repeated episodes of single gene and whole genome duplication such that the eukaryotic genome increased in size. However, these duplications were often coupled with gene deletions, obscuring their original relationship with prokaryotes.
Almost all of the genes donated by prokaryotes, including those subsequently deleted from the eukaryotic genome, performed crucial functions that would guide the future or evolution. These included regulatory genes and genes controlling core cellular activities and the capacity to make duplicates of individual genes and the entire genome.
Genome sequencing has revealed an extensive conservation of the same repertoire of genes coding for core cellular functions in the genomes of prokaryotes and eukaryotes (Koonin et al., 2004; Koonin and Wolf 2008). A core set of approximately 70 genes contributed by archae and bacteria have been conserved and passed down, without deletion, for billions of years, and which make up around between 1% to 10% of the genes in the genomes of all multicellular life (Koonin 2003; Koonin and Wolf, 2008; Harris et al., 2003; Charlebois and Doolittle 2004).
These conserved genes, proteins, (Koonin 2002) and gene sequences (Koonin 2009b), include those governing translation, the core transcription systems, and several central metabolic pathways, such as those for purine and pyrimidine nucleotide biosynthesis (Koonin 2003). Moreover, protein sequence conservation extends from mammals to bacteria thus demonstrating their great antiquity (Dayhoff et al., 1974; Eck and Dayhoff 1966; Dayhoff et al., 1983).
Between 2150 to 4137 orthologous gene sets are highly conserved and can be traced back to the last common ancestor for eukaryotes (Makarova et al., 2005). And often these orthologs express or perform the same function regardless of species.
In yet other instances, these conserved genes had not been expressed in ancestral species and were activated only after hundreds of millions of years had passed; activated in response to changing environmental or regulatory conditions. These genes generally have numerous interaction partners indicating they can exert widespread effects across networks of genes.
Consider, the evolution of the eye. It has been claimed that the chief components of the eye, such as photoreceptors must have evolved essentially de novo 40–65 times independently according to Darwinian principles (Salvini-Plawen & Mayr 1977). However, genes do not evolve de novo or ex nihilo; they are transferred from another species, inherited from an ancestral species, or they are produced by exon shuffling, whole gene duplication, and numerous other replicative mechanisms.
Genes involved in eye development, known as Pax, "Pax-6" and opsin in vertebrates and "eyeless" in fruit flies, are homologous between diverse phyla (Quiring et al., 1994; Gehring & Ikeo 1999). Pax genes ("Pax-6"). They have also been found in the genomes of ancient species such as the sea urchin and trichoplax, both of which have no eyes and cannot see (Sodergren et al., 2007; Callaerts et al. 1997; Hadrys et al., 2005).
Pax-6 serves as a master regulator of a network of genes that can give rise to a variety of different types of eyes that utilize the same visual pigment genes. That is, Pax-6 appears to act on different genes to produce the different structures on which the pigment cells are mounted in different creatures giving rise to a variety of eyes (Sheng et al., 1997; Gehring and Ikeo, 1999; Davidson, 2001).
Moreover, Pax 6 proteins show an 90-90% identity between vertebrate and invetebrates (e.g. squid) as well as insects (Drosophila) and marine worms (Tomarev, 1997). These genes also utilize identical homologous Pax-6 proteins during eye development (Gehring & Ikeo 1999). As the common ancestors for vertebrates and invetebrates diverged between 600 mya to 1.6 bya (Ayala et al., 1998; Wray et al., 1996; Gu, 1998; Cutler, 2000), this is an indication of the great antiquity of Pax genes--many of which can be traced to ancestral species who had no eyes and were unable to see. Those ancestors could include prokaryotes.
Consider, for example, vitamin-A-related chromophores in the visual pigment and which is the single most prerequisite for vision in the vertebrate or invertebrate genome. Vitamin-A-related chromophores are also found in bacteria as well as algae (Seki and Vogt 1998; von Lintig, J., Vogt 2004).
These highly conserved genes were then passed down, through numerous diverging ancestral species until activated in the period leading up to and including the Cambrian Explosion. Over 1000 genes involved in visual functioning, including ancestral Pax-6 genes, were inherited and are homologous between phyla (Quiring et al., 1994; Gehring and Ikeo, 1999),(Tomarev et al. 1997), and have been isolated from several invertebrate and vertebrate species, including squid, flatworm, ribbonworm, ascidian, sea urchin, nematode, and fruit flies (Callaerts et. al., 1997; Tomarev, 1997).
Be it vertebrate, flatworm or insect, and in spite of the large differences in eye morphology and mode of development (Gehring 1996), the same genes and same gene products related to the visual system are under the same genetic control (Quiring et al., 1994). Thus, regardless of species some parts of eyes are homologous because they are coded by the same genes and the same proteins.
Between 70% to 80% of these genes are common and evolutionary conserved in the genomes of mammals, squid, octopus, flatworm, ribbonworm, ascidian, and nematode mosquitos, flies, tunicates, and vertebrate genomes including humans (Ogura et al., 2004). The common ancestors for these species diverged anywhere from 1.2 bya to 830 million years ago (Ma) (e.g., Wray et al., 1996; Peterson et al., 2004, Nei et al., 2001; Gu 1998). As there is no evidence for visual functioning in any creature before 550 mya, these genes were therefore inherited, in silent form, from ancestral species which could not see.
However, regardless of their activity, genes that are highly conserved over the course of eukaryotic evolution not only remain in the same location but accumulate fewer substitutions in their protein sequences. Therefore the conservation of a gene and the fact that it is passed down vertically to subsequent species and is maintained unchanged in the same position, indicates biological importance and the identical roles it plays, almost regardless of species, over the course of evolution.
That importance may also have more to do with the future of evolution rather than the survival of the species possessing that gene. Therefore, some highly conserved genes can be removed (knocked out) of various genomes without having any noticeable impact on the viability of the organism or its ability to function (Koonin 2000). In fact, hundreds of genes have been knocked out, or stripped from various species which remained viable (Glass et al., 2006; Koonin 2000).
Mycoplasma genitalium, for example, has one of the smallest genome of any organism but remained viable even after 100 of its 482 genes were removed (Glass et al., 2006). However, 28% of the minimal set of genes coded for unknown functions (Glass et al., 2006). Moreover, 80 genes of the original minimal gene set were represented by orthologs in all forms of life and many of these coded for unknown functions (Koonin 2000). Therefore, not all highly conserved genes are related to the viability of the organism, but instead serve the future evolution of new functions, new structures, and new species.
Mycoplasma genitalium Likewise, features of gene architecture that are not necessarily directly relevant to gene function are highly conserved across lengthy periods of evolutionary history. This includes the positions of a large fraction of introns with 25–30% conservation in orthologs from plants and chordates (Fedorov et al., 2002; Rogozin et al., 2003; Roy and Gilbert 2006). In the human genome, these ultraconserved elements often overlap introns or nearby genes involved in the regulation of transcription and development. Highly conserved genes are also located adjacent to exons involved in RNA processing (Bejerano et al., 2004).
In addition, the positions of a large number of introns are conserved between plants and vertebrates (Fedorov, et al., 2002; Rogozin et al., 2003; Roy and Gilbert 2006) and between mammals and "living fossils" such as as Trichoplax and the sea anemone (Putnam et al., 2007; Srivastava et al., 2008); species which diverged over a billion years ago. Introns play a major role in the regulation of gene expression and transcription and creation of new genes from old genes.
Thus, genes involved in transcription regulation and which were donated by prokaryotes to eukaryotes interact with and overlap genes and introns also contributed by prokarayotes to eukaryotes. Moreover, these same genes were repeatedly duplicated and dispersed to a wide range of divergent species, and when activated gave rise to identical or similar evolutionarily advanced characteristics such as the eye and brain.
GENE REPLICATION & WHOLE GENOME DUPLICATION
Some of these highly conserved genes act as a genetic mechanism through which prokaryote genes, gene sequences, and proteins, could be repeatedly duplicated within the eurkaryotic genome. For example, a variety of regulatory genes and proteins were donated which insure that specific genes and the functions they code for remained inhibited, while guaranteeing these same genes could be repeatedly duplicated and their functions preserved even as they grew in number and were passed down to subsequent species over hundreds of millions of years. Nevertheless, many of these genes were suppressed and remained silent.
For example, archae and bacteria contributed three subunits of the core DNA-dependent RNA polymerase (Iwabe et al. 1991; Klenk et al. 1993) and two enzymes of DNA metabolism, RecA and Pol1A to the eukaryotic genome (Eisen and Hanawalt 1999; Harris et al., 2003) . These enzymes and the core RNA polymerase subunits serve many regulatory and replicative functions. For example, both RecA and Pol1A contributed to genetic continuity by gene conversion after recombination. They also insure the integrity and maintenance of genetic information as the lengths of DNA strands increase and the genome grows larger in size (Eisen and Hanawalt 1999).
The replicative DNA polymerase, DnaN (COG0592), and the gene for the “sliding clamp” were also donated. This gene and proteins are necessary for the high degree of processivity of DNA polymerase during replication (Kuriyan and O'Donnell 1993;Hingorani and O'Donnell 2000). This enables the accurate replication of linked genes and the preservation of the information they encode.
Many of the proteins that regulate eukaryotic signal transduction networks, including those involved in programmed cell death, are also derived from the prokaryotic genome (Aravind et al., 1999; Koonin and Aravind 2002; Bidle and Falkowski 2004). These signaling molecules are common in bacteria, cyanobacteria, and archae and include proteases from the AP-ATPase family. These proteases perform catalytic functions, and are found in the plant and animal genome (Koonin and Aravind 2002; Bidle and Falkowski 2004) and are utilized by mitochondria.
Replication is a universal feature of cellular organisms, and eurkaryotes and prokaryotes share many genes and characteristics involved in replication, including the production of RNA primers, replication bidirectionality, strand synthesis, and the utilization of the same principal proteins involved in transcription and translation. That these genes were transferred from prokaryotes to eukaryotes is demonstrated by their commonality.
Prokaryotic genes which guide replication and duplication contributed to the expanding size of the eukaryotic genome. Indeed, the number of signal transduction and regulatory proteins that are encoded parallel the increasing size of the genome. Thus, the larger the genome, the greater the number of genes dedicated to signal transduction (van Nimwegen 2003; Konstantinidis Tiedje 2004; Galperin 2005).
Some of these genes that can be traced to a common ancestor also perform functions that involve the transfer of genetic information (Harris et al., 2003). Some interact with ribosomes and those ribosomal RNA genes which play fundamenal roles in cellular functioning and DNA translation and transcription (Harris et al., 2003). Ribosome and ribosomal RNA genes were also likely transferred from prokaryotes to eukaryotes (Lake et al. 1984; Lake 1988; 1998; Rivera and Lake 1992; Rivera and Lake 2004; Vishwanath et al. 2004).
Thus, the ability to replicate and duplicate genes, and to transfer genes and to express these genes can be traced backwards in time to prokaryotes and to the direct descendants of the first creatures to arrive on Earth.
Moreover, the donation of these genes and proteins was not random but under extreme regulatory control, performing essential functions related to the metamorphosis and evolution of future eukaryotic species; and this is why they are highly conserved across diverse species. These functions include gene and whole genome duplications (Dehal and Boore 2005; Lynch and Conery 2000; Lynch et al., 2001; McLysaght et al., 2002).
Repeated replication, including whole genome duplication, freed up duplicated genes from regulatory restraint. Thus pre-coded genetic instructions were expressed giving rise to advanced traits which had been suppressed. Gene duplication is a major evolutionary mechanism (Ohno 1970).
However, with each duplication, genes were also deleted, often the original prokaryotic insert. For example, a comparison of the numbers of ancestral gene clusters with those of extant animals such as the nematode, fly, mouse and human, established that extant bilaterian animals have retained more than 3500 gene clusters of the ancestral gene set, and have lost more than 1600 gene clusters (Ogura et al., 2004). Following duplication the originals or the copies were moved to new locations within the eukaryotic genome. Therefore, most of the genes which originated in the prokaryote genome can no longer be traced back to their prokaryotic source.
After they had been donated and transferred to the eurkaryotic genome, many of these genes were simultaneously deleted from the prokaryotic gene pool thus insuring they would not affect prokaryote evolution. In prokaryotes, gene loss is one of the two major evolutionary processes, along with horizontal gene transfer (HGT), that contribute to the intensive “gene flux” that seems to have shaped the genomes of these organisms.
Those donated genes included those regulating whole genome duplication (WGD). Thus, it appear that these genes underwent WGD only after they had been acquired by eukaryotes as there is little evidence of WGD in prokaryotes.
There have been several whole gene duplications during the early evolution of eukaryotes and which date back to the emergence of the first eukaryotic cells or their ancestors (Makarova et al., 2005). Reconstruction of ancestral gene repertoires has identified 4137 orthologous gene sets in the last multicellular eukaryotic common ancestor, and 2150 orthologous sets in the hypothetical first unicellular eukaryotic common ancestor, which is indicative of WGD coupled with deletions. (Makarova et al., 2005).
There is evidence to suggest that the genome may be duplicated at least every 100 million years (Lynch et al., 2001; Lynch and Conery 2000). Therefore majority of the genes in most genomes of cellular life underwent at least one duplication at some point during evolution (Lynch 2007; Koonin et al., 1996) and many genes belong to large families of paralogs.
The number of ancestral gene sets at the time of the split of plant–animal–fungi and the divergence of bilaterian animals, is estimated to be 2469 and 6577, respectively (Ogura et al., 2004). There is a 2.7-fold increase in the number of gene clusters during the period from the evolutionary split of plant–animal–fungi to the divergence of bilaterian animals (Ogura et al., 2004). This indicates that at least one and possibly two whole genome duplications must have occurred coupled with massive deletions.
Whole genome duplications have occurred in almost all lineages, including yeast (Wong et al., 2002; Vision et al., 2000; Kellis et al., 2004; Dietrich et al., 2004), fish (Van de Peer et al., 2003; Jaillon et al., 2004; Taylor et al., 2001), frogs (Tymowska et al., 1977; Jeffreys et al., 1980) and plants (Blanc and Wolfe 2004). The relatively large and complex vertebrate genome appears to have been duplicated at least twice (McLysaght et al., 2002; Dehal and Boore 2005).
Whole genome duplication played a central role in the primary radiation of chordates (Dehal and Boore 2005) during the Cambrian explosion over 500 million years ago. There followed additional duplications during chordate evolution, thereby forming many of the gene families of vertebrates (McLysaght et al., 2002).
Dehal and Boore (2005 reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, and then determined when each gene underwent duplication relative to the evolutionary tree of the organism. An analysis of the global physical organization and genomic map positions of paralogous genes indicates these specific genes were duplicated prior to the fish–tetrapod split, some 400 million years ago. This was followed by two distinct genome duplication events early in vertebrate evolution as indicated by clear patterns of four-way paralogous regions covering a large part of the human genome (Dehal and Boore 2005).
Large-scale genomic events marked the transition and divergence between yeast and fungi (Liti and Louis, 2005) chordates and non-chordates (McLysaght et al., 2002), fish and tetrapods (Dehal and Boore 2005), and then once or twice more after vertebrates began to colonize the surface of Earth (Dehal and Boore 2005).
There is evidence to suggest that the genome may have been duplicated dozens of times over the course of evolutionary history (Lynch and Conery 2000; Lynch et al., 2001) thereby triggering the transition and divergence between numerous species, ranging from yeast and fungi (Liti and Louis, 2005) to chordates and non-chordates (Dehal and Boore 2005; McLysaght et al., 2002).
Gene and whole genome duplication are crucial mechanisms of evolutionary innovation and when coupled with regulatory genes contributed by prokaryotes, enabled the genomes of eukaryotes to become increasingly complex as well as larger in size. This also allowed for multiple copies of the same genes to appear in divergent species and to be passed down until a regulatory or environmental signal triggered their activation.
Gene duplication appears to provide t he raw material for major evolutionary transitions and triggering the emergence of new species in the absence of obvious intermediaries. The duplication of all genes at the same time could possibly induce rapid and extensive evolutionary change; i.e. the emergence of new species from old in the absence of obvious transitional species. Whole genome duplication also enabled the entire expanded gene repertoire to evolve together and reach a greater level of interaction and complexity as compared to single gene duplications.
Duplication is often followed by accelerated sequence evolution as well as rearrangement of a gene, an evolutionary mode that obliterates detectable connections to the original gene source. Moreover, although numerous genes might be retained, other duplicated genes or the original might be quickly eradicated (Wolfe 2001) thus erasing the genetic footprints that would lead back to the prokaryotic source. This would make it appear that a new gene has emerged because its origins are no longer apparent. In fact, the vast majority of duplicated genes are subsequently deleted (Dehal and Boore 2005); an event which may also lead to freeing the original, or the duplicate, from inhibitory restraint, and which can erase all evidence of genome duplication (Dehal and Boore 2005).
GENE LOSS & GENE EXPRESSION
Lineage-specific gene loss is one of the major evolutionary processes that have been brought to light by comparative analyses of gene sets from completely sequenced genomes (Aravind et al. 2000; Moran 2002). Genome analysis has revealed the extensive loss of genes after WGD, in yeasts (Katinka et al. 2001; Scannell et al., 2007; Wolfe and Shields 1997), plants (Soltis et al., 2008; Tuskan et al., 2006), and chordates (Dehal and Boore 2005; Durand 2003; McLysaght et al., 2002).
Gene loss without replacement is a common phenomenon in many genomes and appears to play an important role in shaping genome content (Snel et al. 2002). The extent of gene loss can be dramatic, and it can occur relatively rapidly under a strong selective pressure (Baumann et al. 1995).
Although genomes of parasites expose the most striking cases of massive gene loss, a possible function of deletion following transfer, the fact is: substantial gene loss has occurred in all phylogenetic lineages (Snel et al. 2002; Mirkin et al. 2003).
The eradication of the original gene may also play a role in the expression of the duplicate. Some of these duplicate genes appeared to have been freed from inhibitory restraint and were able to undergo an accelerated rate of sequence change thereby inducing the rapid evolution of new characteristics and abilities (Seoighe et al., 2003) Thus after duplication followed by deletion, the duplicate or original genes, now freed of the constraints, could express an already encoded function (“neofunctionalization”) which had been repressed (Conant and Wolf 2008).
In many cases the 'new' function of one gene copy is a secondary property, or subfunction, that was always present, but which may have been suppressed, or which only came to be expressed when other more dominant functional capabilities were inhibited, suppressed or deleted. Therefore, old functions might be fractionated giving rise to new subfunctions (“subfunctionalization”). That is, the new function was not really "new" but had always been a property of a specific gene that could only be expressed following duplication, or duplication coupled with deletion.
Thus, it is not uncommon for the new paralogs to retain or express distinct subsets of the original functions of the ancestral gene whereas the rest of the functions differentially deteriorate (Lynch and Force 2000; Lynch and Katju 2004)
Duplication and the lessening of regulatory restraints might also make the gene more susceptible to environmental triggers.
ARCHAE VS BACTERIA: GENE TRANSFER
Numerous species of bacteria act as endosymbionts or endoparasites (Dyall et al., 2004; Poole and Penny 2007). Viruses are parasitic by nature. Archaea do not generally serve in this capacity--though there are exceptions. Bacteria, of course, are not uniform and there may be innumerable species (Nakabachi et al., 2006; Ranea et al., 2005; Schulz and Jorgensen 2001; Schneiker et al., 2007) .
Considered in the broadest terms, archaea are highly distinct from bacteria, particularly in regard to the size of their genomes and cell membranes. For example, archaean membranes are made of ether lipids where as bacterial cell membranes are created from phosphoglycerides with ester bonds (De Rosa et al., 1986). Like bacteria, archae can live in the most extreme environments (Kimura et al, 2006, 2007; Leininger et al., 2006; Robertson et al., 2005). However, whereas bacteria are usually (but not always, e.g. Leininger et al., 2006) the most common form of life in the soil, archaeota are the most common form of life in the ocean, dominating ecosystems below 150 m in depth (Karner et al., 2001; Robertson et al., 2005).
The genomes of archae are rather uniform and compact in size ranging from 0.5 Mb in the parasite Nanoarchaeum equitans (Waters et al., 2003) to 5.5 Mb in Methanosarcina barkeri (Maeder et al., 2006).
Bacterial genomes can vary by two orders of magnitudes, from 180 kb in an intracellular symbiont, Carsonella rudii (Nakabachi et al., 2006), to 13 Mb in Sorangium cellulosum which dwells in soil (Schneiker et al., 2007). Although there are bacterial genomes of intermediate size, the vast majority of bacteria so far sequenced show a clear-cut bimodal distribution of genomes; i.e. large vs small, suggesting the existence of two distinct classes of bacteria: those with ‘small’ genomes (Ranea et al., 2005) with the highest peak at 2 Mb and those with "large" genomes at about 5 Mb (Schulz and Jorgensen 2001).
By contrast, eukaryotic genomes range wildly in size and are generally several magnitudes larger than those of prokaryotes. However, the genomes of some eukaryotic species, such as microsporidian Encephalitozoon cuniculi (Katinka et al., 2001) are substantially smaller than many bacteria and archaeal genomes. Encephalitozoon cuniculi is also a parasite and may serve as a genetic messenger.
Likewise, those bacteria and archae with the smallest genomes share a significant behavioral feature with Encephalitozoon cuniculi: they too are parasites and they prey upon other prokaryotes as well as eukaryotes (Waters et al., 2003; Huber et al., 2002). It is these parasitic behaviors which may explain their small genomes, and the presence of prokaryotic genes in the eukaryotic genome. These prokaryotes may have donated their genes to a eukaryotic host billions of years ago. Once donated many of these genes were not replaced.
For example, prokaryotes with the smallest genomes, i.e. parasitic and symbiotic bacteria and archaeal parasites (e.g., N. equitans) no longer encode or express a variety of protein regulators, indicating the responsible genes have been transferred to the genome of the eurkaryotic host. With the donation of these regulatory genes, the genomes of these parasitic and symbiotic prokaryotes decreased in size. However, in addition to genes, many species of parasitic bacteria/archae may have taken up residence inside a eukaryotic host after which they continued to transfer and donate genes (Dyall et al., 2004; Margulis et al., 1997).
Hundreds of specialized prokaryotic genes have been donated to the genomes of their hosts, possibly by horizontal gene transfer (Yutin et al., 2008) and were then preserved, unchanged, often in the same position even after hundreds of millions and, perhaps, even after billions of years of evolution. Some of these donated genes, or the complete engulfment of a bacterial parasite by eurkayotes, appear to to be responsible for the metamorphosis of mitochondria which also donated genes to the eukarayote genome (Margulis et al., 1997). These prokaryotic genes and bacteria/archae symbionts, enabled eukaryotes to become increasingly complex and to colonize and conquer new environments which were being genetically engineered by prokaryotic genes.
PHOTOSYNTHESIS & OXYGENATION
Initially Earth was devoid of a significant atmosphere, and lacked free oxygen, as the oceans were anoxic and possibly sulphidic (Barleya et al., 2005; Canfield 2005; Holland 2006; Mentel and Martin 2008). Only anerobic organisms, and those adapted to breathing hydrogen or methane, or feasting on iron and sulphites and other minerals and metals in the absence of oxygen, were able to thrive (Barleya et al., 2005; Olson 2006; Rosing and Frei 2004; Sleep and Bird 2008)--as is the case with many modern day species of bacteria and archae (Richardson 2000).
Photosynthesizing cyanobacteria contributed genes to the eukaryotic genome (Howe et al., 2008), possibly at the initial stages of eukaryotic evolution. Gene transfer may have taken place secondary to endosymbiont engulfment by non-photosynthetic eukaryotic hosts (Howe et al., 2008). The genes donated by these cyanobacteria enabled some eukaryotes to develop pigmented plastids which engaged in photosynthesis. Plastid formed the major organelles which are now found in plants and algae and are responsible for the synthesis of fatty acids and the storage of starch.
Plastid DNA exists as large protein-DNA complexes, each containing at least 10 copies of the plastid DNA. Plastids also possess numerous internal membrane layers which raises the possibility that plastids are stripped down photosynthetic prokaryotic endosymbionts (Howe et al., 2008). Thus some eurkayotes began to engage in photosynthesis in an oxygen free environment, and to secrete oxygen as a waste product (Buick 1992; Holland 2006).
In addition to photosynthesis, some prokaryotes including aerobic photoautrophic marine plankton were producing oxygen via the photobiologically catalysed oxidation of water (Buick 2008; Falkowski and Godfrey 2008) and were engaging in oxygen metabolism as demonstrated by U–Pb data from metasediments, and the creation of thick kerogenous shales dated to 3.8 bya to 3.2 bya respectively (Buick 2008).
The excretion of "waste" products, such as oxygen, over hundreds of millions of years, directly altered the environment (Barleya et al., 2005; Buick 1992; Canfield 2005; Holland 2006; Rosing and Frei 2004), and the altered environment acted on gene selection, activating genes that had been donated to the eukaryotic genome by prokaryotes and viruses.
The buildup of free molecular oxygen resulted in nitrate being oxidized from ammonium and subsequently denitrified. Increased production of oxygen led to decreased fixed inorganic nitrogen in the oceans--as is evident from isotopic analyses of fixed nitrogen in sedimentary rocks from the Late Archaean (Falkowski and Godfrey 2008). The interaction between the oxygen and nitrogen cycles and the continued buildup of oxygen in Earth's atmosphere allowed nitrification to become dominant over denitrification (Falkowski and Godfrey 2008), such that oxygenic photosynthesis and aerobic respiration became the preferred mode of energy acquisition within eukaryotic host cells-- a function of the activation of gene selection and possibly the horizontal transfer of these activated genes from prokaryotes to eukaryotes (Falkowski and Godfrey 2008).
As certain elements, gasses, and minerals built up as waste, they acted on gene selection (Williams and Fraústo da Silva 1996, 2006), giving rise to metabolic processes that enabled these creatures to biologically catalyse electron transfer (redox) reactions, beginning with H, C, N, and then O and S (Falkowski and Godfrey 2008). This sequence of changing environments acting on gene selection, led to the production of oxygen via the photobiologically catalysed oxidation of water and photosynthesis. As atmospheric oxygen levels continued to build up, this resulted in the surface weathering of soil-bound sulphides which were reduced to sulphates which drained into the oceans as sulphate (Mentel and Martin 2008).
Under anaerobic conditions, chemolithotrophic microbes break down and convert ferric iron which is employed a oxidant to decompose other minerals, and producing sulfate and ferrous iron as waste products (Fernandez-Remolar et al., 2008). Thus, in addition to oxygen soil weathering, innumerable bacteria and archae were also acting on soils, such that ferrous iron and sulphates were being liberated and draining into the oceans.
In consequence, sulphate reducers and anaerobic, hydrogen sulphide-producing prokaryotes, as well as ferrous iron producing bacteria, began to proliferate on a global scale (Mentel and Martin 2008; Sleep and Bird 2008). The continued production and buildup of sulphide and ferrous iron, were eventually incorporated within eukaryotic cells and protein-bound and served as oxygen acceptors (Sleep and Bird 2008). Thus what had been oxygen-independent ATP-generating pathways, became oxygen-dependent.
In fact, cyanobacteria may have begun to use ferrous iron as reductant as early as 3.0 bya (Olson 2006). However, as based on an analysis of microfossils, stromatolites, and chemical biomarkers in Australia and South Africa, chlorophyll containing cyanobacteria had switched to oxygenic photosynthesis by 2.8 Ga (Olson 2006).
Thus, a complex genetic-environmental feedback system was established, with genes acting on the environment and the biologically altered environment acting on gene selection which gave rise to species which utilized these "wastes" and rejected those which were no longer or not as useful (Richardson 2000; Williams 2007).
As summarized by Williams (2007), "in essence, organisms at all times had to accumulate certain elements while rejecting others. Central to accumulation were C, N, H, P, S, K, Mg and Fe while, as ions, Na, Cl, Ca and other heavy metals were largely rejected." One step leads to the next, beginning with the use of hydrogen, methane, Fe, sulphur, and nitrates by bacteria and archae (Berks et al., 1995; Bult et al., 1996; Gold, 1992; Lonergan et al., 1996; Lovley, 1991; Richardson 2000; Vargas et al., 1998), followed by oxygenic photosynthesis (Castresana & Saraste, 1995; Castresana & Moreira, 1999; Falkowski and Godfrey 2008; Schafer et al., 1996; Schwartzman et al., 2008; Sleep and Bird 2008). Thus, the evolution of subsequent species utilized new byproducts, such as sulphides, ferrous iron, glucose, pyruvate, and NADH, which provided new and additional sources of energy and nourishment, and which acted on gene selection (Williams and Fraústo da Silva 2006).
The liberation and incorporation of these chemical substances led to increasing cellular complexity, a function of cells acting on the environment which acts on gene selection which acts on the environment, creating a complex feedback system which promotes the evolution of increasingly complex creatures.
For example, "in order to form the vital biopolymers, C and H, from CO2 and H2O, had to be combined generating oxygen. The oxygen then slowly oxidized the environment over long periods of time. These environmental changes were relatively rapid, unconstrained and continuous, and they imposed a necessary sequential adaptation by organisms while increasing the use of energy. Then, evolution has a chemical direction in a combined organism/environment ecosystem. Joint organization of the initial reductive chemistry of cells and the later need to handle oxidative chemistry has also forced the complexity of chemistry of organism in compartments. The complexity increased to take full advantage of the environment from bacteria to humans in a logical, physical, compartmental and chemical sequence of the whole system" (Williams 2007).
OXYGENATION & MITOCHONDRIA
When the environment had become sufficiently oxygenated and enriched with sulphide and ferrous iron which could served as oxygen acceptors (Sleep and Bird 2008) oxygen-dependent ATP-generating pathways replaced the less efficient oxygen-independent pathways and eukaryotic cells underwent a significant alteration and began breathing oxygen via the metamorphosis of mitochondria (Schafer et al., 1996).
The genomes of all extant eukaryotes contain genes which can be traced to ancestors that possessed the -proteobacterial endosymbiont that gave rise to the mitochondria (van der Giezen and Tovar 2005; Embley 2006) Presumably, the genes of this α-proteobacterium symbiont underwent transformation in response to the increasing levels of oxygen in the atmosphere, becoming a mitochondria.
Mitochondria serves as the powerhouse of the eukaryote cell and are located outside the nucleus. Mitochondria generate most of the cell's supply of adenosine triphosphate (ATP) which is used as a source of chemical energy (Akao et al., 2001; Dahout-Gonzalez et al., 2006; Garlid et al., 2003; Margulis et al., 1997). The production of ATP is accomplished by oxidizing the major products of glucose, pyruvate, and NADH, which are produced in the cytosol (Akao et al., 2001; Dahout-Gonzalez et al., 2006; Garlid et al., 2003; Herrmann and Neupert 2000) and by bacteria and archae (Richardson 2000).
Many cells have only a single mitochondrion, whereas others contain several thousand. Mitochondria have their own independent genomes and their DNA shows substantial similarity to bacterial genomes (Pace 2006; Woese 1994). Mitochondria are enclosed in their own inner and outer membrane, play a significant role in signaling, cellular differentiation, cell death, as well as the control of the cell cycle and cell growth (Anderson et al., 1981; Chipuk et al., 2006; Mannella 2006; Rappaport et al., 1998). Thus, mitochondria are essential to the functioning of the eukaryote cell (Margulis et al., 1997).
The DNA of multicellular eukaryotes is contained within the nucleus of every cell and mitochondria sit adjacent to the nucleus. It appear that the eurkaryotic nucleus was fashioned hundreds of millions of years after phogotrophy and hundreds of millions of years before the metamorphosis of mitochondria (Margulis et al., 1997). The nucleus which protects the eukaryotic genome, and the establishment of compartments, may have originally consisted of stripped down bacteria/archae. The nucleus and compartmentalization made it possible for predatory eukaryotes to ingest and phagotocize other creatures while minimizing the risk of random gene mixing and the unregulated incorporation of foreign DNA.
Some researchers believe that the nucleus, the organelles, as well as mitochondria may have been created from bacterial and archae genes (Pace 2006; Woese 1994; Embley and Martin, 2006; Martin and Koonin, 2006; Martin and Muller 1998). Some believe the nucleus may be a derived endosymbiont, a descendant of an archaeon that invaded a bacterial host, or a bacteria and archae which invaded or was engulfed and phagotocyzed by the first eukaryotes (Lake and Rivera 1994; Horiike et al. 2004; Hartman and Fedorov 2002).
Many researchers also believe that the mitochondria are directly linked to the engulfment of an anaerobic symbiont α-proteobacterium (reviewed by Gray et al., 1999) or a free-living photo-synthesizing bacteria by a methanogenic archaeon. Presumably this bacterium supplied hydrogen to the host (Martin W, Muller, 1998) which then engaged in anaerobic respiration to metabolize glycolytic products and turn them into energy; releasing oxygen as waste. Hundreds of millions of years would pass before eukarayotes began breathing oxygen (Schafer et al., 1996).
Once the environment became sufficiently oxygenated, the α-proteobacterium either underwent metamorphosis to become a mitochondria, and/or contributed genes which gave rise to aerobic mitochondria (Embley and Martin, 2006; Gray et al., 1999; Martin and Koonin, 2006; Martin and Muller 1998; Rivera and Lake 2004; Martin and Koonin 2006). The activation of these genes, and the metamorphosis of mitochondria enabled eukaryotes to colonize emerging aerobic environments.
Others have argued that mitochondria arose when the first multi-cellular eukaryotes internalized and formed a symbiogenetic relationship with a free-living proto-mitochondria or an α-proteobacterium symbiont (Cavalier-Smith, 2009). This could explain why a few groups of unicellular eukaryotes lack mitochondria: the microsporidians, metamonads, and archamoebae. As based on phylogenetic trees constructed using rRNA information, these unicellular eukaryotes appeared before the origin of mitochondria. Thus, the endosymbiont may have been incorporated only after larger, more complex multicellular eukaryotes evolved.
However, unicellular eukaryotes who are without mitochondria nevertheless, possess organelles of -proteobacterial descent (Gray et al., 1999). This has led to the possibility that the genes giving rise to mitochondria, organelles, and the nuclear compartment originated at the same time in the common ancestor of all extant eukaryotes rather than in separate, subsequent events (Gray et al., 1999). The mitosome, for example, is an organelle found in some unicellular eukaryotic organisms and is related to mitochondria (Bakatselou et al., 2003; Tovar et al., 1999; Williams et al., 2002). Like mitochondria, they have a double membrane. The mitosome, however, has been detected only in anaerobic or microaerophilic parasitic organisms that do not have mitochondria (Bakatselou et al., 2003; Mentel and Martin 2008; Tovar et al., 1999; Williams et al., 2002). Nevertheless, the organelles of most unicellular eukaryotes have also been shown to be of -proteobacterial descent (Gray et al., 1999). Mitosomes therefore, may also be related to to an -proteobacteria which gave rise to mitochondria, or they may be derived from mitochondrial genes (Mentel and Martin 2008). However, unlike mitochondria, mitosomes genes are contained in the nuclear genome of the eukaryotic host (Bakatselou et al., 2003; Tovar et al., 1999). Thus mitosomes appear to be mini-mitochondria albeit stripped of its genes ( Williams et al., 2002). The existence of the mitosome does not appear to be compatible with endosymbiotic theory that postulates that mitochondria arose following the phagocytosis of a mitocondria-like organisms by a multi-cellular eukaryote (Mentel and Martin 2008). Unlike mitochondria mitosomes do not have the capability of gaining energy from oxidative phosphorylation (Mentel and Martin 2008) and t his may be due anaerobic environments in which they dwell (Tovar et al., 1999). The existence of the mitosome in anaerobic unicellular eukaryotes, and the link to an -proteobacteria and mitochondria, suggests that mitosomes and mitochondria are derived from the genes that gave rise to the first eukaryotes, and that the metamorphosis of mitochondria was in response to increased levels of oxygen, sulphur and ferrous iron, and other gasses, ions and minerals; a consequence of the environment acting on gene selection. Mitochondria, as a distinct entity within eukaryotic cells, did not arise until between 2.3 to 1.8 BYA (Mentel and Martin 2008). It was during this time that oxygen, produced by photosynthetic bacteria, had begun to enrich the atmosphere (Barleya et al., 2005; Eigenbrode and Freeman 2006), and during which Earth had became glaciated, fueled by oxygenic photosynthesis (Eigenbrode and Freeman 2006; Evans et al., 1997; Kirschvink, et al. 2000). This rise in oxygen has been referred to as the Paleoproterozoic "Great Oxidation Event" (~2.2 to 2.0 Ga), when atmospheric oxygen may have risen to >1% of modern levels, a byproduct of oxygenic photosynthesis (Buick 2008; Canfield 2005; Holland 2006; Nisbett and Nisbett 2008; Olson 2006).
Initially, those α-proteobacterium genes which contained the DNA instructions for the metamorphosis of mitochondria, remained suppressed and were not activated, as the environment and atmosphere of Earth lacked oxygen and other chemicals such as NADH and other oxidases. In the absence of an oxygen rich atmsophere, eurkaryotes had no need for a mitochondria, and instead use alternate energy sources.
Therefore, the first eurkaytoes probably did not posses mitochondria but mitosomes, as is also exemplified by many unicellular eukaryotes (Bakatselou et al., 2003; Tovar et al., 1999; Williams et al., 2002). Thus, mitochondria may have also evolved from prokaryotic genes, around 2.3 - 1.8 bya, when increasing levels of oxygen acted on gene selection.
Using sulfur isotopes to determine the oxygen content of ~2.3 billion year-old rocks, Guo and colleagues (2009) found that "the Archean-Proterozoic transition is characterized by the widespread deposition of organic-rich shale, sedimentary iron formation, glacial diamictite, and marine carbonates recording profound carbon isotope anomalies." This includes the first known anomaly in the carbon cycle indicative of a sudden increase in atmospheric oxygen. "All deposits reflect environmental changes in oceanic and atmospheric redox states, in part associated with Earth's earliest ice ages...a rise in atmospheric oxygen... and the Great Oxidation Event (ca. 2.3 Ga)." Thus not just the metamorphosis of mitochondria but Earth's earliest ice age are linked to the rise of oxygen in Earth's atmosphere.
During this same time period, when oxygen levels increased, organisms with more than 2-3 cell types appeared (Hedges et al., 2004). This increase in energy availability (oxygen) and the ability to extract it (mitochondria) conferred major advantages for the eukaryotic host which became increasingly complex and expanded in size.
By 1.5 BYA, eukaryotes expanded to approximately 10 cell types (Hedges et al., 2004). This increase in size and complexity was made possible, in part, by the energy provided by mitochondria which used oxygen as an energy source.
A billion years later, and by the onset of the Cambrian Explosion, so much oxygen had been released into the atmosphere that ozone was established which blocked out life-neutralizing UV rays. With the establishment of ozone, innumerable creatures could emerge from the sea or from beneath the soil and exploit new environments; environments which acted on gene selection giving rise to new capabilities and new species. Those who breathed oxygen were at a signficiant advantage, increasing the number of environments they could invade and conquer.
The activity of photosynthesizing organisms and prokaryotic genes altered the environment via the liberation, secretion, and synthesis of a variety of chemicals and enzymes including oxygen (Buick 1992, 2008; Falkowski and Godfrey 2008; Holland 2006; Olson 2006; Williams and Fraústo da Silva 2006). The changed environment acted on gene selection, activating genes contributed by bacteria and archae, giving rise to new traits and new species perfectly adapted for a world that had been prepared for them.
OYXGEN ENVIRONMENT, MITOCHONDRIA & ENDOSYMBIOTIC GENE TRANSFER
Although the genes necessary for creating a mitochondria may have been present when Earthly eukaryotes were first fashioned, it was not until the planet became sufficiently oxygenated that the metamorphosis of mitochondria ensued via the transformation and activation of genes provided by α-proteobacterium, and triggered by a signfiicant increase in oxygen levels (Hedges et al., 2004) and other essential elements necessary for the functioning of mitchondria such as NADH.
However, the rise of oxygen was also a function of biological activity ( (Buick 1992, 2008; Castresana and Moreira 1999; Castresana and Saraste 1995; Falkowski and Godfrey 2008; Holland 2006; Olson 2006; Schafer, et al., 1996). Thus once altered by photosynthetic organisms the environment acted on gene selection, and the rise in oxygen resulted in the diversification and increased complexity of the photosynthetic life that produced the oxygen that changed the atmosphere (Guo et al., 2009).
As genes act on the environment which acts on gene selection, additional genes were activated, and new functions, characteristics, and species began to appear. However, not just the eukaryotic genome was impacted, but the mitochondria genome. Mitchondria subsequently donated numerous genes which were integrated into the eukaryotic genome (Rogers et al., 2007). These included genes coding for organelles and the endoplasmic reticulum, as well as genes contributing to the nucleus, and the bacterial-type plasma membrane that displaced the original archaeal membrane (Esser et al., 2004; Rivera andLake 2004); a process Andersson (2005) refers to as “endosymbiotic gene transfer."
Endosymbiotic gene transfers are a common and ongoing process in diverse eukaryotes (Bensasson et al. 2001; Leister 2003; Timmis et al. 2004). Further endosymbiotic gene transfer from mitochondria may have facilitated the invasion of group II introns into host genes (Martin and Koon, 2006) which served as the precursors of spliceosomal introns (Cavalier-Smith, 2009). This invasion of introns exerted a profound effect on the regulation of gene expression (e.g. Brietbart et al., 1985; Leff et al., 1986; Yoshihama et al., 2007), the expansion and duplication of the eurkayotic genome, and the evolution and metamorphosis of increasingly complex creatures. |
|
|
|
|
|
|
|
Rhawn Joseph, Ph.D. PART 3
INTRONS
DNA includes stretches of nucleotides, called exons, that are encoded and expressed to produce various proteins (De Souza et al., 1996). These strings of nucleotides are punctuated, bracketed, framed, and interspersed with long stretches of non-encoding DNA, called introns (Belfort, 1991, 1993; Breathnach et al., 1978; Buchman and Berg 1988; Witkowski, 1988). In complex multicellular organisms introns are often 10-fold longer than exons (De Souza et al., 1996). They also signal which lengths of exons are to be expressed (Belfort, 1991, 1993; Breathnach et al., 1978; Witkowski, 1988). Introns are typically snipped out as strings of exons are transcribed via RNA intermediaries, into proteins (Breibart et al., 1985; Leff et al., 1986).
Introns are of particular importance in regulating gene expression (Brinster et al., 1988; Buchman and Berg, 1988; Collis et al., 1990; Lai, et al., 1998; Noe et al., 2003). If different "starter" or "stop" introns are activated this results in different segments or sequence lengths becoming expressed, thereby producing a different product (Belfort, 1991, 1993; Breathnach et al., 1978; Breibart et al., 1985; Leff et al., 1986). Hence, variation and diversity can be differentially induced if different "starter" exons or promoter introns are activated.
Introns have been preserved often in the same places in the genome, over the course of evolution, be it the genes of Drosophila melanogaster (the fruit fly), Caenorhabditis elegans (nematode), mice, or humans ((De Souza et al., 1996; Federov et al., 2002). This extreme conservation and preservation of their positions within genes, attests to their importance in regulating and coordinating evolution and metamorphosis among numerous species. Many are catalytically active and facilitate chemical reactions, even catalyzing their own synthesis (De Souza et al., 1996).
Some introns are found within or in association with ribosomes (Dürrenberger and Rochaix 1991; Jackson et al., 2002; Toro et al., 2007; Yoshihama et al., 2007). The functional part of the ribosome is fundamentally a ribozyme, the molecular machine that translates the RNA copies of exons into proteins (Cech 2000). Thus introns, in association with ribosomes play a major role in translation, transcription and protein synthesis. Ribozymes are also able to splice themselves and other introns out of the original transcript created by these RNA molecules (Jackson et al., 2002). Ribozymes can also be found in the intron of RNA transcripts, which had been removed from the transcript.
Mitochondrial ribosomes and introns are considered to be of bacterial origin (Kenmochi et al., 2001); a product of endosymbiosis (Dyall et al., 2004; O'Brien 2002). Ribosomal introns and protein sequences which circulate in the cytoplasm appear to have originated in the archae genome, and were later donated to eukaryotes, as there is a specific affinity between eukaryotic genes and their orthologs from archae (Lake et al. 1984; Lake 1988; 1998; Rivera and Lake 1992 Rivera and Lake 2004 Vishwanath et al. 2004). Archae and bacteria were a major source of introns and ribosomes.
Some introns are also known as spliceosomes, self-splicing introns, and as Group I and II introns (Roy and Gilbert, 2006). Spliceosomes and spliceosomal introns are responsible for splicing out introns and transposable elements, and insuring that the genetic sequences in introns are not translated into proteins. Thus, they regulate gene expression and help guarantee that only designated exons are translated and transcribed (Roy and Gilbert, 2006).
Spiceosomal introns and are found in the nuclear genes of higher eukaryotes including humans (Doolittle 1978; Gilbert 1978; Mattick 1994; Deutsch and Long 1999). Simple prokaryotes and some eukaryotes (such as fungi and protozoa) do not possess a nucleus and lack nuclear introns. Nuclear introns also engage in alternative splicing, and can produce multiple types of messenger RNA from a single gene (Roy and Gilbert, 2006).
Via the joining of exons after splicing, introns also trigger the synthesis of novel proteins with new properties (Brietbart et al., 1985; De Souza et al., 1996; Leff et al., 1986). They may also promote the creation of multiple copies of the proteins coded by single genes (Brietbart et al., 1985; Leff et al., 1986). In fact, the presence of an intron can increase transcriptional efficiency 100-fold whereas in the absence of the intron these genes may not be expressed at all (Brinster et al., 1988; Lai et al., 1998).
Hence, introns are involved in transcription, translation, signaling, protein synthesis, and regulating which gene sequences or portions of the gene should be expressed or inhibited (Brinster et al., 1988; Brietbart et al., 1985; Buchman and Berg 1988; Collis et al., 1990; Leff et al., 1986; Lai, et al., 1998; Noe et al., 2003). They also create new genes.
Introns guide or participate in the genetic recombinations between exons, a process called “exon shuffling" (Gilbert, 1978, 1987; Doolittle,1978; Blake, 1978). Exon shuffling is the process where new full-length genes are created from exon “pieces” by recombination within the introns (De Souza et al., 1996, 1998, 2003; Fedorov 2001, 2003; Long et al., 1995; Roy 2003; Roy et al., 1999, 2001, 2003). Exon shuffling is associated with the formation of new genes from old genes.
Introns also are implicated in the production of additional genes and even gene clusters which are located deep within the intron (Henikoff et al. 1986; De Souza et al., 1996; Strachan & Read, 1996). Thus, introns may be responsible for producing duplicate genes as well as new genes and clusters of genes, including numerous copies of highly repetitive sequences of nucleotide base pairs (Finnegan, 1989; Henikoff et al. 1986; Peters & Fink, 1982). Indeed, introns, and intronic gene clusters are considered a "hot spot" for homologous recombination (Wahls et al. 1990).
Introns also play a major role in the origin and diversity of proteins by facilitating recombination of sequence coding for small protein/peptide modules (Brietbart et al., 1985; Leff et al., 1986; Koonin 2006). If the length of the code is altered and reframed, or if introns change their positions within the genes, the products produced by the altered code may also undergo subtle or profound changes (Brietbart et al., 1985; Leff et al., 1986). Therefore a variety of tissues and organs can be fashioned.
Introns also contain copies of gene sections that have been silenced and suppressed (De Souza et al., 1996). They maintain the "old code" for genes that were once translated into a protein, as well as the codes for genes that have not yet been expressed. Introns are thus implicated in the release of genetically genetically pre-coded traits (de Jong & Scharloo, 1976; Dykhuizen & Hart, 1980; Gibson & Hogness, 1996; Polaczyk et al., 1998; Rutherford & Lindquist, 1998; Wade et al., 1997).
Hence, introns create genes from old genes, recombine pieces of genes, and thus can combine, fractionate, or reconfigure the structure of a gene, thereby creating new functions from the parsing or assimilation of old functions coded by single or multiple genes. Moreover, they can silence or activate the expression of the genes they create or those they regulate.
Introns, therefore, play a major role in evolution acting to regulate gene expression, maintaining copies of genes, and promoting the assembly of new genes and new gene sequences from old genes, and multiple copies of the same or a new protein product.
Thus, following the donation of introns to eukaryotes, new genes were assembled from old genes (De Souza et al., 1996, 1998, 2003; Fedorov 2001, 2003; Long et al., 1995; Roy 2003; Roy et al., 1999, 2001, 2003). The genome began to increase in size and complexity and genes expressed new, albeit precoded functions; which gave rise to new tissues, organs, and the evolution of new species (Duret 2001; Comeron and Krietman 2000). In fact, the number of introns per gene varies by more than two orders of magnitude between species (Roy 2004).
Therefore, introns, which may have originally been donated by prokaryotes (Cavalier-Smith 1991; Martin and Koonin 2006; Sharp 1991; Stoltzfus 1999), may play a significant role in regulating, copying, and duplicating genes which had also been transferred to the eukaryotic genome by prokaryotes. Moreover, they appear able to regulate the manufacture of new proteins and thus guide the evolution of new tissues, organs, and species. These are not random events, but are under precise regulatory control.
INTRONS ORIGINATED IN PROKARYOTES
Numerous introns invaded eukaryotic genes at the outset of eukaryogenesis as the first eurkayotes were being fashioned (Martin and Koonin 2006; Rogozin et al., 2005), and thus at the earliest stages of eukaryote evolution (Rogozin et al., 2005). All eukaryotes whose genomes have been sequenced, including parasitic protists, have been shown to possess introns (Doolittle 1978; Gilbert 1978; Mattick 1994; Deutsch and Long 1999; Nixon et al. 2002; Simpson et al. 2002; Vanacova et al. 2005). Even the simplest of eurkaryotes contain introns as well as spliceosomal proteins within their genomes (Collins and Penny 2005).
Hence, introns were present when simple eukarayotes took root on this planet, or they originated in the prokaryote genome and were transferred to the first proto-eukaryotic organism (Cavalier-Smith 1991; Martin and Koonin 2006; Sharp 1991; Stoltzfus 1999). Introns then continued to be donated or duplicated as eukaryotes evolved.
Both archae and bacteria appear to have supplied eurkaryotes with numerous introns (Martin and Koonin 2006), perhaps flooding the eukaryotic genome with introns and transposable elements at the earliest stages of eukaryosis (Cavalier-Smith 1991; Martin and Koonin 2006; Sharp 1991; Stoltzfus 1999). Or these prokaryotes may have suppled introns at the time the archae and bacteria genomes were unified to create the first eukaryotes (Martin and Koonin 2006). A massive influx of introns would also explain why ancient eukaryotes (Roy 2006) including the last common ancestors for eukaryotes, possessed high intron densities comparable even to vertebrates who posses intron-rich modern genomes (Roy 2006; Carmel et al., 2007; Csuros et al., 2008).
Mitochondria may also be a direct and indirect source for introns including group II self-splicing introns and spliceosomal introns (Dyall et al., 2004; O'Brien 2002; Roy and Gilbert 2006). For example, spliceosomal introns may have evolved from group II self-splicing introns which originated in the genome of the alpha-proteobacterial progenitor of the mitochondria (Koonin 2006). Group II self-splicing introns are present in the genomes of many bacteria (Cavalier-Smith 1991; Koonin 2006; Roy 2006; Stoltzfus 1999). Thus, at least some eukaryotic introns may be linked to the same alpha-proteobacteria genome which gave rise to mitochondria which also donated numerous genes to the eukaryotic genome (Koonin 2006).
Moreover, archae may have contributed introns, including ribosomal introns and protein sequences. Some archael genomes contain genes that are dotted with micro-introns and some archae proteins are also bracketed by introns (Watanabe et al., 2002) as is common in eukaryotes.
Be it archae, bacteria, viruses, or a combination of influences, once these introns were donated to the eurkaryotic genome, they then punctuated and framed numerous protein-coding genes and played crucial roles in recombination, gene creation, coordination of transcription and translation, the emergence of the spliceosome, as well as the nucleus, linear chromosomes, telomerase, the ubiquitin signaling system, inhibition and expression, gene duplication and creation, and the expansion of the genome, (Comeron and Kreitman 2000; De Souza et al., 2003; Duret 2001; Fedorov 2003; Koonin 2006; Gilbert 1978, 1987; Long et al., 1995; Mattick 1994; Prachumwat et al., 2004; Roy and Gilbert 2006; Tonegawa et al., 1978).
Thus, introns which were donated by prokaryotes, acted on genes which had been transferred by prokaryote to the eukaryotic genome, thereby creating new genes from old genes, expressing pre-coded traits, and giving rise to new species. Introns play a major role in the regulation of evolutionary metamorphosis,
The donation of introns by prokaryotes following the metamorphosis of the first eukaryotes, also explains the relative absence of introns in the genomes of most modern prokaryotes (Koonin 2006). Introns were donated and were not replaced thus insuring that eukaryotes and not prokaryotes would evolve into new species.
That these prokaryotes at one time may have contained an abundance of introns may also account for why the genomes of archae and bacteria contain split genes (Dassa et al., 2007). Therefore, having contributed their introns to the eukaryotic genome, most archae and most bacterial genes lack or have only a few introns, and their genes are encoded as uninterrupted open reading frames. This indicates that the donation of introns was not random, but under precise genetic control, such that their transfer to eukaryotes played a highly regulated role in eukaryotic evolution whereas their deletion from the prokaryotic genome insured that only eukaryotes would continue to evolve.
INTRONS ARE CONSERVED
The positions of introns and numerous spliceosomal and spliceosome-associated proteins, have been highly conserved in the same locations and positions within the genes of numerous species (Anantharaman et al., 2002; Collins and Penny 2005; Federov et al., 2002). Thousands of introns are located in the exact same regions of the genome, even when comparing the genes of fungi and humans (Federov et al., 2002). This conservation of position and location indicates they exert extremely important influences on the coordination of gene regulation and expression even among different species, possibly even acting to coordinate the evolution of various species in relation to one another.
Studies have shown that highly conserved, shared intron positions are common in animal, plant and fungal genes (Federov et al., 2002). In one study it was found that 14% of animal introns match plant positions, and that ≈17–18% of fungal introns match animal or plant positions (Fedorov et al., 2002), even though animals and plants diverged from any common ancestors over a billion years ago (Wang et al., 1999).
Indeed, the three-way split between plants, animals and fungi has been estimated to have occurred around 1.6 bya, whereas the the basal animal phyla (Porifera, Cnidaria, Ctenophora) diverged between 1.2 to 1.5 bya (Wang et al., 1999). Introns have an ancient pedigree.
Federov et al., (2002) examined 30 nonrelated genes with the highest numbers of common animal–plant introns and found that "60% of the fungal introns have positions common to animal and/or plant introns, and 39% of fungal introns are common simultaneously to both plant and animal introns. This exceptionally high abundance of introns with positions common to all three taxa of animals, plants, and fungi strongly supports the antiquity of these common intron positions."
In yet another genomic study (Rogozin et al., 2003), intron positions were compared in 684 orthologous gene sets from 8 complete genomes of animals, plants, fungi, and protists/parasite. Approximately one-third of the introns in the protist parasite were shared with at least one crown group of eukaryote; indicating that these introns have been conserved for over 1.2 billion years of evolution.
Between 10% to 20% of intron positions and other genomic features without obvious functions are conserved throughout the evolution of eukaryotes leading up to an including in humans (Bejerano et al., 2004; Fedorov et al., 2002). However, the fact that these functions are not obvious is not an indication of a lack of importance. These unknown functions may not be expressed except in future species. "What is conserved is functionally relevant" should be considered a central tenant of biology, even if the functions are not yet obvious.
INTRONS & PUNCTUATED EVOLUTIONARY EQUILIBRIUM
Sequences within introns have changed considerably over the course of evolution, sometimes by orders of magnitude, and at a faster pace than those of the exons (Federov et al., 2002). Thus, these highly conserved introns are obviously active and are exerting a variety of influences on the genome and gene expression, as well as the evolution of new species. In fact, bursts of introns appear to have invaded the eurkaryotic genome initially and possibly at key points in eukaryotic evolution, such as the origin of animals and prior to the divergence of extant eukaryotic lineages (Carmel et al., 2007). For example, lineages leading to animals seem to have experienced a phase of massive intron invasion early in their evolution (Carmel et al., 2007).
After billions, or hundreds of millions or tens of millions of years of stasis, armies of introns either invade or rapidly duplicate within the eukaryotic genome, and are directly associated with, or may have directly triggered bursts of branching speciation and explosions of evolutionary change in the absence of transitional forms; a phenomenon that Eldredge and Gould (1972; Gould 2002) described as "punctuated equilibrium." Indeed, there is no fossil evidence of gradual change from one species to another or any fossil record of transitional forms acting as an evolutionary bridge between species (Eldredge and Gould 1972; Gould 2002). Evolution occurs in leaps. Thus, the regulation and coordination of these great evolutionary leaps may well be yet another function of introns.
Although the position of an intron in a gene's coding sequence is well conserved, introns can make copies of themselves which can be snipped out and transposed to another region of the genome (Finnegan, 1989; Moran et al., 1999). Introns change position within the genome, acting as transposable elements. Moreover, they can act as a plasmid or transposon and invade and transpose themselves the genomes of cospecies (Dujon, 1989; Dujon et al., 1989; McDonald 1993). In this manner, they can coordinate gene expression among most members of the same species, such that all make the same evolutionary leaps simultaneously.
Also many drop out of the genome after serving their function, which in turn would effect gene selection and exon transcription. When introns drop out, their deletion may halt any further evolutionary advance, thus leading to another long period of stasis. Intron deletion would also obscure and erase evidence of any genetic footprints leading to prokaryotes, viruses, or a common ancestor.
INTRON GAINS & LOSSES
The donation or duplication and deletion of introns may have occurred throughout eukaryotic evolution, with introns coming and going (Roy and Gilbert 2006). Eukaryotes harbor multiple introns per gene (Logsdon 1998; Mourier and Jeffares 2003; Jeffares et al. 2006), requiring hundreds of thousands, if not millions of individual introns to have been donated or duplicated throughout eukaryotic evolution and even during recent evolutionary history (Cavalier-Smith, 1985; Logsdon 1998; Palmer and Logsdon, 1991). However, gains are often accompanied or followed by losses.
It is inferred that a relatively high intron density was reached early in the metamorphosis of eukaryotes (Carmel et al., 2007; Cavalier-Smith 1991; Csuros et al., 2008; Martin and Koonin 2006; Roy 2006; Sharp 1991; Stoltzfus 1999). It has been estimated that the last common ancestor of eukaryotes contained >2.15 introns/kilobase. The last common ancestor of multicellular life acquired even more, harboring ∼3.4 introns/kilobase, a greater intron density than in modern insects, most extant fungi and some animals (Carmel et al., 2007); indicating a massive intron duplicative event coupled with deletions. Among the top six intron-rich species, five are ancestral forms, indicating that some species have subsequently lost introns, whereas initially the number of introns actually increased during the evolutionary leap from uni-cellular ancestor to the first multi-cellular ancestor.
Just as prokaryotes may have lost introns upon donating them in massive amounts to ancestral eukarotes, the higher density of introns in ancient vs more recent species, also suggests that introns play a major role in evolution and then drop out in those species which will no longer evolve.
The evolution of eukaryotic genes is characterized by numerous gains and losses of introns (Carmel et al., 2007) and different species vary dramatically in their intron density, ranging from a few introns per genome to over eight per gene (Logsdon 1998; Mourier and Jeffares 2003; Jeffares et al. 2006). Introns are prevalent in complex eukaryotes but rare in the simple ones (Cavalier-Smith, 1985; Logsdon 1998; Palmer and Logsdon, 1991), indicating that the acquisition or duplication of introns is associated with species which have evolved. By contrast some introns have been eliminated from the genomes of those in a state of prolonged stasis and evolutionary equilibrium.
Therefore, intron gains and losses may be an indication of the evolutionary status of any particular species, if they are in a state of stasis or if their genome is primed to undergo additional evolutionary leaps. Thus intron gain, retention, or loss, may indicate if a species may continue to evolve.
For example, we see an elevated rate of intron loss in several lineages, such as fungi and insects, nematodes, and arthropods (Carmel et al., 2007; Rogozin et al., 2003); species which no longer appear to be evolving, and which may have diverged from vertebrates around 1.2 bya (Wang et al., 1999). Thus, in non-vertebrates the rate of intron loss and gain have decreased in the last 1.3 billion yr. (Carmel et al., 2007). Further, in these lineages and in the last 100 to 300 million years, there has been a dramatic decrease in intron duplicative events, such that gains decreased faster than the decrease in losses, resulting in many lineages with very limited intron gains (Carmel et al., 2007; Rogozin et al., 2003).
Nematodes are characterized by a high number of events, with losses being more plentiful than gains (Cho et al. 2004; Coghlan and Wolfe 2004). Fungi also show more losses than gains (Nielsen et al. 2004). Recent intron losses are also seen in plant genes (Charlesworth et al., 1998).
Whereas many ancestral introns have been lost in fungi and other lower forms, they are retained in the genomes of higher vertebrates (Rogozin et al., 2003) many of which evolved in the last 40 million years. Many "higher" vertebrate species have continued to gain introns, albeit at a rather slowed pace, whereas "lower vertebrates" appear to be losing introns and to be experiencing a rapid reduction in gains (Fedorov et al. 2003; Babenko et al. 2004; Coulombe-Huntington and Majewski 2007). A survey of mammalian genes found six cases of intron losses in rodents relative to human (Roy et al., 2003). In fact, for most extant species, the total number of losses outnumbers the number of gains (Carmel et al., 2007).
The accelerated rate of loss in many species may indicate that these introns have been donated to the genomes of yet other species where they are exerting regulatory and evolutionary influences on gene selection and expression. As introns are quite mobile, they can also jump from location to location like a plasmid, coordinating the expression or suppression of a wide range of genes simultaneously and thus making it appear that introns have been lost, or gained, when they have merely moved to a new location; or, perhaps, jumped to the genome of a different species.
INTRONS & TRANSPOSONS
Introns have been implicated in the creation of new genes, new traits, new species, and thus evolutionary metamorphosis. They have played crucial roles in gene creation, coordination of transcription and translation, the expansion and possibly even the duplication of the genome, the emergence of the spliceosome, the nucleus, linear chromosomes, telomerase, the ubiquitin signaling system, and eukaryotic evolutionary innovation (Koonin 2006; Mattick 1994; Roy and Gilbert 2006).
Introns exert a significant regulatory influence over gene expression and may have played a role in the seperation between transcription and translation (Roy and Gilbert 2006). For example, they appear to have provided two types of RNA genes to the eukaryotic genome--mRNA and iRNA. These highly structured Eukaryotic RNAs are also linked with group II introns and might have originated from introns in the alphaproteobacterial progenitor of the mitochondria (Blumenthal, 2005; Toro et al., 2007).
Spliceosomal introns snip out introns and interrupt sequences of protein-coding genes and are among the defining features of eukaryotes (Doolittle 1978; Gilbert 1978; Mattick 1994; Deutsch and Long 1999). Numerous spliceosomal introns invaded genes of the emerging eukaryote during eukaryogenesis and thus must have originated in prokaryotes.
Splicing mechanisms are directly linked to bacterial group II introns (Toro et al., 2007), to archae and bacteria (Lake et al. 1984; Lake 1988; 1998; Rivera and Lake 1992; Rivera and Lake 2004; Vishwanath et al. 2004) (Martin and Koonin 2006), to mitochondria (Blumenthal, 2005; Dyall et al., 2004; O'Brien 2002), and to bacterial operons (Garrett et al., 1994).
Self-splicing introns can be traced back to the earliest stages of eurkaryotic evolution, and are linked to RNA and the basic machinery of gene expression: transcription, splicing, and translation (Blumenthal, 2005).
Likewise, spliceosomal proteins are part of the core cellular machinery that is conserved across eukaryotes, and are sometimes located within operons (Blumenthal and Gleason, 2003; Blumenthal et al., 2002; Garrett et al., 1994; Hill et al., 2000). Operons are sequences of nucleotides which include several structural genes and a promoter, and which produce messenger RNA (mRNA), via transcription by an RNA polymerase (Salgado et al., 2000). Operons are believed to have originated in the prokaryote genome (Che et al., 2006; Ermolaeva et al., 2001) and regulate the expression of various genes, depending on environmental conditions (Salgado, et al., 2000). This is accomplished by the binding of a repressor to the operator to prevent transcription, or by inserting an inducer molecule which binds to the repressor thereby allowing expression (Blumenthal et al., 2002; Salgado, et al., 2000). Introns have retained the operon capacity to repress or selectively express genes sequences.
Group II self-splicing introns also evolved in partnership with the spliceosome, both of which may have originated in organelles which transfered type II introns into the nucleus (Cavalier-Smith, 1985; Rogers, 1989). Organelles are linked to the alpha-bacterial symbiont whose genes combined with archae to fashion the eukaryotic genome and which gave rise to mitochondria.
Self-splicing Group II introns serve as catalytic RNAs (ribozymes) and mobile retroelements, which reinsert themselves into the genome after they are snipped out (Finnegan, 1989; Moran et al., 1999; Roy and Gilbert 2006). They can change their position within the genome and can influence the expression of different sequences of genes in a step-wise temporal-sequential fashion (Dibb & Newman, 1989; John & Miklos, 1988; Kuhsel, et al. 1990).
Group II introns therefore, have the mobile characteristics of transposons and retrotransposons and also serve as transposable genetic elements (Crick, 1979; Coghlan and Wolfe, 2004; Finnegan, 1989; Hickey 1992; Moran et al., 1999). Likewise, some novel introns appear to arise by transposon insertions (Crick, 1979; Dibb & Newman, 1989; John & Miklos, 1988; Kuhsel, et al. 1990). Conversely, some retrotransposons, which have the ability to reinsert themselves, appear to have evolved from mobile group II introns.
Introns and transposable elements (TEs) are intimately linked and in some instances are indistinguishable. Eukaryotic genomes contain numerous TEs, many of which are found in introns (Nekrutenko and Li 2001). Most eukaryotic genomes are littered with introns and transposable elements, and many TEs are located within introns or have been inserted into exons during evolution (Nekrutenko and Li 2001). Hallick et al., 1993).
Coghlan and Wolfe (2004) have examined intron matches and found that around 70% have a nucleotide identity identical to transposable elements. In many cases the new intron is homologous to a transposon and to another intron, indicating the intron acted as a transposon and made a copy of itself which was inserted into another region of the genome. In this manner introns duplicate themselves, jump to different regions of the genome, and can coordinate gene expression in a wide range of gene networks. In some cases what appears to be a new intron is in fact an intron reinsertion, transcript retroposition, intron duplication, or gene conversion. If due to duplication then deletion of the original intron following transposition, then intron gains and losses may be one and the same. However, intron loss may also be a function of transfer to another organism.
The original introns were likely highly mobile, retrotransposable genetic elements which actively invaded the eukaryotic genome at the outset of eukaryotic evolution, relying in part on internally encoded enzyme activities for mobility.
TRANSPOSONS, INTRONS & GENE ACTIVATION VS GENE EXPRESSION
Introns also insert themselves into introns. The genomes of numerous species contain introns-within-introns (twintrons), indicating that introns are also targets of intron insertions (Copertino and Hallick 1991; Doetsch et al., 2001) . Thus introns may also regulate introns.
TEs inserted into introns also affect RNA processing, and intronic TEs can render its host gene susceptible to siRNA-mediated transcriptional gene silencing (Doetsch et al., 2001). Therefore, they can turn genes on, or off.
The majority of all introns in the eukaryotic and human genome have Alu insertions (Grover et al. 2004). These Alu enzymes cut up foreign DNA in a process called "restriction" and are also found in bacteria and archaea (Arber and Linn 1969; Krüger and Bickle 1983). Possibly they were donated to the eukaryotic genome by prokaryotes, perhaps as a protection against viruses. "Restriction" is yet another means by which introns can silence genes, including nearby genes, as well as engage in cutting and splicing.
Moreover, transposons/introns, in association with RNA, can serve as regulators of gene expression and chromosome segregation by inserting and introducing heterochromatin which prevents gene expression by wrapping the gene in a protective protein coat (Hall et al., 2002; Grewal and Moazed 2003; Grewal and Martienssen, 2002; McClintock 1950; Volpe et al., 2002). Indeed, heterochromatin is characterized by a high density of transposons (Volpe et al., 2002). TE insertion therefore, can disrupt the coding sequences of a gene and inhibit the production of viable gene products.
These mechanisms mediating gene silencing and activation have also been adopted to evolve new traits (Liu et al., 2004). TE insertion within promoters, introns, and untranslated regions, can directly trigger incredible genetic variation and the full gambit of phenotypes, ranging from subtle epigenetic regulatory perturbations to the complete loss of gene function (Kidwell and Lisch, 1997; Wessler, 1988). That is, by turning genes on and off, different regions of a gene network may be activated and different products can be produced.
TEs that insert into introns are sometimes spliced out during mRNA processing. Even when spliced, however, these TE inserted introns can effect regulatory sequences and gene regulation in numerous ways including triggering or suppressing gene expression in certain tissues (Greene et al., 1994). Moreover, intronic transposable elements and transposons can significantly affect the expression of nearby genes (Finnegan, 1989; Dibb & Newman, 1989; John & Miklos, 1988; Kuhsel, et al. 1990; Lippman et al. 2004). Gene silencing is accomplished in a step-wise process involving RNA and the methylation of histones (Grewal and Martienssen, 2002; Hall et al., 2002).
Group II and III intronic retroelements often insert themselves into exons. Once inserted they are quickly integrated within these exonic sequence (Hallick et al., 1993) and can easily suppress these genes. Group III intron are sometimes formed from the domains of two individual group II introns (Hong and Hallick, 1994). The group III introns and group II introns also share a common evolutionary ancestor, which is linked to the alpha bacteria progenitator as well as archae.
These introns possess the genetic mechanisms which allow them to be efficiently spliced out of transcripts, and to reinsert themselves in another part of the genome. They are able to demarcate coding sequences and to regulate gene expression in different regions of the genome, perhaps simultaneously as well as sequentially. Thus they can guide the activity of a number of gene networks to coordinate gene expression.
Therfore, introns, which may have originated in prokaryotes, can duplicate and give birth to themselves, and possess the genetic machinery which enables them to propagate throughout the genome and to regulate gene expression via silencing and restriction. As is also demonstrated by their highly conserved nature, these are not chance, or random events.
INTRONS & RNA
Some introns may also propagate at the RNA level including within messenger RNA. Messenger RNA (mRNA) is transcribed from a DNA template and contains the codes for creating specific protein products which it transports to ribosomes for protein synthesis. These introns indicate which portions of the code are to be translated and transcribed and are then snipped out and are reinserted (spliced) into another region of the genome which is without an intron.
Presumably, the new intron-containing RNA is reverse-transcribed and undergoes gene conversion leading to a new intron. Therefore, via reverse-splicing an excised intron sometimes reintegrates back into a different site in the same mRNA (Coghlan & Wolfe 2004; Tarrío et al., 1998) thereby exerting multiple coordinated influences on gene expression and protein synthesis.
Introns may have been the original information source for the creation of genes which code for mRNA. Likewise, genes involved in mRNA processing and splicing, and germline-expressed genes, preferentially gain introns (Roy 2004). By contrast, introns/TEs are generally excluded from mRNAs of highly conserved genes (van de Lagemaat et al., 2003).
A gene ontology analysis has demonstrated that novel introns are unusually frequent in genes with mRNA processing functions, relative to germ-line-expressed genes. This suggests that it is the function of these genes, rather than their mode of transcription, that makes them amenable to gaining introns (Coghlan & Wolfe 2004). Thus, introns regulate functional expression. Thus introns regulate gene expression or suppression and control the transposition of these introns to different regions of the genome. These properties enabled introns to coordinate the expression or suppression of a wide network of genes.
For example, RNA not only serves as a messenger but can interfere with and inhibit and silence gene expression (Hall et al., 2002). This is accomplished, in association with transposons/introns via heterochromatin formation whose repressive capacity is mediated by components of RNA interference machinery (RNAi). This RNAi machinery acts to nucleate heterochromatin assembly and can initiate and propagates regional heterochromatic inhibition and gene silencing (Hall et al., 2002; Volpe 2002). RNAi in association with introns/transposons can even control chromosome segregation and the expression of large chromosome domains (Grewal and Moazed 2003).
Thus, introns and transposons can exert regulatory control of individual genes, chromosomes, and thus the entire genome.
TE-induced genetic alterations and changes in regulatory sequences, are of extreme evolutionary significance to their hosts and to the metamorphosis and evolution of future species (Britten 1996). TEs, especially when inserted into introns, can alter the size and arrangement of whole genomes, induce changes in single nucleotides, and generate new genetic variation on a scale, and with a degree of sophistication, ranging from subtle to dramatic alterations in the development and organization of tissues and organs (de Jong & Scharloo, 1976; Dykhuizen & Hart, 1980; Finnegan, 1989; Dibb & Newman, 1989; Gibson & Hogness, 1996; John & Miklos, 1988; Kuhsel, et al. 1990; Moran et al., 1999 Polaczyk et al., 1998; Rutherford & Lindquist, 1998; Strachan & Read, 1996; Wade et al., 1997). Such changes appear most likely if these insertions occur in coding regions and often confer useful traits on the host, as well as guide, coordinate, and regulate evolution and metamorphosis.
INTRONS INFECT OTHER SPECIES
Between 35% to 50% of the human genome is ultimately derived from transposable elements (International Human Genome Sequencing Consortium 2001; Lander et al., 2001; Smith 1996; Yoder et al., 1997), and there are many examples of human genes derived from single transposon insertions (Nekrutenko and Li, 2001; Sakai et al. 2007). Moreover, large numbers are found in human protein coding genes (Nekrutenko and Li, 2001).
In a study of genome-wide impact of transposable elements on evolution, Nekrutenko and Li (2001) found that almost 89% of these TEs reside within 'introns' and were recruited into coding regions as novel exons, such that it appears that TE insertion might create new genes (Nekrutenko and Li, 2001) and recruit new exons (Sakai et al. 2007), which would in turn, affect and accelerate species divergence. Numerous studies have in fact found that TEs in the mammalian genome promote the variation and diversification of genes, and affect the expression of many genes through the donation of transcriptional regulatory signals (Thornburg et al., 2006; van de Lagemaat et al., 2003; Jordan et al., 2003).
TEs therefore, contribute to pre-transcriptional gene regulation, especially by moving transcriptional signals within the genome which in turn leads to new gene expression patterns (Thornburg et al., 2006) and the creation of new genes from old genes (Nekrutenko and Li, 2001; Sakai et al. 2007). Further TEs are involved in gene duplication and the creation of large numbers of interspersed repetitive sequences (Smit 1996). By contrast, mRNAs of highly conserved genes are generally devoid of TEs (van de Lagemaat et al., 2003).
TEs are more frequent in duplicate than single copy protein coding genes (Sakai et al. 2007) indicating they are involved in gene duplication and diversity (van de Lagemaat et al., 2003) and not gene conservation. Thus TEs serve as recombination hot spots and may express or create specific cellular functions, through the control of protein translation and gene transcription (Thornburg et al., 2006). In fact because many TEs are taxon-specific, their integration into coding regions could accelerate species divergence and contribute to sudden bursts of evolutionary development (Jordan et al., 2003; Morgan 1993; Nekrutenko and Li, 2001; Sakai et al. 2007; van de Lagemaat et al., 2003).
Moreover, gene classes which react to external environmental stimuli, have transcripts enriched with TEs (van de Lagemaat et al., 2003). In addition, TEs are intimately involved in the simultaneous regulation of multiple genes (Jordan et al., 2003). Thus TEs can trigger gene expression in numerous genes simultaneously in response to changing environmental conditions; and this may include whole genome duplication and/or explosive evolutionary leaps after long periods of evolutionary equilibrium.
The life cycle of TEs in any single phylogenetic lineage can apparently last for many thousands or millions of years and can be considered as a succession of six phases: dynamic replication, movement to another region of the genome, transfer to another species, activation, inactivation, degradation (Kidwell, 1993; Miller et al., 1996).
TE are intrinsically parasitic (Doolittle and Sapienza, 1980; Dujon, 1989; Orgel and Crick, 1980; Hickey 1982; Kiyasu and Kidwell 1984; McDonald 1993; Yoder et al., 1997), and can easily duplicate themselves (Plasterk and Sherratt, 1995) and invade new species (Dujon, 1989; Dujon et al., 1989; McDonald 1993). A proclivity for horizontal transfer is consistent with the role of TEs as genomic parasites. TEs, therefore, also act as plasmids.
Horizontal transfer to another host lineage provides the opportunity for active TEs to begin the cycle over again in yet another species (Dujon, 1989; Dujon et al., 1989; Hurst et al., 1992; Kidwell, 1993; 1994; McDonald 1993) or to insure that all members of the same species undergo the same genetic and evolutionary changes at the same time (McDonald 1993).
Moreover, this enables these intronic TEs to coordinate gene expression among multiple members of the same or divergent species, such that different species may evolve in tandem or develop complimentary traits at the same time.
These TEs can survive over long periods of evolutionary time by spreading throughout numerous genomes belonging to numerous divergent and subsequent species. However, once transferred, transposed, and inserted, these TEs may serve only to inhibit gene expression (Waterland and Jirtle, 2003; Yoder et al., 1997). It may take hundreds of millions or even billions of years, before these genes become active and begin expressing new functions, new characteristics, and even new species; and this may require major changes in the environment and the elimination of suppressive influences.
GENE ACTIVATION & SUPPRESSION:
Genes expression can be restricted and inhibited by a variety of mechanisms and proteins, such by "restriction" via Alu enzymes (Arber and Linn 1969; Krüger and Bickle 1983), or the binding of a repressor molecule or protein to the operator to prevent transcription (Blumenthal et al., 2002; Salgado, et al., 2000), or via methylation and/or the generation of heterochromatin (Waterland, 2006, Waterland and Jirtle, 2003; Yoder et al., 1997).
Further, TEs inserted into introns can inhibit mRNA processing, and can render numerous genes susceptible to siRNA-mediated transcriptional gene silencing (Doetsch et al., 2001). Heterochromatin formation and its repressive capacity are also mediated by RNA interference (RNAi) machinery (Grewal and Moazed 2003; Hall et al., 2002; Volpe et al., 2002). Therefore, they can turn genes on, or off.
Transposons which use the gene replication machinery to reproduce themselves, also utilize methylation to prevent their own replication and to prevent the expression of nearby genes (Yoder et al., 1997; Rakyan et al., 2002). Most transposable elements in the mammalian genome, along with the genes positioned near them, are silenced by methylation (Yoder et al., 1997; Rakyan et al., 2002). DNA methylation involves four atoms, the methyl group, which attaches to and coats the gene thus silencing the gene by preventing its expression. Methylation is commonly employed to inactivate a variety of genes (Wolff et al., 1998; Yoder et al., 1997; Van den Veyver 2002). However, by inactivating a TE, methylation may instead induce gene expression.
Transposable elements, therefore, in conjunction with methylation, "restriction" siRNA-mediated transcriptional gene silencing, and the generation of heterochromatin commonly silence or activate various genes, and can cause considerable phenotypic variability, making each individual mammal a "compound epigenetic mosaic" (Whitelaw and Martin, 2001).
ENVIRONMENT & GENE EXPRESSION: METHYLATION
Not just transposons and introns, but the environment also activates or silences genes, and can effect methylation. In fact, those genes which are most responsive to external environmental stimuli, have transcripts enriched with TEs (van de Lagemaat et al., 2003). However, certain environmental triggers can induce or remove methylation thus enabling the expression of these genes (Waterland and Jirtle, 2003; Wolff et al., 1998).
Red and green boxes represent silenced and active transposons
Those environmental influences can include diet and nutrition (Van den Veyver 2002; Waterland and Jirtle, 2003; Wolff et al., 1998). Diet plays a significant role in evolutionary metamorphosis and gene expression via inhibitory mechanisms such as methalation.
For example, it has been demonstrated that nutritional supplementation to the mother can permanently alter gene expression in her offspring by activating or silencing Agouti genes via methylation (Waterland and Jirtle, 2003; Wolff et al., 1998). In one set of experiments pregnant mice that received dietary supplements of vitamin B12, folic acid, choline and betaine, gave birth to babies with brown coats whereas the control group gave birth predominantly to mice with yellow coats (Waterland and Jirtle, 2003). These four nutrients possessed chemicals that donated methyl groups which reduced the expression of a specific gene, Agouti via DNA methylation. Thus, diet altered the color of the coats by acting on gene selection. This effect is referred to as "epigenetic" because it occurs over and above the gene sequence without altering the four-unit genetic code.
Likewise, genes passed down from ancestral species can be expressed by varying the environment and through other stresses including fluctuations in temperature, oxygen levels, and diet (e.g., de Jong & Scharloo, 1976; Dykhuizen & Hart, 1980; Gibson & Hogness, 1996; Polaczyk et al., 1998; Rutherford & Lindquist, 1998; Wade et al., 1997). Change the environment, and gene expression patterns may also be altered, giving rise to slight or major differences in the products produced. For example, increases in the levels of oxygen, calcium, and other elements and gasses significantly impacted gene selection around 540 mya, triggering what became the Cambrian Explosion.
GENE EXPRESSION, HSP90 & MOLECULAR SWITCHES
These genetic-environmental interactions on gene expression are mediated through protein products like Hsp90 (Rutherford & Lindquist, 1998). Hsp90 is a highly conserved multifunctional protein which targets multiple signal transducers which act as "molecular switches" which control gene expression in eukaryotes ranging from yeast to humans (Feder and Hofmann 1999; Rutherford 2003; Sangster et al., 2004). Hsp90 "normally suppresses the expression of genetic variation affecting many developmental pathways" (Rutherford & Lindquist, 1998).
Hsp90 does not act alone but is part of a networks that includes other protiens such as Hsp70, and p23 (Pratt and Toft 2003). As summarized by Cossins (1998, p. 309), these and related regulatory and signaling proteins, are sometimes referred to as "chaperones and have been discovered in all organisms studied so far. These signaling proteins form complex webs of molecular switches that allows signals both within and between cells to be transduced into responses." However, the coordination of these responses, can be influenced by the environment.
"Hsp90 is one of the more abundant chaperones. At normal temperatures it binds to a specific set of proteins, most of which regulate cellular proliferation and cell development" (Cossins, 1998). At significantly lower or higher temperatures Hsp90 ceases to bind to these proteins thus allowing for gene expression(Rutherford and Lindquist 1998). Thus they can also act for or against genetic variation and can trigger or prevent the expression of silent characteristics (Cossins, 1998; Rutherford and Lindquist 1998).
For example, these proteins may prevent DNA expression by acting as a buffer between silent genes and their nucleotides and the environment. Therefore these genes are inhibited and are only expressed in reaction to changes in the environment including temperature change.
HSP90, GENE EXPRESSION, NUCLEAR RECEPTORS & SNOW BALL EARTH
In response to signifcantly lowered or increased temperatures, Hsp90 levels are reduced and no longer act as effective buffers against the expression of signal-transduction proteins which leads to the expression of genes that had been inhibited (Rutherford and Lindquist 1998). This allows for the expression of hidden genetic variation leading to new developmental and evolutionary patterns. As demonstrated by, Rutherford and Lindquist (1998, p. 341) Hsp90 acts as an "explicit molecular mechanism that assists the process of evolutionary change in response to the environment" and it accomplishes this through the "conditional release of stores of hidden morphological variation.... perhaps allowing for the rapid morphological radiations that are found in the fossil record."
This has important implications for evolution as Earth has repeatedly undergone global ice ages followed or preceded by periods of high temperatures secondary to greenhouse warming. As lowered or raised temperatures can eliminate the suppressive influences of chaperones such as Hsp90, dramatic climate change, such as global glaciation or global warming, could affect a wide variety of signal-transduction proteins that are stabilized by Hsp90, thus inducing gene expression and the expression of precoded traits thus inducing the next stage of evolutionary metamorphosis.
The Hsp90 complex also regulate nuclear receptors (Arbeitman and Hogness 2000; Feder and Hofmann 1999; Mayer and Bukau 1999; Picard 2002; Rutherford 2003; Pratt and Toft 2003). These include receptors for retinoic acid, thyroid hormone, signal-transduction proteins, ligand-dependent transcription factors, tyrosine/serine/threonine kinases, and steroids.
Most nuclear receptors appear to be restricted to metazoans (Laudet 1997; Escriva et al. 2000; Thornton 2001; Baker 2005). However, the metamorphosis of the first metazoans did not take place until during or after the 3rd world wide glaciation.
As will be detailed, Earth has undergone at least three major world-wide glaciations (Hoffman et al. 1998; Hyde et al., 2000; Runnegar 2000; Lubick 2002). Each was followed by periods of global warming and the diversification and evolution of new species. However, the last glaciation which began around 635 mya is also associated with the evolution of the the first primitive metazoan, i.e. a "living fossil" known as Trichoplax, around 630 mya (Srivastava, et al., 2008). Trichoplax, however, was not a true bilateral animal and lacked muscle, heart, eyes or brain. Thus, although its genome likely possessed all the genes that code for these structures, including nuclear receptors (Srivastava, et al., 2008), the preponderance of evidence suggests they had not been expressed.
By the end of t he 3rd glaciation, around 580 mya, what may be the first bilateral-symmetrical metazoan had evolved; an Echinodermata, Arkarua adami (Gehling 1987). In fact, a wide range of increasing complex species appeared following the 3rd glaciation and ensuing warming cycle, leading to an explosive burst of evolutionary change and diversification (beginning 540 mya), including the appearance of complex animals and chordates equipped with bilateral bodies, eyes, and brains (Chen et al., 1995, 1999; 2003; Shu et al., 2001; Siveter et al., 2001). It was during this same time period, known as the Cambrian Explosion, that the genome duplicated in size (Holland 1994, 1999; Dehal and Boore 2005) and which is associated with the evolution of every phylum which is in existence today.
It can be assumed that the metamorphosis of the first true metazoans and chordates, was paralleled by the functional expression of those nuclear receptors regulated by the Hsp90 protein complex, and which are associated with metazoans. Thus, the explosion of complex life at the onset of the Cambrian, could be related to the effects of world wide glacial freezing followed by global warming on the Hsp90 protein complex. This may have led to activation of genes that had been suppressed, and even the duplication of individual genes and the entire genome thus enabling their expression.
In fact, the genome underwent duplication at this time (Holland 1994, 1999; Dehal and Boore 2005) and nuclear receptors appear to have evolved by series of gene duplications, followed by functional expression of the duplicated gene (Laudet 1997; Baker 1997, 2003; Thornton 2001). Therefore, it appears that the genes coding for sex steroids, adrenal and other nuclear receptors, and which have an important role in development and sexual differentation, underwent duplications in chordates possibly during the Cambrian Explosion (Baker 1997, 2003; Laudet 1997; Escriva et al. 2000; Thornton 2001) and were expressed once freed of inhibitory restraints.
Therefore, Hsp90, which can prevent the expression of a variety of genes or enable these genes to express functions which had been suppressed, may have been impacted by the extremes climatic changes in global temperatures. These global temperature changes, which may have been induced by biological activity, in turn effected a wide variety of signal-transduction proteins that are stabilized by Hsp90, thereby allowing for their expression and thus the metamorphosis of complex species including those which appeared during the Cambrian Explosion.
Genes often interact in networks. Change the environment and gene expression patterns may be altered, giving rise to slight or major differences in the products produced and allowing for the expression of pre-determined traits (Rutherford & Lindquist, 1998). As demonstrated by experiments performed by Rutherford and Lindquist, (1998) when these suppressive protein-buffering actions are altered by environmental change, including temperature fluctuations, "variants are expressed and selection can lead to the continued expression of these traits, even when" the actions of these repressor proteins are restored.
However, it as also the actions of genes, that is, biological organisms, which were largely responsible for these dramatic changes in the climate and global temperatures. Genes effect the environment and the environment acts on gene selection, creating an interactive feedback loop which significantly impacts the speed and rate of evolutionary metamorphosis. In order for these repressor proteins and other regulating genetic mechanisms to be switched off or on, requires contact and exposure to specific environmental agents.
Presumably, these environmental influences directly impacted those genetic mechanisms involved in gene silencing, gene duplication, and gene expression, thereby giving rise to traits, functions, organs, and species, which had been precoded into silent genes inherited from ancestral species, and which were donated to the eukaryotic genome by prokaryotes--the ancestors of which, arrived on Earth from other planets.
Part 3. Genes, Microbes & Metazoan Metamorphosis: |
| |||||||