Intelligent Design, the best explanation of Origins

This is my personal virtual library, where i collect information, which leads in my view to Intelligent Design as the best explanation of the origin of the physical Universe, life, and biodiversity

You are not connected. Please login or register

Intelligent Design, the best explanation of Origins » Origin of life »  LUCA—The Last Universal Common Ancestor

LUCA—The Last Universal Common Ancestor

Go down  Message [Page 1 of 1]

1 LUCA—The Last Universal Common Ancestor on Sun Aug 30, 2015 6:06 pm


LUCA—The Last Universal Common Ancestor 1

The last universal common ancestor represents the primordial cellular organism from which diversified life was derived

minimal  gene content of the first biological cell = 561 functional annotation descriptions = that means, it cannot be reduced further = irreducibly complex

A minimal estimate for the gene content of the last universal common ancestor
19 December 2005
A truly minimal estimate of the gene content of the last universal common ancestor, obtained by three different tree construction methods and the inclusion or not of eukaryotes (in total, there are 669 ortholog families distributed in 561 functional annotation descriptions, including 52 which remain uncharacterized)

A fairly complex genome similar to those of free-living prokaryotes, with a variety of functional capabilities including metabolic transformation, information processing, membrane/transport proteins and complex regulation, shared between the three domains of life, emerges as the most likely progenitor of life on Earth, with profound repercussions for planetary exploration and exobiology. The estimate of LUCA's gene content appears to be substantially higher than that proposed previously, with a typical number of over 1000 gene families, of which more than 90% are also functionally characterized.a fairly complex genome similar to those of free-living prokaryotes, with a variety of functional capabilities including metabolic transformation, information processing, membrane/transport proteins and complex regulation, shared between the three domains of life, emerges as the most likely progenitor of life on Earth.

The Last Universal Common Ancestor: emergence, constitution and genetic legacy of an elusive forerunner
2008 Jul 9
LUCA does not appear to have been a simple, primitive, hyperthermophilic prokaryote but rather a complex community of protoeukaryotes with a RNA genome, adapted to a broad range of moderate temperatures, genetically redundant, morphologically and metabolically diverse.

The proteomic complexity and rise of the primordial ancestor of diversified life
2011 May 25
Life was born complex and the LUCA displayed that heritage. Recent comparative genomic studies support the latter model and propose that the urancestor was similar to modern organisms in terms of gene content

Last Universal Common Ancestor had a complex cellular structure
OCT 5, 2011
New evidence suggests that LUCA was a sophisticated organism after all, with a complex structure recognizable as a cell, researchers report. Their study appears in the journal Biology Direct. The study lends support to a hypothesis that LUCA may have been more complex even than the simplest organisms alive today, said James Whitfield, a professor of entomology at Illinois and a co-author on the study.

Cenancestor, the Last Universal Common Ancestor
02 September 2012
Theoretical estimates of the gene content of the Last Common Ansestor’s genome suggest that it was not a progenote or a protocell, but an entity similar to extant prokaryotes.

Some Assembly Required: The Ingredients of Life
July 1, 2017
From analyses of bacterial microfossils, (some of which may be up to 3.48 billion years old) we know that the most primitive life was nearly as complex as today’s bacteria. Unfortunately, the (micro)fossil record can’t really tell us how we got from the simple chemicals to living, working, bacterial cells.

Koonin, the logic of chance, page 213:
Comparative-genomic reconstruction of the gene repertoire of LUCA
Why do we believe that there was a LUCA? More than one argument supports the LUCA conjecture, but the strongest one seems to be the universal evolutionary conservation of the gene expression system. Indeed, all known cellular life forms use essentially the same genetic code (the same mapping of 64 codons to the set of 20 universal amino acids and the stop signal), with only a few minor deviations in highly degraded genomes of bacterial parasites and organelles. The universal conservation of the code and the expression machinery, and the most coherent evolutionary history of its components leave no reasonable doubt that this system is the heritage of some kind of LUCA.

The inference makes sense based on methodological naturalism. Once design is considered, one can infer common design of all life forms of the aforementioned machinery.

Arguments for a LUCA that would be indistinguishable from a modern prokaryotic cell have been presented, along with scenarios depicting LUCA as a much more primitive entity (Glansdorff, et al., 2008).
The difficulty of the problem cannot be overestimated. Indeed, all known cells are complex and elaborately organized. The simplest known cellular life forms, the bacterial (and the only known archaeal) parasites and symbionts, clearly evolved by degradation of more complex organisms; however, even these possess several hundred genes that encode the components of a fully fledged membrane; the replication, transcription, and translation machineries; a complex cell-division apparatus; and at least some central metabolic pathways. As we have already discussed, the simplest free-living cells are considerably more complex than this, with at least 1,300 genes. 

All the difficulties and uncertainties of evolutionary reconstructions notwithstanding, parsimony analysis combined with less formal efforts on the reconstruction of the deep past of particular functional systems leaves no serious doubts that LUCA already possessed at least several hundred genes. In addition to the aforementioned “golden 100” genes involved in expression, this diverse gene complement consists of numerous metabolic enzymes, including pathways of the central energy metabolism and the biosynthesis of amino acids, nucleotides, and some coenzymes, as well as some crucial membrane proteins, such as the subunits of the signal recognition particle (SRP) and the H+-ATPase.

However, the reconstructed gene repertoire of LUCA also has gaping holes. The two most shocking ones are

(i) the absence of the key components of the DNA replication machinery, namely the polymerases that are responsible for the initiation (primases) and elongation of DNA replication and for gap-filling after primer removal, and the principal DNA helicases, and
(ii) the absence of most enzymes of lipid biosynthesis. These essential proteins fail to make it into the reconstructed gene repertoire of LUCA because the respective processes in bacteria, on one hand, and archaea, on the other hand, are catalyzed by different, unrelated enzymes and, in the case of membrane phospholipids, yield chemically distinct membranes.

Thus, the reconstructed gene set of LUCA seems to be remarkably nonuniform, in that some functional systems appear to have reached complexity that is almost indistinguishable from that in modern organisms, whereas others come across as rudimentary or missing. This strange picture resembles the general concept of asynchronous “crystallization” of different cellular systems at the early stages of evolution that Carl Woese proposed and prompts one to step back and take a more general view of the LUCA problem.

These problems are solved, once someone takes another approach, and understands that a creator made each of the organisms distinctively and separately, each of its kind.

More specifically, with regard to membrane biogenesis, it has been proposed that LUCA had a mixed, heterochiral membrane, so that the two versions with opposite chiralities emerged as a result of subsequent specialization in archaea and bacteria, respectively. With regard to DNA replication, a hypothesis has been developed under which one of the modern replication systems is ancestral, whereas the other system evolved in viruses and subsequently displaced the original system in either the archaeal or the bacterial lineage. The other major area of nonhomology between archaea and bacteria, lipid biosynthesis (along with lipid chemistry), prompted the even more radical hypothesis of a noncellular although compartmentalized LUCA. Specifically, it has been proposed that LUCA(S) might have been a diverse population of expressed genetic elements that dwelled in networks of inorganic compartments.

This is a nice example of how there have to be far-fetched, invented, incredible, fictional made-up scenarios in order to keep the standard paradigm of philosophical naturalism.

The possibility that LUCA was dramatically different from any known cells has been brought up, originally in the concept of “progenote,” a hypothetical, primitive entity in which the link between the genotype and the phenotype was not yet firmly established. In its original form, the progenote idea involves primitive, imprecise translation, a notion that is not viable, given the extensive pre-LUCA diversification of proteins that the analysis of diverse protein superfamilies has demonstrated beyond doubt.

Even the most conservative models of the composition of LUCA paint it as a quite complex system, a true organism. A system like this would be very difficult to imagine arising directly from purely prebiotic chemical reactions.
ASTROBIOLOGY An Evolutionary Approach page 131


5) Koonin, the logic of chance, page 213
6) Origins of Life: The Primal Self-Organization, page 174

further reading:

Last edited by Admin on Sun May 06, 2018 2:54 pm; edited 39 times in total

View user profile

2 Re: LUCA—The Last Universal Common Ancestor on Sun Aug 30, 2015 6:22 pm


The Last Universal Common Ancestor 1

[Dr. Michael Syvanen is a professor studying molecular genetics in the Department of Medical Microbiology at the University of California, Davis, and has been an advocate since the early-80s of an idea that has gained considerable support over the last few years - that much evolution is not tree-shaped, but net-shaped. That is, that genes cross taxonomic lineages. Since many attacks on evolution claim we should “teach the controversy”, we at Panda’s Thumb thought it might be nice to present an *actual* controversy in science. Discussion is welcomed. Here, at least.]

It has been over 30 years since the suggestion that horizontal gene transfer (HGT) may have been a factor in the evolution of life entered the literature. Initially these speculations were based on discoveries made in medical microbiology; namely that genes for resistance to antibiotics were found to move from one bacterial pathogen to another. This discovery was so unexpected and contrary to accepted genetic principles that though announced in Japan in 1959 (1,2) it was not generally recognized in the west for another decade. Speculations that HGT may have been a bigger factor in the evolution of life was inviting because it offered broad explanations for a variety of biological phenomena that have interested and puzzled biologist for over the last century and a half. These were problems that had been raised by botanists that have puzzled over the evolution of green plants (3) as well as by paleontologists that recorded macroevolutionary trends (4) in the fossil record that were often difficult to reconcile with the New Synthesis that merged Darwin’s thinking with Mendelian genetics. However, outside of the field of bacteriology this exercise did not really attract that much attention until the late 1990s at which time there was a major influx of data indicating that HGT had been very pervasive in early life. Namely, complete genome sequences began to appear. Simple examination of these sequences showed beyond any doubt that horizontal gene transfer was indeed a major factor in the evolution of modern bacterial, Archael and Eukaryotic genomes.

As an example for how profound the notion of HGT has changed our thinking concerns the notion of the last universal common ancestor (LUCA). This is an idea that was central to the hypothesis that life shared common ancestors. Though the idea of common ancestry remains valid (indeed evidence for common ancestry is everywhere in the sequence of our genes) there is no longer a need to postulate that all life evolved from a single last universal common ancestor. Rather, we can entertain common descent from multiple ancestors.

Figure 1 Universal tree of life and two alternatives.
Bacteria contain many deeply rooted branchs, here we include two groups which are shown as the gram (-) or more accurately known as proteobacteria and gram (+) or the low GC gram (+) bacteria. A shows the so called universal tree that is supported by the rRNA sequences. B shows the relationships found between a very large number of genes involved in metabolism and biosynthesis. C simply shows the remaining four taxa relationship for which very few genes seem to follow.
The notion that all life passed through a single interbreeding bottleneck is still probably believed to be true by most people who think about this problem. The reason is simple. There are many genes involved in information processing (i.e. DNA replication, RNA transcription and protein synthesis) whose molecular homologs are found in all three major domains of life. Furthermore, when the sequences of these genes are submitted to phylogenetic analysis they more or less support the following relationship – the Archaea and Eukaryotes define a branch to the exclusion of a bacterial branch and a single line links both. Figure 1a shows this relationship. The figure shows an unrooted tree with four of life’s major groups. These are the Archaea, the Eukaryotes and two of the major groups of bacteria. The Archael/Eukaryote branch, by definition, implies the existence of a common ancestor for these two groups and further we can infer that a point on the line leading to the bacterial branch that represents the last common ancestor of all life. Thus we can say there is empirical support for the existence of the last common ancestor. I mentioned above that this scenario is more or less supported by the informational genes. The striking finding is that other genes common to the three major kingdoms frequently show exceptions to these relationships.

When it comes to the genes for energy metabolism Eukaryotes and gram-negative bacteria are usually more closely related to one another than they are to the Archaea and other bacteria (as in Fig 1B). These genes are thought to have become associated with the Eukaryotic cell through the endosymbiote that eventually gave rise to the mitochondrion (5,6,7). In green plants we can also trace the ancestry of many genes involved in carbon fixation, photosynthesis as well as other metabolic processes to cyanobacteria, the endosymbiote host that gave rise to the chloroplast. For many of the biosynthetic pathways the relevant genes yield even more complex relationships. Thus we have arrived at the current situation that is accepted by most – there remain a few genes (almost all associated with the most basic genetic informational processing) that reflect an evolutionary history that goes back to some very primitive LUCA, but that superimposed over the remnants of that primitive ancestor in modern genomes are numerous examples of subsequent horizontal gene transfer events.

The above is a good model and it requires good reasons to reject it. To begin, not all of the informational molecular homologues support the simple phylogenetic pattern outlined above. Even here there are some exceptions. These exceptions have been dealt with in one of two ways – first in some cases it can be argued that there is insufficient amount of sequence to rigorously support the true branch relationships (i.e. sequence noise or homoplasy is hiding the true pattern) or alternatively, these are informational genes that also have been involved in HGT events. Though some of the cases are still open to debate there are a number of cases where it is simplest to conclude that some of the informational genes have been involved in HGT events, this is especially true for some of the amino acid-tRNA charging enzymes (8 ); these enzymes are intimately involved in translating the genetic and hence are central to information processing. Once we reach this point then it is no longer possible to argue that biochemically complex processes such as protein synthesis are too complicated to have their genes being involved in HGT events; a position that was held at least up until 1998. In fact Woese (9) suggested there existed in the very primitive cells a less functionally constrained protein synthesis machinery that permitted some HGT events of these components thereby accounting for the few exceptions. In this formulation a LUCA at least implicitly remains in the model. But evidence for the LUCA is greatly reduced, at least with respect to the number of genes found in modern genomes that can be directly traced back to the LUCA via exclusive vertical evolution. In 1982 it was automatic to assume that because a biochemical process was found in all of modern life, than that process must represent evidence for the one interbreeding population of the LUCA. Now we know that many of the universal biochemical processes have moved horizontally multiple times. Thus today we have a greatly truncated LUCA from what we believed just a decade ago.

When speculating on the nature of the LUCA it is generally accepted that it must have contained the modern universal genetic code since that is a feature shared by all life. However, even if we accept the existence of this LUCA there are a variety of reasons to believe that the LUCA itself was the product of an evolutionary process that employed horizontal transfer events; this is so especially with respect to the evolution of the genetic code. It is very difficult to see how the modern genetic code could have evolved in a sequential fashion; rather the code must have evolved on separate occasions and become fused into single lineages. This problem is illustrated by considering the case of lysine-tRNA charging enzyme genes found in modern life. All life has two different completely nonhomologous enzymes. If the modern genetic code evolved in a sequential fashion, then we would have to imagine a situation where a lineage that carried one of the two enzymes evolved the second. The raises the question: what selective pressure could possibly to account for the emergence of this second enzyme when it already has one? It is much simpler to believe that the lysine enzyme evolved independently in two different lineages, which then fused to give rise to the ancestor of modern life. This is not a radical idea. Of course, if HGT is common to life after the time of LUCA then it seems not unreasonable to assume that it was common to life before the LUCA. At this point we come to the following model for evolution of life if we try to preserve the LUCA. We have multiple lineages of pre-LUCA life that are linked together by HGT events into a netted or reticulate evolutionary pattern. This leads to the LUCA. The LUCA diversifies into its many modern lineages and then these lineages are again reticulated. We then end up with a topological model that looks like an hourglass. Namely a net above that bottlenecks to the LUCA which then diversifies and yields a net below. At this point the principle of parsimony should kick in. Why encumber our model with this bottleneck. It is not only no longer necessary but is now an exceptional assumption.

There is another reason that we should jettison the LUCA. This has to do with the finding that many of the universal genes including a number that make up the genetic code, are younger than are the major kingdoms of life. That is we can be reasonably sure that life forms resembling Archaea, bacteria and some kind of primitive Eukaryote existed before 1.5 and likely before 2 billion years ago. However, parts of the genetic code are younger than that. (see papers 8 and 9 at ref 11 for the documentation). The simplest explanation is the genetic code continued to evolve after modern life diversified. If so, then the only reasonable explanation for this is that these younger members of the genetic code must have achieved their current modern and universal distribution via HGT events. Once we accept that something as complex as the genetic code can evolve and spread by HGT events, it strongly suggests that a gene encoding any function could also.

There are deep ideological reasons for believing in a LUCA that explain the reluctance of many to abandon it. In fact this reason is built directly into the most basic model of modern biology, i.e. the tree of life. The only figure in Darwin’s “Origin of Species” happens to be a tree that inevitably maps back to a single trunk. Indeed the algorithms used in phylogenetic analysis can only find a single trunk, which, of course, is how they are designed. All practicing biologists are aware of the limitations of phylogenetic modeling with its built in assumptions, but nevertheless these assumptions do cause confusion. For example, let me pose a question and ask how often there was confusion when thinking about mitochondrial eve? Isn’t it a common misperception to think at some point that all of human life could be mapped back to a single woman? When in fact all we can say is that the only surviving remnant of that distant ancestor is her mitochondrial genome, and it is extremely unlikely that any of her other genes survive in any human populations. Because of the phenomena of sexual reproduction and recombination we share genes with multiple ancestors with no need to hypothesize any individual ancestor from whom we have descended. The same reasoning should apply to the evolution of all life; because of the phenomena of horizontal gene transfer we share genes with multiple ancestors with no need to hypothesize individual species from whom we have descended (10).


Last edited by Admin on Sat Dec 02, 2017 2:44 pm; edited 3 times in total

View user profile


What is the Last Universal Common Ancestor (LUCA)?

In the study of early life on Earth, one name towers above the rest: LUCA. LUCA is not the name of a famous scientist in the field; it is shorthand for Last UniversalCommon Ancestor, a single cell that lived perhaps 3 or 4 billion years ago, and from which all life has since evolved. Amazingly, every living thing we see around us (and many more that we can only see with the aid of a microscope) is related. As far as we can tell, life on Earth arose only once.

Answers in the genetic code

Life comes in all shapes and sizes, from us humans to bacteria. So how do we know that all life has evolved from a single cell? The answer is written in the language of the genetic code (Image A).

The genetic code spells out DNA.

  • The genetic code is the language in which most genes are written into DNA.
  • Such genes are recipes for making proteins.
  • Proteins are what make the cell tick, doing everything from making DNA to digesting the food we eat and extracting the nutrients.
  • Incredibly, the exact same code is used in humans and bacteria, so a gene from a human being can be put into a bacterium, and the bacterium will make the human protein — this is how insulin is made.

The genetic code is universal for all life.
That the genetic code is universal to all life tells us that everything is related. All life regenerates itself by producing offspring, and over time small changes in the offspring result in small changes to the protein recipes. But because the recipes are written in the same language (the genetic code), it is possible to compare these recipes (and other genes) to build the equivalent of a family tree.

Family trees

The tree of life explains relationships among all living things.

  • Archaea
  • Bacteria
  • Eukaryota

It is astounding that as recently as 25 years ago we were blissfully unaware that we and bacteria shared the planet with a third form of life!

Reconstructing LUCA

Which features of the archaea, bacteria, and eukaryotes can be traced to LUCA?

genes get lostThe tree of life is without doubt one of the great achievements in biology (image F). But for some researchers it is merely a means to an end. These researchers are trying to reconstruct LUCA, the cell from which all life has evolved.3 The question they are asking is, “which features of the archaea, bacteria and eukaryotes can be traced back to their common ancestor, LUCA?” This should be a very simple task — simply compare all three groups and choose the features that are common to all. By rights, LUCA’s reconstruction should be a done deal what with 70 or so complete genomes across the whole tree having been deciphered. (A genome houses all the genes in an organism, and a ‘catalogue’ of these genes is obtained by sequencing the organism’s DNA.) Unfortunately, it’s not that simple, for two reasons:

  • genes get swapped

How can we tell if a gene is ancient?
DNA provides clues to the age of a gene.

  • The implication of genes being lost is that when we compare genomes to see which genes are common across all life (that is, which are ‘universal’), we underestimate how many genes were originally in LUCA. Some of the genes that are not universal can be added to LUCA because clues to their origin can be found by looking at what they do. While we can make an educated guess as to whether a non-universal gene was in LUCA, most genes that are not universal are probably ‘new inventions’, specific to one of the three major branches of the tree. In fact, many may only be specific to one small group of, say, the archaea.
  • Another way to check if a gene is ancient is to look at whether it is a recipe for protein or RNA. This is an important clue because some RNAs date back to an even earlier period than the time when LUCA lived. The logic goes thus: if an RNA is older than LUCA, then LUCA had it too, even if that RNA is no longer universal.4,5

While dealing with gene loss is tricky, it is not an insurmountable hurdle — it just means reconstructing LUCA will be peppered with a lot of educated guesswork, and probably a few gaps. But gene swapping is another matter altogether — it threatens to fell the tree of life, and consign LUCA to the dustbin.6,7,8

Horizontal gene transfer

Horizontal gene transfer is another term for gene swapping.

Gene swapping (or horizontal gene transfer as it’s often called by biologists) has been known about for decades. What biologists are only now beginning to look at is the extent to which genes are transferred between organisms. Comparing two bacteria from the same species reveals major differences.9 For example, Escherichia coli is a common gut bacterium that is part of our natural gut flora. But the O157:H7 strain causes severe gastrointestinal ailments. The genomes of both a harmless variant (K-12) and the O157:H7 strains have been deciphered and compared, and the result is striking.

  • 1387 of the 5416 (26%) genes in O157:H7 are not in K-12.
  • 528 of K-12’s 4405 (12%) genes are not in O157:H7.

Many of the O157:H7 genes are arguably foreign genes that have been borrowed from elsewhere. If we compare two people, or even a person with a chimpanzee, there’s nowhere near this kind of variation — humans all share the same genes, and humans and chimps may well have only a handful of genes that are different between our two species.
On a broader level, a now famous comparison of Escherichia coli K-12 to Salmonella enterica (another species of bacterium often responsible for food poisoning) concluded that:

  • At minimum, 17% of the K-12 genome has been borrowed since these two bacteria split from a common ancestor around 100 million years ago.10
  • LUCA would have roamed the Earth 3-4 billion years ago, so if all genes are so easily swapped, any evidence for LUCA would have effectively been scrambled because genomes are so severely shuffled.6

Not all genes are equally swappable.

So where does this leave LUCA? A pessimist would say that LUCA is out of reach. However, it is far from obvious that all genes are equally swappable. Some, like genes for antibiotic resistance, are the gene equivalent of gypsies:

  • when there is antibiotic present, they provide a bacterium with resistance11
  • once the antibiotic disappears, they too are often lost

Other genes produce proteins that lock together with other proteins into large protein complexes, much like a 3D jigsaw. The ability for one jigsaw piece to be swapped with the equivalent jigsaw piece from another organism will depend on how similar the jigsaws are. Escherichia coli K-12 and O157:H7 could probably exchange such genes with relative ease, but a bacterium and an archaeon probably wouldn’t have a hope of doing so, even though such jigsaws perform the same biological role.12 Is gene swapping as common across other branches of the tree? We animals don’t tend to swap protein recipes like bacteria do, but we have done this in the past. There is now overwhelming evidence that we are part bacterium.13,14

Evidence indicates gene swapping in human DNA.

  • Our bacterial ancestry comes in the form of mitochondria (image G, tiny power plants housed in our cells.
  • The DNA of your mitochondria is miniscule, with only a handful of genes. But mitochondria were once full-blown bacteria that took up residence in and struck up a partnership with one of our distant single-celled ancestors.
  • Since then, much of the DNA from the original bacterium has been thrown away, but a lot of it has ended up in the DNA of our nucleus (image D).

The good news for LUCA biologists is that we seem to be pretty successful at identifying which bits of our nuclear DNA came from the mitochondrion, and which bits were already there. So to some extent, it might be possible to disentangle parts of the tree of life. But is it enough to save LUCA?

One or many LUCAs?

Carl Woese suggests there may be more than one LUCA.

Carl Woese, one of the key players in the bid to reconstruct the tree of life, has added another twist to the LUCA puzzle. He has got researchers fired up by suggesting that:

  • LUCA was also into gene swapping, and on a much larger scale than what we observe in modern bacteria
  • gene swapping was once more important than inheritance from parent to offspring, and that early archaea, bacteria and eukaryotes each emerged independently from a ‘sea’ of gene transfer8

It’s not clear how his claims could be tested, but they are certainly food for thought — if he’s right there never was a single LUCA, but more of a community of genes loosely associated with cells.

Conclusion: LUCA is still a puzzle but science continues to find pieces of the puzzle.

The jury is still out as to how to reconstruct LUCA, and whether horizontal gene transfer will turn this task into a futile one. However, if not all genes are equal in the game of horizontal gene transfer, biologists stand an outside chance. Either way, there are plenty of exciting challenges, and many unknowns for those trying to build the tree of life and reconstruct our origins. For instance, just this year a member of a new group of microscopic archaea has been identified from a deep-sea trench.15 To give you some sense of perspective as to the significance of this discovery, it is roughly equivalent to discovering the first plant! Whether there was one or many LUCAs, these are definitely exciting times.

View user profile

4 The Genetic Core of the Universal Ancestor on Mon Aug 31, 2015 6:01 pm


The Genetic Core of the Universal Ancestor 1

Group 1: Ribosomal Proteins and Translation Initiation Factors
Group 1 contains genes that recapitulate the three-domain phylogeny and whose products are directly linked to the function of the ribosome. This group includes genes for 29 universally conserved ribosomal proteins (rproteins) and the four universally conserved initiation and elongation factors (RNAs were not considered in this compilation.) In the case of the 30 small subunit rprotein COGs, 15 were universal, six were found only in Bacteria, and nine were found only in Eucarya and Archaea. The majority of universal COGs for small subunit rproteins showed strong support for a three-domain phylogeny (14 of 15; Table 1).

The genes for large subunit ribosomal proteins were a more complex group, and a smaller fraction of these were universally conserved (17 of 51 COGs, with 15 being three-domain). COGs encoding 11 large subunit proteins appeared to be three-domain by either maximum parsimony or neighbor-joining analysis, but bootstrap support for the three-domain topology was not strong (< 50%). We presume that the lack of statistical support in the calculations resulted from random evolutionary convergence in the relatively small data sets (ranging between 125 and 200 parsimony-informative characters) for these COGs. Because the best and most resolved topology was three-domain, we classified them as such.

Group 2: Proteins Associated With the Ribosome or Protein Modification
Group 2 includes universal three-domain genes that encode a diverse set of nonribosomal proteins with known functions that potentially link the genes to ribosome function or to modification of proteins. COG0024, methionine aminopeptidase (map, in E. coli), cleaves the initiator methionine during the process of translation (Lowther and Matthews 2000). COG0006, XaaPro amino peptidase (pepP inE. coli), also encodes a protease, initially identified by enzymatic activity against dipeptides with proline as the penultimate residue (Yaron and Mlynar 1968). COG0112 encodes the GlyA protein inE. coli. GlyA is required for amino acid catabolism and for donation of methyl groups to S-adenosyl-methionine-dependent methyltransferases and other methylating enzymes. In modern organisms, proteins encoded by three members of Group 2 (COG0201 [secY], COG0552 [ffh, SRP54], and COG0541 [ftsY SRP54 receptor]) are involved in protein export or insertion into membranes, guiding leader peptides to the membrane during translation (Walter and Johnson 1994).

Group 3: Proteins Associated With Transcription and Replication of DNA
Four of the universally conserved three-domain COGs in Group 3 encode proteins involved in transcription, including three subunits of DNA-dependent RNA polymerase (COG0085, COG0086, and COG0202 [RpoB, RpoC, and RpoA, respectively in E. coli]), and the gene for a transcription anti-terminator (COG0250, NusG in E. coli).

The number of universal genes involved in DNA replication and repair was surprisingly small, only four. Of these universal genes, only three were found to be three-domain: COG0592 (DnaN, in E. coli) that encodes the sliding clamp subunit of DNA polymerase III, COG0258 (Pol1-A in E. coli) that encodes the 5′-3′ exonuclease function of the DNA polymerase I, and COG0468 that encodes the recombination enzyme RecA.

Group 4: Uncharacterized Proteins
Two universally conserved genes that displayed three-domain phylogeny (> 95% bootstrap for all domains) but have no known functions were also found. In both cases, at least one property of the COG proteins could be predicted from its sequences. COG0037 (mesJand ydaO in E. coli) encodes a predicted ATPase, and COG0012 (ychF in E. coli) encodes a predicted GTPase.

Group 5: Universal, Non-Three-Domain Proteins
Twenty-eight universally conserved COGs did not show three-domain phylogeny. Presumably, therefore, these genes encode essential functions and have been subjected to lateral gene transfer at some point in evolution (Doolittle 1999; Glansdorff 2000). For example, 14 of the eucaryal amino acyl tRNA synthetase genes did not form a monophyletic group, and rather were always nested within either the bacterial or the archaeal groups (Woese et al. 2000). The non-three-domain universal COGs also include COG0125 (thymidine kinase), COG0550 (topoisomerase 1A), and COG1109 (phosphomanomutase;mrsA in E. coli). COG0533 has been predicted to encode a metal-dependent protease (ygjD in E. coli), but the precise function of the gene product has not been identified in any organism. The remaining COG representing a subunit of DNA polymerase III that was found to be universal was COG0470 (holB in E. coli). In modern organisms this subunit is required to load the modern sliding clamp, but, like the rest of the essential DNA polymerase genes, COG0470 has been transferred between domains. Groups 5 also includes a number of COGs that are missing from only one of the 36 genomes included in the survey and do not show three-domain phylogeny (Table 1).


View user profile


Last universal common ancestor 'more sophisticated than we thought,' say biologist

The chemical soup out of which all life eventually evolved could have been more complex than was first thought, according to a new study.
The last universal common ancestor (LUCA) is the name given to a crude organism that is now traceable in all domains of life; plants, animals, fungi, algae, etc.
Very little is actually known about this great-grandfather of evolution, and some scientists still debate whether it was even a cell.

But a new study has suggested that LUCA was a more sophisticated organism than presumed, with a complex structure that makes it identifiable as a cell.
The research, published in the journal Biology Direct, builds on several years of study of a previously overlooked feature of microbial cells.
This particular area has a high concentration of polyphosphate, a type of energy currency in cells

The new research argues that this polyphosphate storage site actually represents the first known universal organelle. 

Organelles were not thought to be common to all three branches of the tree of life - bacteria, archaea and eukaryotes - and common scientific consent said they were never found in bacteria.
However, in a 2003 study the same team from the University of Illinois showed that the polyphosphate storage structure in bacteria was physically, chemically and functionally the same as an organelle called an acidocalcisome, which is found in many single-celled eukaryotes.

Acidocalcisomes 3  are rounded electron-dense acidic organelles, rich in calcium and polyphosphate and between 100 nm and 200 nm in diameter.
Acidocalcisomes were originally discovered in Trypanosomes (the causing agents of sleeping sickness and Chagas disease) but have since been found in Toxoplasma gondii (causes toxoplasmosis), Plasmodium (malaria), Chlamydomonas reinhardtii (a green alga), Dictyostelium discoideum (a slime mould), bacteria and human platelets. Their membranes are 6 nm thick and contain a number of protein pumps and antiporters, including aquaporins, ATPases and Ca2+/H+ and Na+/H+ antiporters. They may be the only cellular organelle that has been conserved between prokaryotic and eukaryotic organisms.  2

This meant that the acidocalcisomes arose before the bacterial and eukaryotic lineages of the tree of life split, making the organelle even more ancient than realised.
The new study tracked the evolutionary history of a protein enzyme known as V-H+PPase, which is common to all three branches.
By comparing the V-H+PPase genes found in hundreds of organisms representing all three domains of life, the team constructed a 'family tree' that showed how different versions of the enzyme in different organisms were related.
For the enzyme to be present in all three, it had to have originated in the LUCA, before the three branches of the tree of life were formed.

'There are many possible scenarios that could explain this, but the most likely would be that you had already the enzyme even before diversification started on Earth,' said professor Gustavo Caetano-Anollés.
'The protein was there to begin with and was then inherited into all emerging lineages.'
His colleague Manfredo Seufferheld, who led the study, explained:  'This is the only organelle to our knowledge now that is common to eukaryotes, that is common to bacteria and that is most likely common to archaea.

'We may have underestimated how complex this common ancestor actually was'

'It is the only one that is universal.'
The findings suggest that LUCA may have been more complex than the simplest organisms alive today, and it has simplified itself during the evolutionary process instead of growing more complex, as we might expect.  
'You can't assume that the whole story of life is just building and assembling things,' said James Whitfield, a professor of entomology at Illinois and a co-author on the study.
'Some have argued that the reason that bacteria are so simple is because they have to live in extreme environments and they have to reproduce extremely quickly.
'So they may actually be reduced versions of what was there originally.
'According to this view, they've become streamlined genetically and structurally from what they originally were like.
'We may have underestimated how complex this common ancestor actually was.'

Evolution of vacuolar proton pyrophosphatase domains and volutin granules: clues into the early evolutionary origin of the acidocalcisome. 4

Prokaryotes are thought to differ from eukaryotes in that they lack membrane-bounded organelles. However, it has been demonstrated that as in acidocalcisomes, the calcium and polyphosphate-rich intracellular "volutin granules (polyphosphate bodies)" in two bacterial species, Agrobacterium tumefaciens, and Rhodospirillum rubrum, are membrane bound and that the vacuolar proton-translocating pyrophosphatases (V-H+PPases) are present in their surrounding membranes. Volutin granules and acidocalcisomes have been found in organisms as diverse as bacteria and humans.

Here, we show volutin granules also occur in Archaea and are, therefore, present in the three superkingdoms of life (Archaea, Bacteria and Eukarya).

Molecular analyses of V-H+PPase pumps, which acidify the acidocalcisome lumen and are diagnostic proteins of the organelle, also reveal the presence of this enzyme in all three superkingdoms suggesting it is ancient and universal.

Using Protein family (Pfam) database, we found a domain in the protein, PF03030. The domain is shared by 31 species in Eukarya, 231 in Bacteria, and 17 in Archaea. The universal distribution of the V-H+PPase PF03030 domain, which is associated with the V-H+PPase function, suggests the domain and the enzyme were already present in the Last Universal Common Ancestor (LUCA).

The importance of the V-H+PPase function and the evolutionary dynamics of these domains support the early origin of the acidocalcisome organelle. In particular, the universality of volutin granules and presence of a functional V-H+PPase domain in the three superkingdoms of life reveals that the acidocalcisomes may have appeared earlier than the divergence of the superkingdoms. This result is remarkable and highlights the possibility that a high degree of cellular compartmentalization could already have been present in the LUCA.

This membrane-enclosed organelle is characterized by its acidic nature, high electron density, and high content of polyphosphates (polyP) including pyrophosphate (PPi), calcium, magnesium, and other elements. In addition, the organelle contains a variety of cation pumps including Ca2+/H+ and H+ pumps. In particular, the vacuolar proton translocating pyrophosphatase (V-H+PPase) proteins have been localized in the acidocalcisomes of bacteria, parasitic protozoans, algae, plants, and recently in cockroaches

Volutin-polyP bodies occur in organisms spanning an enormous range of phylogenetic complexity from Bacteria and Archaea to unicellular eukaryotes to algae to plants to insects to humans.The volutin granules shown in a number of microorganisms appear to be identical to acidocalcisomes of Agrobacterium and Rhodospirillum and eukaryotes.

The fact that V-H+PPase sequences group many familiar clades monophyletically, and are also ubiquitous among superkingdoms although with poor phylogenetic signal, suggests these proteins are truly ancient.

Crystal structure of a membrane-embedded H+-translocating pyrophosphatase 5

H+-translocating pyrophosphatases (H+-PPases) are active proton transporters that establish a proton gradient across the endomembrane by means of pyrophosphate (PPi) hydrolysis1, 2. H+-PPases are found primarily as homodimers in the vacuolar membrane of plants and the plasma membrane of several protozoa and prokaryotes2, 3. The three-dimensional structure and detailed mechanisms underlying the enzymatic and proton translocation reactions of H+-PPases are unclear. Each VrH+-PPase subunit consists of an integral membrane domain formed by 16 transmembrane helices. IDP is bound in the cytosolic region of each subunit and trapped by numerous charged residues and five Mg2+ ions. A previously undescribed proton translocation pathway is formed by six core transmembrane helices. Proton pumping can be initialized by PPi hydrolysis, and H+ is then transported into the vacuolar lumen through a pathway consisting of Arg 242, Asp 294, Lys 742 and Glu 301. We propose a working model of the mechanism for the coupling between proton pumping and PPi hydrolysis by H+-PPases.

a Ribbon diagram of the overall structure of VrH+-PPase, containing 16 TMs (labelled 1–16). A missing region (residues 42–66) is shown with dotted lines.
b The six inner and ten outer TMs drawn as cylinders and coloured in yellow and green, respectively. This orientation is rotated by 60° from that in a.
c VrH+-PPase dimer shown as a ribbon diagram with height and width dimensions of 75 Å and 85 Å, respectively. The detergent molecules of n-decyl β-D-maltoside are shown as sticks.
d Electrostatic surface potential of the VrH+-PPase dimer (red, blue and white indicate negative, positive and neutral potentials, respectively). In a–c, IDP is shown as sticks and coloured in CPK.

a Six core TMs (yellow ribbon) with IDP-binding residues (sticks).
b The electrostatic surface potential of the IDP binding pocket (red, blue and white indicate negative, positive and neutral potentials, respectively).
c The IDP-binding residues. Mg2+ ions, K+ ions and water molecules are shown as green, purple and red spheres, respectively. Interactions are presented as dashed lines. d The binding site of VrH+-PPase (in white) superimposed on EcPPase (PDB 2AUU; in pink). The Mg2+ ions of the VrH+-PPase (in green, with numbers) and EcPPase (in grey) are shown as spheres. The F− in EcPPase is shown as a blue sphere, and the Watnu in VrH+-PPase is shown as a labelled red sphere. IDP in VrH+-PPase and PPi in EcPPase are coloured in CPK.

a Resting state (R state).
b Initiated state (I state). 
c Transient state (T state). The VrH+-PPase dimer is shown and coloured in green and blue. M6 and M16 are shown as cylinders. The residues involved in proton transport are shown

a The proton transport pathway is formed by six core TMs (M5, M6, M11, M12, M15 and M16). The solvent-accessible surface area is coloured in cyan. The arrow indicates the direction of proton translocation. Right: zoomed-in view of the proton transport pathway. The residues involved in proton transport are shown and labelled. The solvent-accessible surface has been removed. Bottom: the hydrophobic gate around Glu 301. Residues forming a hydrophobic gate are displayed and labelled.
b The electron density map (2Fobs − Fcalc) (in blue) around the proton transport pathway drawn at a contour level of 2σ. The IDP is shown as sticks and coloured in CPK. Water molecules Watnu, Wat1 and Wat2 are presented as labelled red spheres. Water-mediated hydrogen bonds are drawn as black dashed lines.

Vacuolar-type proton translocating pyrophosphatase 1, putative 6

» 97.5% identical to GB:AAF80381.1: vacuolar-type proton translocating pyrophosphatase 1 {Trypanosoma cruzi} (PMID:10998372)
» Located in the plasma membrane and acidocalcisomes of amastigotes, epimastigotes, and trypomastigotes (PMID:9705361;IDA) curator_FSF date_20150319
» The gene encodes a vacuolar proton translocating pyrophosphatase predominantly localized in acidocalcisomes. The protein is also localized in the plasma membrane, Golgi complex and contractile vacuole.

Amino Acids 814 7

Conserved and functional residues in mPPases 8

The amino acid sequences of Na+- and H+ -PPases are highly conserved. The importance of α-helices has been studied showing that TM6 particularly contains many essential residues for the structure and function of mPPases (212). However, it is not trivial to distinguish between direct and indirect functional roles of residues or helices as exemplified by the finding that several residues in TM3, which does not form part of the central ion transport funnel, are required for H+ transport in H+-PPases (213). The significance of conserved and non-conserved amino acid residues has been extensively studied by site-directed mutagenesis (Table 1) (Fig. 7) (214, 215) and random mutagenesis (216). The amino acid residues that have an effect on mPPase function are listed in Table 1.

Figure 7. Amino acid residues (shown as pink stars) important for mPPase function and/or conformational stability according to the literature. The mPPases are from V. radiata (Vr-PPase), Flavobacterium johnsoniae (Fj-PPase), C. hydrogenoformans (Ch-PPase), Bacteroides vulgatus (Bv-PPase) and T. maritima (Tm-PPase). Both conserved and non-conserved residues were shown to be important for mPPase function.

Multiple alignments of PF03030 domain sequences distributed in the three superkingdoms (Archaea, Bacteria, and Eukarya). Shown above is the strongly conserved 57-residue region of the V-H+-PPase

8 )

Last edited by Admin on Wed Sep 02, 2015 11:18 am; edited 11 times in total

View user profile


The proteomic complexity and rise of the primordial ancestor of diversified life 1


The last universal common ancestor represents the primordial cellular organism from which diversified life was derived.

This urancestor accumulated genetic information before the rise of organismal lineages and is considered to be either a simple 'progenote' organism with a rudimentary translational apparatus or a more complex 'cenancestor' with almost all essential biological processes. Recent comparative genomic studies support the latter model and propose that the urancestor was similar to modern organisms in terms of gene content. However, most of these studies were based on molecular sequences, which are fast evolving and of limited value for deep evolutionary explorations.


The minimum urancestral FSF set reveals the urancestor had advanced metabolic capabilities, was especially rich in nucleotide metabolism enzymes, had pathways for the biosynthesis of membrane sn1,2 glycerol ester and ether lipids, and had crucial elements of translation, including a primordial ribosome with protein synthesis capabilities. It lacked however fundamental functions, including transcription, processes for extracellular communication, and enzymes for deoxyribonucleotide synthesis. Proteomic history reveals the urancestor is closer to a simple progenote organism but harbors a rather complex set of modern molecular functions.

comparison of protein fold structures among lineages of the three superkingdoms supported an urancestor with functional complexity similar to that of extant life


Last edited by Admin on Sun Nov 29, 2015 3:13 am; edited 2 times in total

View user profile

7 Re: LUCA—The Last Universal Common Ancestor on Sun Oct 18, 2015 2:01 pm


Last Universal Common Ancestor had a complex cellular structure

CHAMPAIGN, lll. — Scientists call it LUCA, the Last Universal Common Ancestor, but they don’t know much about this great-grandparent of all living things. Many believe LUCA was little more than a crude assemblage of molecular parts, a chemical soup out of which evolution gradually constructed more complex forms. Some scientists still debate whether it was even a cell.

New evidence suggests that LUCA was a sophisticated organism after all, with a complex structure recognizable as a cell, researchers report. Their study appears in the journal Biology Direct.

The study builds on several years of research into a once-overlooked feature of microbial cells, a region with a high concentration of polyphosphate, a type of energy currency in cells. Researchers report that this polyphosphate storage site actually represents the first known universal organelle, a structure once thought to be absent from bacteria and their distantly related microbial cousins, the archaea. This organelle, the evidence indicates, is present in the three domains of life: bacteria, archaea and eukaryotes (plants, animals, fungi, algae and everything else).

The existence of an organelle in bacteria goes against the traditional definition of these organisms, said University of Illinois crop sciences professor Manfredo Seufferheld, who led the study.

“It was a dogma of microbiology that organelles weren’t present in bacteria,” he said. But in 2003 in a paper in the Journal of Biological Chemistry, Seufferheld and colleagues showed that the polyphosphate storage structure in bacteria (they analyzed an agrobacterium) was physically, chemically and functionally the same as an organelle called an acidocalcisome (uh-SID-oh-KAL-sih-zohm) found in many single-celled eukaryotes.

Their findings, the authors wrote, “suggest that acidocalcisomes arose before the prokaryotic (bacterial) and eukaryotic lineages diverged.” The new study suggests that the origins of the organelle are even more ancient.

The study tracks the evolutionary history of a protein enzyme (called a vacuolar proton pyrophosphatase, or V-H+PPase) that is common in the acidocalcisomes of eukaryotic and bacterial cells. (Archaea also contain the enzyme and a structure with the same physical and chemical properties as an acidocalcisome, the researchers report.)

By comparing the sequences of the V-H+PPase genes from hundreds of organisms representing the three domains of life, the team constructed a “family tree” that showed how different versions of the enzyme in different organisms were related. That tree was similar in broad detail to the universal tree of life created from an analysis of hundreds of genes. This indicates, the researchers said, that the V-H+PPase enzyme and the acidocalcisome it serves are very ancient, dating back to the LUCA, before the three main branches of the tree of life appeared.

“There are many possible scenarios that could explain this, but the best, the most parsimonious, the most likely would be that you had already the enzyme even before diversification started on Earth,” said study co-author Gustavo Caetano-Anollés, a professor of crop sciences and an affiliate of the Institute for Genomic Biology at Illinois. “The protein was there to begin with and was then inherited into all emerging lineages.”

“This is the only organelle to our knowledge now that is common to eukaryotes, that is common to bacteria and that is most likely common to archaea,” Seufferheld said. “It is the only one that is universal.”

The study lends support to a hypothesis that LUCA may have been more complex even than the simplest organisms alive today, said James Whitfield, a professor of entomology at Illinois and a co-author on the study.

“You can’t assume that the whole story of life is just building and assembling things,” Whitfield said. “Some have argued that the reason that bacteria are so simple is because they have to live in extreme environments and they have to reproduce extremely quickly. So they may actually be reduced versions of what was there originally. According to this view, they’ve become streamlined genetically and structurally from what they originally were like. We may have underestimated how complex this common ancestor actually was.”

The study team also included Kyung Mo Kim, of the Korea Research Institute of Bioscience and Biotechnology; and Alejandro Valerio, of the Museum of Biological Diversity in Columbus, Ohio.

The National Institute of Allergy and Infectious Diseases and the National Science Foundation provided funding for this study.

View user profile

8 Re: LUCA—The Last Universal Common Ancestor on Thu Jul 21, 2016 9:05 pm


The Last Universal Common Ancestor: emergence, constitution and genetic legacy of an elusive forerunner 1

Since the reclassification of all life forms in three Domains (Archaea, Bacteria, Eukarya), the identity of their alleged forerunner (Last Universal Common Ancestor or LUCA) has been the subject of extensive controversies: progenote or already complex organism, prokaryote or protoeukaryote, thermophile or mesophile, product of a protracted progression from simple replicators to complex cells or born in the cradle of "catalytically closed" entities?

There seems to be a limited number of indispensable cellular functions, but the number of unique realizations of the minimal gene-set for cellular life is likely to be astronomically large.

LUCApedia: a database for the study of ancient life 3
Organisms represented by the root of the universal evolutionary tree were most likely complex cells with a sophisticated protein translation system and a DNA genome encoding hundreds of genes. The growth of bioinformatics data from taxonomically diverse organisms has made it possible to infer the likely properties of early life in greater detail. Here we present LUCApedia, (, a unified framework for simultaneously evaluating multiple data sets related to the Last Universal Common Ancestor (LUCA) and its predecessors. This unification is achieved by mapping eleven such data sets onto UniProt, KEGG and BioCyc IDs. LUCApedia may be used to rapidly acquire evidence that a certain gene or set of genes is ancient, to examine the early evolution of metabolic pathways, or to test specific hypotheses related to ancient life by corroborating them against the rest of the database.


View user profile

9 Re: LUCA—The Last Universal Common Ancestor on Sat Dec 02, 2017 2:44 pm


Koonin, the logic of chance, page 331

All the difficulties and uncertainties of evolutionary reconstructions notwithstanding, parsimony analysis combined with less formal efforts on the reconstruction of the deep past of particular functional systems leaves no serious doubts that LUCA already possessed at least several hundred genes. In addition to the aforementioned “golden 100” genes involved in expression, this diverse gene complement consists of numerous metabolic enzymes, including pathways of the central energy metabolism and the biosynthesis of amino acids, nucleotides, and some coenzymes, as well as some crucial membrane proteins, such as the subunits of the signal recognition particle (SRP) and the H+- ATPase.

The reconstructed gene repertoire of LUCA also has gaping holes. The two most shocking ones are (i) the absence of the key components of the DNA replication machinery, namely the polymerases that are responsible for the initiation (primases) and elongation of DNA replication and for gap-filling after primer removal, and the principal DNA helicases (Leipe, et al., 1999), and (ii) the absence of most enzymes of lipid biosynthesis. These essential proteins fail to make it into the reconstructed gene repertoire of LUCA because the respective processes in bacteria, on one hand, and archaea, on the other hand, are catalyzed by different, unrelated enzymes and, in the case of membrane phospholipids, yield chemically distinct membranes.

The DNA replication machinery is essential in all domains, and so is lipid biosynthesis for cell membranes. Its not possible that the first cells emerged without membranes and DNA replication in a LUCA, and then evolved separately distinguished membranes and DNA replication. Should that not be evidence that a LUCA never existed, and that the three domains of life had to emerge separately ? That means, that the at least several hundred genes possessed in all tree domains of life would have had to emerge in a convergent manner , and lipid biosynthesis and DNA replication separately, which is a hard sell if proposing unguided natural mechanisms. If the emergency of biological cells is unlikely to the extreme just once, imagine tree times separately. Neither a LUCA is credible, nor naturally emerging tree separate domains of life by partial convergent evolution. The only rational explanation is a designer creating the tree domains of life separately, and using the same toolkit where required, and a separate divergent toolkit in other parts.

This urancestor accumulated genetic information before the rise of organismal lineages and is considered to be either a simple 'progenote' organism with a rudimentary translational apparatus or a more complex 'cenancestor' with almost all essential biological processes. Recent comparative genomic studies support the latter model and propose that the urancestor was similar to modern organisms in terms of gene content. However, most of these studies were based on molecular sequences, which are fast evolving and of limited value for deep evolutionary explorations.

A minimal estimate for the gene content of the last universal common ancestor--exobiology from a terrestrial perspective.

Using an algorithm for ancestral state inference of gene content, given a large number of extant genome sequences and a phylogenetic tree, we aim to reconstruct the gene content of the last universal common ancestor (LUCA), a hypothetical life form that presumably was the progenitor of the three domains of life. The method allows for gene loss, previously found to be a major factor in shaping gene content, and thus the estimate of LUCA's gene content appears to be substantially higher than that proposed previously, with a typical number of over 1000 gene families, of which more than 90% are also functionally characterized. More precisely, when only prokaryotes are considered, the number varies between 1006 and 1189 gene families while when eukaryotes are also included, this number increases to between 1344 and 1529 families depending on the underlying phylogenetic tree. Therefore, the common belief that the hypothetical genome of LUCA should resemble those of the smallest extant genomes of obligate parasites is not supported by recent advances in computational genomics. Instead, a fairly complex genome similar to those of free-living prokaryotes, with a variety of functional capabilities including metabolic transformation, information processing, membrane/transport proteins and complex regulation, shared between the three domains of life, emerges as the most likely progenitor of life on Earth, with profound repercussions for planetary exploration and exobiology.

The Very Early Stages of Biological Evolution and the Nature of the Last Common Ancestor of the Three Major Cell Domains


2. The presence of a core of highly conserved RNA-related sequences supports
the hypothesis that the LCA was preceded by earlier entities in which RNA
molecules played a more conspicuous role in cellular processes and in which
ribosome-mediated protein synthesis had already evolved.

3. Whole-genome analysis has revealed high levels of sequence redundancy,
which demonstrates the significance of paralogous duplications in shaping
the size and complexity of cell genomes. This redundancy suggests that
during early stages of biological evolution, anabolic pathways and other
biological processes were catalyzed by less-specific enzymes that could react
with a wide range of chemically related substrates.

4. The chemistry of ribonucleotide reduction, combined with sequence analysis,
supports the hypothesis of a monophyletic origin of DNA that took
place prior to the evolutionary divergence of the three main cell domains.
The available evidence suggests that the bacterial DNA polymerase I palm
domain, and its homologs, is a descendant of a component of an ancestral
RNA-dependent RNA polymerase that may have played dual roles as a
replicase and a transcriptase during the RNA/protein stage.

5. The availability of genomic data has revealed major discrepancies with the
topology of rRNA trees. This has led researchers to question the early
branching of Thermotoga and Aquifex, two bacterial thermophilic species.
These conclusions suggest not only that heat-loving bacteria may have been
recipients of archaeal hyperthermophilic traits, but also that the LCA was
not an extremophile.

Introducing LUCA

LUCA is short for Last Universal Common Ancestor, and it is from this organism that every living cell on the planet has supposedly descended. LUCA does not represent the earliest stage in the evolution of life — it is widely accepted that before the evolution of proteins and DNA (which are common to all cellular life) there was a period where RNA carried out the roles now performed by proteins and DNA . There are a lot of uncertainties when we go this far back in evolutionary time and perhaps all we can be certain of is that, at a point in Earth’s history (probably over 3 billion years ago), cells emerged which stored recipes for making both proteins and RNA on a third molecule, DNA.
Nevertheless, studying LUCA is not science fiction. In the same way as supposedly humans and chimpanzees shared a common history until less than 10 million years ago, all modern lifeforms shared a common history back as far as the split that gave rise to the three ‘domains’ of life we now know of as archaea, bacteria and eukaryotes; that is, back as far as LUCA. That there are three domains was first established by Carl Woese and colleagues, who found that the group called prokaryotes was actually two groups, the archaea and bacteria . Amazingly, this work has largely stood the test of time, and though it is argued that there has been extensive gene swapping between these two groups , recent analyses using complete genomes supports Woese and colleagues’ decision to split prokaryotes into archaea and bacteria.

Woese and colleagues’ discovery rested on an even more astounding one — all life stores its genetic information on DNA, using a common code which we call the genetic code. The information is stored as packets, called genes — recipes for making RNA, and proteins [see Appendix A: Making Protein]. The languages of DNA and RNA are so similar they may as well be called dialects, but both are markedly different from the language of protein. For RNA and DNA, the information-carrying part of both molecules is made up of four bases (analogous to letters in an alphabet) read in a linear fashion, as with written human languages. In RNA, the four bases are A, G, C and U. In DNA, A, G and C are also used, while T is used instead of U. Establishing the evolutionary basis for this change from U to T is not a trivial exercise, and is an interesting problem in itself [Poole et al. 2001]; but in terms of the actual language, the difference is as minor as the variant spelling of English words, e.g., civilisation and civilization.
The unearthing of the genetic code, and the subsequent demonstration that it is common to all life (a gene from a human can be read by the translation machinery of a bacterium) is without a doubt a key piece of evidence in establishing that there was a LUCA. But what else can we find out about LUCA? Knowing that the genetic code had arisen tells us there was probably a LUCA, but gives us very little information on the nature of LUCA.
In a nutshell, the study of LUCA broadly revolves around two questions:
What features are common to all cellular life?

What sets the three domains — archaea, bacteria and eukaryotes — apart from one another?

At first sight, building a list of LUCA features might seem a fairly straightforward process, especially now that advances in technology allow all the genes possessed by an organism to be identified by sequencing its genome. A sensible approach would perhaps be to compare all the genes from representative genomes of archaea, bacteria and eukaryotes. Those genes that are common to all three domains were in the LUCA and those that aren’t must have been added later. Unfortunately, it’s not that straightforward, for two main reasons:
Some genes appear to have moved from organism to organism like genetic gypsies, confounding our ability to distinguish between features that are universal and date back to LUCA, and features that are universal because of genes moving about.

Some genes which were found in LUCA may no longer be universal. That means it may be impossible to distinguish some LUCA features from genes that arose later, say in the evolution of eukaryotes.

Ask any two researchers to give an overview of what they think the LUCA was like, and you will no doubt get different answers. With such a tricky scientific endeavour as this — working out what an organism that lived billions of years ago was like — this is hardly surprising. Some of my own views on one aspect of the LUCA question are to be found in an earlier piece posted on which I co-wrote with Dan Jeffares [Jeffares & Poole 2000], but what follows is a broader overview of the fast-growing field of ‘LUCA biology.’

2. The minimal genome project

One hands-on approach to trying to uncover the biology of the LUCA has been to look for genes that are universal — that is, genes that all life forms possess. Once a list of these genes has been made, they also lead to another possibility: perhaps this list encapsulates the essence of cellular life — the minimum number of genes required to make a cell. In 1996, with the sequences of the first two bacterial genomes (Mycoplasma genitalium & Haemophilus influenzae) in hand, Arcady Mushegian & Eugene Koonin [Mushegian & Koonin 1996] tried exactly this. The most striking features of their minimal genome were:
A mere 256 genes

No biosynthetic machinery for making the building blocks of DNA

From this they tentatively concluded that LUCA stored its genetic information in RNA, not DNA, and made suggestions on how to further reduce the number of genes in their minimal genome. The work heralded the arrival of comparative genome studies, and there is no doubt that a good number of the genes in their 256-strong list do date back to the LUCA. However, the work was squarely criticised because of the omission of DNA [Becerra et al. 1997]. Both these bacteria are human parasites and it seems most likely that they did away with parts of the machinery for making their own DNA because they can steal from the host (i.e., the organism infected with these pathogens). Indeed, why put in the effort to make your own when it’s there for the taking?
Regardless of whether or not DNA was a part of the LUCA (I think it was [Poole et al. 2000], but there are plenty of researchers that beg to differ [e.g., Leipe et al. 1999]), this omission highlighted a wider problem with the minimal genome. Namely, the genomes you begin with probably affect the final set of genes. This is a problem for the following reasons:

How many genomes must be compared before we are confident we aren’t leaving anything out?

‘Lifestyle’ can affect the final list (in Mushegian & Koonin’s work, the minimal gene set may in fact be a generic set required for parasitism in humans, and has little in common with what was required for a free-living cell to go about its business billions of years ago)

Gene losses: if a gene was in the LUCA, but now remains in only one of the three domains, this method would consistently leave it off the list of LUCA genes.

Finally, if genes can move from organism to organism (so-called horizontal or lateral gene transfer), certain genes may have done such a good job of spreading that they sometimes appear to date back to the time of LUCA, whereas in actual fact, they arose more recently.

Despite its limitations, the minimal genome concept is probably the best attempt to put money where mouth is and come up with a hard list of genes that may have been a feature of the LUCA. It is also the only sensible framework we currently have. That said, if gene transfer is extreme, genes will have moved about so often that this and related methods are rendered futile [Doolittle 1999].
Koonin has recently published an updated minimal gene set, using 21 complete genomes [Koonin 2000]. Surprisingly, of the 256 genes in the original set, only 81 remain, and this list is clearly insufficient to describe either the minimum number of genes required for a cell to function, or the genetic makeup of LUCA.
While working out which genes were part of LUCA is no easy task, the various attempts [Mushegian & Koonin 1996; Kyrpides et al. 1999; Hutchison et al. 1999] are a good starting point, and have served to highlight important problems that must be dealt with in the field of ‘LUCA biology.”

LUCA genomics

As the minimal genome work demonstrates, the major issues are:
How much does reliance on universal features underestimate the genetic makeup of LUCA?

How much gene swapping has gone on during the evolution of life from LUCA to the present?

The magic number of universal features is likely to shift about, and there have been plenty of criticisms of all the attempts to reconstruct the LUCA. Nevertheless, universal features are important because they describe a lower limit from which to build upon, and importantly, all these attempts converge on agreement insofar as concluding that LUCA was quite complex. Some of the gaps will be relatively simple to fill, but others may be close to impossible.
LUCA biologists are aware that universal features may underestimate the complexity of LUCA to some extent, but another concern is emerging that could cause even more headaches — the spectre of ‘horizontal gene transfer’ (also called lateral gene transfer):
If there is lots of gene swapping between organisms, the tree of life becomes more like a web, and it may not be possible to disentangle the branches.
If genes are extremely nomadic, truly universal features cannot be distinguished from genes that have successfully spread themselves by gene transfer.

The problem of gene transfer was made apparent in a landmark study of two bacterial genomes, Escherichia coli andSalmonella. Jeffery Lawrence & Howard Ochman [Lawrence & Ochman 1998] concluded that, since diverging from a shared ancestor 100 million years ago, at least 10% of the E. coli genome has been acquired in somewhere in excess of 200 horizontal gene transfer events.
In an equally insightful commentary, William Martin [Martin 1999] has discussed the implications of this work for our ability to reconstruct phylogenetic trees:

The further back in time an evolutionary divergence, the greater the likelihood that any given gene in a genome has been transferred.
Indeed, it may be the case that all bacterial genes have been subject to horizontal gene transfer at some point in their evolutionary history.
This could undermine the utility of phylogenetic tree reconstruction for deep divergences.

Currently, there is a lot of debate over whether gene transfer is so rampant that evolutionary trees cannot be built, or whether the levels of gene transfer are negligible. Both extremes are currently championed in the literature, and ironically, when it comes to the LUCA, Carl Woese’s work is central to both — many of those that view Woese’s three domains as correct have been arguing for little or insignificant levels of transfer, whereas Woese has recently suggested that very early in evolution, gene transfer between organisms was more important than inheritance from generation to generation [Woese 1998].
While it sounds like Woese is being inconsistent, his more recent claim is limited to the earliest periods in the evolution of life, and arose from concerns that LUCA was beginning to appear totipotent, a crazy notion that would have LUCA as the ultimate source for all life’s diversity:

A fertilised egg is totipotent — from a single cell it will develop into all the different cells and tissues that make up an adult human being.

If genes move between organisms, LUCA might mistakenly appear totipotent because many features would be incorrectly counted as universal.

Extrapolating Lawrence & Ochman’s result back billions of years may not be realistic. But what if horizontal transfer was the default state? This is the idea Woese has developed. His argument is that genes were so free to exchange that there were no distinct lineages — genes moved more through horizontal transfer than by vertical inheritance. As the genetic system becomes more accurate and as the complexity increases, more genes become interdependent, and transfer gives way to vertical inheritance. Woese argues that translation (and therefore the genetic code) was the first thing to be fixed or crystallised, with other cellular functions following later. From this horizontal transfer dominated system, the three domains (archaea, bacteria and eukaryotes) each emerged independently as lineages.
This is certainly food for thought, but there are several issues:

Gathering evidence to support it is not exactly easy since there’s no real way to establish that horizontal transfer was the initial state.
Koonin’s shrinking minimal gene set (Koonin 2000), shows that the minimal genome approach is not creating a totipotent LUCA. Instead, the number of genes ascribable to LUCA is becoming smaller as more genomes are added.

Another problem has to do with the switch from horizontal transfer to vertical inheritance. How many genes would have been able to partake in global transfers before becoming crystallised and therefore unable to transfer? Would it really have been complex enough for the ancestors of the three domains of life to have emerged independently as distinct lineages?

Gene transfer is going to be a hotly debated topic for a while, and will continue to confound the reconstruction of the LUCA. The problem is a complex one:

Horizontal gene transfer has been demonstrated — e.g., the spread of antibiotic resistance.
Limitations of the methods for building evolutionary trees can give false evidence for gene transfer.
Methods that don’t make use of evolutionary information are being used to examine genetic relationships and, in many cases, the data that have been used to argue for horizontal gene transfer are weak.
There is little consensus on the reliability of methods for detecting horizontal gene transfer.
What data are required to demonstrate ancient horizontal gene transfer events?

If natural selection is considered, most horizontal gene transfers will probably result in the gene being lost — by analogy, the organism needs a new gene like a fish needs a bicycle! For instance, antibiotic resistance genes won’t spread and be maintained by selection unless the organisms with the genes are being assaulted with the antibiotic.
This last point has been too often ignored, and there has been little attempt to establish patterns (e.g., are all genes equally nomadic?). So how should LUCA biologists deal with horizontal gene transfer?
If we accept that there is or has been massive unbridled horizontal gene transfer between the three domains [e.g., Doolittle 1999], we must conclude that all our tools for looking into the evolutionary past are invalidated, which means we might as well give up on the question of the LUCA. We know that there are demonstrated cases of horizontal gene transfer, but this extreme position is like throwing the baby out with the bathwater.

If we take as our starting point the opposite extreme, that the effect of horizontal gene transfer has been negligible, we are in a much better position — we still have our tools in place, and any suggestions of horizontal gene transfer will need to be backed up with good evidence.

There is no doubt a middle ground can be found, but amidst the furore over horizontal gene transfer, a number of researchers, making use of whole genome sequences, have reported results suggesting gene transfers have minimal effect on the ability to recover evolutionary trees [e.g., Snel et al. 1999; Sicheritz-Pontén & Andersson 2001]. These results suggest that it is possible to reconstruct the tree of life, and moreover, conclude that the 3 domain structure of the tree, as first reported by Woese and his colleagues, is supported by whole genomes.
In a timely article, Chuck Kurland has firmly criticised the eagerness of many to attribute horizontal gene transfer [Kurland 2000]. One particularly interesting aspect of his exposition is that he suggests a number of non-scientific factors that have contributed to the hype around horizontal gene transfer, and is as much a comment on how science currently operates as it is about gene transfer.
The root of the tree of life is hard to pin down [Pennisi 1999], and unbridled horizontal transfers early in the evolution of life can’t easily be distinguished from the limits of the sensitivity of our phylogenetic tools — that researchers have failed to reach a consensus on the shape of the tree does not mean that there must therefore have been horizontal transfer.
Indeed, there is another issue here — the reliability of the methods used for building evolutionary trees. Many researchers are very confident of the reliability of these methods, yet it is well known that these are based on mathematical algorithms which are convenient, but which do not necessarily accurately model real biological sequence evolution. These methods are likely to be robust for recent evolutionary events, and are definitely the most robust of the methods for detecting gene transfers. The problem is that near the root of the tree of life, they may be just too inaccurate to be useful for scrutinizing the very earliest events in evolution. In a worst-case scenario, the situation might in fact be a bit like timing the 100M sprint at the Olympics with a sundial!
David Penny, Bennet McComish and their coworkers have recently tried to address this question by investigating how far back in time the standard models used in evolutionary tree building can go before they start to go wrong. Their overall conclusion is that the models used do seem to do a little better than might be expected from theory, but that the models still do poorly for very early evolutionary events. Penny and colleagues also criticise the recent trend in reporting conflicting trees as evidence for horizontal gene transfer — given how hard it seems to accurately reconstruct the tree of life, it is hard to say whether conflicting answers are evidence for gene transfer, or just reflect the limitations of the methods for building the trees. Their testing of the models suggests that it is just not reasonable to say that there is horizontal gene transfer just because two trees made with two different genes don’t come back with the same relationships between organisms. They make the following comment, which sums up the problem very succinctly:

“… there are major difficulties between data sets for ancient divergences. It is difficult to see why researchers are so confident in their results when the relatively recent divergences within mammals, birds, or flowering plants are only now being resolved.”

This work of Penny et al. [2001] and the picture coming from evolutionary trees of whole genomes [Snel et al. 1999; Sicheritz-Pontén & Andersson 2001] seems to bolster Kurland’s provocative assertion that horizontal transfer is ‘an ideology that is begging for deconstruction.’
Nevertheless, horizontal gene transfer does occur to some extent — Lawrence & Ochman’s 1998 paper is but one of many demonstrating this. Moreover, many of the technologies biologists use for inserting genes are simply human exploitation of what has been described as natural genetic engineering. The following naturally occurring mechanisms of ‘genetic engineering’ are routinely used in molecular biology laboratories:

Plasmids: small, usually circular, pieces of DNA that often carry genes that enable them to move from one bacterium to another.
Viruses: many will naturally insert themselves into the DNA of the organism they are infecting, and can be engineered to carry extra pieces of DNA.
Natural or assisted DNA uptake by bacterial cells.
Restriction endonucleases: molecular scissors that allow precise ‘cutting’ of DNA.

We also know it is possible to identify ancient gene transfers that may have occurred as far back as 2 billion years ago. Biologists can readily identify genes in the eukaryote repertoire that have come in via the mitochondrion, a compartment in the eukaryote cell which is bacterial in origin. Indeed, the handful of genes remaining in this compartment have been shown to be bacterial in origin, as have some that have since taken up residence in the eukaryote nucleus [Lang et al. 1999].
Returning for a moment to the biology of nomadic genes, the consensus emerging from studies of bacteria is that we should indeed start thinking of bacterial (and perhaps archaeal) genomes as being an ever changing collection of genes, but only to a degree [Hacker & Carniel 2001]:

Gens which are central to the running of any cell — these are often referred to as housekeeping genes — make up the ‘core’ genome.
Genes which come and go make up the flexible genome.

The flexible genome might be a window into the nomadic gene pool of bacteria — Lan & Reeves [2000] point out that closely related strains of bacteria differ in the genes they carry by as much as 20%, and this requires we reevaluate how we categorise species of bacteria. The hope is that the flexible genome can tell us how a particular bug is currently making its living. Losing genes isn’t in itself particularly surprising for organisms that are in competition to reproduce as fast as they can [Jeffares & Poole 2000] — genes that aren’t being used aren’t kept. ‘Use it or lose it’ is the maxim of natural selection, and there are plenty of examples wherever you look in biology (our appendix and tail bone both appear to have headed in that direction for instance).
If this picture of core and flexible genomes is correct, it is good news for LUCA research because many universal features can in theory be recovered. This goes too for the ancient horizontal transfers seen with the mitochondrion. We should be optimistic that some patterns of horizontal gene transfer can be analyzed, though we still need to exercise care when looking so far back in time.

4. Fusion

When it comes to horizontal gene transfer, the hype about archaea and bacteria is arguably a case of squabbling over crumbs when compared to what seems to have happened in the early evolution of the eukaryotic cell. The now popular idea that eukaryotes emerged from a massive fusion (the ultimate gene transfer) event between a bacterium and an archaeon is also raising problems for LUCA biology.
While fusion is all about the origin of the eukaryotes, it is also about LUCA:
Fusion scenarios challenge Woese’s division of the living world into three domains.

Rather than a tree with three branches all tracing back to the LUCA, fusion has two lineages, with the eukaryotes emerging by fusion.
Fusion is in conflict with the emerging picture of the direct link between eukaryote biology and the RNA world.

A number of researchers have argued that the genes in the average eukaryote look to be a mixture of bacterial-like and archaeal-like. That is to say, at the genetic level, eukaryotes look to be some sort of genetic fusion between archaea and bacteria [Ribeiro & Golding 1998; Rivera et al. 1998; Horiike et al. 2001].
In understanding this, it has been helpful to divide genes into two categories: informational or operational [Rivera et al. 1998]. Informational genes are those which are involved with the copying, storing and regulation of genetic information, while operational genes are the recipes for making proteins for synthesis and breakdown of molecules in the cell, and are largely involved in energy metabolism.
Consistent with earlier research [see Gupta & Golding 1996] Rivera and colleagues found that there was rhyme and reason to the mixture of bacterial and archaeal genes in eukaryotes:

For informational genes — archaea and eukaryotes share more in common
For operational genes — bacteria and eukaryotes share more in common

Mark Ridley [2000] has suggested a good analogy for what many think has happened — a business merger. Instead of doubling up and having two departments for every aspect of the new company (Eukaryote Inc.), only one of each was kept, with the result being that the informational department came from Archaea Inc. and the ‘operational department’ from Bacteria Inc.
With the ongoing debate on how much horizontal gene transfer there is between organisms, the most exciting contribution to this picture looks not at the genes, but at gene networks. Taking a page from the study of complex networks such as the Internet, Eörs Szathmáry and colleagues [Podani et al. 2001] have recently shown that, while eukaryotic operational genes appear bacterial in origin, the structure of the metabolic network that these genes make up is in fact much much more like what is observed in archaea. In keeping with the business merger analogy, this is perhaps equivalent to keeping the management structures of Archaea Inc. in place.
This is an exciting picture, and there is no question that modern-day eukaryotes are the product of some sort of fusion [Ribeiro & Golding 1998; Horiike et al. 2001]. However, the tricky thing is working out what it all means for the origin of the eukaryotic cell. These are some of the outstanding issues:

Why has such a merger apparently only happened once?
No one has ever observed modern bacteria and archaea fusing.
Why is it we don’t see ‘anti-eukaryotes’ (that is, organisms which have the operational genes of archaea and the informational genes of bacteria)?
A number of features found exclusively in eukaryotes are tricky to explain by a fusion event.

Indeed, there are a number of ways of explaining the fusion data, and consequently, there are quite a few different opinions on how the eukaryotes came to be [Minkel 2001].
If eukaryotes are the result of a fusion between a bacterium and an archaeon, then the 3 domain picture that Carl Woese’s work supports would be wrong. Fusion would imply that everything in eukaryote biology is either a recent innovation specific to this domain, or an offshoot of the biology of archaea and bacteria. In other words, if you want to know about LUCA, archaea and bacteria are the only two domains worth looking at. This is an assumption that is often made, regardless of fusion, and a point against which some researchers, myself included, have argued [see Jeffares & Poole 2000; also Forterre & Philippe 1999; Poole et al. 1999].
To make sense of the motivation behind the many emerging fusion scenarios for the origin of the eukaryote cell and how these might impact on LUCA biology, it helps to concentrate on the big picture, rather than wading through the details of the various scenarios. Laura Katz [1998] has written a good overview of the various fusion scenarios, though several new scenarios have been published since then [e.g., Margulis, et al., 2000; Horiike et al. 2001; Bell 2001; Hartman & Fedorov 2002]. Fusion theories have developed out of the endosymbiotic theory for the origin of the mitochondrion:

The endosymbiotic theory was first formulated by Mereschkowsky at the beginning of the 20th century, but reintroduced and updated by Lynn Margulis in the 1970s [Martin et al. 2001].
This theory argues that the mitochondrion, sometimes called the powerhouse of the cell, was originally a bacterial cell that took up residence in the ancestor of modern eukaryotes.
Both structural and genetic similarities have shown without a shadow of a doubt that the endosymbiotic theory is correct — the DNA in the mitochondrion is more closely related to bacteria than to the DNA stored in the eukaryotic cell nucleus.
It is now widely accepted that this event happened once only.
Despite much agreement, there is ongoing debate surrounding the endosymbiotic theory:
How was this partnership founded (e.g., oxygen-based or hydrogen-based metabolism)?
Was the host that ultimately engulfed the bacterium a eukaryote or an archaeon?

The first question opens up a whole can of worms (which we’ll avoid here), and is a current source of intense debate [Andersson & Kurland 1999; Rotte et al. 2000]. The second question is what has the major impact on LUCA biology, but these two questions have been unnecessarily muddled. The bottom line is that the genomic & gene network data supporting fusion between an archaeon and a bacterium can as easily be made to fit a fusion between a eukaryote and a bacterium.
The state of the field is as follows:

Everyone agrees that the mitochondrion evolved from a bacterial ancestor (though there is current debate as to what the bacterial ancestor was like, and how it interacted with its host).
There is disagreement as to whether the host was a eukaryote, or an archaeon.
Archaea-Bacteria fusion hypotheses require all genes found only in eukaryotes to have arisen post-LUCA, post-fusion — that is, they are indirectly descended from LUCA.

This comes into conflict with the picture of LUCA from RNA [Jeffares & Poole 2000], and Woese’s tree of life [Woese et al. 1990] — both require that eukaryotes were directly descended from LUCA.
So how do we distinguish between an archaeal and a eukaryotic host? The key is in two parts — one is historical and the other requires careful thought about how archaea and eukaryotes might be related:
The historical aspect centres around understanding the shift in thinking from the original picture of an ancient eukaryote playing host to the now largely agreed-upon picture of an archaeon playing host. This shift largely revolves around the changing branches in the eukaryote evolutionary tree [Dacks 2002].
The relationship between archaea and eukaryotes cuts to the heart of how researchers view the evolution of cells.

Archaezoa - missing links lost

So why is it that fusion hypotheses have become so popular? Indeed, this goes against the classical interpretation, most thoroughly espoused by Tom Cavalier-Smith [1987], who identified a disparate group of eukaryotes that appeared to him to be missing links — the so-called Archaezoa, which look like eukaryotes but lack mitochondria. His hypothesis, that the Archaezoa evolved before the introduction of mitochondria into the eukaryote lineage, held sway for many years, though has recently been dropped in favour of fusion:
There is growing evidence that all eukaryotes once harboured mitochondria

Thus, the Archaezoa have probably all lost their mitochondria, rather than never having had them [Embley & Hirt 1998].

For instance, one group of the Archaezoa called the microsporidia are now widely accepted to have been incorrectly placed very deep on the eukaryotic tree. Indeed, probably most of the Archaezoa, if not all, are incorrecly placed on the tree. Rather than being missing links leading back to the origin of eukaryotes, they probably arose more recently [see Dacks & Doolittle 2001; Keeling 1998; Dacks 2002].

If the Archaezoa aren’t a series of missing links, the origin of eukaryotes may have been concurrent with the endosymbiosis that gave rise to mitochondria.

The conclusion from the above is that all eukaryotes probably had a mitochondrion, and without the Archaezoa, the only ancestor of eukaryotes is archaea. Voilà! We have fusion.
A case of throwing the baby out with the bathwater?

The important point to keep in mind about the picture for fusion is that it is a partial one, based largely on gene data. There are a large number of differences in the general structure of eukaryotic and prokaryotic (archaeal & bacterial) cells that aren’t explained by fusion [Poole & Penny 2001]. However, the major inconsistency is that the picture provided from trees is not the same for the relationship between archaea and eukaryotes, and that of bacteria and eukaryotes [Poole & Penny 2002, submitted]:
Margulis’ hypothesis is evidenced from trees. There is now overwhelming agreement that the mitochondria branch is within the bacterial tree, specifically within a subgroup called the alpha-proteobacteria [Lang et al. 1999], and a bacterial origin is also observed for chloroplasts (where photosynthesis takes place in plants and other photosynthetic eukaryotes).

Comparisons of relevant genes from eukaryotes and archaea should give this picture also, yet the evidence points to archaea and eukaryotes being very distinct domains.

This has strong parallels to the way the Archaezoa case is being treated — if there are no modern groups of archaea that appear to have split from the trunk of the tree of life before the appearance of eukaryotes, should we accept fusion? Stronger evidence was certainly required in testing the origin of the mitochondrion!

Another issue has to do with missing links. If the disappearance of the missing links (the Archaezoa) is used to suggest fusion, it is surely just as reasonable to argue against fusion on exactly the same grounds — there are no intermediates between eukaryotes with mitochondria and the archaea. For example, we don’t see examples of archaea with mitochondria in them, or archaea with nucleus-like structures.

With perhaps a couple of billion years separating the divergence of archaea and eukaryotes, it would be incorrect to require that the archaeon in the fusion must have been just like modern archaea. This cuts right to the heart of the problem — there is no inherent requirement that evolution leaves behind a series of intermediates for us to use to piece together the different evolutionary trajectories of archaea and eukaryotes. As with Chinese whispers, the end point may be very different from the starting phrase, but with evolution, all we have to look at is a number of different endpoints, from which we can only guess at the starting phrase!
While the specifics of the Archaezoan hypothesis are most probably wrong, it should not be thrown out completely. Explanations of eukaryote origins by fusion or via Margulis’ original scenario each suffers from the disappearance of intermediate forms, but this is expected. As Stephen Jay Gould had often said, evolution results in bushes, not ladders.
A number of researchers [Forterre & Philippe 1999; Andersson & Kurland 1999; Penny & Poole 1999] maintain that the data for fusion can be reconciled with Lynn Margulis’ endosymbiotic theory and Carl Woese’s three-domain tree. Indeed, David Penny and I have argued that fusion does the worst job of explaining the available data [Penny & Poole 1999; Poole & Penny 2001]. For instance, fusion doesn’t fit with the hypothesis that some eukaryote features, which have since been lost in archaea and bacteria, actually date back to the LUCA (see Jeffares & Poole 2000).
What it very tentatively implies is that archaea and eukaryotes may have shared a more recent ancestor than either shares with bacteria, as is often shown in textbooks, but this too is not certain, since the relationships between these three groups is also a point of controversy [see Forterre & Philippe 1999; Pennisi 1999]!

5. Conclusions

We are now entering a very exciting period in uncovering the history of the LUCA — the field has been given a major boost from a broader range of ideas being applied to the problem:
Acknowledging the technical challenges with building the tree of life is an important step in the right direction. So is the idea of using the RNA world period in the origin of life for establishing aspects of the nature of the LUCA (see Jeffares & Poole 2000).

Despite the deluge of genome data available, it is hard to say whether we will ever actually manage to get a complete list of genes which LUCA possessed. We may pick out certain characteristics, but each needs to be evaluated with extreme care. The minimal genome study of Mushegian & Koonin [1996] demonstrates this and, in its failure, it stands as a strong caveat. Most researchers have toyed with this idea, and many were probably disappointed (then later relieved) that Mushegian & Koonin beat them to it!

Horizontal gene transfer is likely to be a factor in confounding such efforts, but it is better to err on the side of caution with respect to how pervasive this is in the history of life. The emerging picture from genome research suggests that not all genes transfer equally easily, and that there may be an ecological underpinning to the nature of gene transfer.

The fusion hypothesis has important consequences for the LUCA — if correct, the LUCA must have been like bacteria and/or archaea, because those unique features of the biology of eukaryotes had not yet evolved.

The three domain tree that emerged from Woese’s original work permits features of all three domains to trace back to the LUCA while the fusion hypothesis, in its strictest form, does not. Unless an argument for loss of a feature in all modern archaea can be made, it is diametrically opposed to the nature of the LUCA as suggested from RNA world fossils [Jeffares & Poole 2000].

Currently, many major assumptions are being questioned:

Were there three domains or two, with the third arising by fusion?
Was LUCA prokaryote-like or eukaryote-like or even a mixture?
Is the genetic code the only one possible?
Was early evolution more reliant on horizontal gene transfer than inheritance?
Was there one or more LUCAs?

Each of these questions could easily fill a book, and it has become impossible to cover every aspect of LUCA biology in one article. To the casual observer, the field of LUCA biology looks to be in disarray, with everyone having their own pet theory. This can be exciting, frustrating, and, at times, bordering on the absurd, but above all it is a sign of healthy debate! Many views and varied approaches to the problem means some exciting answers to some fundamental questions about life’s origins are just around the corner…

Origin of membranes and cells 3

The notion that the common ancestor of the F‑ and V‑type ATPases had a different function, such as nucleic acid or protein translocation, is consistent with the differences in membrane biogenesis and DNA replication systems between archaea and bacteria. Archaeal phospholipids are chemically distinct from those that are present in bacterial and eukaryotic membranes; the glycerine moieties possess opposite chiralities, and the corresponding biosynthetic enzymes are either unrelated or are, at least, not orthologous. The core proteins of the DNA-replication systems — most notably, the elongating DNA polymerases and primases — are non-homologous in archaea and bacteria. These observations led to radical proposals on the nature of the Last Universal Common Ancestor (LUCA), namely, that it had neither membrane organization nor DNA replication and, accordingly, was not a typical cell. Or maybe there was never a LUCA.... Furthermore, the origin of the cellular membrane itself seems to involve a catch‑22: for a membrane to function in a cell, it must be endowed with at least a minimal repertoire of transport systems but it is unclear how such systems could evolve in the absence of a membrane. However, the model of a non-membranebound LUCA faces substantial difficulties. The principal issue is the ubiquitous conservation of several membrane proteins and complex, membrane-associated molecular machines, such as the SRP, core proteins of the Sec system and F‑ and V‑type ATPases themselves. The primitive function of an RNA and protein translocase, proposed above for an ancestor of the F‑ and V‑type ATPases, could ahm, the typical guess-work represent a potential solution to the primordial-membrane conundrum. The modern ion-impermeable membranes might have been preceded by primordial, ion-permeable proto-membranes that had the capacity to sequester RNA and proteins, and consisted of, for example, polyprenyl phosphates. These structures would have had the potential to host the first membrane enzymes — initially, translocases of macromolecules, and subsequently, ion-translocating ATPases or ATP synthases and small molecule transporters. It has been extensively argued that pervasive horizontal gene exchange between primordial genetic systems was both an intrinsic feature of early, pre-cellular evolution how could evolution act prior to cell life and replication ? thats speculation and pseudo-science based on no evidence and a necessary requirement for the evolution of increasingly complex entities. For this to occur in conjunction with the evolution of biological membranes, the cell membrane could neither be result of evolution, since it had to be present, fully operational, when life began nucleic acid translocation devices would seem to be an essential prerequisite. These first nucleic acid translocases mediated the import and export of RNA molecules in virus-like entities that contained several RNA segments, a primitive membrane and, possibly, a capsid-like structure. Conceptually, at least, such primitive translocases could have been analogous to the hexameric P4 ATPase that is detected in modern, lipid-containing dsRNA bacteriophages. This scenario for the origin of the F‑ and V‑type ATPases describes a succession of events, from a soluble helicase and a membrane channel, to RNA and protein translocases and, finally, to the ion-translocating ATPases. However, the divergence of these scenarios at the penultimate stage, leading to several alternatives with respect to the nature of the common ancestor of the F‑ and V‑type ATPases, has substantially different implications for the status of membranes in the LUCA. A protein translocase as a common ancestor (FIG. 2a) implies primitive, ion-leaky membranes in LUCA.

V‑ATPases was an ion translocase (FIG. 2b,c), a more conventional, cell-like LUCA with ion-impermeable membranes would be implied. Although such a cell-like LUCA is arguably the most efficient explanation for the existence of ubiquitous, membraneassociated structures, a major difficulty that is faced by this model is the necessity to explain the non-orthologous displacement of a considerable number of key enzymes as well as the membrane lipids.

View user profile

Sponsored content

Back to top  Message [Page 1 of 1]

Permissions in this forum:
You cannot reply to topics in this forum