ElShamah - Reason & Science: Defending ID and the Christian Worldview
Would you like to react to this message? Create an account in a few clicks or log in to continue.
ElShamah - Reason & Science: Defending ID and the Christian Worldview

Otangelo Grasso: This is my library, where I collect information and present arguments developed by myself that lead, in my view, to the Christian faith, creationism, and Intelligent Design as the best explanation for the origin of the physical world.

You are not connected. Please login or register

The origin of viruses is another mystery besides the origin of life

Go down  Message [Page 1 of 1]



The origin of viruses is another mystery besides  the origin of life 

Nejc Kejzar (2022): Viruses play a central role in all ecological niches; the origin of viruses, however, remains an open question. Phylogenetic analysis of distantly related viruses is hampered by a lack of detectable sequence similarity 1

Viruses are essential agents for life
C. A. Suttle (2005): Viruses exist wherever life is found. They are a driver of global geochemical cycles and a reservoir of the greatest genetic diversity on Earth. In the oceans, viruses probably infect all living things, from bacteria to whales. They affect the form of available nutrients and the termination of algal blooms. Viruses can move between marine and terrestrial reservoirs, raising the spectre of emerging pathogens. Because viruses are significant agents of microbial mortality, they have an effect on nutrient cycling. Moreover, the narrow host range of most viruses suggests that infection is important in controlling the composition of planktonic communities. Viruses are catalysts that accelerate the transformation of nutrients from particulate (living organisms) to dissolved states, where it can be incorporated by microbial communities. 2

H.Ross (2020) : Viruses are rarely in the spotlight when it comes to elucidating biological origins. Unjustifiably so, since they are essential for life. Hugh Ross (2020): Without viruses, bacteria would multiply and, within a relatively short time period, occupy every niche and cranny on Earth’s surface. The planet would become a giant bacterial slime ball. Those sextillions of bacteria would consume all the resources essential for life and die. Viruses keep Earth’s bacterial population in check. They break up and kill bacteria at the just-right rates and in the just-right locations so as to maintain a population and diversity of bacteria that is optimal for both the bacteria and for all the other life-forms. It is important to note that all multicellular life depends on bacteria being present at the optimal population level and optimal diversity. We wouldn’t be here without viruses! Viruses also play a crucial role in Earth’s carbon cycle. They and the bacterial fragments they create are carbonaceous substances. Through their role in precipitation, they collect as vast carbonaceous sheets on the surfaces of the world’s oceans. These sheets or mats of viruses and bacterial fragments sink slowly and eventually land on the ocean floors. As they are sinking they provide important nutrients for deep-sea and benthic (bottom-dwelling) life. Plate tectonics drive much of the viral and bacterial fragments into Earth’s crust and mantle where some of that carbonaceous material is returned to the atmosphere through volcanic eruptions.3 

Virus-archaea interactions play a central role in global biogeochemical cycles. Ramesh K Goel (2021): Viruses play vital biogeochemical and ecological roles by (a) expressing auxiliary metabolic genes during infection, (b) enhancing the lateral transfer of host genes, and (c) inducing host mortality. Even in harsh and extreme environments, viruses are major players in carbon and nutrient recycling from organic matter. 4 

Eugene V. Koonin (2020): Lytic infections (involving the replication of a viral genome) of cellular organisms, primarily bacteria, by viruses play a central role in the biological matter turnover in the biosphere. Considering the enormous abundance and diversity of viruses and other mobile genetic elements (MGEs), and the ubiquitous interactions between MGEs and cellular hosts, a thorough investigation of the evolutionary relationships among viruses and mobile genetic elements (MGEs) is essential to advance our understanding of the evolution of life 5 

Eugene V Koonin (2013): Virus killing of marine bacteria and protists largely determines the composition of the biota, provides a major source of organic matter for consumption by heterotrophic organisms, and also defines the formation of marine sediments through the deposition of skeletons of killed plankton organisms such as foraminifera and diatoms. 6 

Rachel Nuwer (2020):  If all viruses suddenly disappeared, the world would be a wonderful place for about a day and a half, and then we’d all die – that’s the bottom line. The vast majority of viruses are not pathogenic to humans, and many play integral roles in propping up ecosystems. Others maintain the health of individual organisms – everything from fungi and plants to insects and humans. “We live in a balance, in a perfect equilibrium.  In 2018, for example, two research teams independently made a fascinating discovery. A gene of viral origin encodes for a protein that plays a key role in long-term memory formation by moving information between cells in the nervous system. 7

P. Forterre (2008): Historically, three hypotheses have been proposed to explain the origin of viruses: (1) they originated in a precellular world (‘the virus-first hypothesis’); (2) they originated by reductive evolution from parasitic cells (‘the reduction hypothesis’); and (3) they originated from fragments of cellular genetic material that escaped from cell control (‘the escape hypothesis’). All these hypotheses had specific drawbacks. The virus-first hypothesis was usually rejected firsthand since all known viruses require a cellular host. The reduction hypothesis was difficult to reconcile with the observation that the most reduced cellular parasites in the three domains of life, such as Mycoplasma in Bacteria, Microsporidia in Eukarya, or Nanoarchaea in Archaea, do not look like intermediate forms between viruses and cells. Finally, the escape hypothesis failed to explain how such elaborate structures as complex capsids and nucleic acid injection mechanisms evolved from cellular structures since we do not know any cellular homologs of these crucial viral components. 

Much like the concept of prokaryotes became the paradigm on how to think about bacterial evolution, the escape hypothesis became the paradigm favored by most virologists to solve the problem of virus origin. This scenario was chosen mainly because it was apparently supported by the observation that modern viruses can pick up genes from their hosts. In its classical version, the escape theory suggested that bacteriophages originated from bacterial genomes and eukaryotic viruses from eukaryotic genomes. This led to a damaging division of the virologist community into those studying bacteriophages and those studying eukaryotic viruses, ‘phages’ and viruses being somehow considered to be completely different entities. The artificial division of the viral world between ‘viruses’ and bacteriophages also led to much confusion on the nature of archaeal viruses. Indeed, although most of them are completely unrelated to bacterial viruses, they are often called ‘bacteriophages’, since archaea (formerly archaebacteria) are still considered by some biologists as ‘strange bacteria’. For instance, archaeal viruses are grouped with bacteriophages in the drawing that illustrates viral diversity in the last edition of the Virus Taxonomy Handbook. Hopefully, these outdated visions will finally succumb to the accumulating evidence from molecular analyses. 

Viruses Are Not Derived from Modern Cells 
Abundant data are now already available to discredit the escape hypothesis in its classical adaptation of the prokaryote/eukaryote paradigm. This hypothesis indeed predicts that proteins encoded by bacterial viruses (avoiding the term bacteriophage here) should be evolutionarily related to bacterial proteins, whereas proteins encoded by viruses infecting eukaryotes should be related to eukaryotic proteins. This turned out to be wrong since, with a few exceptions (that can be identified as recent transfers from their hosts), most viral encoded proteins have either no homologs in any cell or only distantly related homologs. In the latter cases, the most closely related cellular homolog is rarely from the host and can even be from cells of a domain different from the host. More and more biologists are thus now fully aware that viruses form a world of their own, and that it is futile to speculate on their origin in the framework of the old prokaryote/ eukaryote dichotomy.

A more elaborate version has been proposed by William Martin and Eugene Koonin, who suggested that life originated and evolved in the cell-like mineral compartments of a warm hydrothermal chimney. In that model, viruses emerged from the assemblage of self-replicating elements using these inorganic compartments as the first hosts. The formation of true cells occurred twice independently only at the end of the process (and at the top of the chimney), producing the first archaea and bacteria. The latter escaped from the same chimney system as already fully elaborated modern cells. In the model, viruses first co-evolved with acellular machineries producing nucleotide precursors and proteins.

The emergence of the RNA world involves at least the existence of complex mechanisms to produce ATP, RNA, and proteins. This means an elaborated metabolism to produce ribonucleotide triphosphate (rNTP) and amino acids, RNA polymerases, and ribosomes, as well as an ATP-generating system. If such a complex metabolism was present, it appears unlikely that it was unable to produce lipid precursors, hence membranes. If this is correct, then ‘modern’ viruses did not predate cells but originated in a world populated by primitive cells. 

Viruses and the Origin of DNA 
Considering the possibility that at least some DNA viruses originated from RNA viruses, it has been suggested that DNA itself could have appeared in the course of virus evolution (in the context of competition between viruses and their cellular hosts). Indeed, DNA is a modified form of RNA, and both viruses and cells often chemically modify their genomes to protect themselves from nucleases produced by their competitor. It is usually considered that DNA replaced RNA in the course of evolution simply because it is more stable (thanks to the removal of the reactive oxygen in position 20 of the ribose) and because cytosine deamination (producing uracil) can be corrected in DNA (where uracil is recognized as an alien base) but not in RNA. 

Anyone that studies biochemistry, knows the enormous complexity of ribonucleotide reductase enzymes, that remove oxygen from the 2' position of ribose, the backbone of RNA, to transform RNA into DNA. There is no scientific explanation for how RNA could have transitioned to DNA, and the origin of the ultra-complex machinery to catalyze the needed reactions. Molecules have no goals, no foresight. They did not think about the advantage of stability if transitioning to DNA. There's nothing about inert chemicals and physical forces that say we want to become part of a living self-replicating entity called a cell at the end of a chemical evolutionary process. Molecules do not have the "drive", they do not urge or "want" to find ways to become information-bearing biomolecules, or able to harness energy as ATP molecules, become more efficient, or become part of a molecular machine, or in the end, a complex organism. There is a further hurdle to overcome. More and more biologists are now fully aware that viruses form a world of their own. Proteins encoded by bacterial viruses are not related to bacterial proteins. Modern viruses exhibit very different types of genomes (RNA, DNA, single-stranded, double-stranded), including highly modified DNA, whereas all modern cellular organisms have double-stranded DNA genomes. So the question becomes how Viruses that have a DNA genome originated since they had an independent origin from living cells. Even more: P. Forterre (2008): Many DNA viruses encode their own enzymes for deoxynucleotide triphosphate (dNTP) production, ribonucleotide reductases (the enzymes that produce deoxyribonucleotides from ribonucleotides), and thymidylate synthases (the enzymes that produce deoxythymidine monophosphate (dTMP) from deoxyuridine monophosphate (dUMP). 
That means RNR enzymes would have evolved independently, in a convergent manner, twice !! 

The replacement of RNA by DNA as cellular genetic material would have thus allowed genome size to increase, with a concomitant increase in cellular complexity (and efficiency) leading to the complete elimination of RNA cells by the ancestors of modern DNA cells. This traditional textbook explanation has been recently criticized as incompatible with Darwinian evolution since it does not explain what immediate selective advantage allowed the first organism with a DNA genome to predominate over former organisms with RNA genomes. Indeed, the newly emerging DNA cell could not have immediately enlarged its genome and could not have benefited straight away from a DNA repair mechanism to remove uracil from DNA. Instead, if the replacement of RNA by DNA occurred in the framework of the competition between cells and viruses, either in an RNA virus or in an RNA cell, modification of the RNA genome into a DNA genome would have immediately produced a benefit for the virus or the cell. It has been argued that the transformation of RNA genomes into DNA genomes occurred preferentially in viruses because it was simpler to change in one step the chemical composition of the viral genome than that of the cellular genomes (the latter interacting with many more proteins). Furthermore, modern viruses exhibit very different types of genomes (RNA, DNA, single-stranded, double-stranded), including highly modified DNA, whereas all modern cellular organisms have double-stranded DNA genomes. This suggests a higher degree of plasticity for viral genomes compared to cellular ones. The idea that DNA originated first in viruses could also explain why many DNA viruses encode their own enzymes for deoxynucleotide triphosphate (dNTP) production, ribonucleotide reductases (the enzymes that produce deoxyribonucleotides from ribonucleotides), and thymidylate synthases (the enzymes that produce deoxythymidine monophosphate (dTMP) from deoxyuridine monophosphate (dUMP). Because in modern cells, dTMP is produced from dUMP, the transition from RNA to DNA occurred likely in two steps, first with the appearance of ribonucleotide reductase and production of U-DNA (DNA containing uracil), followed by the appearance of thymidylate synthases and formation of T-DNA (DNA containing thymine). The existence of a few bacterial viruses with U-DNA genomes has been taken as evidence that they could be relics of this period of evolution. If DNA first appeared in the ancestral virosphere, one has also to explain how it was later on transferred to cells. One scenario posits the co-existence for some time of an RNA cellular chromosome and a DNA viral genome (episome) in the same cell, with the progressive transfer of the information originally carried by the RNA chromosome to the DNA ‘plasmid’ via retro-transposition. 8

What came first, cells or viruses? 
This is a classical chicken & egg problem: Gladys Kostyrka (2016): Cells depend on viruses, but viruses depend on cells as a host for replication. What came first? How could viruses play critical roles in the OL if life relies on cellular organization and if viruses are defined as parasites of cells? In other words, how could viruses play a role in the emergence of cellular life if the existence of cells is a prerequisite for the existence of viruses? 9

Virus origins: From what did viruses evolve or how did they initially arise? The answer to this question is not simple, because, while viruses all share the characteristics of being obligate intracellular parasites that use host cell machinery to make their components which then self-assemble to make particles that contain their genomes, they most definitely do not have a single origin.

Virus origins: From what did viruses initially arise?
E.Rybicki: The graphic depicts a possible scenario for the evolution of viruses: “wild” genetic elements could have escaped, or even been the agents for transfer of genetic information between, both RNA-containing and DNA-containing “protocells”, to provide the precursors of retroelements and of RNA and DNA viruses.  Later escapes from Bacteria, Archaea and their progeny Eukarya would complete the virus zoo. It is generally accepted that many viruses have their origins as “escapees” from cells; rogue bits of nucleic acid that have taken the autonomy already characteristic of certain cellular genome components to a new level.  Simple RNA viruses are a good example of these: their genetic structure is far too simple for them to be degenerate cells; indeed, many resemble renegade messenger RNAs in their simplicity. 10

The origin of viruses is another mystery besides  the origin of life  Bacter15

Viruses, the most abundant biological entities on earth
Steven W. Wilhelm (2012): Viruses are the most abundant life forms on Earth, with an estimated 10^31 total viruses globally. 11 

Eugene V. Koonin (2020): Viruses appear to be the dominant biological entities on our planet, with the total count of virus particles in aquatic environments alone at any given point in time reaching the staggering value of 10^31, a number that is at least an order of magnitude greater than the corresponding count of cells.  The genetic diversity of viruses is harder to assess, but, beyond doubt, the gene pool of viruses is, in the least, comparable to that of hosts. The estimates of the number of distinct prokaryotes on earth differ widely, in the range of 10^7 to 10^12, and accordingly, estimation of the number of distinct viruses infecting prokaryotes at 10^8 to 10^13 is reasonable. Even assuming the lowest number in this range and even without attempting to count viruses of eukaryotes, these estimates represent vast diversity. Despite the rapid short-term evolution of viruses, the key genes responsible for virion formation and virus genome replication are conserved over the long term due to selective constraints. Genetic parasites inescapably emerge even in the simplest molecular replicator systems and persist through their subsequent evolution. Together with the ubiquity and enormous diversity of viruses in the extant biosphere, these findings lead to the conclusion that viruses and other mobile genetic elements MGEs played major roles in the evolution of life ever since its earliest stages.   5

G.Witzany (2015): If we imagine that 1ml of seawater contains one million bacteria and ten times more viral sequences it can be determined that 10^31 bacteriophages infect 10^24 bacteria per second. 12

Eugene V. Koonin (2022):  We argue that viruses emerge on a number (even if far from astronomical) independent occasions, so that the number of realms will considerably increase from the current 6, by splitting some of the current realms, giving the realm status to some of the currently unclassified groups of viruses and discovery of new distinct groups. Viruses are often considered to be the most numerous entities in the global biosphere. The most common estimates suggest that there are on the order of 10^31 virus particle on the planet at any given moment, about an order of magnitude greater than the total number of cells. To the best of our current understanding, all organisms on earth are hosts to multiple viruses, with the possible exception of some endosymbiotic bacteria. Empirical observations on the ubiquity of viruses are buttressed by theoretical arguments on the inevitable emergence of genetic parasites in any replicator system. Virus genomes are also extremely diverse, and the ongoing metagenomic metatranscriptomic revolution reveals the vast scale of that diversity. The case of RNA viruses can serve as an apt illustration. Astonishingly, analysis of a single metatranscriptome, apparently coming from an environment rich in unicellular eukaryotes hosting RNA viruses, resulted in a twofold expansion of the known RNA virome. Three independent subsequent studies exploring thousands of metatranscriptomes from diverse environments each led to a further, several-fold increase in the number of known distinct RNA viruses (distinct, in this case, means not too closely related to each other, more specifically, clusters of genomes with similar sequences that roughly correspond to a virus species level), which combined, would amount to a more than an order of magnitude expansion. Rarefaction analysis shows that saturation of the RNA virus diversity is not yet in sight. Metagenomic studies indicate that the case of DNA viruses is similar, and expansion of some groups, for instance, tailless bacteriophages, or tailed phages of the expansive order Crassvirales has been even more dramatic.

So how many distinct viruses, or virus species, are there in the global virome altogether? Given that metagenomic and metatranscriptomic analyses (below we refer to these collectively as metaviromics insofar as applied to virus discovery) are not yet approaching saturation, this number cannot be inferred by extrapolation from available data. However, to obtain a rough, back-of-the-envelope estimate, we can take a different approach modeled over that employed previously to estimate the number of unique microbial genes. The great majority of viruses on earth are tailed and tailless phages infecting bacteria; viruses of archaea and eukaryotes are only relatively small additions. Let us conservatively assume that there are 10^6 to 10^7 bacterial species on earth (some estimates are orders of magnitude higher). Most if not all bacteria are hosts to multiple viruses. For Escherichia coli alone, about a hundred bacteriophages have been identified, whereas for Mycobacterium smegmatis mc2155, more than 10,000 individual mycobacteriophages have been isolated, although only 2,100 of these have been sequenced and thus it remains to be determined how many different virus species they represent. Furthermore, analysis of CRISPR spacers, the majority of which appear to be virus-derived but do not match known viruses, implies large, host species-specific viromes. Let us assume 10 to 100 virus species per host species as a conservative estimate. Then, the size of the global virome can be crudely estimated at 10^7 to 10^9 distinct virus species – obviously, even the low bound in this range, probably, a vast underestimate, is a huge number. The upper bound appears more realistic, so there is likely to be about a billion virus species if not more on earth – evidently, a long way to go from the currently recognized 10^4 species until we know them all. 13

Capsid-encoding organisms in contrast to ribosome-encoding organisms
Eugene V. Koonin (2014): Viruses were defined as one of the two principal types of organisms in the biosphere, namely, as capsid-encoding organisms in contrast to ribosome-encoding organisms, i.e., all cellular life forms. Structurally similar, apparently homologous capsids are present in a huge variety of icosahedral viruses that infect bacteria, archaea, and eukaryotes. These findings prompted the concept of the capsid as the virus “self” that defines the identity of deep, ancient viral lineages. This “capsidocentric” perspective on the virus world is buttressed by observations on the extremely wide spread of certain capsid protein (CP) structures that are shared by an enormous variety of viruses, from the smallest to the largest ones, that infect bacteria, archaea, and all divisions of eukaryotes. The foremost among such conserved capsid protein structures is the so-called jelly roll capsid (JRC) protein fold, which is represented, in a variety of modifications, in extremely diverse icosahedral (spherical) viruses that infect hosts from all major groups of cellular life forms. In particular, the presence of the double-beta-barrel JRC (JRC2b) in a broad variety of double-stranded DNA (dsDNA) viruses infecting bacteria, archaea, and eukaryotes has been touted as an argument for the existence of an “ancient virus lineage,” of which this type of capsid protein is the principal signature (9). Under this approach, viruses that possess a single beta-barrel JRC (JRC1b)—primarily RNA viruses and single-stranded DNA (ssDNA) viruses— could be considered another major viral lineage. A third lineage is represented by dsDNA viruses with icosahedral capsids formed by the so-called HK97-like capsid protein (after bacteriophage HK97, in which this structure was first determined), with a fold that is unrelated to the jelly roll fold. This assemblage of viruses is much less expansive than those defined by either JRC2b or JRC1b, but nevertheless, it unites dsDNA viruses from all three domains of cellular life. The capsid-based definition of a virus does capture a quintessential distinction between the two major empires of life forms, i.e., viruses and cellular life forms.    14

The origin of viruses is another mystery besides  the origin of life  Capsid10
Replication-expression classes of viruses and homologous, capsidless selfish elements. (A) RNA and reverse-transcribing elements. (B) DNA elements. The three shades of the blue background denote approximate relative prevalences of capsidless selfish elements in the respective Baltimore class (i.e., low for ssRNA genomes, moderate for dsDNA genomes, and high for retroelements and ssDNA genomes; so far, there are no capsidless elements with negative-strand RNA or dsRNA genomes). The abbreviations for the virus hallmark genes are as follows: RdRp, RNA-dependent RNA polymerase; S3H, superfamily 3 helicase; JRC, jelly roll capsid protein; RT, reverse transcriptase; INT, retro-type integrase; RCRE, rolling circle replication endonuclease; A-E DNA primase, archaeo-eukaryotic DNA primase; UL9-like S2H, UL9-like superfamily 2 helicase; FtsK pack-ATPase, FtsK-family packaging ATPase; ATPase suT, ATPase subunit of terminase; ppPolB, protein-primed DNA polymerase B; Ad-like Pro, adeno-like protease; and mat-Pro, maturation protease. The hallmark genes that are present in all known members of the given class are rendered in bold. For negative-strand RNA viruses, the RdRp is indicated in parentheses to emphasize the tentative relationship between the RNA polymerases of these viruses and the RdRp/RT. Helitrons are marked by an asterisk because of their distinct replication cycle: unlike other RCRE-encoding ssDNA selfish elements, helitrons are transposed as dsDNA. DdDp, DNA-dependent DNA polymerase.

Viruses with a different genetic alphabet
Stephen Freeland (2022):The genetic material of more than 200 bacteriophage viruses uses 1-aminoadenine (Z) instead of adenine (A). This minor difference in chemical structures is nevertheless a fundamental deviation from the standard alphabet of four nucleobases established by biological evolution at the time of life's Last Universal Common Ancestor (LUCA). Placed into broader context, the finding illustrates a deep shift taking place in our understanding of the chemical basis for biology. 15

What is the best explanation for viral origin?
Edward C. Holmes (2011):  The central debating point in discussions of the origin of viruses is whether they are ancient, first appearing before the last universal cellular ancestor (LUCA), or evolved more recently, such that their ancestry lies with genes that “escaped” from the genomes of their cellular host organisms and subsequently evolved independent replication. The escaped gene theory has traditionally dominated thinking on viral origins (reviewed in reference 37), in large part because viruses are parasitic on cells now and it has been argued that this must have always have been the case. However, there is no gene shared by all viruses, and recent data are providing increasingly strong support for a far more ancient origin. 16

Koonin mentions three possible scenarios for their origin. One of them: 

Eugene V. Koonin (2017) The virus-first hypothesis, also known as the primordial virus world hypothesis, regards viruses (or virus-like genetic elements) as intermediates between prebiotic chemical systems and cellular life and accordingly posits that virus-like entities originated in the precellular world. The second: The regression hypothesis, in contrast, submits that viruses are degenerated cells that have succumbed to obligate intracellular parasitism and in the process shed many functional systems that are ubiquitous and essential in cellular life forms, in particular the translation apparatus. The third, the escape hypothesis postulates that viruses evolved independently in different domains of life from cellular genes that embraced selfish replication and became infectious. 17

The second and third are questionable, in face of the fact that evolution would sort out degenerated cell parts that would harm their survival. The hypothesis that these parts would become parasites, goes detrimentally against the evolutionary paradigm, since evolution is about the survival of the fittest, and not evolving parasites that would kill the cell. Furthermore, if Viruses were not extant right from the beginning, how would ecological homeostasis be guaranteed?

Koonin agrees that the first is the most plausible. He writes:  The diversity of genome replication-expression strategies in viruses, contrasting the uniformity in cellular organisms, had been considered to be most compatible with the possibility that the virus world descends directly from a precellular stage of evolution, and an updated version of the escape hypothesis states that the first viruses have escaped not from contemporary but rather from primordial cells, predating the last universal cellular ancestor. The three evolutionary scenarios imply different timelines for the origin of viruses but offer little insight into how the different components constituting viral genomes might have combined to give rise to modern viruses.

The conclusion that can be drawn is, that Viruses co-emerged with life, and that occurred multiple times. If just emerging once is extremely unlikely based on the odds, how much more, multiple times?

Koonin continues: A typical virus genome encompasses two major functional modules, namely, determinants of virion formation and those of genome replication. Understanding the origin of any virus group is possible only if the provenances of both components are elucidated. Given that viral replication proteins often have no closely related homologs in known cellular organisms, it has been suggested that many of these proteins evolved in the precellular world or in primordial, now extinct, cellular lineages. The ability to transfer the genetic information encased within capsids—the protective proteinaceous shells that comprise the cores of virus particles (virions)—is unique to bona fide viruses and distinguishes them from other types of selfish genetic elements such as plasmids and transposons.Thus, the origin of the first true viruses is inseparable from the emergence of viral capsids. Studies on the origin of viral capsids are severely hampered by the high sequence divergence among these proteins.

Analysis of the available sequences and structures of major capsid proteins (CP) and nucleocapsid (NC) proteins encoded by representative members of 135 virus taxa (117 families and 18 unassigned genera) allowed us to attribute structural folds to 76.3% of the known virus families and unassigned genera. The remaining taxa included viruses that do not form viral particles (3%) and viruses for which the fold of the major virion proteins is not known and could not be predicted from the sequence data (20.7%). The former group includes capsidless viruses of the families Endornaviridae, Hypoviridae, Narnaviridae, and Amalgaviridae, all of which appear to have evolved independently from different groups of full-fledged capsid-encoding RNA viruses. The latter category includes eight taxa of archaeal viruses with unique morphologies and genomes, pleomorphic bacterial viruses of the family Plasmaviridae, and 19 diverse taxa of eukaryotic viruses. It should be noted that, with the current explosion of metagenomics studies, the number and diversity of newly recognized virus taxa will continue to rise. Although many of these viruses are expected to have previously observed CP/NC protein folds, novel architectural solutions doubtlessly will be discovered as well. 17

Gladys Kostyrka (2016): To french molecular biologist and microbiologist Patrick Forterre, viruses could not exist without cells because he endorses their definition as intracellular obligate parasites. However, this does not mean that viruses did not exist prior to DNA cells. On the basis of comparative sequence analyses of proteins and nucleic acids from viruses and their cellular hosts, Forterre hypothesized that viruses originated before DNA cells and before LUCA (the Last Universal Cellular Ancestor). Forterre’s hypothesis has been first formulated in the 1990s and was inspired by protein phylogenies. “Comparative sequence analyses of type II DNA topoisomerases and DNA polymerases from viruses, prokaryotes and eukaryotes suggest that viral genes diverged from cellular genes before the emergence of the last common ancestor (LCA) of prokaryotes and eukaryotes”.  At least some viruses originated not from the known cellular domains e Bacteria, Eukarya, and Archaea e but before these three domains were formed. In other words, these viruses must have originated before LUCA. 

There are several genes shared by many groups of viruses with extremely diverse replication-expression strategies, genome size and host ranges. In other words, there are several “hallmark genes”, coding for several hallmark proteins present in many viruses. Yet these genes and proteins are not supposed to be shared by viruses that do not have the same origin, given their diversity. This “key observation” of several hallmark viral genes is thus problematic. It is even more problematic if one takes into account the fact that these genes are not found in any cellular life forms.1 It is then highly improbable that these viral hallmark genes were originally cellular genes that were transferred to viruses. Koonin assumes that these genes originated in a primordial viral world and were conserved. “The simplest explanation for the fact that the hallmark proteins involved in viral replication and virion formation are present in a broad variety of viruses but not in any cellular life forms seems to be that the latter actually never possessed these genes. Rather, the hallmark genes, probably, antedate cells and descend directly from the primordial pool of virus-like genetic elements” 17

If Koonin's hypothesis were the case, these nucleotides would require foresight to assemble into genes, that later would become virions, depending on cell hosts. That's simply not tenable.  The evidence is better interpreted by the creationism model. It coincides with the hypothesis, that God created each species/kind and viruses separately. Multiple creation events by natural means and the emergence of symbiotic and parasitic relationships just mean multiplying the odds, and then naturalistic proposals become more and more untenable.

Achieving the same function through different molecular assembly routes refutes an evolutionary-naturalistic origin of viruses
Eugene V. Koonin (2015): The ability to form virions is the key feature that distinguishes viruses from other types of mobile genetic elements, such as plasmids and transposons. The origin of bona fide viruses thus appears to be intimately linked to and likely concomitant with the origin of the capsids. However, tracing the provenance of viral capsid proteins (CPs) proved to be particularly challenging because they typically do not display sequence or structural similarity to proteins from cellular life forms. Over the years, a number of structural folds have been discovered in viral CPs. Strikingly, morphologically similar viral capsids, in particular, icosahedral, spindle-shaped and filamentous ones, can be built from CPs which have unrelated folds. Thus, viruses have found multiple solutions to the same problem. Nevertheless, the process of de novo origin of viral CPs remains largely enigmatic.  18

Stephen J. Gould (1990):…No finale can be specified at the start, none would ever occur a second time in the same way, because any pathway proceeds through thousands of improbable stages. Alter any early event, ever so slightly, and without apparent importance at the time, and evolution cascades into a radically different channel.19

Fazale Rana (2001): Gould’s metaphor of “replaying life’s tape” asserts that if one were to push the rewind button, erase life’s history, and let the tape run again, the results would be completely different.  The very essence of the evolutionary process renders evolutionary outcomes as nonreproducible (or nonrepeatable). Therefore, “repeatable” evolution is inconsistent with the mechanism available to bring about biological change. 20

William Schopf (2002): Because biochemical systems comprise many intricately interlinked pieces, any particular full-blown system can only arise once…Since any complete biochemical system is far too elaborate to have evolved more than once in the history of life, it is safe to assume that microbes of the primal LCA cell line had the same traits that characterize all its present-day descendants. 21 22

Hugh M. B. Harris: (2021): Viruses are ubiquitous. They infect almost every species and are probably the most abundant biological entities on the planet, yet they are excluded from the Tree of Life (ToL). Viruses may well be essential for ecosystem diversity 23

Matti Jalasvuori (2012): Viruses play a vital role in all cellular and genetic functions, and we can therefore define viruses as essential agents of life. Viruses provide the largest reservoir of genes known in the biosphere but were not, stolen’ from the host. Such capsids cannot be of host origin. It is well accepted by virologists that viruses often contain many complex genes (including core genes) that cannot be attributed to having been derived from host genes. 24

Julia Durzyńska (2015): Many attempts have been made to define nature of viruses and to uncover their origin.   As the origin of viruses and that of living cells are most probably interdependent, we decided to reveal ideas concerning nature of cellular last universal common ancestor (LUCA).   Many viral particles (virions) contain specific viral enzymes required for replication. A few years ago, a new division for all living organisms into two distinct groups has been proposed: ribosome-encoding organisms (REOs) and capsid-encoding organisms (CEOs). 25

Eugene V. Koonin: (2012): Probably an even more fundamental departure from the three-domain schema is the discovery of the Virus World, with its unanticipated, astonishing expanse and the equally surprising evolutionary connectedness. Virus-like parasites inevitably emerge in any replicator systems, so THERE IS NO EXAGGERATION IN THE STATEMENT THAT THERE IS NO LIFE WITHOUT VIRUSES. And in quite a meaningful sense, not only viruses taken together, but also major groups of viruses seem to be no less (if not more) fundamentally distinct as the three (or two) domains of cellular life forms, given that viruses employ different replication-expression cycles, unlike cellular life forms which, in this respect, are all the same. 26

Shanshan Cheng: (2013): Viral capsid proteins protect the viral genome by forming a closed protein shell around it. Most of currently found viral shells with known structure are spherical in shape and observe icosahedral symmetry. Comprised of a large number of proteins, such large, symmetrical complexes assume a geometrically sophisticated architecture not seen in other biological assemblies. The geometry of the complex architecture aside, another striking feature of viral capsid proteins lies in the folded topology of the monomers, with the canonical jelly-roll β barrel appearing most prevalent (but not sole) as a core structural motif among capsid proteins that make up these viral shells of varying sizes. Our study provided support for the hypothesis that viral capsid proteins, which are functionally unique in viruses in constructing protein shells, are also structurally unique in terms of their folding topology. 27

Eugene V. Koonin (2020): In a seminal 1971 article, Baltimore classified all then-known viruses into six distinct classes that became known as Baltimore classes (BCs) (a seventh class was introduced later), on the basis of the structure of the virion's nucleic acid (traditionally called the virus genome):

The seven Baltimore classes (BCs): information flow. For each BC, the processes of replication, transcription, translation, and virion assembly are shown by color-coded arrows (see the inset). Host enzymes that are involved in virus genome replication or transcription are prefixed with “h-,” and in cases when, in a given BC, one of these processes can be mediated by either a host- or a virus-encoded enzyme, the latter is prefixed with “v-.” Otherwise, virus-encoded enzymes are not prefixed. CP, capsid protein; DdDp, DNA-directed DNA polymerase; DdRp, DNA-directed RNA polymerase; gRNA, genomic RNA; RdRp, RNA-directed RNA polymerase; RT, reverse transcriptase; RCRE, rolling-circle replication (initiation) endonuclease.

1. Double-stranded DNA (dsDNA) viruses, with the same replication-expression strategy as in cellular life forms
2. Single-stranded DNA (ssDNA) viruses that replicate mostly via a rolling-circle mechanism
3. dsRNA viruses
4. Positive-sense RNA [(+)RNA] viruses that have ssRNA genomes with the same polarity as the virus mRNA(s)
5. Negative-sense RNA [(−)RNA] viruses that have ssRNA genomes complementary to the virus mRNA(s)
6. RNA reverse-transcribing viruses that have (+)RNA genomes that replicate via DNA intermediates synthesized by reverse transcription of the genome
7. DNA reverse-transcribing viruses replicating via reverse transcription but incorporating into virions a dsDNA or an RNA-DNA form of the virus genome.

Evidence supports monophyly for some of the BCs but refutes it for others. Generally, the evolution of viruses and MGEs is studied with methods of molecular evolutionary analysis that are also used for cellular organisms. However, the organizations of the genetic spaces dramatically differ between viruses and their cellular hosts.

The origin of viruses is another mystery besides  the origin of life  Mr031010
Representation of the 6 “superviral hallmark genes” in virus genomes of the seven Baltimore classes. The “superviral hallmark proteins” are shown by ribbon diagrams of the representative protein structures. The lines connect the proteins with the viruses of BCs in which they are present. The thickness of each connecting line roughly reflects the abundance of a given “superhallmark” gene in a given BC. DJR-CP, double-jelly-roll capsid protein; RCRE, rolling-circle replication (initiation) endonuclease; RdRp, RNA-directed RNA polymerase; RT, reverse transcriptase; S3H, superfamily 3 helicase; SJR-CP, single-jelly-roll capsid protein.

Rob Phillips (2018):The origins of superviral hallmark genes VHGs appear to be widely different. In particular, RdRps, RTs, and RCREs most likely represent the heritage of the primordial, precellular replicator pool as indicated by the absence of orthologs of these proteins in cellular life-forms. At the top of the megataxonomy are the four effectively independent realms that, however, are connected at an even higher rank of unification through the super-VHG domains.

The International Committee on Taxonomy of Viruses or ICTV classifies viruses into seven orders:

Herpesvirales, large eukaryotic double-stranded DNA viruses;
Caudovirales, tailed double-stranded DNA viruses typically infecting bacteria;
Ligamenvirales, linear double-stranded viruses infecting archaea;
Mononegavirales, nonsegmented negative (or antisense) strand single-stranded RNA viruses of plants and animals;
Nidovirales, positive (or sense) strand single-stranded RNA viruses of vertebrates;
Picornavirales, small positive strand single-stranded RNA viruses infecting plants, insects, and animals;
Tymovirales, monopartite positive single-stranded RNA viruses of plants.

In addition to these orders, there are ICTV families, some of which have not been assigned to an ICTV order. Only those ICTV viral families with more than a few members present in our dataset are explored. 28

Structure and Assembly of Complex Viruses
Carmen San Martin (2013): Viral particles consist essentially of a proteinaceous capsid protecting a genome and involved also in many functions during the virus life cycle. In simple viruses, the capsid consists of a number of copies of the same, or a few different proteins organized into a symmetric oligomer. Structurally complex viruses present a larger variety of components in their capsids than simple viruses. They may contain accessory proteins with specific architectural or functional roles; or incorporate non-proteic elements such as lipids. They present a range of geometrical variability, from slight deviations from the icosahedral symmetry to complete asymmetry or even pleomorphism. Putting together the many different elements in the virion requires an extra effort to achieve correct assembly, and thus complex viruses require sophisticated mechanisms to regulate morphogenesis. This chapter provides a general view of the structure and assembly of complex viruses.

A viral particle consists essentially of a proteinaceous capsid with multiple roles in the protection of the viral genome, cell recognition and entry, intracellular trafficking, and controlled uncoating. Viruses adopt different strategies to achieve these goals. Simple viruses generally build their capsids from a number of copies of the same, or a few different proteins, organized into a symmetric oligomer. In the case of complex viruses, capsid assembly requires further elaborations. What are the main characteristics that define a structurally complex virus? Structural complexity on a virus often, but not necessarily, derives from the need to house a large genome, in which case a larger capsid is required. However, capsid or genome sizes by themselves are not determinants of complexity. For example, flexible filamentous viruses can reach lengths in the order of microns, but most of their capsid mass is built by a single capsid protein arranged in a helical pattern. On the other hand, architecturally complex viruses such as HIV have moderate-sized genomes (7–10 kb of single-stranded (ss) RNA). Structurally complex viruses incorporate a larger variety of components into their capsids than simple viruses. They may contain accessory proteins with specific architectural or functional roles or incorporate non-proteic elements such as lipids. 29

Forming viral symmetric shells
Roya Zandi (2020): The process of formation of virus particles in which the protein subunits encapsidate genome (RNA or DNA) to form a stable, protective shell called the capsid is an essential step in the viral life cycle. The capsid proteins of many small single-stranded RNA viruses spontaneously package their wild-type (wt) and other negatively charged polyelectrolytes, a process basically driven by the electrostatic interaction between positively charged protein subunits and negatively charged cargo.  Regardless of the virion size and assembly procedures, most spherical viruses adopt structures with icosahedral symmetry. How exactly capsid proteins (CPs) assemble to assume a specific size and symmetry have been investigated for over half a century now. As the self-assembly of virus particles involves a wide range of thermodynamics parameters, different time scales, and an extraordinary number of possible pathways, the kinetics of assembly has remained elusive, linked to Levinthal’s paradox for protein folding. The role of the genome on the assembly pathways and the structure of the capsid is even more intriguing. The kinetics of virus growth in the presence of RNA is at least 3 orders of magnitude faster than that of empty capsid assembly, indicating that the mechanism of assembly of CPs around RNA might be quite different. Some questions then naturally arise: What is the role of RNA in the assembly process, and by what means then does RNA preserve assembly accuracy at fast assembly speed? Two different mechanisms for the role of the genome have been proposed: (i) en masse assembly and (ii) nucleation and growth.

The assembly interfaces in many CPs are principally short-ranged hydrophobic in character, whereas there is a strong electrostatic, nonspecific long-ranged interaction between RNA and CPs. To this end, the positively charged domains of CPs associate with the negatively charged RNA quite fast and form an amorphous complex. Hydrophobic interfaces then start to associate, which leads to the assembly of a perfect icosahedral shell. Based on the en masse mechanism, the assembly pathways correspond to situations in which intermediates are predominantly disordered. They found that, at neutral pH, a considerable number of CPs were rapidly (∼28 ms) adsorbed to the genome, which more slowly (∼48 s) self-organized into compact but amorphous nucleoprotein complexes (NPC). By lowering the pH, they observed a disorder−order transition as the protein−protein interaction became strong enough to close up the capsid and to overcome the high energy barrier separating NPCs from virions. 30

No common ancestor for Viruses

Viruses and the tree of life (2009): Viruses are polyphyletic: In a phylogenetic tree, the characteristics of members of taxa are inherited from previous ancestors. Viruses cannot be included in the tree of life because they do not share characteristics with cells, and no single gene is shared by all viruses or viral lineages. While cellular life has a single, common origin, viruses are polyphyletic – they have many evolutionary origins. Viruses don’t have a structure derived from a common ancestor.  Cells obtain membranes from other cells during cell division. According to this concept of ‘membrane heredity’, today’s cells have inherited membranes from the first cells.  Viruses have no such inherited structure.  They play an important role by regulating population and biodiversity. 31

Eugene V. Koonin (2017): The entire history of life is the story of virus–host coevolution. Therefore the origins and evolution of viruses are an essential component of this process. A signature feature of the virus state is the capsid, the proteinaceous shell that encases the viral genome. Although homologous capsid proteins are encoded by highly diverse viruses, there are at least 20 unrelated varieties of these proteins. Viruses are the most abundant biological entities on earth and show remarkable diversity of genome sequences, replication and expression strategies, and virion structures. Evolutionary genomics of viruses revealed many unexpected connections but the general scenario(s) for the evolution of the virosphere remains a matter of intense debate among proponents of the cellular regression, escaped genes, and primordial virus world hypotheses. A comprehensive sequence and structure analysis of major virion proteins indicates that they evolved on about 20 independent occasions. Virus genomes typically consist of distinct structural and replication modules that recombine frequently and can have different evolutionary trajectories. The present analysis suggests that, although the replication modules of at least some classes of viruses might descend from primordial selfish genetic elements, bona fide viruses evolved on multiple, independent occasions throughout the course of evolution by the recruitment of diverse host proteins that became major virion components.

The importance of the admission that viruses do not share a common ancestor cannot be outlined enough. Researchers also admit, that under a naturalistic framework, the origin of viruses remains obscure, and has not found an explanation. One reason is that viruses depend on a cell host in order to replicate. Another is, that the virus capsid shells that protect the viral genome are unique, there is no counterpart in life. A science paper that I quote below describes capsids with a "geometrically sophisticated architecture not seen in other biological assemblies". This seems to be interesting evidence of design. The claim that their origin has something to do with evolution is also misleading - evolution plays no role in explaining either the origin of life or the origin of viruses. The fact that "no single gene is shared by all viruses or viral lineages" prohibits drawing a tree of viruses leading to a common ancestor.  17

Edward C. Holmes (2011): The discovery of mimivirus has undoubtedly had a major impact on theories of viral origins. More striking is that most (∼70% at the time of writing) mimivirus genes have no known homologs, in either virus or cellular genomes, so their origins are unknown. More importantly, the discovery of mimivirus highlights our profound ignorance of the virosphere. It is therefore a truism that a wider sampling of viruses in nature is likely to tell us a great deal more about viral origins. Although perhaps less lauded, the discovery of conserved protein structures among diverse viruses with little if any primary sequence similarity has even grander implications for our understanding of viral origins. More recently, a common virion architecture has been proposed for some viruses that do not possess an icosahedral capsid, including the archaean virus Halorubrum pleomorphic virus 1 (HRPV-1) 16

Arturo Becerra (2016): There are many unresolved questions concerning the origin and evolution of viruses. Nonetheless, many researchers consider, as do we, that the origin of these biological entities is polyphyletic.32

1. Nejc Kejzar: New Vista into Origins of Viruses from a Prototypic ssDNA Phage May 27, 2022
2. Curtis A. Suttle: Viruses in the sea  14 September 2005
3. Hugh Ross: Viruses and God’s Good Designs March 30, 2020
4. Ramesh K Goel: Viruses and Their Interactions With Bacteria and Archaea of Hypersaline Great Salt Lake 2021 Sep 28
5. Eugene V. Koonin: Global Organization and Proposed Megataxonomy of the Virus World 4 March 2020
6. Eugene VKoonin: A virocentric perspective on the evolution of life October 2013
7. Rachel Nuwer  Why the world needs viruses to function  (2020)
8. P.Forterre: Origin of Viruses 2008
9. Gladys Kostyrka: What roles for viruses in origin of life scenarios? 27 February 2016
10. Rybicki: Virus origins: from what did viruses evolve or how did they initially arise? 12th August 2015
11. Steven W. Wilhelm: Ocean viruses and their effects on microbial communities and biogeochemical cycles 2012 Sep 5.
12. G.Witzany: Viruses are essential agents within the roots and stem of the tree of life 21 February 2010
13. Eugene V. Koonin: The global virome: how much diversity and how many independent origins? 2022 Sep 12
14. Eugene V. Koonin: Virus World as an Evolutionary Network of Viruses and Capsidless Selfish Elements 2, June 2014
15. Stephen Freeland: Undefining life's biochemistry: implications for abiogenesis 23 February 2022
16. Edward C. Holmes: What Does Virus Evolution Tell Us about Virus Origins? 2011 Jun; 85
17. Eugene V. Koonin: Multiple origins of viral capsid proteins from cellular ancestors March 6, 2017
18. Eugene V. Koonin:  Evolution of an archaeal virus nucleocapsid protein from the CRISPR-associated Cas4 nuclease 2015
19. Stephen J. Gould, Wonderful Life: The Burgess Shale and the Nature of History 1990
20. Fazale Rana: Repeatable Evolution or Repeated Creation? 2001
21. J. William Schopf: Life’s Origin 2002
22. Fazale Rana: Newly Discovered Example of Convergence Challenges Biological Evolution 2008
23. Hugh M. B. Harris: A Place for Viruses on the Tree of Life 14 January 2021
24. Matti Jalasvuori  Viruses: Essential Agents of Life (2012)
25. Julia Durzyńska  Viruses and cells intertwined since the dawn of evolution  (2015)
26. Eugene V. Koonin:  The Logic of Chance : The Nature and Origin of Biological Evolution (2012)
27. Shanshan Cheng: Viral Capsid Proteins Are Segregated in Structural Fold Space February 7, 2013
28. Rob Phillips: A comprehensive and quantitative exploration of thousands of viral genomes 2018 Apr 19
29. Carmen San Martin: Structure and Assembly of Complex Viruses  19 April 2013
30. Roya Zandi: How a Virus Circumvents Energy Barriers to Form Symmetric Shells March 2, 2020
31. Viruses and the tree of life 19 March 2009

Last edited by Otangelo on Wed Sep 13, 2023 6:30 pm; edited 73 times in total




Viruses with a different genetic alphabet
Stephen Freeland (2022):The genetic material of more than 200 bacteriophage viruses uses 1-aminoadenine (Z) instead of adenine (A). This minor difference in chemical structures is nevertheless a fundamental deviation from the standard alphabet of four nucleobases established by biological evolution at the time of life's Last Universal Common Ancestor (LUCA). Placed into broader context, the finding illustrates a deep shift taking place in our understanding of the chemical basis for biology. 1

Yasemin Saplakoglu (2021): These viruses use a unique genetic alphabet not found anywhere else on the planet. The blueprint for life on our planet is typically written by DNA molecules using a four-letter genetic alphabet. But some bacteria-invading viruses carry around DNA with a different letter — Z — that may help them survive. And new studies show it is much more widespread than previously thought. A series of new papers describe how this strange chemical letter enters into viral DNA, and researchers have now demonstrated that the "Z-genome" is much more widespread in bacteria-invading viruses across the globe — and may have even evolved to help the pathogens survive the hot, harsh conditions of our early planet. DNA is almost always made up of the same four-letter alphabet of chemical compounds known as nucleotides: Guanine (G), cytosine (C), thymine (T) and adenine (A). A DNA molecule consists of two strands of these chemicals that are tied together into a double-helix shape. DNA's alphabet is the same whether it's coding for frogs, humans or the plant by the window, but the instructions are different.

In 1977, a group of scientists in Russia first discovered that a cyanophage, or a virus that invades a group of bacteria known as cyanobacteria, had substituted all of its As for the chemical 2-aminoadenine (Z). In other words, a genetic alphabet that typically consists of ATCG in most organisms on our planet was ZTCG in these viruses.  For decades, this was a head-scratching discovery — as weird as spelling apples “zpples” — and little was known about how this one-letter substitution may have impacted the virus. In the late 1980s, researchers found that this Z nucleotide actually gave the virus some advantages: it was more stable at higher temperatures, it helped one strand of DNA bind more accurately to the second strand of DNA after replication (DNA is double-stranded), and Z-DNA could resist certain proteins present in bacteria that would normally destroy viral DNA. Now, two research groups in France and one in China have discovered another piece of the puzzle: how this Z-nucleotide ends up in the genomes of bacteriophages — viruses that invade bacteria and use its machinery to replicate.

Factory Z
All three research groups, using a variety of genomic techniques, identified a part of the pathway that leads to the Z-genome in bacteriophages. The first two groups found two major proteins known as PurZ and PurB that are involved in making the Z-nucleotide. Once the cyanophage injects its DNA into bacteria to replicate itself, a series of transformations take place: Those two proteins make a precursor Z-molecule and then convert the Z precursor molecule into the Z-nucleotide. Other proteins then modify it so that it can be incorporated into DNA. The third group identified the enzyme responsible for assembling new DNA molecules from the parent DNA molecule: a DNA polymerase known as DpoZ. They also found that this enzyme specifically excludes the A-nucleotide and always adds the Z instead. For decades, the Z-genome was only known to exist in one species of cyanobacteria. "People believed that this Z-genome was so rare," Suwen Zhao, an assistant professor in the school of life science and technology at ShanghaiTech University and the senior author of one of the studies, said.  Zhao and her team analyzed sequences of the phages with the Z-genome and compared them to other organisms. They discovered that Z-genomes are actually much more widespread than previously thought. The Z-genome was present in more than 200 different types of bacteriophages.  The phages carrying this Z-genome "could be considered as a different form of life," Pierre Alexandre Kaminski, a researcher at the Institut Pasteur in France, senior author of another one of the studies and co-author on the third, said. But "it's difficult to know the exact origin," and it's necessary to explore the extent that this PurZ protein exists across bacteriophages — and maybe even organisms, he told Live Science.

Kaminski and his group analyzed the evolutionary history of the PurZ protein and discovered that it is related to a protein called PurA found in archaea that synthesizes the A-nucleotide. This "distant" evolutionary connection raises the question of whether the proteins involved in making the Z-nucleotide first arose in bacteria and were eventually adapted by viruses, or whether they occurred more frequently in preliminary lifeforms on the planet, perhaps even within cells,  PurZ and DpoZ are often inherited together, which suggests that the Z-genomes has existed alongside normal DNA since the early days of life on our planet, before 3.5 billion years ago, they wrote. What's more, an analysis conducted in 2011 of a meteorite that fell in Antarctica in 1969 discovered the Z-nucleotide alongside some standard and nonstandard nucleotides likely of extraterrestrial origin, "raising a potential role for Z in early forms of life," they wrote.

Future Z
It's possible that this Z-genome, if it existed that early in our planet's history, could have conferred an advantage to early lifeforms. "I think it's more suitable for Z-genome organisms to survive in the hot and the harsh environment" of the early planet, Zhao said.  The Z-genome is very stable. When two strands of normal DNA join together to form a double helix, two hydrogen bonds bind A to T, and three hydrogen bonds bind G to C. But when A is replaced with Z, three hydrogen bonds bind them together, making the tie stronger. This is the only non-normal DNA that modifies the hydrogen bonding, Kaminski said. But it's no surprise that the Z-genome is not widespread across species today. The Z-genome creates very stable, but not flexible, DNA, Zhao said. For many biological events, such as replicating DNA, we need to unzip the double-strand, and the extra hydrogen bond makes unzipping more difficult, she said. "I think it's more suitable for hot and harsh environments, but not this more comfortable environment right now," Zhao said.  Still, the Z-genome's stability makes it an ideal candidate for certain technologies. Now that researchers know which proteins the virus uses to make these Z-genomes, scientists can make them themselves. "Now we can produce the Z-genome on a large scale," Zhao said.

A third purine biosynthetic pathway encoded by aminoadenine-based viral DNA genomes

Dona Sleiman (2021): Cells have two purine pathways that synthesize adenine and guanine ribonucleotides from phosphoribose via inosylate. A chemical hybrid between adenine and guanine, 2-aminoadenine (Z), replaces adenine in the DNA of the cyanobacterial virus S-2L. We show that S-2L and Vibrio phage PhiVC8 encode a third purine pathway catalyzed by PurZ, a distant paralog of succinoadenylate synthase (PurA), the enzyme condensing aspartate and inosylate in the adenine pathway. PurZ condenses aspartate with deoxyguanylate into dSMP (N6-succino-2-amino-2′-deoxyadenylate), which undergoes defumarylation and phosphorylation to give dZTP (2-amino-2′-deoxyadenosine-5′-triphosphate), a substrate for the phage DNA polymerase. Crystallography and phylogenetics analyses indicate a close relationship between phage PurZ and archaeal PurA enzymes. Our work elucidates the biocatalytic innovation that remodeled a DNA building block beyond canonical molecular biology.

Bacteriophage genomes contain many modified nucleotides that are enzymatically synthesized and then incorporated by polymerization. The most conspicuous is 2-aminoadenine (hereafter referred to as Z), which was found in Synechococcus phage S-2L (1) and has also been detected in meteorites, suggesting a prebiotic existence. Z completely replaces the canonical adenine in S-2L DNA, increasing its thermostability because of a third hydrogen bond in the pair Z:T and altering the conformational properties of the double helix because of the presence of a 2-amino group in the minor groove, which renders S-2L DNA resistant to most restriction enzymes. The substitution of adenine (A) in S-2L DNA suggested an aminoadenine biosynthetic pathway encoded by the phage. This is in line with the fact that the S-2L genome encodes a putative homolog of succinoadenylate synthase PurA, which catalyzes the first step of de novo biosynthesis of adenosine 5′-monophosphate (AMP) by coupling the hydrolysis of guanosine 5′-triphosphate (GTP) with the synthesis of succinoadenylate from L-aspartate and inosine 5′-monophosphate (IMP). We identified PurZ homologs in several bacteriophages, notably in the PhiVC8 phage infecting Vibrio cholerae, whose DNA contains amino adenine instead of adenine. This prompted us to characterize the activity of the PurZ enzymes, to elucidate the biosynthetic pathway for amino adenine nucleotides, and to probe it in vivo. We first expanded our search for PurA and PurZ homologs and found candidates in 60 other phages infecting distantly related bacteria (mostly Actinobacteria, Firmicutes, and Proteobacteria). Phylogenetic analysis revealed a clear distinction between prokaryotic and eukaryotic canonical PurA sequences on the one hand and phage PurZ sequences (pink) falling into a clade embedded within archaeal PurA sequences on the other hand (green) (Fig. 1 and fig. S1). The PurZ clade also includes some bacterial homologs (blue), which could originate from phages. This evolutionary closeness is consistent with two specific deletions that are not found in bacterial and eukaryotic PurA. The evolutionary distinction between the PurZ clade and cellular PurA is also confirmed by a number of specific sequence signatures corresponding to Vibrio phage PhiVC8 amino acids S14, R230, I234, G238, L255, G256, and T262. In particular, the essential D13 residue (S14 in PhiVC8) in Escherichia coli PurA  is not conserved in PurZ sequences. 

1. Stephen Freeland: Undefining life's biochemistry: implications for abiogenesis 23 February 2022
2. Yasemin Saplakoglu: Some viruses have a mysterious 'Z' genome April 29, 2021

3. Dona Sleiman Et al: A third purine biosynthetic pathway encoded by aminoadenine-based viral DNA genomes 2021 Apr 30

Last edited by Otangelo on Wed Sep 13, 2023 6:55 pm; edited 7 times in total




If we imagine that 1ml of seawater contains one million bacteria and ten times more viral sequences it can be determined that 10^31 bacteriophages infect 10^24 bacteria per second.

Identification of Capsid/Coat Related Protein Folds and Their Utility for Virus Classification

The origin of viruses is another mystery besides  the origin of life  Entry_10
Viral penetration into host nucleus
Most DNA and few RNA viruses target their genome to the host nucleus. The crossing of nuclear membrane occurs in several ways :
-RNA virus, dsDNA virus and lentivirus genomes enter via the nuclear pore complex (NPC) through the cellular Importin transport.
-ssDNA virus capsid seems to be small enough to cross the NPC and enter the nucleus as an intact capsid.
-Hepadnaviridae capsid would enter the NPC pore, but remains attached to it and releases the viral genomic DNA into the nucleoplasm.
-Herpesvirales capsid is too large to enter the NPC pore, the viral genome is directly injected through the NPC on which the capsid docks.
-All retroviridae except lentivirus would enter the nucleus during mitosis, when the nuclear membrane temporarily disintegrates.

All these strategies to cross the nuclear envelope barrier are associated with various levels of capsid disassembly, since virus can pass intact (e.g. parvoviridae) or, in the case of injection, only the viral genome enters the nucleus (e.g. herpesviruses). Genome integration in the host genome may eventually follow

Last edited by Otangelo on Tue Jul 26, 2022 9:05 am; edited 2 times in total




Eugene V. Koonin (2016): Almost all cellular life forms are hosts to diverse genetic parasites with various levels of autonomy including plasmids, transposons and viruses. Theoretical modeling of the evolution of primordial replicators indicates that parasites (cheaters) necessarily evolve in such systems and can be kept at bay primarily via compartmentalization. Given the (near) ubiquity, abundance and diversity of genetic parasites, the question becomes pertinent: are such parasites intrinsic to life? At least in prokaryotes, the persistence of parasites is linked to the rate of horizontal gene transfer (HGT). We mathematically derive the threshold value of the minimal transfer rate required for selfish element persistence, depending on the element duplication and loss rates as well as the cost to the host. Estimation of the characteristic gene duplication, loss and transfer rates for transposons, plasmids and virus-related elements in multiple groups of diverse bacteria and archaea indicates that most of these rates are compatible with the long-term persistence of parasites. Notably, a small but non-zero rate of HGT is also required for the persistence of non-parasitic genes. We hypothesize that cells cannot tune their horizontal transfer rates to be below the threshold required for parasite persistence without experiencing highly detrimental side-effects. As a lower boundary to the minimum DNA transfer rate that a cell can withstand, we consider the process of genome degradation and mutational meltdown of populations through Muller’s ratchet. A numerical assessment of this hypothesis suggests that microbial populations cannot purge parasites while escaping Muller’s ratchet. Thus, genetic parasites appear to be virtually inevitable in cellular organisms.

All or nearly all cellular life forms appear to harbor diverse genetic parasites including transposable elements, plasmids, viruses, and others. Genetic parasites are non-cellular replicators that possess their own genomes but to various degrees depend on the host cells for information processing systems, in particular translation. Parasite–host interaction and co-evolution undoubtedly are major aspects of all evolution of life that to a large extent drive evolutionary transitions. A major aspect of this co-evolution is the perennial arms race whereby cellular hosts evolve multiple mechanisms of defense to which the parasites respond by evolving counter-defense systems. However, cooperation between hosts and parasites is the other side of the co-evolution coin. Parasitic genetic elements can be beneficial to the host by providing resistance to superinfection but also in other, still incompletely characterized ways. Importantly, genetic material of parasitic elements is often recruited for host functions. Indeed, sequences derived from transposable elements constitute large fractions of the genomes of diverse eukaryotes, (e.g., up to 90% of many plant genomes) and are considered to be important drivers of genome evolution. Although genomic parasites are not quite as prominent in prokaryotes, there are few bacterial or archaeal genomes that are free of selfish, parasitic elements including transposons and prophages. A major role of viruses in the host biology has also been demonstrated in prokaryotes as illustrated by the transfer of photosystem genes by cyanophages, the utilization of defective prophages as vehicles for horizontal gene transfer known as Gene Transfer Agents, and more generally by the recruitment of viral genes for diverse host functions.

The relationships between genomic parasites and their hosts are highly differentiated such that the parasites span a wide range of “selfishness,” i.e., the cost incurred on the host by the parasite reproduction. At one end of the spectrum are benign elements that are incapable of autonomous replication and only replicate with the host genome, so that the only cost associated with the element is that of the replication itself which is near negligible compared to the total energy expenditure of the host. At the opposite end are lytic viruses that replicate to extremely high copy numbers and rapidly kill the host. In between are selfish elements with various degrees of autonomy including transposons that have the capacity to proliferate within the host genomes, low and high copy number plasmids and others. There are multiple, tight evolutionary links between selfish elements that differ in terms of the cost to the host, such as lytic viruses and transposons. Moreover, the same element often alternates between different lifestyles, e.g., between low-cost prophages and high-cost lytic viruses, as in the thoroughly studied case of temperate bacteriophages, or between a bacteriophage and a plasmid. The same cellular organism typically hosts various classes of parasitic selfish elements resulting in complex ecosystems of interacting replicators.

Because parasites exist at all levels of biological organization and apparently accompany (nearly) all cellular life forms, the question of whether parasites are inherent to life becomes pertinent. The inevitability of the emergence of parasites has figured prominently in theoretical studies on the origin of life. In particular, considerable effort has been dedicated to understanding how pre-cellular communities of simple replicators could survive the onslaught of “cheaters” that replicate faster than “cooperators” by not coding for resources required for replication but rather exploiting resources produced by the cooperators. A major and striking theoretical result is that a spatially homogeneous population of replicators is prone to takeover by cheaters (parasites) and hence is generally doomed to collapse. The only path to stability for a population of replicators in the face of the parasite onslaught appears to be compartmentalization: generally, parasites can be eliminated from a population or at least kept in check if the rate at which the parasite is transferred to new hosts is insufficient to compensate for the loss of the parasite. Thus, the primordial parasite-host arms race could have been a key driver of the evolution of biological complexity.

Assuming that selfish genetic elements evolved within the simplest, pre-cellular replicator systems, the emergence of cells resulted in a steep new barrier to the spread of parasites, which thereafter needed to undergo horizontal gene transfer (HGT) to infect new hosts. The question, then, emerges: can cellular organisms purge parasites by lowering their HGT rates? Parasites are lost at two levels: at the genome level, the intrinsic loss bias observed in prokaryotic genomes leads to attrition of non-beneficial genes; at the population level, purifying selection eliminates the hosts with the larger parasitic load. In the case of relatively passive elements, fitness costs are likely associated with the additional energy required for DNA replication, increased genomic instability, and interference with functional genes. Although these costs are presumably small for many if not most non-viral selfish elements, the relatively low copy numbers of such elements in prokaryotic genomes imply that host populations are able to keep them under control. The intuitive idea that a combination of proliferation, loss, transfer, and selection governs the internal dynamics of genomes can be formalized to obtain an expression that defines the conditions for parasites to survive in a population. In this work, we derive such an expression and evaluate its consistency with genomic data. To this end, we estimate characteristic duplication, loss and transfer rates for transposons, plasmids and virus-related elements, as well as “non-selfish” genes, in bacterial and archaeal genomes. As predicted by the theory, we find that the estimated transfer rates are large enough to compensate for losses. Similar qualitative patterns of loss and transfer are observed for genetic parasites and non-selfish genes, which suggest that the causes that underlie the long-term persistence of parasites are not essentially different of those that allow long-term persistence of most other genes. According to this view, genetic parasites with a higher effective loss rate (which includes the negative contribution to the fitness of the host) require a higher transfer rate to survive, which might explain the division between parasites that possess or lack autonomous transfer mechanisms. Moreover, the existence of the latter class of parasites raises the question whether cells could curtail genetic transfer to levels incompatible with the persistence of such parasites.

In prokaryotes, the processes that lead to the acquisition of external genetic material have been traditionally divided according to the mechanism by which DNA enters the cell, the origin of such DNA, and the consequences of its acquisition. Thus, transformation is associated to the intake of DNA from close relatives, which leads to homologous recombination, whereas conjugation and transfection are related to the transfer of heterologous DNA from plasmids and phages. Recent research reveals a more complex picture, whereby any of these DNA uptake mechanisms can result in homologous and/or non-homologous recombination. For example, large heterologous sequences can be acquired through transformation as long as flanking homology regions exist, as it has been observed with pathogenicity islands in S. pneumoniae. Therefore, transformation can be not only a source of genetic innovation but also a means of transmission of genetic parasites. Conversely, plasmids and prophages are capable of mobilizing other (non-parasitic) genes, some of which can share homology with the resident genes if the infection occurs between related lineages. For these reasons, we refer to HGT as a catchall term that comprises the entry of DNA into a cell and its subsequent incorporation into the genome via homologous or non-homologous recombination.

Horizontal gene transfer (HGT) is a dominant process in the evolution of bacteria and archaea that is the source of most innovations and adaptations to new environments. Apart from the acquisition of new genes that is crucial in the long term but can be beneficial only sporadically in evolving microbial populations, HGT appears to be important for a more immediate reason. In the absence of recombination, finite populations are subject to irreversible deterioration through accumulation of deleterious mutations, a process known as Muller’s ratchet, that eventually leads to the collapse of a population via mutational meltdown. Even if, in some cases, Muller’s ratchet can be partly offset by compensatory mutations (e.g., in the case of rapidly mutating RNA viruses), it still causes a dramatic drop in the fitness of susceptible populations. Muller’s ratchet is more powerful than neutral drift and can keep acting in regimes where selection is able to counteract the drift. Specifically, population genetics models show that, in a population of size N, Muller’s ratchet drives the accumulation of mutations with the cost s’ such that  Ns' > 1 > Ns'e-U/s', where U is the rate at which such mutations occur (Gordo and Campos 2008). Therefore, with the possible exceptions of extremely fast mutating organisms or populations subject to severe bottlenecks, the main contribution to Muller’s ratchet comes from slightly deleterious mutations. In line with its major evolutionary impact, avoidance of Muller’s ratchet is thought to be one of the key driving forces in the evolution of sex. Mathematical modeling suggests that in microbes, a sufficiently high rate of environmental DNA (eDNA) uptake is a key condition for the long-term survival of microbial populations facing Muller’s ratchet.

Here, we test the hypothesis that cells cannot tune their horizontal transfer rates to be below the threshold required for parasite persistence without compromising their long-term survival. Among all the proposed benefits of HGT, we considered prevention of genome degeneration through Muller’s ratchet because it constitutes a baseline (i.e., a minimum requisite) for the HGT rate that can be analytically estimated from population genetics models and tested through comparative genomics. The analysis of available data suggests that most microbial populations cannot purge parasites and escape from Muller’s ratchet at the same time. Susceptibility to selfish elements thus becomes a cost that cells have to pay to maintain an evolutionary regime that does not lead to mutational meltdown.

My comment:  In the absence of recombination, finite populations are subject to irreversible deterioration through accumulation of deleterious mutations, a process known as Muller’s ratchet, that eventually leads to the collapse of a population via mutational meltdown. From the onset, there would have had to be a population of diversified microbes, not just the population of one progenitor, but varies with different genetic make-ups, internally compartmentalized, able to perform Horizontal Gene Transfer and recombination. Unless these preconditions were met, the population would die.

Given the indispensability of HGT for the survival of microbial populations, a plausible hypothesis seems to be that HGT is evolvable, i.e. is an adaptive, selectable trait. However, whether or not this is the case is not an easy question because HGT might be considered a by-product of the presence of substantial amounts of DNA in the environment combined with genetic processes such as transformation and bacteriophage infection that leads to gene transduction. Diverse bacteria and archaea are competent for natural transformation that is mediated by dedicated DNA intake pumps. These pumps can be legitimately considered devices for utilization of environmental DNA as a source of nucleotides (simply put, food), with HGT being a fringe benefit. However, the demonstration that at least in some bacteria the ingested DNA is protected against degradation, thus preventing its use as a nucleotide source and conversely facilitating HGT, implies that, at least in part, natural competence evolved as a gene transfer mechanism. The long-known existence of DNA uptake signal sequences and proteins that bind them, which jointly comprise a discrimination mechanism allowing bacteria to preferentially take up DNA from closely related organisms, is another piece of evidence in support of the view of transformation as an evolved route of gene transfer, apart from the nutritional value of DNA.

Bacterial and archaeal conjugation (prokaryotic sex) is a mechanism of genetic material transfer between microbial cells that combines features of selfish genetic elements and devices for gene transfer. Conjugative plasmids encode proteins required for autonomous replication, whereas integrative and conjugative elements (ICEs, or conjugative transposons) typically replicate while integrated into the host chromosome but have the ability to excise and form plasmid-like molecules. Both types of elements are transferred by the conjugation molecular machinery (type IV secretion systems) and typically carry ‘cargo’ genes unrelated to the transposon life cycle. Thus, these selfish elements are at the same time vehicles for HGT that mediate microbial adaptation by introducing new genes into the recipient genomes.

Perhaps the most striking showcase for dedicated vehicles of HGT are the gene transfer agents (GTAs), defective prophages that form virus particles in which they package apparently random fragments of the bacterial chromosome rather than the phage genome. The GTAs then infect other bacteria or archaea, and the transferred DNA integrates into the recipient genome. In marine bacterial communities, the rate of GTA-mediated gene transfer appears to be quite high and often involves distantly related organisms. Notably, the GTAs confer on their carriers the ability to donate rather than acquire genetic material. Such a capacity could be adaptive in the context of utilization of “public goods” by microbial communities. The wide spread of GTAs appears to present strong evidence of evolvability of HGT.

Both theoretical models and tantalizing experimental clues suggest that HGT is essential for microbial survival. 1

Eugene V. Koonin (2017): Genetic parasites, including viruses and mobile genetic elements, are ubiquitous among cellular life forms, and moreover, are the most abundant biological entities on earth that harbor the bulk of the genetic diversity. Here we examine simple thought experiments to demonstrate that both the emergence of parasites in simple replicator systems and their persistence in evolving life forms are inevitable because the putative parasite-free states are evolutionarily unstable.

Nearly all cellular life forms are hosts to various types of genetic parasites that exploit functional systems of the host cells to replicate their own genomes. Only some bacteria with highly reduced genomes that themselves lead a symbiotic or parasitic lifestyle seem to lack genetic parasites that undoubtedly have been lost during the reductive evolution of these bacteria from free-living ancestors. Genetic parasites include viruses, transposons, plasmids and other semi-autonomous genetic elements (SAGE) that display a broad range of relationships with the hosts, from acute antagonism, whereby a virus rapidly kills the host, to symbiosis when SAGE are not costly to the host and could even have beneficial effects.

Strikingly, virus particles appear to be the most common biological entities on earth. In most environments, the ratio between virus particles and cells varies between 10 and 100. This enormous physical abundance of viruses is matched by vast genetic diversity so that most of the gene repertoire of the biosphere appears to be concentrated in viruses, even as exact number remain a matter of debate. The prevalence of viruses in the biosphere is also paralleled by the abundance of SAGE integrated in genomes of cellular life forms. Integrated SAGE are present in virtually all genomes of cellular organisms (again, missing only in some intracellular symbionts and parasites), and in genomes of multicellular eukaryotes, SAGE-derived sequences quantitatively dominate the genome, comprising at least 50% of the DNA in vertebrates and up to 90% in plants. Recruitment of sequences from SAGE for cellular functions is a common phenomenon that made substantial contributions to the evolution of cellular life forms.

The entire course of the evolution of life is a history of host-parasite co-evolution. Being subject to the constant onslaught of genetic parasites, cellular life forms have evolved a plethora of defense mechanisms. A typical organism harbors and interacts with multiple types of genetic parasites (e.g. viruses, different families of transposons, and plasmids) which it holds at bay thanks to multiple defense strategies that include parasite exclusion, innate immunity and adaptive immunity. The SAGE respond with counter-defense mechanisms that range from simple mutational escape from defense to dedicated multigene systems that specifically inactivate host defense systems. Notably, defense systems and SAGE including their counter-defense machineries are tightly linked in evolution. Enzymes involved in the mobility of SAGE, in particular, transposons are often recruited by host defense systems for roles in parasite genome inactivation and other functions, and conversely, SAGE recruit components of defense systems that then evolve to become agents of counter-defense.

Thus, the arms race, along with cooperation, between genetic parasites and their hosts are perennial features of the evolution of life. Why is this the case? Why do the parasites emerge in the first place? And, could some cellular organisms actually get rid of the parasites through highly efficient defense systems? Empirically, the answer to the latter question seems to be negative. Conceivably, the general cause of the inability of the hosts to eliminate the genetic parasites is the unescapable cost of maintaining sufficiently powerful defense systems. Analysis of theoretical models of parasite propagation suggests that an important source of this cost, perhaps the primary one in microbes, could be that efficient anti-parasite defense has the side effect of curtailing horizontal gene transfer (HGT), which is an essential process in microbial evolution that allows microbes to avoid deterioration via Muller’s ratchet . Another major factor could be the effectively unavoidable autoimmunity. However, what about the first, arguably, the most fundamental question: why do genetic parasites evolve to begin with? Again, empirically, there is a strong impression that the emergence of such parasites is inevitable. Not only are they ubiquitous in cellular life forms but they also evolve in various computer simulations of replicator system evolution. Furthermore, it appears intuitive: genetic parasites can be considered cheaters, in game-theoretical terms, and as soon as, in a replicator system, there is a distributable resource, such as a replicase, cheaters would emerge to steal that resource without producing their share of it. These, however, are informal considerations. Here we ask the question: is it possible to develop a theoretical framework that would allow a formal demonstration of the inevitability of the emergence of genetic parasites in evolving replicator systems, or else, that parasite-free replicator systems are after all possible?

The emergence, as well as persistence of genetic parasites, is an inalienable feature of evolving replicators and, as such, one of the central principles of biology. 2

Viruses and their hosts are engaged in a constant arms race leading to the evolution of antiviral defence mechanisms. Recent studies have revealed that the immune arsenal of bacteria against bacteriophages is much more diverse than previously envisioned. These discoveries have led to seemingly contradictory observations: on one hand, individual microorganisms often encode multiple distinct defence systems, some of which are acquired by horizontal gene transfer, alluding to their fitness benefit. On the other hand, defence systems are frequently lost from prokaryotic genomes on short evolutionary time scales, suggesting that they impose a fitness cost. In this Perspective article, we present the ‘pan-immune system’ model in which we suggest that, although a single strain cannot carry all possible defence systems owing to their burden on fitness, it can employ horizontal gene transfer to access immune defence mechanisms encoded by closely related strains. Thus, the ‘effective’ immune system is not the one encoded by the genome of a single microorganism but rather by its pan-genome, comprising the sum of all immune systems available for a microorganism to horizontally acquire and use.

Almost all cellular life forms, from prokaryotes (like bacteria) to eukaryotes (like plants and animals), harbor various genetic parasites. These parasites use the host's cellular machinery to replicate but can also bring harm or benefits to the host. These genetic parasites have played a significant role in shaping the evolution of life. They can sometimes be harmful, like lytic viruses that kill their host. However, they can also be beneficial, like by providing resistance to superinfections or by contributing genes that get integrated into the host's genome, driving evolutionary change.

Parasites were inevitable in the early stages of life

Theoretical studies on life's origin indicate that parasites were almost inevitable in the early stages of life. Pre-cellular communities of replicators would have been vulnerable to "cheaters" (or parasitic forms) that used resources without contributing. To combat this, early life forms needed to develop mechanisms to protect against these cheaters, which in turn drove complexity in life. The relationship between cellular life forms and parasites is not merely parasitic but symbiotic. While parasites leverage the host's resources, the host can benefit from the genetic material and other advantages the parasite brings. Life as we know it has not evolved in isolation. Hosts and parasites have constantly influenced each other's evolutionary trajectories. As hosts developed defenses against parasitic invasions, parasites evolved counter-strategies, resulting in an ongoing evolutionary arms race. This constant push-and-pull has driven the emergence of new traits and capabilities in both hosts and parasites. One of the ways early life could have defended against parasitic cheaters was through compartmentalization, leading to the evolution of cellular structures. Genetic parasites have contributed vast amounts of genetic material to their hosts. In some eukaryotes, a significant fraction of their genome comes from transposable elements. These insertions can lead to new functions, genetic diversity, and evolutionary innovation. While we often view parasites in a negative light, their presence can offer advantages. For example, bacteria can acquire resistance genes against antibiotics from plasmids, and viral infections can sometimes provide immunity against more dangerous infections. Parasites can be lost at both the genome level and the population level. The proliferation, loss, transfer, and selection of genes govern the dynamic landscape of genomes. Both genetic parasites and non-selfish genes seem to follow similar patterns of loss and transfer. Genetic parasites with a higher effective loss rate would require a higher transfer rate for survival. HGT is fundamental in prokaryotic evolution. It can happen through transformation (uptake of DNA from related organisms), conjugation (direct transfer from one cell to another), and transfection (virus-mediated transfer). Each process can lead to either homologous or non-homologous recombination.

Without HGT or recombination, harmful mutations accumulate, leading to the eventual population collapse, a process known as Muller's ratchet. HGT provides an escape from this trap by introducing new genes and genetic variations. HGT is not just about adapting and evolving; it's a fundamental requirement for microbial survival. Microbial populations that cannot engage in HGT are doomed to suffer from Muller’s ratchet, eventually leading to their collapse. While it's clear that HGT plays a critical role in microbial survival and evolution, it's debated if HGT is an adaptive, selectable trait or a mere by-product of other processes. Evidence points to both, with some processes clearly evolved for gene transfer, while others may serve dual purposes. The interdependence of life forms and parasitic elements suggests a symbiotic evolution. In essence, without parasites and mechanisms like HGT, life may not have persisted as we know it. Viruses and cellular life forms might indeed be interdependent. Viruses, as genetic parasites, have played an indispensable role in shaping the evolution of cellular organisms. While they might seem purely destructive, they are pivotal in introducing genetic diversity, which is the cornerstone of evolution. The coexistence of viruses and cellular life points towards an intricate evolutionary dance where both partners have influenced each other's existence. Viruses have honed the adaptive responses of organisms, pushing them towards greater complexity. In return, organisms have served as hosts, providing an environment for viruses to evolve and adapt.
If one considers life as a vast interconnected web of interactions, then viruses form the critical nodes, connecting, disrupting, and stimulating the network. Without their influence, the diversity and resilience of life as we understand it would likely be vastly diminished. In that sense, suggesting that life and viruses had to emerge in tandem paints a picture of an ecosystem where every entity, no matter how minuscule or vast, has a role to play in the grand narrative of existence.

1. Eugene V Koonin: Inevitability of Genetic Parasites 2016 Sep 26
2. Eugene V. Koonin: Inevitability of the emergence and persistence of genetic parasites caused by evolutionary instability of parasite-free states 04 December 2017

Last edited by Otangelo on Wed Sep 13, 2023 7:21 pm; edited 1 time in total




Donald Pan (2022): Virus Origins and the Origin of Life

Viruses are enigmatic, with scientists debating, often heatedly, about the nature of their existence, such as whether viruses should be categorized as living or non-living. They are not even represented in most conventional trees of life. However, viruses have played a driving role in the co-evolution of life on Earth since its origins. Viruses can serve as molecular fossils, providing a possible glimpse at a pool of genes that may have been present from a time before the Last Universal Common Ancestor (LUCA). The study of virology and the origin of life have often been intertwined. Since the earliest studies of viruses, viruses have been hypothesized to be ancient, possibly anteceding cell-based life. Other hypotheses propose that cell-based life preceded viruses, with viruses “escaping” from cellular life forms. 

Any discussion on the origin and nature of viruses inevitably leads down a path toward a discussion of the origin and nature of life itself. Viruses have challenged the definitions of life and how scientists think about it, occupying the hazy boundary between living and non-living. Ever since the earliest days of virology, they have been hypothesized to play a role in the origin of life, predating cells. Soon after bacteriophage were discovered in the 1910s, they were proposed to be related to the earliest life forms, being given the name “protobe” (first life) by their French Canadian discoverer Felix D’Herelle (Vaughan 1927). One of the foundational figures in the modern study of the origin of life, J.B.S. Haldane, in his 1929 essay “The Origin of Life” suggested that life had existed in a “virus stage for many millions of years” before the first cell formed (Haldane 1929). Around the beginning of this century, discoveries of novel viruses like giant viruses and new families of archaeal viruses, as well as the development of sequencing technologies and metagenomics have revealed the global impact and importance of viruses to life on Earth. In this new era, there has been a revival of interest in the evolutionary origins of viruses and the roles that viruses may have played at the origin of life on Earth. There has also been increasing interest in viruses for astrobiology and the search for extraterrestrial life.

Viruses are intertwined with life on Earth. They are known to infect all domains of life. As far as we know, there are apparently no cellular organisms that do not harbor viruses or related genetic parasites. Their fingerprints are found within the genomes of all cellular organisms. Viruses and related selfish replicating genetic elements have been locked in an arms race with cellular life likely since its origins. Viruses are also abundant in extreme environments that may have been the setting for the origin of life, for example hydrothermal vents and terrestrial hot springs. Thus, viruses, genetic parasites, and other selfish replicating elements offer clues to early life. In addition, viruses have been implicated in various major evolutionary events in the history of the Earth.

What Are Viruses? Evolution of the Virus Concept
What are viruses? This seemingly simple question has been fraught with ambiguities and debate from the earliest days of virology to today. Even the definition of virus has evolved over time. The word “virus” derives from Latin vira, meaning “poison”, and had originally referred to anything causing disease. Modern virology started in the late 1800s when Ivanovsky and Beijerinck found that an agent (now known to be tobacco mosaic virus), which had passed through Chamberland filters that no known bacteria could pass, was capable of causing disease in tobacco leaves. Viruses were redefined by their ability to pass through filters and were initially called “contagium vivum fluidum” (contagious living fluid) and later “filterable viruses”. Viruses infecting bacteria, which were coined “bacteriophage”, were discovered by Twort and D’Herelle in the early 1900s. In 1926, Thomas Rivers found that viruses are obligate parasites. Wendell Stanley’s 1935 crystallization of the tobacco mosaic virus confirmed that viruses are particulate, not fluid, in nature, and in 1937 Bawdan and Pirie found that virus particles are composed of protein and nucleic acid . During this early period of virology, the virus particle (virion) and the virus were considered one and the same. However, the virus concept shifted with the discovery of lysogeny. Initially discovered by Burnet & MacKie in 1929 and further developed by André Lwoff, some bacteriophage were found to be incorporated into the host as prophage. The virus concept could no longer be limited to the virus particle, it must also include the virus integrated with the host cell. Some viruses can also reside in host cells without integrating into the host genome. It became clear that viruses can have a distinct intracellular and extracellular phase. This necessitated a shift in the virus concept to not only be centered on the virion, but to also include all the phases of the virus replication cycle, extracellular as well as intracellular. It was no longer appropriate to confuse the virion for the virus. This was recognized early on by Lwoff (1957), however this confusion remains a major trapping nowadays with much of the public (as well as many scientists).

Raoult and Forterre proposed a definition of viruses that would divide life into two categories: capsid-encoding (viruses) and ribosome-encoding organisms (cells). Under this definition, viruses necessarily encode capsids which serve to propagate the virus extracellularly. Cellular life forms all encode ribosomes, while there are no known viruses that encode ribosomes. This further led to the formation of the “virocell concept” which asserts that the infected virion-producing cell is in fact the living phase of the virus. Cells that do not produce virions are termed ‘ribocells’. The virus in the course of infection converts the cell into a ‘virion factory’. A virus co-existing with its host cell is termed a ‘ribovirocell’. Framing the virus-infected cell as a cellular form of the virus can offer a perspective on phenonema such as the infection of other cells by an infected cell via direct cell-to-cell contact rather than through extracellular virion release. The virocell concept also aligns with why a virus may encode genes for metabolism (auxiliary metabolic genes) that are seemingly useless for the metabolically inert virion (e.g. such as genes related to photosynthesis or fermentation) 

For the purpose of examining the origin of viruses and their roles in the origin of life, I will utilize the replicator paradigm coined by Koonin which includes phylogenetically related replicons as part of a wider “viral world” in which the virus concept is “genetic, informational parasitism”. The reason for utilizing this paradigm is that capsidless selfish elements likely preceded bona fide viruses in evolution. As more convention-breaking viruses were discovered and vast amounts of genomic data became available, a rethinking of traditional notions of viruses became necessary. Recently, the views of viruses have shifted dramatically. The discovery of giant viruses eliminated the notion that virions are necessarily small. Further investigations of viruses have also shown that they play a more complex role than mere disease-causing agents. They also engage in mutualistic interactions with their hosts. It is still challenging to rigorously define what a virus is. There is also debate about whether viruses should be considered organisms and what the definition of an organism should be. Many have resorted to referring to viruses as “biological entities” to avoid this. However, it still suffers from the same lack of a rigorous definition: what is a biological entity or biological individual? If viruses can integrate into hosts, is an endogenized virus still a virus or biological entity? The question of the nature of viruses will likely continue to be debated as it may belong more to the realm of philosophy.

Viruses Co-evolved with Cellular Life on Earth 
The modern era of viral ecology likely started with the discovery of the enormous quantity of virus particles in the ocean, exceeding the number of cells. Since then, virions have been revealed to be ubiquitous in every environment wherever cells exist. The development of viral metagenomics and the availability of massive sequence data have revealed the enormous genetic diversity of viruses, far greater than that of cellular life forms. Most of the genetic diversity on Earth resides in the virosphere. Viruses have shaped the genomes of all cellular life forms. There does not seem to be any organism that does not harbor viruses or related genetic parasites. Organisms for which no viruses are known seem to more likely reflect a lack of effort in searching rather than an actual absence of viruses. However even in their genomes, there can still be remnants of past viral infections such as integrated proviruses and other parasitic genetic elements. Many eukaryotic genomes are filled with selfish elements, for instance, the human genome in which parasitic genetic elements make up a significant portion. From the perspective of the virus, cells can be seen as a habitat and rare resource necessary for replication. Viruses engage in complex interactions of competition and cooperation in order to compete for space in the cellular habitat. And given the abundance and diversity of virions in the environment, it is no wonder that all cellular life forms on Earth have evolved systems to resist the constant invasion of genetic parasites for example CRISPR-Cas, restriction modification, and RNAi. However, virus-host co-evolution is not only an arms race. Viruses also engage in mutualistic symbioses. Viruses can mediate horizontal gene transfer and can shape cellular genomes through integration. Some viruses can integrate into host genomes as proviruses which may become endogenized, losing their ability to produce virions. The viral components can then become exapted by the host for other uses. These interactions between virus and host have shaped all life on Earth.

Viruses Share Homologous Genes Not Found in Cellular Life Forms 
Viral sequence data has allowed the identification of virus “hallmark genes” shared across major viral lineages. The viruses of the 3 domains of life share hallmark genes that do not share any homology with those from cellular life forms, suggesting that viruses are an ancient lineage, deriving from before the Last Universal Cellular Ancestor (LUCA, also referred to as Last Universal Common Ancestor). Whereas the core genes of all cellular life forms are monophyletic, deriving from the same lineage as LUCA, core viral replication genes such as RNA-dependent RNA polymerase, reverse transcriptase, protein primed DNA polymerase, and rolling-circle replication initiation endonuclease do not share homology with those from cellular life forms. In addition, homologous viral capsid proteins containing the double jelly roll fold and HK97 fold are encoded by diverse viruses infecting all 3 domains of life, but are not found in cellular life forms suggesting that viruses infecting the 3 domains had evolved before the 3 domains diverged from LUCA.

Viruses Utilize Diverse Modes of Replication 
Several features of viruses make them attractive as models for pre-LUCA life forms. One primary feature is the utilization of a greater variety of genetic information storage and replication strategies than extant cellular life forms. Viruses can encode their genomes in linear or circular form, as double-stranded DNA (dsDNA), single-stranded DNA (ssDNA), double-stranded RNA (dsRNA), or single-stranded RNA (ssRNA), whereas extant cells only utilize dsDNA. This is illustrated by the Baltimore classification of viruses (Baltimore 1971), which groups viruses by their genome types and modes of transcription and replication. All cells follow the “Central Dogma”, coined by Francis Crick (1958), in which genomic information is stored as dsDNA, transcribed into RNA, and translated into protein. However, viruses can store their genomic information in myriad forms and can replicate in ways that defy the Central Dogma, suggesting that they did not derive from LUCA. Furthermore, viruses do not only rely on host cell machinery, but many also encode their own polymerases. The many modes of replication and the diversity of replication machineries encoded by viruses offer tantalizing clues to possible alternative modes of replication on Earth before the evolution of DNA-based genomes. Take for example reverse transcribing viruses (Group VI and VII on the Baltimore classification) that use the enzyme reverse transcriptase to produce DNA from RNA genomes or templates. Viruses encoding reverse transcriptase may have been instrumental during the transition from the RNA World to DNA World via RNA-dependent DNA synthesis.

Giant Viruses 
While large viruses infecting algae have been known since the 1980s, the discovery of giant viruses with genomes of over 1 million basepairs in length (as large as some prokaryotic genomes), has challenged the conception of what viruses are. Visible by light microscopy, the mimivirus was originally thought to be a bacterium (initially named “Bradfordcoccus”) when it was first observed in 1992. It remained classified as a bacterium until 2003 when it was found to actually be an enormous virus. Since then, numerous giant viruses have been discovered including Pandoravirus, Pithovirus, and Tupanvirus. Unlike other viruses, giant viruses encode thousands of genes including many involved in translation, leading some to claim that giant viruses represent an intermediate stage between a parasitic cell and typical smaller viruses or that giant viruses belong to a fourth domain of life that may have preceded LUCA. Some have also hypothesized that the giant viruses are an ancient lineage that arose before the origin of eukaryotes. Giant viruses were also found to be susceptible to infection by other viruses (virophage). Based on this evidence, some have suggested that viruses should be considered living organisms. More recently, “huge phages” with genomes as large as 735 kbp were detected based on metagenomic sequencing (Al-Shayeb et al. 2020), reigniting the question of whether they may have risen from a pre-cellular stage of evolution.

Virus-First Hypothesis 
The idea that viruses are older than cellular life forms is as old as virology itself, with some of the foundational thinkers in virology and origin of life studies suggesting that viruses were the first forms of life on Earth (Haldane 1929). This “Virst-first hypothesis” asserts that viruses emerged before cellular life forms evolved. This idea had been disputed because viruses are obligate intracellular parasites of cells—it would be impossible for bona fide viruses to precede cells in evolution. Another potential problem is that most virus capsid proteins seem to originate from cellular proteins. If one were to utilize a virus concept that strictly limits viruses to capsid-encoding genetic parasites, the virus-first hypothesis would have to be rejected. However, this is avoided with a more expansive virus concept by including related virus-like genetic replicators. The appearance of parasitic replicators at the origin of life is supported by studies demonstrating that parasites emerge in any system of replicating genetic elements. This was developed by Koonin in the “Primordial Virus World hypothesis”, in which diverse, virus-like selfish replicators first emerged in inorganic compartments such as those found in hydrothermal vents before the first cells evolved. Some portion of the initial replicators inevitably become parasites, initiating a continuous process of virus-host co-evolution. These compartments of replicators and parasites would eventually evolve into extant cells and viruses. Given that viruses can replicate in cell-free lysate, it is not hard to imagine that virus-like replicators may have replicated in primordial enzyme-containing pre-cellular compartments prior to the emergence of cells.

Viroids and RNA Viruses in the RNA World 
The RNA World hypothesis suggests that the earliest life forms on Earth were RNA replicons. RNA is attractive as a candidate for an early form of life on Earth because it carries both genetic information and enzymatic function in a single molecule. Viroids, which do not encode any proteins, are the simplest RNA replicons known and have been suggested to be “living fossils” of the RNA World. Viroids are a biological entity that may fill an evolutionary gap before ribosomes and proteins had evolved. However, some argue that viroids are unlikely to have been present during the early history of life on Earth because they are found only in plants and not in other domains of life, suggesting that they may be a more recent evolutionary development. Regardless, viroids and RNA viruses in general serve as models of replicators in the RNA World. In fact, the viral quasispecies model, which has been utilized extensively to study RNA viruses, was originally developed to model the behavior of autonomous prebiotic replicons on early Earth (Eigen 1971). The high error-rates of RNA virus and viroid replication make them fit to study as quasispecies.

Cell Reduction or Regression Hypothesis 
The regression hypothesis asserts that viruses evolved from parasitic cells that underwent reductive evolution into viruses, akin to the reductive evolution of endosymbionts like the mitochondria. Robert Green and Sir Patrick Laidlaw had first proposed this idea in the 1930s. However, criticisms of this hypothesis focus on the lack of any plausible mechanism by which a cell would regress into a virion form. The discovery of giant viruses reignited contemporary discourse regarding this hypothesis. The virions of giant viruses, upon entry into the cell, form “virion factories” within the cytoplasm, reminiscent of endosymbiotic organelles. The genomes of giant viruses are as large as some prokaryotes and code for thousands of genes, including many of those involved in translation. The recently discovered Tupanvirus was found to encode a near-complete set of translation proteins with the exception of ribosomes. Some have claimed that the giant viruses represent the remnants of a fourth domain (later reassessed “Fourth TRUC—Things Resisting Uncompleted Classification”) that had existed prior to LUCA. However, this had been refuted by phylogenetic analyses suggesting that giant viruses had evolved from smaller viruses. Instead, giant virus lineages had recruited genes from their hosts, expanding their genomes.

Escape or Endogenous Hypothesis 
The escape hypothesis asserts that viruses emerged from cellular genes that became selfish and evolved into viruses capable of escaping the cell. This idea had first been proposed in the 1920s and had become widely accepted among virologists. The strong specificity between viruses and their hosts seem to support this idea—there do not seem to be any viruses that can infect hosts across domains. However, the escape hypothesis does not explain the apparent homologies between viruses infecting different domains of life. To account for this, a modified version of the escape hypothesis had been proposed, with viruses escaping pre-LUCA protocells rather than modern post-LUCA cells. In this modified scenario, selfish genes within the genomes of RNA World protocells became viruses by recruiting host proteins for encapsidation. The escape hypothesis was further modified by merging it with the virus-first hypothesis in which early pre-cellular replicators emerged first, symbiotically associating with the first protocells after they emerged. The intracellular replicators may then evolve into bona-fide viruses by recruiting capsid proteins from those early protocells.




Roger W. Hendrix (2000) Genome analyses of double-strand DNA-tailed bacteriophages argue that they evolve by recombinational reassortment of genes and by the acquisition of novel genes as simple genetic elements termed morons. These processes suggest a model for early virus evolution, wherein viruses can be regarded less as having derived from cells and more as being partners in their mutual co-evolution.

Comment: That does not explain the origin of viruses in the first place.

The evolution and origin of viruses and how these processes could relate to the evolution of cellular life have exercised the imagination of biologists since viruses were discovered. In 1924, Félix d’Herelle, one of the discoverers of bacteriophages, proposed a scheme for the origin of life in which viruses were ancestral to cells. As more was learned about the biology of both cells and viruses this view became untenable, on the grounds that viruses’ dependence on host cells for growth makes them improbable pre-cursors of those cells. Most virology texts now suggest rather vaguely that the first virus was derived from some pre-existing cellular structure or metabolic process that ‘learned’ how to be partially independent of the cell

Recent comparative studies of bacteriophage genome sequences, particularly for the double-strand DNA-tailed phages, have given an increasingly detailed view of the evolutionary mechanisms that must have shaped contemporary bacteriophage genomes. Extrapolation of those mechanisms back in time allows us to suggest a speculative model for the origins of these phages, which, if correct, would argue that even though d’Herelle was not right, neither was he entirely wrong. Genetic mosaicism has been known in phage l and its relatives since DNA heteroduplex studies were performed in the late 1960s. From these data, Susskind and Botstein2,3 proposed a ‘modular theory of phage evolution, in which recombination between different members of the family of phages took place

Roger W.Hendrix: The origins and ongoing evolution of viruses  1 November 2000


Sponsored content

Back to top  Message [Page 1 of 1]

Permissions in this forum:
You cannot reply to topics in this forum