ElShamah - Reason & Science: Defending ID and the Christian Worldview
Would you like to react to this message? Create an account in a few clicks or log in to continue.
ElShamah - Reason & Science: Defending ID and the Christian Worldview

Otangelo Grasso: This is my personal virtual library, where i collect information, which leads in my view to the Christian faith, creationism, and Intelligent Design as the best explanation of the origin of the physical Universe, life, biodiversity


You are not connected. Please login or register

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?

Go down  Message [Page 1 of 1]

Otangelo


Admin

What might be a Cell’s minimal requirement of parts ?  1

https://reasonandscience.catsboard.com/t2110-what-might-be-a-protocells-minimal-requirement-of-parts

The simplest cells available to us for study have nothing "primitive" about them. They have a teleonomic apparatus so powerful that no vestiges of truly primitive structures are discernible.
J.Monod Chance and Necessity: An Essay on the Natural Philosophy of Modern Biology 1972

DIANA YATES (OCT 5, 2011) New evidence suggests that LUCA was a sophisticated organism after all, with a complex structure recognizable as a cell, researchers report. Their study appears in the journal Biology Direct. The study lends support to a hypothesis that LUCA may have been more complex even than the simplest organisms alive today, said James Whitfield, a professor of entomology at Illinois and a co-author on the study.
Yates, D. (2011, October 5). Last Universal Common Ancestor had a complex cellular structure. Illinois News. Link.

Gaetano-Anollés, G. (2011)The proteome of LUCA was recently reconstructed and was shown to embody the functions of a complex organism.
Evolution of vacuolar proton pyrophosphatase domains and volutin granules: clues into the early evolutionary origin of the acidocalcisome. Biology Direct, 6(50). Link. 


https://www.sciencedaily.com/releases/2011/10/111005112145.htm
Tiny genome: A somewhat vague term for a → genome, that is smaller than 300 kb, or smaller than the genome of Mycoplasma genitalium, the free-living organism with the smallest genome. Five obligate symbionts of insects have such tiny genomes: ‘Candidatus Sulcia muelleri’ (245,530 bp), ‘Candidatus Zinderia insecticola’(208,564 bp), ‘Candidatus Carsonella ruddii’ (159,662 bp), ‘Candidatus Hodgkinia cicadicola’(143,795 bp) and ‘Candidatus Tremblaya princeps’( 138,927 bp).

The genome sequence reveals that “Ca. Tremblaya princeps” cannot be considered an independent organism but that the consortium with its gammaproteobacterial symbiotic associate represents a new composite living being. 3

Chance of unguided random natural events producing just a minimal functional proteome, not considering all other essential things to get a first living self-replicating cell,is:

Let's suppose, we have a fully operational raw material, and the genetic language upon which to store genetic information. Only now, we can ask: Where did the information come from to make the first living organism? Various attempts have been made to lower the minimal information content to produce a fully working operational cell. Often, Mycoplasma is mentioned as a reference to the threshold of the living from the non-living. Mycoplasma genitalium is held as the smallest possible living self-replicating cell. It is, however, a pathogen, an endosymbiont that only lives and survives within the body or cells of another organism ( humans ).  As such, it IMPORTS many nutrients from the host organism. The host provides most of the nutrients such bacteria require, hence the bacteria do not need the genes for producing such compounds themselves. As such, it does not require the same complexity of biosynthesis pathways to manufacturing all nutrients as a free-living bacterium. 

The simplest free-living bacteria is Pelagibacter ubique. 13 It is known to be one of the smallest and simplest, self-replicating, and free-living cells.  It has complete biosynthetic pathways for all 20 amino acids.  These organisms get by with about 1,300 genes and 1,308,759 base pairs and code for 1,354 proteins.  14  That would be the size of a book with 400 pages, each page with 3000 characters.  They survive without any dependence on other life forms. Incidentally, these are also the most “successful” organisms on Earth. They make up about 25% of all microbial cells.   If a chain could link up, what is the probability that the code letters might by chance be in some order which would be a usable gene, usable somewhere—anywhere—in some potentially living thing? If we take a model size of 1,200,000 base pairs, the chance to get the sequence randomly would be 4^1,200,000 or 10^722,000. This probability is hard to imagine but an illustration may help.  

Imagine covering the whole of the USA with small coins, edge to edge. Now imagine piling other coins on each of these millions of coins. Now imagine continuing to pile coins on each coin until reaching the moon about 400,000 km away! If you were told that within this vast mountain of coins there was one coin different to all the others. The statistical chance of finding that one coin is about 1 in 10^55. 

The argument of the cell
1. At least 1300 proteins are required as building blocks for the simplest living cell to come into existence.
2. Proteins are highly complex structures.
3. The probability of random creation of complex proteins, the assemblage of the needed 1300 in one place in nature without any control is less than 10^700,000 or impossible.
4. If you leave the molecules required to make the four basic building blocks of life, they will just randomize, and become asphalts.
4. Eliminative inductions argue for the truth of a proposition by arguing that competitors to that proposition are false. The impossibility of chance indicates the necessity of an intelligent designer to explain the origin of life.

How Many Genes Can Make a Cell: The Minimal-Gene-Set Concept
https://www.ncbi.nlm.nih.gov/books/NBK2227/
Several theoretical and experimental studies have endeavored to derive the minimal set of genes that are necessary and sufficient to sustain a functioning cell under ideal conditions, that is, in the presence of unlimited amounts of all essential nutrients and in the absence of any adverse factors, including competition. A comparison of the first two completed bacterial genomes, those of the parasites Haemophilus influenzae and Mycoplasma genitalium, produced a version of the minimal gene set consisting of ~250 genes.

Following  irreducible processes and parts  are required to keep cells alive, and illustrate mount improbable to get life a first go: 
Reproduction. Reproduction is essential for the survival of all living things.
Metabolism. Enzymatic activity allows a cell to respond to changing environmental demands and regulate its metabolic pathways, both of which are essential to cell survival. 
Nutrition. This is closely related to metabolism. Seal up a living organism in a box for long enough and in due course it will cease to function and eventually die. Nutrients are essential for life. 
Complexity. All known forms of life are amazingly complex. Even single-celled organisms such as bacteria are veritable beehives of activity involving millions of components. 
Organization. Maybe it is not complexity per se that is significant, but organized complexity. 
Growth and development. Individual organisms grow and ecosystems tend to spread (if conditions are right). 
Information content. In recent years scientists have stressed the analogy between living organisms and computers. Crucially, the information needed to replicate an organism is passed on in the genes from parent to offspring. 
Hardware/software entanglement. All life of the sort found on Earth stems from a deal struck between two very different classes of molecules: nucleic acids and proteins. 
Permanence and change. A further paradox of life concerns the strange conjunction of permanence and change.
Sensitivity. All organisms respond to stimuli— though not always to the same stimuli in the same ways.
Regulation. All organisms have regulatory mechanisms that coordinate internal processes.

Nature made at least three new types of inventions in assembling living cells from building blocks produced by prebiotic chemistry:
catalysis synchronized the necessary chemical reactions;
specificity put the building blocks together correctly; and
heritable blueprints – genetic coding – furnished sufficient continuity for complexity to grow.
The most dramatic of these inventions were all completed and probably overwritten before the first living cells appeared
http://serious-science.org/ancient-enzymes-reveal-the-dna-genesis-3234

Only self awareness and conscient intelligent beings can invent things.

Bhavesh H Patel: Common origins of RNA, protein and lipid precursors in a cyanosulfidic protometabolism 2015 Jul 8
A minimal cell can be thought of as comprising informational, compartment-forming and metabolic subsystems. To imagine the abiotic assembly of such an overall system, however, places great demands on hypothetical prebiotic chemistry. The perceived differences and incompatibilities between these subsystems have led to the widely held assumption that one or other subsystem must have preceded the others.
https://pubmed.ncbi.nlm.nih.gov/26154881/

The discussion on what is an ideal chassis—a natural, robust cell or a minimized version— is still open. (2015) There is a hierarchy of minimal cells depending on the chemical composition of the medium.  14 The smallest known organism (microorganism) is debatable 17 There isn’t a consensus over how small a free-living organism can be, and what the space optimization strategies may be for a cell at the lower size limit for life.

First Detailed Microscopy Evidence of Bacteria at the Lower Size Limit of Life
https://newscenter.lbl.gov/2015/02/27/ultra-small-bacteria/
Scientists have captured the first detailed microscopy images of ultra-small bacteria that are believed to be about as small as life can get. About 150 of these bacteria could fit inside an Escherichia coli cell.  The cells are close to and in some cases smaller than several estimates for the lower size limit of life. This is the smallest a cell can be and still accommodate enough material to sustain life.  The bacterial cells have densely packed spirals that are probably DNA, a very small number of ribosomes, hair-like appendages, and a stripped-down metabolism that likely requires them to rely on other bacteria for many of life’s necessities.

https://www.nature.com/articles/ncomms7372#Sec19

So that is as well not our candidate for a putative minimal free-living cell.

A minimal (extant, non-primitive) cell has been defined as “a biological system that possesses only the necessary and sufficient attributes to be considered alive. Therefore, it must be able to maintain its own structures (homeostasis), self-reproduce, and evolve in a supportive, protected, and stable environment”.   Thus, the challenge is to demarcate those “necessary and sufficient attributes” of life, and a functional approach appears adequate for that purpose. The functional elements of a living cell are (lipid) membranes, proteins, and RNA molecules, and the instructions for making these parts, which are encrypted in genes (i.e., DNA) whose information must be “read” by the rest of the molecular machinery. For this reason, a major challenge in biology during the last decades has been to define the minimal number of genes necessary to keep a minimal cell alive, what has been called a minimal genome. Most studies have focused on bacteria, due to their apparent simplicity and the amount of information that has already been gathered about them. 14

Learning from Nature to Define a Minimal Genome
In order to define a set of essential and sufficient genes to keep a bacterial cell alive, it is first necessary to define which are the essential functions that need to be fulfilled. To approach this point, scientists have looked for functions that have been preserved in natural living bacteria with the most reduced genomes, because they must retain all genes involved in informational functions plus a minimal metabolic network for cellular maintenance and reproduction in their given niche. To date, all known cases of reduced bacterial genomes are associated with specific lifestyles linked to stable environments: cosmopolitan oceanic free-living bacteria and obligate symbionts (either parasitic or mutualistic), the latter being the most affected by reductive genome evolution. In other words, natural small genomes have been observed in diverse situations with remarkable dissimilarities, including a dramatic difference in population sizes, from large marine bacterioplankton populations to small populations of endosymbiotic bacteria inside a eukaryotic cell 14

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  Hetero10
Examples of microorganisms with small genomes completely sequenced

Mycoplasma genitalium is held as the smallest possible living self-replicating cell, and as such, used as a reference of the smallest possible living cell. Mycoplasma is, however, a pathogen, an endosymbiont 13 that only lives and survives within the body or cells of another organism ( humans ).  As such, it IMPORTS many nutrients from the host organism. The host provides most of the nutrients such bacteria require, hence the bacteria do not need the genes for producing such compounds themselves. 12  Endosymbionts can only survive inside their host cells as they rely on their host (and, in some cases on co-primary endosymbiotic partners) for metabolic and other functions. 14 As such, it does not require the same complexity of biosynthesis pathways to manufacturing these nutrients as a free-living bacterium.

Amino Acid Transport in Mycoplasma
The fact that the minute Mycoplasma cells lack many biosynthetic pathways and depend on the supply of many nutrients from the growth medium may indicate the presence of numerous transport systems in these organisms
https://jb.asm.org/content/jb/95/5/1685.full.pdf

This indicates, that the Last Universal Common Ancestor would have had to be much more complex. Amino Acids, for example, were no readily available on the early earth. For example, in the Miller Urey experiment, eight of the 20 amino acids were never produced. Neither in 1953 nor in the subsequent experiments.

How Many Genes Can Make a Cell: The Minimal-Gene-Set Concept
https://www.ncbi.nlm.nih.gov/books/NBK2227/
(x) No biosynthetic pathways for amino acids, since we suppose that they can be provided by the environment.

My comment: This paper presupposes that amino acids could be provided by the environment. In prebiotic scenarios, such supposition is not justified. What can be inferred from this is, that the usual proposed minimal genome, proteome, and metabolome for a putative LUCA are incorrect, and a complete set of metabolic pathways should be incorporated. 

As for the metabolic aspect, an obvious difficulty is that there is no one minimal gene set for life but many, depending on the environment. Among cells with reduced genomes, we find a continuum of metabolic modes, from organic matter-dependent heterotrophy to the minimally demanding autotrophy. 14

The translation machinery is, by far, the most complex part of a modern minimal cell, both in its biogenesis and its function. Therefore, it was not surprising that half of the previously classified as poorly characterized genes have been associated with the maturation of the translation apparatus.

Ribosome biogenesis is fundamental for cellular life, but surprisingly little is known about the underlying pathway. 15  The biosynthesis of ribosomes is, therefore, an essential process for all living organisms. A highly complex interaction of a multiplicity of non-ribosomal proteins and small nucleolar RNAs (snoRNAs) facilitates ribosome formation. Prokaryotic ribosome synthesis is a complex, multistep process requiring the coordinated synthesis, cleavage, post-transcriptional modification and folding of ribosomal RNA (rRNA), and the translation, post-translational modification, folding and binding of approximately 50 ribosomal proteins (r-proteins). 16  Ribosome biogenesis is energetically costly, with the majority of cellular transcription and translational capacity dedicated to the production of new ribosomes.   This process is both rapid, requiring ∼2 minutes for production of a single ribosome, and efficient, with the vast majority of assembly events resulting in mature, translationally active complexes. The assembly of ribosomes is tightly regulated in a growth-rate–dependent manner primarily at the level of rRNA synthesis

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  Fevo-03-00123-g001

Network fragility increases with metabolic minimization. 
Gene essentiality was determined in in silico knock-out experiments using Flux Balance Analysis (FBA) on metabolic models inferred from complete genomes, except for the minimal theoretical network, based on CMG (Gil et al., 2004), where Elementary Flux Mode analysis was used. From right to left, the data points correspond to E. coli (Belda et al., 2012), ancestral and extant S. glossinidius network (Belda et al., 2012), M. pneumoniae (Wodke et al., 2013), Blattabacterium (González-Domenech et al., 2012), B. aphidicola BAp (Thomas et al., 2009), B. aphidicola BCc (Belda et al., 2012), and the minimal theroretical metabolism (Gabaldón et al., 2007). CDS, protein coding sequences.

My comment: This indicates, that robustness and homeostasis of the metabolome is only reached when a certain size and complexity is achieved. Which is a contradiction in terms. It means, that it would have been extremely unlikely that the essential metabolic network would have grown step by step, from small to big, until getting a minimal functional size. Rather than integrate, the molecules would have disintegrated.

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  Nanoba10
Surface area (SA) and volume (V) ratios in three selected species of different sizes: Escherichia coli, “Candidatus Pelagibacter ubique,” and Nanoarchaeum equitans. The microorganism with the smallest dimensions is “Ca. P. ubique”. The habitat of “Ca. P. ubique” is the open ocean (Oligotrophs occupy environments where the available nutrients offer little to sustain life. ) The total protein numbers in encoded by genomes of E. coli (NCBI Reference Sequence: NC_000913.3), “Ca. P. ubique” (GenBank: CP000084.1) and N. equitans (GenBank: AE017199.1) are given and related with the proteins with membrane-spanning domains. For prediction of transmembrane helices in proteins, above genomes were analyzed using TMMHMM 2.0 Server at http://www.cbs.dtu.dk/services/TMHMM/ (Krogh et al., 2001; Möller et al., 2001). ∗Dimensions and calculations of surface area and volume were obtained from Young (2006). ∗∗The diameter was obtained from Huber et al. (2002), the equations for the surface area (SA = 4πr2, where r is the radius) and volume 

[ltr]V=43πr3[/ltr]
, where r is the radius) of a sphere.





Proteome Organization in a Genome-Reduced Bacterium
The bacterium Mycoplasma pneumoniae, a human pathogen, has a genome of reduced size and is one of the simplest organisms that can reproduce outside of host cells. As such, it represents an excellent model organism in which to attempt a systems-level understanding of its biological organization.

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  Mycopl10
From proteomics to the cell. By a combination of pattern recognition and classification algorithms, the following TAP-identified complexes from M. pneumoniae, matching to existing electron microscopy and x-ray and tomogram structures 
(A), were placed in a whole-cell tomogram 
(B): the structural core of pyruvate dehydrogenase in blue (~23 nm), the ribosome in yellow (~26 nm), RNA polymerase in purple (~17 nm), and GroEL homomultimer in red (~20 nm). Cell dimensions are ~300 nm by 700 nm. The cell membrane is shown in light blue. The rod, a prominent structure filling the space of the tip region, is depicted in green. Its major structural elements are HMW2 (Mpn310) in the core and HMW3 (Mpn452) in the periphery, stabilizing the rod.The individual complexes (A) are not to scale, but they are shown to scale within the bacterial cell (B). 11

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  Higher10
Higher level of proteome organization.
(A) The RNA polymerase–ribosome assembly. Core components are represented by circles, attachments by diamonds. The line attribute corresponds to socio-affinity indices: dashed lines, 0.5 to 0.86; plain lines, >0.86. Color code and shaded yellow circles around groups of proteins refer to individual complexes: RNA polymerase (pink), ribosome (purple), and translation elongation factor (green). The bottom graph shows that the ribosomal protein RpsD (23 kD) and the α subunit of the RNA polymerase, RpoA-TAP (57 kD), co-elute in high molecular weight fractions (MD range) during gel filtration chromatography. 
(B) DNA topoisomerase (diameter ~ 12 nm) is a heterodimer in bacteria: ParE (ATPase and DNA binding domains) and ParC (cleavage and C-terminal domains). The interaction between ParE-DNA–binding and ParC–cleavage domains was modeled by using yeast topoisomerase II as a template [Protein Data Bank (PDB) code 2rgr], and ParE-ATPase and ParC–C-terminal domains were modeled separately on structures of gyrase homologs (PDB 1kij and 1suu). All four domains were fitted into the electron microscopy density. Gyrase (~12 nm) is similarly split in bacteria into GyrA/GyrB, which are paralogs of ParE/ParC, and was modeled and fitted by using PDB 1bjt as a template for the GyrB-DNA–binding and GyrA-cleavage domains interaction. (C) Protein multifunctionality in M. pneumoniae illustrated with the AARS complexes.



According to a peer-reviewed scientific paper published in Science magazine in 2016: Design and synthesis of a minimal bacterial genome, in their best approximation to a minimal cell, it has a 531,000-base pairs genome that encodes 473 gene products, being substantially smaller than M. genitalium (580 kbp), which has the smallest genome of any naturally occurring cell that has been grown in pure culture, having a genome that contains the core set of genes that are required for cellular life. That means, all its genes are essential and irreducible. It encodes for 438 proteins

https://sci-hub.ren/10.1126/science.aad6253

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  First_10


 

From the book: Lateral gene transfer in evolution, page 6
To control and process DNA as an information and storage apparatus, an organism REQUIRES AT LEAST a minimal set of DNA polymerase, DNA ligase, DNA helicase, DNA primase, DNA topoisomarase, and a DNA-dependent RNA polymerase.

A fairly complex genome similar to those of free-living prokaryotes, with a variety of functional capabilities including metabolic transformation, information processing, membrane/transport proteins and complex regulation, shared between the three domains of life, emerges as the most likely progenitor of life on Earth, with profound repercussions for planetary exploration and exobiology. The estimate of LUCA's gene content appears to be substantially higher than that proposed previously, with a typical number of over 1000 gene families, of which more than 90% are also functionally characterized.a fairly complex genome similar to those of free-living prokaryotes, with a variety of functional capabilities including metabolic transformation, information processing, membrane/transport proteins and complex regulation, shared between the three domains of life, emerges as the most likely progenitor of life on Earth.
http://sci-hub.ren/https://www.sciencedirect.com/science/article/pii/S0923250805002676

How Structure Arose in the Primordial Soup
Primitive organisms began to split into the different branches that make up the tree of life. In between those two seminal events, some of the greatest innovations in existence emerged: the cell, the genetic code and an energy system to fuel it all. All three of these are essential to life as we know it, yet scientists know disappointingly little about how any of these remarkable biological innovations came about.
https://www.scientificamerican.com/article/how-structure-arose-in-the-primordial-soup/

How small can a genome get and still run a living organism? 
12 October 2006 

Researchers now say that a symbiotic bacterium calledCarsonella ruddii, which lives off sap-feeding insects, has taken the record for smallest genome with just 159,662 'letters' (or base pairs) of DNA and 182 protein-coding genes. At one-third the size of previously found 'minimal' organisms, it is smaller than researchers thought they would find.
https://www.nature.com/news/2006/061009/full/news061009-10.html

The physiology and habitat of the last universal common ancestor
SEPTEMBER 2016
Among 286,514 protein clusters, we identified 355 protein families (∼0.1%) that trace to LUCA by phylogenetic criteria. Because these proteins are not universally distributed, they can shed light on LUCA’s physiology. Their functions, properties and prosthetic groups depict LUCA as anaerobic, CO2-fixing, H2-dependent with a Wood–Ljungdahl pathway, N2-fixing and thermophilic. LUCA’s biochemistry was replete with FeS clusters and radical reaction mechanisms. Its cofactors reveal dependence upon transition metals, flavins, S-adenosyl methionine, coenzyme A, ferredoxin, molybdopterin, corrins and selenium. Its genetic code required nucleoside modifications and S-adenosyl methionine-dependent methylations
http://sci-hub.ren/10.1038/nmicrobiol.2016.116

For a nonliving system, questions about irreducible complexity are even more challenging for a totally natural non-design scenario, because natural selection — which is the main mechanism of Darwinian evolution — cannot exist until a system can reproduce.  For an origin of life we can think about the minimal complexity that would be required for reproduction and other basic life-functions.  Most scientists think this would require hundreds of biomolecular parts.  And current science has no plausible theories to explain how the minimal complexity required for life (and the beginning of biological natural selection) could have been produced by natural process before the beginning of biological natural selection.

In order to make life, and specially multicellular complex life,  the building blocks of life, cells, have to be made, which are the tiniest living entities. To build  cells requires information and programming, complex protein manufacturing machines and assembly lines, energy, nutrient supply chains, quality control , waste bins, ability to adapt to the environment and to react to stimuli, ability of replicating, and housing ( the cell membrane ). 

“The complexity of the simplest known type of cell is so great that it is impossible to accept that such an object could have been thrown together suddenly by some kind of freakish, vastly improbable, event. Such an occurrence would be indistinguishable from a miracle.” 
― Michael Denton, Evolution: A Theory In Crisis

Determination of the Core of a Minimal Bacterial Gene Set
http://mmbr.asm.org/content/68/3/518.full.pdf
Based on the conjoint analysis of several computational and experimental strategies designed to define the minimal set of protein-coding genes that are necessary to maintain a functional bacterial cell, we propose a minimal gene set composed of 206 genes. Such a gene set will be able to sustain the main vital functions of a hypothetical simplest bacterial cell.

How Many Genes Can Make a Cell: The Minimal-Gene-Set Concept
https://www.ncbi.nlm.nih.gov/books/NBK2227/
Several theoretical and experimental studies have endeavored to derive the minimal set of genes that are necessary and sufficient to sustain a functioning cell under ideal conditions, that is, in the presence of unlimited amounts of all essential nutrients and in the absence of any adverse factors, including competition. A comparison of the first two completed bacterial genomes, those of the parasites Haemophilus influenzae and Mycoplasma genitalium, produced a version of the minimal gene set consisting of ~250 genes.

Quinones 
Important components of the chemiosmotic energy-converting mechanisms are the lipophilic quinones that can diffuse in the lipid bilayer and connect the redox enzymes. Menaquinones are widely used in Prokaryotes in general and specifically in all the deep branching prokaryotic phyla (Schoepp-Cothenet et al., this issue) and consequently have been proposed to be present already in LUCA. 2

Molybdenum utilization is very likely an ancient trait present in LUCA because (i) it is utilized by almost all phyla of Archaea and Bacteria and (ii) a number of molybdo-enzymes, including the arsenite oxidase, the formate dehydrogenase, the nitrate reductase and the polysulfide reductase, have been predicted to have existed before the Archaea/Bacteria divergence. 

Iron is essential to most life forms. To date, the only organisms that do not depend on iron belong to the Lactobacillus spp

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  Img_2010

A common feature of all life forms is their ability to maintain homeostasis in a given environment. Moreover, to accomplish cellular growth and division, a minimal cell would also require the ability to transform and assemble
its building blocks using the energy provided by the environment. It seems, therefore, that a minimal cell would require a minimal metabolism to fulfill both essential aspects. A first approximation to this core metabolism is provided by the analysis of the enzymatic functions encoded by the theoretically inferred minimal gene set from the abovementioned combined approach. Figure 16.1 (color plate 12)

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  Img_2011

provides representation of the metabolic network encoded by the theoretically inferred minimal gene set, which is thought to comprise the minimal set of metabolic reactions to sustain a bacterial cell under ideal nutrient supply conditions (i.e., glucose, fatty acids, amino acids, nucleobases, and vitamins). The comparison of this theoretically inferred minimal metabolism, in terms of metabolic capacities, with naturally reduced genomes reveals many parallels, since the procedure to determine this minimal set includes genes that are shared by most endosymbiotic bacteria. In the minimal gene set, the intermediary metabolism is mainly reduced to ATP synthesis by substrate-level phosphorylation during glycolysis and the nonoxidative pentose phosphate pathway, whereas amino acid biosynthesis is virtually absent. So it is with de novo biosynthesis of nucleotides, although the complete salvage pathways for most of them can be found. Lipid biosynthesis is limited to condensation of fatty acids with glycerol phosphate, and there are no pathways for biosynthesis of fatty acids. Altogether the minimal metabolic core seems devoted to the production of energy from glucose and the interconversion, rather than the net biosynthesis, of essential cellular building blocks, most of which would be readily provided by a rich environment. However, adding some complexity to this heterotrophic metabolism, one could envisage a hypothetical autotrophic minimal metabolism, like the one conjectured by Benner (1999).

Marcello Barbieri Code Biology A New Science of Life, page 26
Organic information is an irreducible entity, because it cannot be described by anything simpler than its sequence, and the same is true for organic meaning, which cannot be defined by anything simpler than its coding rules.  Organic information and organic meaning, in short, belong to the same class of entities because they have the same defining characteristics: they both are objective but- not-measurable entities, and they both are fundamental entities because they cannot be reduced to anything simpler. They are the twin pillars of life because organic information comes from the copying process that produces genes, while organic meaning comes from the coding process that generates proteins.

A primitive cell like an E. coli bacteria - one of the simplest life forms in existence today -- is amazingly complex.

Proteins are essential building blocks of living cells; indeed, life can be viewed as resulting substantially from the chemical activity of proteins. Because of their importance, it is hardly surprising that ancestors for most proteins observed today were already present at the time of the 'last common ancestor', a primordial organism from which all life on Earth is descended. How did the first proteins arise? How can we bring a taxonomic order to the diversity of forms that evolved from them? These two questions are at the center of our scientific efforts, on which we bring to bear methods in bioinformatics, protein biochemistry and structural biology.

Based on the conjoint analysis of several computational and experimental strategies designed to define the minimal set of protein-coding genes that are necessary to maintain a functional bacterial cell, we propose a minimal gene set composed of 206 genes. Such a gene set will be able to sustain the main
vital functions of a hypothetical simplest bacterial cell with the following features.

(i) A virtually complete DNA replication machinery, composed of one nucleoid DNA binding protein, SSB, DNA helicase, primase, gyrase, polymerase III, and ligase. No initiation and recruiting proteins seem to be essential, and the DNA gyrase is the only topoisomerase included, which should perform
both replication and chromosome segregation functions.

(ii) A very rudimentary system for DNA repair, including only one endonuclease, one exonuclease, and a uracyl-DNA glycosylase.

(iii) A virtually complete transcriptional machinery, including the three subunits of the RNA polymerase, a factor, an RNA helicase, and four transcriptional factors (with elongation, antitermination, and transcription-translation coupling functions). Regulation of transcription does not appear to be essential in bacteria with reduced genomes, and therefore the minimal gene set does not contain any transcriptional regulators.

(iv) A nearly complete translational system. It contains the 20 aminoacyl-tRNA synthases, a methionyl-tRNA formyltransferase, five enzymes involved in tRNA maturation and modification, 50 ribosomal proteins (31 proteins for the large ribosomal subunit and 19 proteins for the small one), six proteins necessary for ribosome function and maturation (four of which are GTP binding proteins whose specific function is not well known), 12 translation factors, and 2 RNases involved in RNA degradation.

(v) Protein-processing, -folding, secretion, and degradation functions are performed by at least three proteins for posttranslational modification, two molecular chaperone systems (GroEL/S and DnaK/DnaJ/GrpE), six components of the translocase machinery (including the signal recognition particle, its receptor, the three essential components of the translocase channel, and a signal peptidase), one endopeptidase, and two proteases.

(vi) Cell division can be driven by FtsZ only, considering that, in a protected environment, the cell wall might not be necessary for cellular structure.

(vii) A basic substrate transport machinery cannot be clearly defined, based on our current knowledge. Although it appears that several cation and ABC transporters are always present in all analyzed bacteria, we have included in the minimal set only a PTS for glucose transport and a phosphate transporter. Further analysis should be performed to define a more complete set of transporters.

(viii) The energetic metabolism is based on ATP synthesis by glycolytic substrate-level phosphorylation.

(ix) The nonoxidative branch of the pentose pathway contains three enzymes (ribulose-phosphate epimerase, ribosephosphate isomerase, and transketolase), allowing the synthesis of pentoses (PRPP) from trioses or hexoses.

(x) No biosynthetic pathways for amino acids, since we suppose that they can be provided by the environment.

(xi) Lipid biosynthesis is reduced to the biosynthesis of phosphatidylethanolamine from the glycolytic intermediate dihydroxyacetone phosphate and activated fatty acids provided by the environment.

(xii) Nucleotide biosynthesis proceeds through the salvage pathways, from PRPP and the free bases adenine, guanine, and uracil, which are obtained from the environment.

(xiii) Most cofactor precursors (i.e., vitamins) are provided by the environment. Our proposed minimal cell performs only the steps for the syntheses of the strictly necessary coenzymes tetrahydrofolate, NAD, flavin aderine dinucleotide, thiamine diphosphate, pyridoxal phosphate, and CoA.

A brave estimate of the minimal components for the translational apparatus today comprises no more than 200 genes, of which more than 120 are associated with the translational apparatus, encoding about 40 genes for ribosomal
proteins, two rRNAs (omitting the 5S rRNA), 21 tRNAs, 20 synthetases, six factors and at least 20 tRNA modifying enzymes. In addition a minimum of 30 genes are needed for both the generation of household energy and the synthesis of at least some of the amino acids (note: since some of the amino acids were formed in the Stanley Miller type experiments mimicking the atmosphere and the physical environment of more than 3 billion years ago,8 they could be taken up from the primordial soup by the earliest cells and thus did not need to be synthesized).
Prebiotic Evolution and Astrobiology, page 28.

Johnson DE 2010, Programming of Life, p37&49.
life's original alphabet must have used a coding system at least as symbolically complex as the current codon alphabet. There has been no feasible natural explanation proposed to produce such an alphabet since chance or physicality cannot produce functional information or a coding system, let alone a system as complex as that in life"

Jack T. Trevors – Theoretical Biology & Medical Modelling, Vol. 2, 11 August 2005, page 8 1
“No man-made program comes close to the technical brilliance of even Mycoplasmal genetic algorithms. Mycoplasmas are the simplest known organism with the smallest known genome, to date. How was its genome and other living organisms’ genomes programmed?”

http://creation.com/origin-of-life
Donald E. Johnson  (Ph.D: Computer & Information Science; Ph.D: Chemistry)

Abstract. The origin of life's biggest mystery is the origin of the genome which contains the information to cybernetically control all aspects of cellular life today. Without formal control, no life would exist. The genetics-first and metabolism-first models will be examined, each having characteristics that strain scientific credibility. Major physical science limitations and the formidable information science problems are examined. These problems usually result in over-simplifications in speculative  scenarios. More serious are the peer-reviewed scientific null  hypotheses that require falsification before any of the naturalistic scenarios can be considered as serious science. Assuming the problems can be resolved, the requirements for a minimal "genome" can be discussed in the areas of initial generation of programmed controls, replication of the genome and needed components that make it useful, regulation of "life's" processes, and evolvability. Life is an intersection of the physical sciences of chemistry and physics and the nonphysical formalism of information science. Each domain must be investigated using that domain's principles. Yet most scientists have been attempting to use physical science to explain life's nonphysical information domain, a practice that has no scientific justification.


1. http://www.asa3.org/ASA/education/origins/ic-cr.htm
2. http://www.sciencedirect.com.https.sci-hub.hk/science/article/pii/S0005272812010407
3. http://sci-hub.tw/https://www.ncbi.nlm.nih.gov/pubmed/21914892
4. https://web.archive.org/web/20180224154833/http://www.forerunner.com/forerunner/X0728_Evolutionary_Improba.html
5. https://web.archive.org/web/20111113032919/http://www.arn.org/blogs/index.php/2/2009/11/10/minimal_complexity_relegates_life_origin
6. https://www.panspermia.org/chandra.htm
7. https://web.archive.org/web/20170423032439/http://creationsafaris.com/epoi_c06.htm#ec06f12x
8. https://pdfs.semanticscholar.org/5650/aaa06de4de11c36a940cf29c07f5f731f63c.pdf
9. https://books.google.com.br/books?id=Wfpdui3JRTcC&pg=PT458&lpg=PT458&dq=Functional+proteins+from+a+random-sequence+library&source=bl&ots=IpQN_2y-w_&sig=ACfU3U1mDnB-pA6lJUXRedrLwCJY5C6gKQ&hl=en&sa=X&ved=2ahUKEwjh1OK73vLgAhXZK7kGHT_rCvEQ6AEwBnoECF0QAQ#v=onepage&q&f=false
10. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3642372/
11. https://science.sciencemag.org/content/326/5957/1235.full?sid=bcd72d87-8ba8-4210-957f-96cd54b50b84
12. https://en.wikipedia.org/wiki/Minimal_genome
13. https://en.wikipedia.org/wiki/Endosymbiont
14. https://www.frontiersin.org/articles/10.3389/fevo.2015.00123/full
15. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3919561/
16. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5311925/
17. https://en.wikipedia.org/wiki/Smallest_organisms



Last edited by Otangelo on Tue Oct 17, 2023 6:15 am; edited 123 times in total

https://reasonandscience.catsboard.com

Otangelo


Admin

Introduction: Pseudo-Scientific Speculations or Science?
A hundred years ago, the title's question wouldn't have been needed  since a cell was thought to be bag of plasm  originating in a "warm little pond" . Fifty years ago, protein and DNA structures had been determined so science "knew" the secrets of the genome. With the Miller/Urey synthesis, many thought that the origin of life explanation was near. Fifteen years ago, it started to be realized that "junk DNA" was a misnomer. Five years ago, epigenetic control systems largely determined by non-coding DNA began to be discovered. As new knowledge of functional complexity is revealed, we realize that our knowledge of that complexity has been increasing exponentially, with no end in sight. As one layer is pealed back, a new level of functional complexity is exposed. Rather than getting simpler, the more we know, the more we know we don't know! "As sequencing and other new technologies spew forth data, the complexity of biology has seemed to grow by orders of Magnitude" . There seems to be an exponential increase in knowledge, with the target of understanding the origin receding ever faster. 

The origin of life (OOL) is unknown and is obscured by the lack of  knowledge of the prebiotic conditions that existed as life "developed." "Most of the (bio)chemical processes found within all the living organisms are well understood at the molecular level, whereas the origin of life remains one of the most vexing issues in chemistry, biology, and philosophy". "The origin of life remains one of the humankind's last great unanswered questions, as well as one of the most experimentally challenging research areas" . Any speculation inevitably involves science as we don't know it. It is metaphysically presumed that since life obviously exists, there must have been a time when non-life developed into life through natural mechanisms. It is also presumed (with no substantiating reasons) that Pasteur's Law of Biogenesis, all life is from life ("Omne vivum ex vivo" ), must not have been applicable during life's formation from inanimate material. Pasteur's warning that "Spontaneous generation is a dream" ("La génération spontanée est une chimère"  ) is perhaps appropriate to consider with the various speculations. It is important to realize that "we don't yet know, but the answers will be coming" isn't a scientific statement, but rather expresses faith in naturalism-of-the-gaps, which is no more scientific than the god-of-the-gaps explanation that most scientists would dismiss out-of-hand. 

Speculation on a particular aspect of life may not prove fruitful since all known life is a carefully-orchestrated cybernetic system. Without consideration of the origin of cybernetic processes, they are "systems and processes that interact with themselves and produce themselves from themselves". Michael Polanyi argued that life is not reducible to physical and chemical principles, but rather that, "the information content of a biological whole exceeds that of the sum of its parts". "A whole exists when it acts like a whole, when it produces combined effects that the parts cannot produce alone" . "Understanding the origin of life requires knowledge not only of the origin of biological molecules such as amino acids, nucleotides and their polymers, but also the manner in which those molecules are integrated into the organized systems that characterize cellular life"

It should be noted that speculation is important within science, since that is the way that new lines of thought are proposed in order to test scenarios for possibility and feasibility . While the dream of becoming a Nobel laureate may encourage wide dissemination of a speculation, it seems appropriate to warn about spreading such speculations outside the scientific community. The public too often views a scientist's speculation as validated science, so that the speculative nature is overlooked. The public may value a scientist's view in much the same way that a movie star's endorsement of a product is seen as important. There seems to be a wide-spread belief in chemical predestination, even though its chief promoter  has repudiated its possibility. For example, when signs of water on Mars were discovered, the media proclaimed that there must be life then. Our collective preoccupation with the Search for Extraterrestrial  Intelligence illustrates the belief in the inevitability of life. 

1. Overview 
The approach of this essay will be to consider scenarios for developing the minimal replication and control information ("proto-genome") for a protocell, since even "protolife" would require self-replication and control ability. Note that the ability to use the "genomic" information for functionality is also critical. Metabolic cycles , homochirality, cell membranes, and other required components will not be the primary thrust, even though all are indirectly controlled by today's genome. An excellent review of the organic chemistry for biomolecular origin is available. Each proponent's scenario will be briefly highlighted, with the primary arguments against the scenario coming from proponents of an alternative scenario, typically as quotes. Finally, we'll examine principles that are almost universally ignored in OOL scenarios, but are in critical need of scientific explanation. 


1.1 RNA (Genetics) First Scenarios 
A ribosome, "a molecular fossil", can join amino acids  without additional enzymes except for those that are imbedded in the ribosome itself to make it a ribozyme (enzymes needed to manufacture tRNAs presumably developed later). "An appeal of the RNA world hypothesis is that it solves the 'chicken and egg' problem; it shows that in an earlier, simplified biota the genotype/replicator and pheno- type/catalyst could have been one and the same molecule" (but the RNA/enzyme of a ribozyme is another chicken/egg problem). "RNA appears well suited to have served as the first replicative polymer on this planet". The origin of the RNA World by stringing together optimistic extrapolations of prebiotic chemistry achievements and experimenter-directed RNA "evolution" (a misnomer) has been described as "the 'Molecular Biologists' Dream ... [and] the prebiotic chemist's nightmare". The "difficulties in nucleobase ribosylation can be overcome with directing, blocking, and activating groups on the nucleobase and ribose.  These molecular interventions are synthetically ingenious, but serve to emphasize the enormous difficulties that must be overcome if ribonucleosides are to be efficiently produced by nucleobase ribosylation under prebiotically plausible conditions. This impasse has led many scientists to abandon the idea that a RNA "genome" might have assembled abiotically, and has prompted a search for potential pre-RNA informational molecules" . It has been pointed out that "what is essential, therefore, is a reasonably detailed description, hopefully supported by experimental evidence, of how an evolvable family of cycles might operate. The scheme should not make unreasonable demands on the efficiency and specificity of the various external and internally generated catalysts that are supposed to be involved. Without such a description, acceptance of the possibility of complex non-enzymatic cyclic organizations that are capable of evolution can only be based on faith, a notoriously dangerous route to scientific progress" . The experimenter- directed "side products would have amounted to a fatal and committed step in the synthesis of a nascent proto-RNA. This problem illustrates a difficulty in non-enzymatic polymerization that must be taken into account when considering how the nature of the synthetic routes to and structural identities of early genetic polymers: irreversible linkages are adaptive for an informational polymer only when mechanisms exist to make them conditionally reversible. 

No physical law need be broken for spontaneous RNA formation  to happen, but the chances against it are so immense, that the suggestion implies that the non-living world had an innate desire to generate RNA. There is no reason to presume that an indifferent nature would not  combine units at random, producing an immense variety of hybrid short, terminated chains, rather than the much longer one of uniform backbone geometry needed to support replicator and catalytic functions". "The RNA molecule is too complex, requiring assembly first of the monomeric constituents of RNA, then assembly of strings of monomers into polymers. As a random event without a highly structured chemical context, this sequence has a forbiddingly low probability and the process lacks a plausible chemical explanation, despite considerable effort to supply one"  "It has been challenging to identify possible prebiotic chemistry that might have created RNA. Organic molecules, given energy, have a well-known propensity to form multiple products, sometimes referred to collectively as 'tar' or 'tholin.' These mixtures appear to be unsuited to support Darwinian processes, and certainly have never been observed to spontaneously yield a homochiral genetic polymer. To date, proposed solutions to this challenge either involve too much direct human intervention to satisfy many in the community, or generate molecules that are unreactive 'dead ends' under standard conditions of temperature and pressure".  Some  believe that inorganic crystals or clay served as a  template for the original RNA. The "replication of clay 'information' has remained hypothetical, and transfer of replicated clay properties to  nucleic acids even more so". Crystals contain a very small  quantity of information in their regular structures, so that any significant information would have to be in irregularities. How would inanimate nature produce those irregularities to serve as templates for  functional information in replicative polymers? 
"The reaction system... is a purified reconstituted system in which all of the components and their concentrations are defined. The number of components is amazingly large, yet this is one of the simplest encapsulated systems for carrying out protein translation and RNA replication. With regard to the origin of life, the first living systems would have had functionally identical translation and replication systems, but they must have been simpler and contained machinery for nutrient transport. The complexity of our system implies that extant translation machinery has become highly sophisticated during the evolutionary process" . 

1.2 Metabolism-First Scenarios 
Metabolism-first scenarios involve development of a self-replicating, self-sustaining chemical system that is able to capture energy and that is contained within a protocell [24] or geothermal vent [38-39]. Perhaps energy transfer used an "osmosis first" paradigm [40, 26]. Unlike RNA first, there is no nucleotide genome to control replication or component construction so that selection would have favored "not the best replicator, but the reaction that sucked in fuel the quickest, denying energy to other chemical processes" [41]. The "bag of chemicals" (composome) presumably would grow until it reaches a size that enables it to divide, with each "daughter" inheriting about half the chemical contents. "The origin of life was marked when a rare few protocells happened to have the ability to capture energy from the environment to initiate catalyzed heterotrophic growth directed by heritable genetic information in the polymers ... The origin of life occurred when a subset of these molecules was captured in a compartment and could interact with one another to produce the properties we associate with the living state" [39]. There have been simulations [42-43] in which the composomes "undergo mutation-like compositional changes" that are claimed to illustrate evolution, but these have never been experimentally observed. 

Although metabolism-first avoids the infeasibility of forming functional RNA by chance, "replication of compositional information is so inaccurate that fitter compositional genomes cannot be maintained by selection and, therefore, the system lacks evolvability (i.e., it cannot substantially depart from the asymptotic steady-state solution already built-in in the dynamical equations). We conclude that this fundamental limitation of ensemble replicators cautions against metabolism-first theories of the origin of life" [44]. Concerning the chemical cycles required, "These are chemically very difficult reactions ... One needs, therefore, to postulate highly specific catalysts for these reactions. It is likely that such catalysts could be constructed by a skilled synthetic chemist, but questionable that they could be found among naturally occurring minerals or prebiotic organic molecules. The lack of a supporting background in chemistry is even more evident in proposals that metabolic cycles can evolve to 'life-like' complexity. The most serious challenge to proponents of metabolic cycle theories—the problems presented by the lack of specificity of most non-enzymatic catalysts—has, in general, not been appreciated. If it has, it has been ignored. Theories of the origin of life based on metabolic cycles cannot be justified by the inadequacy of competing theories: they must stand on their own" [20]. 

2. Major Unresolved Difficulties 
Nearly all scenarios presented as science during this author's education using the American Chemical Society's "From Molecules to Man" have been shown to be incorrect by today's science. Scientists need to use much caution during speculative dreaming about mechanisms that can be considered as explanations for the observations that are currently available. Some of the major difficulties requiring scientific explanation will be highlighted in this section 


2.1 Physical Science Limitations 
What natural interactions produced homochilarity, -linkage only amino acids, and non-enzymatic peptide bonds and other dehydration reactions in aqueous solutions to produce proteins and RNAs? What physical laws could integrate biochemical pathways and cycles into a formal protometabolic scheme? How did the enzymes required to level life's 10^19 range of uncatalyzed reactions [45] spontaneously polymerize and self-assemble? 

2.2 Formidable Information Science Problems 
"Biological information is not a substance ... biological information is not identical to genes or to DNA (any more than the words on this page are identical to the printers ink visible to the eye of the reader). Information, whether biological or cultural, is not a part of the world of substance" [46]. "All the equations of physics taken together cannot describe, much less explain, living systems. Indeed, the laws of physics do not even contain any hints regarding cybernetic processes or feedback control" [10]. The argument for abiogenesis "simply says it happened. As such, it is nothing more than blind belief. Science must provide rational theoretical mechanism, empirical support, prediction fulfillment, or some combination of these three. If none of these three are available, science should reconsider that molecular evolution of genetic cybernetics is a proven fact and press forward with new research approaches which are not obvious at this time" [47]. "The challenge for an undirected origin of such a cybernetic complex interacting computer system is the need to demonstrate that the rules, laws, and theories that govern electronic computing systems and information don't apply to the even more complex digital information systems that are in living organisms. Laws of chemistry and physics, which follow exact statistical, thermodynamic, and spatial laws, are totally inade-quate for generating complex functional information or those systems that process that information using prescriptive algorithmic information" [48]. 

It is important to realize that data generated by regular fluctuations  (such as seasons or light/dark cycles) have extremely low information content, offering no explanation for life's functional information. Communication of information requires that both sender and receiver  know the arbitrary protocol determined by rules, not law. A functioning protocell would have needed formal organization, not redundant order. Organization requires control, which requires  formalism as a realityEach protein is currently the result of the execution of a real computer program running on the genetic operating system. How did inanimate nature write those programs and  operating systems?
The genome would be useless without the  processing systems needed to carry out its prescriptive instructions. 

2.3 Over-Simplification of Information Requirements 
"Whatever the source of life (which is scientifically unknowable), the alphabet involved with the origin of life, by the necessary conditions of information theory, had to be at least as symbolically complex as the current codon alphabet. If intermediate alphabets existed (as some have speculated), each predecessor also would be required to be at least as complex as its successor, or Shannon's Channel Capacity [49] would be exceeded for information transfer between the probability space of alphabets with differing Shannon capacity. Therefore, life's original alphabet must have used a coding system at least as symbolically complex as the current codon alphabet. There has been no feasible natural explanation proposed to produce such an alphabet since chance or physicality cannot produce functional information or a coding system, let alone a system as complex as that in life" [50]. Coded information has never been observed to originate from physicality. "Due to the abstract character of function and sign systems [semiotics -- symbols and their meaning], life is not a subsystem of natural laws. This suggests that our reason is limited in respect to solving the problem of the origin of life and that we are left accepting life as an axiom... Life express[es] both function and sign systems, which indicates that it is not a subsystem of the [physical] universe, since chance and necessity cannot explain sign systems, meaning, purpose, and goals" [51]. "The reductionist approach has been to regard information as arising out of matter and energy. Coded information systems such as DNA are regarded as accidental in terms of the origin of life and that these then led to the evolution of all life forms as a process of increasing complexity by natural selection operating on mutations on these first forms of life" [52]. "From the information perspective, the genetic system is a pre-existing operating system of unknown origin that supports the storage and execution of a wide variety of specific genetic programs (the genome applications), each program being stored in DNA. DNA is a storage medium, not a computer, that specifies all information needed to support the growth, metabolism, parts manufacturing, etc. for a specific organism via gene subprograms" [50]. 

There are many features in current life that are extremely difficult to envision as arising from a protocell. The smallest genome (though not autonomous) found so far is in "the psyllid symbiont Carsonella ruddii, which consists of a circular chromosome of 159,662 base pairs... The genome has a high coding density (97%) with many overlapping  genes and reduced gene length" [53]. "The origin and evolution of overlapping genes are still unknown" [54]. Since they are prevalent in the simplest known genome, a big question is how and why did overlapping genes arise? Recently, sub-coded information [55] and a second genetic code [56] characterizing alternative splicing have been discovered. Various transcribed RNAs are mixed and matched and spliced into mRNAs for specifying protein construction and other controls. MicroRNAs regulate large networks of genes by acting as master control switches [57]. Tiny polypeptides (with 11-32 amino acids) can function as "micro-protein" gene expression regulators [58]. Were these features required initially, or by what interactions of nature  did they arise later?  Scientists are investigating "the organization of information in  genomes and the functional roles that non-protein coding RNAs play in the life of the cell. The most significant challenges can be summarized by two points: a) each cell makes hundreds of thousands of different RNAs and a large percent of these are cleaved into shorter functional RNAs demonstrating that each region of the genome is likely to be multifunctional and b) the identification of the functional regions of a genome is difficult because not only are there many of them but because the functional RNAs can be created by taking sequences that are not near each other in the genome and joining them together in an RNA molecule. The order of these sequences that are joined together need not be sequential. The central mystery is what controls the temporal and coordinated expression of these RNAs" [59]. "It is very difficult to wrap your head around how big the genome is and how complicated ... It's very confusing and intimidating ... The coding parts of genes come in pieces, like beads on a string, and by splicing out different beads, or exons, after RNA copies are made, a single gene can potentially code for tens of thousands of different proteins, although the average is about five ... It's the way in which genes are switched on and off, though, that has turned out to be really mind-boggling, with layer after layer of complexity emerging" [60]. When and how did these  features arise? Were any present in the first life? 
2.4 Scientific Hypotheses Requiring Falsification 

In addition to falsifying Shannon Capacity Theorem [49] if a  proposed original information system isn't as complex as today's codon-based system, the following testable null hypotheses (proposed in peer-reviewed papers) may require falsification. No scenario should be accepted as science if it violates one or more of these unfalsified null hypotheses [60-61, 11-12]. 

Stochastic ensembles of physical units cannot program algorithmic/cybernetic function. 
Dynamically-ordered sequences of individual physical units  (physicality patterned by natural law causation) cannot program algorithmic/cybernetic function. 
Statistically weighted means (e.g., increased availability of certain units in the polymerization environment) giving rise to patterned (compressible) sequences of units cannot program algorithmic/cybernetic function. 
Computationally successful configurable switches cannot be set by chance, necessity, or any combination of the two, even over large periods of time.
Self-ordering phenomena cannot generate cybernetic organization. Randomness cannot generate cybernetic organization. 
PI (prescriptive information [12]) cannot be generated from/by the chance and necessity of inanimate physicodynamics. 
PI cannot be generated independent of formal choice contingency. 
Formal algorithmic optimization, and the conceptual organization that results, cannot be generated independent of PI.
Physicodynamics cannot spontaneously traverse The Cybernetic Cut [11]: physicodynamics alone cannot organize itself into formally functional systems requiring algorithmic optimization computational halting, and circuit integration. 
3. Could a Protocell Live and Reproduce Without a "Genome?" 
Assuming that the problems highlighted in the previous sections can be overcome (including falsifications of 2.4), this section will discuss the key topic of this essay. The protocell will be assumed to have an appropriate boundary (membrane, microtubule, etc.) that separates the "living" protocell from its environment. This section  will highlight what would be required of a "proto-genome," without regard as to whether such a "genome" is feasible (not operationally falsified). "There seems to be little general agreement as to how the molecular apparatus needed to implement genetics within a cell could have come about. In fact, there seems to be nothing but puzzlement on such questions with virtually no chemically founded suggestions being made at all" [63]. We will be examining the functional requirements of the proto-genome, as opposed to hypothetical implementations. A proto-genome may have little resemblance to today's DNA-based genome since it will be assumed that life's origin didn't involve DNA. Consequently, we will be attempting to examine life as we don't know it, an exercise that should always be accompanied by healthy scientific skepticism. It is important to realize that John von Neumann proposed and proved the requirements for a self-replicating automaton long before the discovery of DNA's information [64]. A self-reproducing system must contain the necessary components of any computer system, as well as the program for its own construction with the hardware needed to accomplish that construction. 

3.1 Replication Requirements 
A mechanism is needed to divide the protocell into two approximately equal daughters with each daughter being capable of growth and eventual division for exponential population potential. The "proto-genome" with its processing system must replicate itself, along with all cellular controls (functional information and senders/receivers/processors) into each daughter. Unless the "proto-genome" has replisome capabilities included in the "proto-genome" rather than a separate enzyme, the self-contained capability is required to duplicate all other needed components for "life" with high fidelity. Each daughter also needs a replicated (or split) cell boundary.
  
Science knows that the current replication hardware and software  requires all the components to be fully functional for replication to occur at all. All known errors during replication result in a decrease of both Shannon and functional information [65], usually producing a cell that is no longer able to reproduce. Reliable replication is fundamental to life, a characteristic lacking in composomes [44]. 

3.2 Control Requirements 
Controlled chemical metabolic networks are needed that can selectively admit "fuel" (redox, heat, photons, etc.) into the cell and process the "fuel" to harness the energy for growth, reproduction, manufacturing of needed components that can't migrate in, and other useful work. Both sender and receiver of the each control signal are needed, along with knowledge of the protocol rules for correct  communication. The manufacturing control for needed cellular  components would probably require enzymatic functionality for polymerization, along with producing homochiral components. In addition, control is required for cell division. Without control, organization (as opposed to self-ordering) is impossible, and functionality would disintegrate, with "tar" a likely result. 

Cellular control is a cybernetic process, so all of the requirements  of the first eight chapters need instantiation into the protocell. While the proto-genome may contain the control instructions, those instructions must be read by other components (unless the proto-genome has expanded capabilities so that it can read itself), and communicated reliably (using "agreed upon" arbitrary protocols between sender and receiver, source and destination) to the components effecting the control operations. This is not an easily-dismissed prerequisite since control in known life is critical to make the chemical components "alive." In addition, mere physicodynamic constraints cannot generate formal biological controls [66]. 

3.3 Evolvability Requirements 
The system would have to be capable of accurate duplication, but capable of gradual changes that would permit evolution to life-as-we-know-it. A robust information structure that can be self-maintained (including error-correction), such as in a long genetic polymer, would be required. The feasibility of formation of such a polymer has yet to be shown with any prebiotic mixture proposed to  date. The enzyme- and template-independent 120-mer polymers  recently generated in water at high temperatures [67] are non-informational homopolymers similar to those adsorbed onto montmorillonite clay surfaces [68]. The aqueous polymers are also cyclic and require some experimenter engineering to achieve 120 mer length. 

The proto-genome would also need to be able to effect highly accurate duplication of the entire proto-cell, with only an occasional "error" that could produce a very similar proto-cell, still possessing all three of the requirements in section 3. The proto-genome, along with all the proto-cell components, would need to have a feasible path to eventually produce cells with the functional complexity of today's life. It does little good to speculate a "simple" initial system unless there are feasible scenarios that can traverse from the proposed initial system to life as we know it, including coded information and other features highlighted previously. For example, one could envision dipping a finger into a bottle of ink and flicking the ink toward a white sheet would eventually produce a pattern that looks like an English letter. That would not explain the formal rules and meaningful syntax of letters that you are currently observing in this book, however. 

4. Conclusions 
While scenarios for the first cell can be envisioned purely from  physicality, a "proto-genome" introduces cybernetic aspects that can have no origination from inanimate material. In particular, organization, prescriptive information, and control require traversing The Cybernetic Cut on a one-way CS (Configurable Switch) Bridge [11] that allows traffic only from formalism to physicality. Just as formalism needs recognition as reality, it is also critical to recognize the limits of physical science, such as physics and chemistry, whose spontaneous inanimate mass/energy interaction behavior is constrained by laws, not formal controls. Initial starting constraints chosen by an experimenter become controls for an experiment, but those chosen constraints are instantiations into physicality of nonphysical formalisms. Life is an intersection of physical science and information science.  Both domains are critical for any life to exist, and each must be investigated using that domain's principles. Yet most scientists have been attempting to use physical science to explain life's information domain, a practice that has no scientific justification. Since the chemistry and physics of life are controlled by prescriptive information (not just constrained by laws), biology is really an information science, not a physical science.

One Way To Think About the Complexity of the “Simplest” Life Form 1

I have always been fascinated by the question, “How simple can life get?” After all, anything that is alive has to perform certain functions such as reacting to external stimuli, taking in energy and converting that energy to its own use, reproducing, etc. Exactly how simple can a living system be if it has to perform such tasks? Many biologists have investigated this question, but there isn’t a firm answer. Typically, biologists talk about how simple a genome can be. The simplest genome belongs to a bacterium known as Carsonella ruddii. It has 159,662 base pairs in its genome, which is thought to contain 182 genes.1 However, it is not considered a real living organism, as it cannot perform all the functions of life without the help of cells found in jumping plant lice.

The bacterium known as Pelagibacter ubique has the smallest genome of any truly free-living organism. It weighs in at 1,308,759 base pairs and 1,354 genes.2 However, there is something in between these two bacteria that might qualify as a real living organism. It is the bacterium Mycoplasma genitalium. It’s genome has 582,970 base pairs and 525 genes.3 While it is a parasite, it performs all the standard functions of life on its own. It just uses other organisms (people as well as animals of the order Primates) for food and housing. Thus, while it cannot exist without other organisms, it might be the best indicator of how “simple” life can get.

If you follow science news at all, you might recognize the name. Two years ago, Dr Craig Venter and his team constructed their own version of that bacterium with the help of living versions of the bacterium, yeast cells, and bacteria of another species from the same genus. Well, now a scientist from Venter’s lab teamed up with several scientists from Stanford University to produce a computer simulation of the bacterium!

Their work, which seems truly marvelous, gives us deep insight into how complex the “simplest” living organism really is.

Let’s start with what the computer simulation actually accomplished. It modeled all the inputs and outputs of the bacterium’s 525 genes throughout a single cell cycle. In other words, it simulated how the genome produces proteins, how those proteins interact with other proteins, and how the entire system is regulated. It followed these processes through all the events leading up to and including the cell reproducing itself.4

Now that’s a lot of work! How did the authors do it? Well, they looked at over 900 different scientific papers that had been produced on the inner workings of Mycoplasma genitalium, and they identified 1,900 specific parameters that seem to govern how the cell operates. There were several discrepancies that were found among the papers involved, and as a result, there was a lot of reconciliation that had to be done. The details of this reconciliation and other matters are found in a 120-page supplement to the 12-page scientific paper.

Once the reconciliation of these studies was accomplished, the essential workings of the cell were split into 28 separate modules that each governed specific functions of the cell. For example, one module dealt with metabolism, while another dealt with the activation of proteins once they were produced. Once each module was built and tested individually, the modules were then joined by looking at what they produced every second. If the products of one module were the kinds of chemicals used by a second module, those products were then treated as inputs to the second module for the next second of computation. The computation proceeded like this (checking the inputs and outputs of each module) for about 10 hours, which is roughly the time it takes a real Mycoplasma genitalium to reproduce.

Why would a group want to undertake such a complex endeavor? Well, one obvious reason is the reconciliation that I mentioned previously. As independent papers, each of the 900 studies to which the authors referred made sense. However, when the authors started using the results of those studies in a model that tries to take all the molecular processes of a cell into account, they found that some results didn’t mesh well with others. The reconciliation that had to take place to get the simulation working will help us better understand the limits of many of the studies related to Mycoplasma genitalium and hopefully will lead to more detailed studies that will slowly wipe away such discrepancies. Also, as the authors state, these kinds of models will:

…accelerate biological discovery and bioengineering by facilitating experimental design and interpretation. Morever, [this study and others] raise the exciting possibility of using whole-cell models to enable computer-aided rational design of novel microorganisms.

So in the end, not only will such models help us better design and interpret experiments, they might one day lead us to ways that we can engineer new microorganisms.

This is fantastic work, and I do think it opens up new vistas in cell and molecular biology. However, we need to pull back for a moment and think about the direct implications of this computer simulation. It simulated, in very basic terms, the molecular interactions that occur in a cell that might be a good analog for the simplest possible life form. It skipped over a lot of details, of course, so it is not a complete simulation by any means. Nevertheless, it is a great first step towards understanding how a living system really works.

Now let’s look at this in very practical terms. In order to be able to match the speed at which the organism operates, this less-than-complete simulation required a cluster of 128 computers to get the job done. Think about that for a moment. In order to simulate most (but not all) of the processes that take place in an analog for what might be the simplest possible living organism, the authors needed the power of 128 computers running together! That should tell us something very clearly:

There is no such thing as a simple living organism.
The more we understand life, the more clear it becomes that even the “simplest” version of it has to be the result of design.

REFERENCES

1. Atsushi Nakabachi, et al., “The 160-Kilobase Genome of the Bacterial Endosymbiont Carsonella,” Science 314:267, 2006.
Return to Text

2. Stephen J. Giovannoni, et al., “Genome Streamlining in a Cosmopolitan Oceanic Bacterium,” Science 309:1242-1245, 2005.
Return to Text

3. According to the Comprehensive Microbial Resource Manual.
Return to Text

4. Jonathan R. Karr, et al., “A Whole-Cell Computational Model Predicts Phenotype from Genotype,” Cell 150(2):389-401, 2012.
Return to Text

1) http://blog.drwile.com/?p=8161



Last edited by Admin on Sat Feb 09, 2019 2:28 pm; edited 4 times in total

https://reasonandscience.catsboard.com

Otangelo


Admin

http://www.uncommondescent.com/intelligent-design/both-genetics-first-and-metabolism-first-origin-of-life-models%E2%80%9D-strain-scientific-credibility-%E2%80%9D/

The Stanford investigators determined that the essential genome of C. crescentus consisted of just over 492,000 base pairs (genetic letters), which is close to 12 percent of the overall genome size. About 480 genes comprise the essential genome, along with nearly 800 sequence elements that play a role in gene regulation.,,, When the researchers compared the C. crescentus essential genome to other essential genomes, they discovered a limited match. For example, 320 genes of this microbe’s basic genome are found in the bacterium E. coli. Yet, of these genes, over one-third are nonessential for E. coli. This finding means that a gene is not intrinsically essential. Instead, it’s the presence or absence of other genes in the genome that determine whether or not a gene is essential.

Jack T. Trevors – Theoretical Biology & Medical Modelling, Vol. 2, 11 August 2005, page 8 1
“No man-made program comes close to the technical brilliance of even Mycoplasmal genetic algorithms. Mycoplasmas are the simplest known organism with the smallest known genome, to date. How was its genome and other living organisms’ genomes programmed?”

The Archaea and Bacteria share a large number of metabolic genes that are not found in eukaryotes. If these two “prokaryotic” groups span the primary phylogenetic divide and their genes are vertically (genealogically) inherited, then the universal ancestor must have had all of these genes, these many functions. This distribution of genes would make the ancestor a prototroph with a complete tricarboxylic acid cycle, polysaccharide metabolism, both sulfur oxidation and reduction, and nitrogen fixation; it was motile by means of flagella; it had a regulated cell cycle, and more. This is not the simple ancestor, limited in metabolic capabilities, that biologists originally intuited. That ancestor can explain neither this broad distribution of diverse metabolic functions nor the early origin of autotrophy implied by this distribution. The ancestor that this broad spread of metabolic genes demands is totipotent , a genetically rich and complex entity, as rich and complex as any modern cell—seemingly more so.



Last edited by Admin on Fri Feb 10, 2017 11:35 am; edited 1 time in total

https://reasonandscience.catsboard.com

Otangelo


Admin

http://spectrummagazine.org/article/book-reviews/2009/10/06/signature-cell

According to Meyer the “simplest extant cell, Mycoplasma genitalium — a tiny bacterium that inhabits the human urinary tract — requires ‘only’ 482 proteins to perform its necessary functions….” If, for the sake of argument, we assume the existence of the 20 biologically occurring amino acids, which form the building blocks for proteins, the amino acids have to congregate in a definite specified sequence in order to make something that “works.” First of all they have to form a “peptide” bond and this seems to only happen about half the time in experiments. Thus, the probability of building a chain of 150 amino acids containing only peptide links is about one chance in 10 to the 45th power.

In addition, another requirement for living things is that the amino acids must be the “left-handed” version. But in “abiotic amino-acid production” the right- and left-handed versions are equally created. Thus, to have only left-handed, only peptide bonds between amino acids in a chain of 150 would be about one chance in 10 to the 90th. Moreover, in order to create a functioning protein the “amino acids, like letters in a meaningful sentence, must link up in functionally specified sequential arrangements.” It turns out that the probability for this is about one in 10 to the 74th. Thus, the probability of one functional protein of 150 amino acids forming by random chance is 10 to the 164th. If we assume some minimally complex cell requires 250 different proteins then the probability of this arrangement happening purely by chance is one in 10 to the 164th multiplied by itself 250 times or one in 10 to the 41,000th power.

there are about 10 to the 80th elementary particles in our observable universe. Assuming a Big Bang about 13 billion years ago, there have been about 10 to the 16th seconds of time. Finally, if we take the time required for light to travel one Plank length we will have found “the shortest time in which any physical effect can occur.” This turns out to be 10 to the minus 43rd seconds. Or turning it around we can say that the most interactions possible in a second is 10 to the 43rd. Thus, the “probabilistic resources” of the universe would be to multiply the total number of seconds by the total number of interactions per second by the total number of particles theoretically interacting. The math turns out to be 10 to the 139th.

https://reasonandscience.catsboard.com

Otangelo


Admin

Determination of the Core of a Minimal Bacterial Gene Set 1


Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  Rewtwe11

http://reasonandscience.catsboard.org/t2110-what-might-be-a-protocells-minimal-requirement-of-parts#3797

Proteins are essential building blocks of living cells; indeed, life can be viewed as resulting substantially from the chemical activity of proteins. Because of their importance, it is hardly surprising that ancestors for most proteins observed today were already present at the time of the 'last common ancestor', a primordial organism from which all life on Earth is descended. How did the first proteins arise? How can we bring a taxonomic order to the diversity of forms that evolved from them? These two questions are at the center of our scientific efforts, on which we bring to bear methods in bioinformatics, protein biochemistry and structural biology.

Based on the conjoint analysis of several computational and experimental strategies designed to define the minimal set of protein-coding genes that are necessary to maintain a functional bacterial cell, we propose a minimal gene set composed of 206 genes. Such a gene set will be able to sustain the main
vital functions of a hypothetical simplest bacterial cell with the following features.

(i) A virtually complete DNA replication machinery, composed of one nucleoid DNA binding protein, SSB, DNA helicase, primase, gyrase, polymerase III, and ligase. No initiation and recruiting proteins seem to be essential, and the DNA gyrase is the only topoisomerase included, which should perform
both replication and chromosome segregation functions.

(ii) A very rudimentary system for DNA repair, including only one endonuclease, one exonuclease, and a uracyl-DNA glycosylase.

(iii) A virtually complete transcriptional machinery, including the three subunits of the RNA polymerase, a  factor, an RNA helicase, and four transcriptional factors (with elongation, antitermination, and transcription-translation coupling functions). Regulation of transcription does not appear to be essential in bacteria with reduced genomes, and therefore the minimal gene set does not contain any transcriptional regulators.

(iv) A nearly complete translational system. It contains the 20 aminoacyl-tRNA synthases, a methionyl-tRNA formyltransferase, five enzymes involved in tRNA maturation and modification, 50 ribosomal proteins (31 proteins for the large ribosomal subunit and 19 proteins for the small one), six proteins necessary for ribosome function and maturation (four of which are GTP binding proteins whose specific function is not well known), 12 translation factors, and 2 RNases involved in RNA degradation.

(v) Protein-processing, -folding, secretion, and degradation functions are performed by at least three proteins for posttranslational modification, two molecular chaperone systems (GroEL/S and DnaK/DnaJ/GrpE), six components of the translocase machinery (including the signal recognition particle, its receptor, the three essential components of the translocase channel, and a signal peptidase), one endopeptidase, and two proteases.

(vi) Cell division can be driven by FtsZ only, considering that, in a protected environment, the cell wall might not be necessary for cellular structure.

(vii) A basic substrate transport machinery cannot be clearly defined, based on our current knowledge. Although it appears that several cation and ABC transporters are always present in all analyzed bacteria, we have included in the minimal set only a PTS for glucose transport and a phosphate transporter. Further analysis should be performed to define a more complete set of transporters.

(viii) The energetic metabolism is based on ATP synthesis by glycolytic substrate-level phosphorylation.

(ix) The nonoxidative branch of the pentose pathway contains three enzymes (ribulose-phosphate epimerase, ribosephosphate isomerase, and transketolase), allowing the synthesis of pentoses (PRPP) from trioses or hexoses.

(x) No biosynthetic pathways for amino acids, since we suppose that they can be provided by the environment.

(xi) Lipid biosynthesis is reduced to the biosynthesis of phosphatidylethanolamine from the glycolytic intermediate dihydroxyacetone phosphate and activated fatty acids provided by the environment.

(xii) Nucleotide biosynthesis proceeds through the salvage pathways, from PRPP and the free bases adenine, guanine, and uracil, which are obtained from the environment.

(xiii) Most cofactor precursors (i.e., vitamins) are provided by the environment. Our proposed minimal cell performs only the steps for the syntheses of the strictly necessary coenzymes tetrahydrofolate, NAD, flavin aderine dinucleotide, thiamine diphosphate, pyridoxal phosphate, and CoA.



1) http://mmbr.asm.org/content/68/3/518.full.pdf
http://www.ncbi.nlm.nih.gov/books/NBK2227/
2) http://www.eb.tuebingen.mpg.de/research/departments/protein-evolution.html



Last edited by Otangelo on Fri Nov 04, 2022 3:00 pm; edited 3 times in total

https://reasonandscience.catsboard.com

Otangelo


Admin

Essential genes of a minimal bacterium 1

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  F3_lar12

Metabolic pathways and substrate transport mechanisms encoded by M. genitalium. Metabolic products are colored red, and mycoplasma proteins are black. White letters on black boxes mark nonessential functions or proteins based on our current gene disruption study. Question marks denote enzymes or transporters not identified that would be necessary to complete pathways, and those missing enzyme and transporter names are colored green. Transporters are colored according to their substrates: yellow, cations; green, anions and amino acids; orange, carbohydrates; purple, multidrug and metabolic end product efflux. The arrows indicate the predicted direction of substrate transport. The ABC type transporters are drawn as follows: rectangle, substrate-binding protein; diamonds, membrane-spanning permeases; circles, ATP-binding subunits.


1) http://www.pnas.org/content/103/2/425



Last edited by Admin on Mon Oct 29, 2018 5:50 am; edited 2 times in total

https://reasonandscience.catsboard.com

Otangelo


Admin

A Whole-Cell Computational Model Predicts Phenotype from Genotype 1

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  ---10

https://reasonandscience.catsboard.com

Otangelo


Admin

A Whole-Cell Computational Model Predicts Phenotype from Genotype 1

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  8768610

M. genitalium Whole-Cell Model Integrates 28 Submodels of Diverse Cellular Processes
(A) Diagram schematically depicts the 28 submodels as colored words—grouped by category as metabolic (orange), RNA (green), protein (blue), and DNA (red)—in the context of a single M. genitalium cell with its characteristic flask-like shape. Submodels are connected through common metabolites, RNA, protein, and the chromosome, which are depicted as orange, green, blue, and red arrows, respectively.
(B) The model integrates cellular function submodels through 16 cell variables. First, simulations are randomly initialized to the beginning of the cell cycle (left gray arrow). Next, for each 1 s time step (dark black arrows), the submodels retrieve the current values of the cellular variables, calculate their contributions to the temporal evolution of the cell variables, and update the values of the cellular variables. This is repeated thousands of times during the course of each simulation. For clarity, cell functions and variables are grouped into five physiologic categories: DNA (red), RNA (green), protein (blue), metabolite (orange), and other (black). Colored lines between the variables and submodels indicate the cell variables predicted by each submodel. The number of genes associated with each submodel is indicated in parentheses. Finally, simulations are terminated upon cell division when the septum diameter equals zero (right gray arrow).



1) http://sci-hub.tw/https://www.sciencedirect.com/science/article/pii/S1355219815000441



Last edited by Admin on Sat Feb 09, 2019 11:24 am; edited 2 times in total

https://reasonandscience.catsboard.com

Otangelo


Admin

In order to make life, and specially multicellular complex life,  the building blocks of life, cells, have to be made, which are the tiniest living entities. To build  cells requires information and programming, complex protein manufacturing machines and assembly lines, energy, nutrient supply chains, quality control , waste bins, ability to adapt  to the environment and to react to stimuli, ability of replicating, and housing ( the cell membrane ).

“The complexity of the simplest known type of cell is so great that it is impossible to accept that such an object could have been thrown together suddenly by some kind of freakish, vastly improbable, event. Such an occurrence would be indistinguishable from a miracle.”
― Michael Denton, Evolution: A Theory In Crisis

A primitive cell like an E. coli bacteria - one of the simplest life forms in existence today -- is amazingly complex.

Following the E. coli model, a cell would have to contain at an absolute minimum:

A cell wall of some sort to contain the cell
A genetic blueprint for the cell (in the form of DNA)
DNA polymerase  capable of copying information out of the genetic blueprint to manufacture new proteins and enzymes
Ribosomes capable of manufacturing new enzymes, along with all of the building blocks for those enzymes
An enzyme that can build cell walls
An enzyme able to copy the genetic material in preparation for cell splitting (reproduction)
An enzyme or enzymes able to take care of all of the other operations of splitting one cell into two to implement reproduction (For example, something has to get the second copy of the genetic material separated from the first, and then the cell wall has to split and seal over in the two new cells.)
Enzymes able to manufacture energy molecules to power all of the previously mentioned enzymes   18


The cell compares to a factory :

The Cell membrane separates the interior of all cells from the outside environment. Thats the exterior  factory wall  that protects the factory.

The Nucleus is the  Chief Executive Officer (CEO). It controls all cell activity; determines what proteins will be made and controls all cell activity.

Plasma membrane gates regulate what enters and leaves the cell; where cells makes contact with the external environment. That's the Shipping/Receiving Department. It functions also as the communications department because it is where the cell contacts the external environment.

The Cytoplasm includes everything between the cell membrane and the nucleus. It contains various kinds of cell structures and is the site of most cell activity. The cytoplasm is similar to the factory floor where most of the products are assembled, finished, and shipped.

Mitochondria/chloroplasts: The power plant. Transforms one form of energy into another

Mitochondrial membranes  keep protein assembly lines together for efficient energy production.

Membrane-enclosed vesicles form packages for cargo so that they may quickly and efficiently reach their destinations.

Internal membranes divide the cell into specialized compartments, each carrying out a specific function inside the cell. That are the compartments in a manufacturing facility.

The cytoplasm is contains the organelles; site of most cell activity.  Its like the space inside the factory.

The Endoplasmic Reticulum (ER) is the compartment where the  Assembly lines reside.  (where workers do their work)

The Golgi apparatus: What happens to all the products that are built on the assembly line of a factory? The final touches are put on them in the finishing and packing department. Workers in this part of the plant are responsible for making minor adjustments to the finished products.

Ribosomes build the proteins , equal to  the Workers in the assembly line.

Signal-Recognition Particles (SRP) and signal receptors provide variety of instructions informing the cell as to what destination and pathway the protein must follow. Thats the address on the parcel where it has to be delivered.

Kinesin Motors: Are the cargo carriers in the cell. That are the  forklift carriers in a factory.

Microtubules: They provide platforms for intracellular transport , amongst other things. That are the internal factory highways.

Lysosomes: are capable of breaking down virtually all kinds of biomolecules, including proteins, nucleic acids, carbohydrates, lipids, and cellular debris. Thats the maintainance crew.  It gets rid of the trash, and to dismantle and dispose of the outmoded machinery.

Hormones: permit the communication between the cells. Thats the cellphone to cellphone communication.


while on the other side, inside of cells:


Highest organisation, order, and efficiency in all manufacturing stages and processes
Highest information storage capacity in the nucleus
Highest possible storage density down to atomic scale. DNA can store in 1 gramm  the information of  570 billion 8mb pendrives!
DNA as a storage medium permits to store the data uncorrupted for centuries.
DNA is volumetric (beaker) rather than planar (hard disk)
high economic,  effective and proper material flow inside the cell
maximal  flexibility  for demand and supply fluctuation
simple material delivery routes and pathways throughout the cell that connect the various internal and external parts
flexbility to external  changes and stimuly, since volumes and demand are variable
High efficiency in the regulation of cell size and growth
lowest energy consumption
high efficiency of braking down waste in the cell and reutilisation and reciclying
Unmatched energy efficiency, approximately 10,000 times more energy-efficient than any nanoscale digital transistor
highest adaptability of the manufacturing process to external changes and pressures
fast fix of damage of broken parts

highest complexity " products "
fidelity in reproduction and replication ( exact copies )
highest adaptability of the products to the environment
complete reproduction autonomy without continuing intelligence input
high efficiency signaling systems and communication pathways
high efficiency

Cell's incorporate the highest possible production efficiency , far beyond imagination. Many life forms are unicellular. But the most complex organisms are multi cellular. One stem cell stores the information to make a body consisting of a vast of array of specialized cells, all interlocked , connected and interdependent producing  a harmonic whole, each cell exercising its specific function, producing a goal directed adult, able to reproduce, and adapt to the environment. So life goes on for thousands of years, without direct intelligent intervention.

Its a very complex integrated system with hierarchical layers of regulation and gene expression, similar to the programs and sub-programs of computer software but much more sophisticated. You can imagine a simple evolutionary pathway, but when you get down to the details, it's far from simple. Each embryo follows a precisely choreographed developmental road map in order to get to the final goal -- the reproductive adult.  Each step is necessary but not sufficient by itself. Turn aside from this developmental pathway and the result is likely to be a damaged worm or a dead one. Skip some steps and the same is true. How did this process come about? We would say this goal-directedness is evidence for a designer who had the final end in mind, and arranged the proper developmental steps appropriately.17

Evolutionary biologists disagree. They say this exquisitely refined developmental pathway evolved gradually, a little at a time. First there was a cell, then a eukaryotic cell, one with a nucleus, organelles, and a cytoskeleton. Then along came multicellularity -- cells living together to make an organism, with some cells set aside to make the next generation, and others free to specialize. As time went on, new digestive, muscle, nerve, and sensory cells evolved and were successfully coordinated into functioning whole organisms.

New genes and proteins must be invented. The cytoskeleton, Hox genes, desmosomes, cell adhesion molecules, growth factors, microtubules, microfilaments, neurotransmitters, whatever it takes to get cells to stick together, form different shapes, specialize, and communicate must all come from somewhere.Regulatory proteins and RNAs must be made to control the expression in time and space of these new proteins so that they all work together with existing pathways.In fact, in order for development to proceed in any organism, a whole cascade of coordinated genetic and biochemical events is necessary so that cells divide, change shape, migrate, and finally differentiate into many cell types, all in the right sequence at the right time and place. These cascades and the resulting cell divisions, shape changes, etc., are mutually interdependent. Interrupting one disrupts the others.



Their product is the replicate itself.  producing a new daughter cell. Its as if we

There is no known compelling mechanism of transition from unicellular to multicellular life. In a multicell organism, stem cells know how to replicate and produce all the specialized daughter cells with amazing efficiency, when to produce them, where they belong, and how to deliver them at the right place. So a organism with just two cells, is already perfect in regard of organisation, complexity, build-up correctness in its developing stage, in the same manner as a organism fully grown, as a human with 3 trillion cells.  There is another interesting aspect. Living beings are always finished and fully apt for survival ( unless sudden violent or sometimes slow habitat changes happen, to which the organism cannot react fast enough ).  A child , 10 years old, has a body with all its members and capable faculties, even if not fully grown. Human artifacts are only finished, when fully build, but during the manufacturing process, unfinished, and unusable. So the whole process of growth of the biological organism is consummate and perfect, even if not finished, while human's artifacts are  not.


The major conceptual flaw of naturalistic evolution models is the fact that it builds on a foundation that cannot be backed up rationally. Its a  fact that  major gaps of understanding about  how first cells could have arised, exist. Fantastic scenarios are hypothesized, like naturally arising, three-dimensional compartmentations observed within fossilized seepage-site metal sulphide to explain the arise of the first cell membranes, self replicating RNA strands, precipitates coevolution of dozens of varios cell components at the same time, " quantum evolution ", ideas which do have no scientific backing, but are just scenarios of scientific fiction in the fertile mind of naturalists.  In the same way as the foundation of a building must be ready, in order to build the house, so with the ToE. Despite a division is made, both , abiogenesis, and biodiversity through ToE stand and fall together. If one isn't true, the other most probably isn't either. There is no reason to evoke the idea that a creator used evolution and natural selection to create all biodiversity. Not only, because in my view  that would diminish his glory. And a capable and powerful God, that creates the universe, should also be able to specially create the incredibly various kinds of animals and plants. But principally, because the overal concept and layout of biological machines indicate that there must be planning in the forefront, conceptualisation of the whole process, visualisation of interdependent parts which work as a interlocking whole like machines designed and engeneered my man. Beside the empirical tests , which show that evolution isnt able to produce new functions for enzymes and proteins 19

The final product of the cell is the fidel copy of itself through replication. While human made factories produce things different than itself, the cell as final product makes a copy of itself. When it divides into two, one daughter cell goes on to make a more specialized type of cell, or even gives rise to several different cell types. Multicellular organisms are more complex than unicellular organisms.


Important considerations for a high economic,  effective and proper material flow are required and must be considered, thought and brought in when planning the concepts and layout design of a new factory assembly line, as for example maximal  flexibility in the line for demand and supply fluctuation,  planning  deep enough to answer all possible aspects of a new line to get max efficiency afterwards.   There should be simple material delivery routes and pathways throughout the facility that connect the processes. Also, there needs to be a plan for flexbility and changes, since volumes and demand are variable. Awareness of the many factors involved right in the planning process of the factory is key. Right-sized equipment and facilities must be planned and considered as well. All equipment and facilities should be designed to the demand rate or takt time.  Projects and facility designs  that do not take these considerations in account,  start out great, but quickly bog down in unresolved issues, lack of consensus, confusion and delay.  

 if we look at human assembly lines, the more they are automated, the less new and continuous external information input is required, and exponentially more complex is the required programming in order to make process flow happen automatically. The most sophisticated factory plants of man still require hudge amounts of human workforce constantly. Thats the best we have been able to come up with in a hundred years. The future will be towards more and more automation and robotics, and less and less direct human intelligence and intervention will be required. Cars will drive by their own to destination. Just program and inform the address in the beginning. In a scale of zero to hundred. But cells do every metabolic actions FULLY automated. If there would be a scale: no automation, and simple manufacturing processes made by hand would be 0, and full automation would be 100, the cell would be 100. At the utmost highest rank top position in regard of automation and complexity. And so in regard of storage capacity.

As a paper in Nature Reviews Molecular Cell Biology states, “Today biology is revealing the importance of ‘molecular machines’ and of other highly organized molecular structures that carry out the complex physico-chemical processes on which life is based.” Likewise, a paper in Nature Methods observed that “[m]ost cellular functions are executed by protein complexes, acting like molecular machines.”
Cells and their  metabolic pathways, comparable to fully automated assembly line in factories, are far far more advanced, complex,  better structured and organized in every aspect, than the most advanced robotic assembly facility ever created by man. 16

“Can all of life be fit into Darwin’s theory of evolution?,” and answered: "The complexity of life's foundation has paralyzed science's attempt to account for it; molecular machines raise an as-yet impenetrable barrier to Darwinism's universal reach."

In 1998, former president of the U.S. National Academy of Sciences Bruce Alberts wrote the introductory article to an issue of Cell, one of the world’s top biology journals, celebrating molecular machines. Alberts praised the “speed,” “elegance,” “sophistication,” and “highly organized activity” of “remarkable” and “marvelous” structures inside the cell. He went on to explain what inspired such words:

The entire cell can be viewed as a factory that contains an elaborate network of interlocking assembly lines, each of which is composed of a set of large protein machines. . . . Why do we call the large protein assemblies that underlie cell function protein machines? Precisely because, like machines invented by humans to deal efficiently with the macroscopic world, these protein assemblies contain highly coordinated moving parts.

A few years later, a review article in the journal Biological Chemistry demonstrated the difficulty evolutionary scientists have faced when trying to understand molecular machines. Essentially, they must deny their scientific intuitions when trying to grapple with the complexity of the fact that biological structures appear engineered to the schematics of blueprints:

Molecular machines, although it may often seem so, are not made with a blueprint at hand. Yet, biochemists and molecular biologists (and many scientists of other disciplines) are used to thinking as an engineer, more precisely a reverse engineer. But there are no blueprints … ‘Nothing in biology makes sense except in the light of evolution’: we know that Dobzhansky (1973) must be right. But our mind, despite being a product of tinkering itself strangely wants us to think like engineers.


Denton, p. 329.
We would see [in cells] that nearly every feature of our own advanced machines had its analogue in the cell: artificial languages and their decoding systems, memory banks for information storage and retrieval, elegant control systems regulating the automated assembly of parts and components, error fail-safe and proof-reading devices utilized for quality control, assembly processes involving the principle of prefabrication and modular construction. In fact, so deep would be the feeling of deja-vu, so persuasive the analogy, that much of the terminology we would use to describe this fascinating molecular reality would be borrowed from the world of late twentieth-century technology.
  “What we would be witnessing would be an object resembling an immense automated factory, a factory larger than a city and carrying out almost as many unique functions as all the manufacturing activities of man on earth. However, it would be a factory which would have one capacity not equalled in any of our own most advanced machines, for it would be capable of replicating its entire structure within a matter of a few hours. To witness such an act at a magnification of one thousand million times would be an awe-inspiring spectacle.”


― Michael Denton, Evolution: A Theory In Crisis
To grasp the reality of life as it has been revealed by molecular biology, we must magnify a cell a thousand million times until it is twenty kilometers in diameter and resembles a giant airship large enough to cover a great city like London or New York. What we would then see would be an object of unparalleled complexity and adaptive design. On the surface of the cell we would see millions of openings, like the port holes of a vast space ship, opening and closing to allow a continual stream of materials to flow in and out. If we were to enter one of these openings we would find ourselves in a world of supreme technology and bewildering complexity.

Unmatched energy efficiency of the cell
A single cell in the human body is approximately 10,000 times more energy-efficient than any nanoscale digital transistor, the fundamental building block of electronic chips. In one second, a cell performs about 10 million energy-consuming chemical reactions, which altogether require about one picowatt (one millionth millionth of a watt) of power.


In contrast to most man-made factories, cells continually dismantle and reassemble their machines at different stages of the cell cycle and in response to environmental challenges, such as infections. Cells use a mixed strategy of prefabricating core elements of machines and then synthesizing additional, snap-on molecules that give each machine a precise function. That provides an economic way to diversify biological processes and also to control them." Thus if the cell needs to respond quickly, such as in a disease or another emergency, it may only need to produce few parts to switch on or tune the machine. On the other hand, if something shouldn't happen, it may only need to block the production of a few molecules. Patrick Aloy and Rob Russell at EMBL used sophisticated computer techniques to reveal the modular organisation of these cellular machines.


The cell is the most complex system mankind has ever confronted. Today we know that the cell contains power stations producing the energy to be used by the cell, factories manufacturing the enzymes and hormones essential for life, a databank where all the necessary information about all products to be produced is recorded, complex transportation systems and pipelines for carrying raw materials and products from one place to another, advanced laboratories and refineries for breaking down external raw materials into their useable parts, and specialized cell membrane proteins to control the incoming and outgoing materials. And these constitute only a small part of this incredibly complex system.

Cellular transport systems:Gated transport is called thus due to it's similarity to our everyday experience of passing through a guarded (electronically or otherwise) gate. This system require three basic components to work: an identification tag, a scanner (to verify identification) and a gate (that is activated by the scanner). The system needs all three components to work otherwise it will not work. Thus in a cell, when a protein is to be manufactured, one of the first steps is for the mRNA [c] to be transported out from the nucleus into the cytoplasm. This requires gated transport of the mRNA at the nuclear pore. Proteins in the pore reads a signal from the RNA (the scanner reads the identification tag) and opens the pore (gate is opened).

The only reason that DNA functions as well as it does is that cells come equipped with an amazing array of cooperative DNA repair mechanisms. For example, polymerase replication during cell division might produce 6 million errors per cell, but then proofreading machinery can reduce this to 10,000 and then mis-match repair machinery could reduce this to 100.  

Question: How could this enormously efficient repair mechanism have evolved, which finds its analogy in our Computer Programs for Spelling Correction ?

So what are the answers in mainstream science literature in regard of the Origin and evolution of metabolic pathways ?

http://flipper.diff.org/app/pathways/info/3461
How the major metabolic pathways actually originated is still an open question.

nice admittance !!


but several different theories have been suggested to account for the establishment of metabolic routes, as The Retrograde hypothesis (Horowitz, 1945), The Granick hypothesis, The Patchwork hypothesis (Ycas, 1974; Jensen, 1976), Semienzymatic origin of metabolic pathways (Lazcano and Miller,1996), The bioinformatic approach , The directed evolution experiments All these ideas are based on gene duplication.

http://onlinelibrary.wiley.com/doi/10.1002/cplx.20365/abstract
Is gene duplication a viable explanation for the origination of biological information and complexity?
Although the process of gene duplication and subsequent random mutation has certainly contributed to the size and diversity of the genome, it is alone insufficient in explaining the origination of the highly complex information pertinent to the essential functioning of living organisms.  8


If a certain line of reasoning  is not persuasive or convincing, then why do atheists not change their mind because of it? The more evolution papers are published, the less likely the scenario of gene duplication ( even questioned by peer reviewed papers, as shown above ) , mutation, and natural selection becomes.  We should consider the fact that modern biology scientific research  may have reached its limits on several key subjects, to which biosynthesis pathways belong. All discussions on principal theories and experiments in the field either end in vague suppositions and guesswork, statements of blind faith, made up scenarios,  or in a confession of ignorance.  Fact is  there remains a huge gulf in our understanding. This lack of understanding is not just ignorance about some technical details; it is a big conceptual gap.  The reach of the end of the road is evident in regard of many, if not almost all major questions. The big questions of macro  evolutionary changes and abiogenesis  are very far from being clearly formulated, even understood,  and nowhere near being solved, and for most, there is no solution at all at sight. But proponents of evolution firmly believe, one day a solution will be found. Not only that, but it seems, the ones that less understand the subject, the more they believe to have the right answers and philosophical position, almost like religious fundamentalists.  Istn't that a prima facie of a " evolution of the gap" position ? We don't know yet, but evolution  and naturalism must be true anyway ? So, the God hypothesis remains out of the equation as a real possibility  in the beginning, and so  at the end, and never receives a serious and honest consideration. If the scientific evidence does not lead towards naturalism providing sactisfactory explanations, why should we not change your minds and look somewhere else ?

15) http://alumnus.caltech.edu/~raj/writing/mass-craft.html
16) http://www.discovery.org/a/14791
17) http://www.evolutionnews.org/2015/04/the_white_space_1095671.html
18) http://science.howstuffworks.com/life/evolution/evolution11.htm
19) http://bio-complexity.org/ojs/index.php/main/article/view/BIO-C.2011.1

https://reasonandscience.catsboard.com

Otangelo


Admin

How Many Genes Can Make a Cell: The Minimal-Gene-Set Concept 1

Several theoretical and experimental studies have endeavored to derive the minimal set of genes that are necessary and sufficient to sustain a functioning cell under ideal conditions, that is, in the presence of unlimited amounts of all essential nutrients and in the absence of any adverse factors, including competition. A comparison of the first two completed bacterial genomes, those of the parasites Haemophilus influenzae and Mycoplasma genitalium, produced a version of the minimal gene set consisting of ~250 genes.

Very similar estimates were obtained by analyzing viable gene knockouts in Bacillus subtilis, M. genitalium, and Mycoplasma pneumoniae. With the accumulation and comparison of multiple complete genome sequences, it became clear that only ~80 genes of the 250 in the original minimal gene set are represented by orthologs in all life forms. For ~15% of the genes from the minimal gene set, viable knockouts were obtained in M. genitalium; unexpectedly, these included even some of the universal genes. Thus, some of the genes that were included in the first version of the minimal gene set, based on a limited genome comparison, could be, in fact, dispensable. The majority of these genes, however, are likely to encode essential functions but, in the course of evolution, are subject to nonorthologous gene displacement, that is, recruitment of unrelated or distantly related proteins for the same function. Further theoretical and experimental studies within the framework of the minimal-gene-set concept and the ultimate construction of a minimal genome are expected to advance our understanding of the basic principles of cell functioning by systematically detecting nonorthologous gene displacement and deciphering the roles of essential but functionally uncharacterized genes.



1.https://www.ncbi.nlm.nih.gov/books/NBK2227/

https://reasonandscience.catsboard.com

Otangelo


Admin

Three Subsets of Sequence Complexity and Their Relevance to Biopolymeric Information - David L. Abel and Jack T. Trevors - Theoretical Biology & Medical Modelling, Vol. 2, 11 August 2005, page 8
"No man-made program comes close to the technical brillia
nce of even Mycoplasmal genetic algorithms. Mycoplasmas are the simplest known organism with the smallest known genome, to date. How was its genome and other living organisms' genomes programmed?"
http://www.biomedcentral.com/content/pdf/1742-4682-2-29.pdf

First-Ever Blueprint of 'Minimal Cell' Is More Complex Than Expected - Nov. 2009
Excerpt: A network of research groups,, approached the bacterium at three different levels. One team of scientists described M. pneumoniae's transcriptome, identifying all the RNA molecules, or transcripts, produced from its DNA, under various environmental conditions. Another defined all the metabolic reactions that occurred in it, collectively known as its metabolome, under the same conditions. A third team identified every multi-protein complex the bacterium produced, thus characterising its proteome organisation.
"At all three levels, we found M. pneumoniae was more complex than we expected,"
http://www.sciencedaily.com/rele.../2009/11/091126173027.htm

There’s No Such Thing as a ‘Simple’ Organism - November 2009
Excerpt: In short, there was a lot going on in lowly, supposedly simple M. pneumoniae, and much of it is beyond the grasp of what’s now known about cell function.
http://www.wired.com/wiredscience/2009/11/basics-of-life/

Simplest Microbes More Complex than Thought - Dec. 2009
Excerpt: PhysOrg reported that a species of Mycoplasma,, “The bacteria appeared to be assembled in a far more complex way than had been thought.” Many molecules were found to have multiple functions: for instance, some enzymes could catalyze unrelated reactions, and some proteins were involved in multiple protein complexes."
http://www.creationsafaris.com/crev200912.htm#20091229a

To Model the Simplest Microbe in the World, You Need 128 Computers - July 2012
Excerpt: Mycoplasma genitalium has one of the smallest genomes of any free-living organism in the world, clocking in at a mere 525 genes. That's a fraction of the size of even another bacterium like E. coli, which has 4,288 genes.,,,
The bioengineers, led by Stanford's Markus Covert, succeeded in modeling the bacterium, and published their work last week in the journal Cell. What's fascinating is how much horsepower they needed to partially simulate this simple organism. It took a cluster of 128 computers running for 9 to 10 hours to actually generate the data on the 25 categories of molecules that are involved in the cell's lifecycle processes.,,,
,,the depth and breadth of cellular complexity has turned out to be nearly unbelievable, and difficult to manage, even given Moore's Law. The M. genitalium model required 28 subsystems to be individually modeled and integrated, and many critics of the work have been complaining on Twitter that's only a fraction of what will eventually be required to consider the simulation realistic.,,,
http://www.theatlantic.com/.../to-model-the.../260198/

twitter discussion criticizing the cell model.. - 2012
Umm – claims of first full computer simulation of an organism seem, well, way way overhyped… one of the worst NY Times science articles I have seen in a while… I do not think they made a complete model …
Another commenter, Steffen Christensen, voiced his agreement:
Aye: a model is NOT a complete simulation…There are what, 1000s of molecule types in a typical cell, and their model tracks less than 30?!? They might’ve done a better job of it. You know, modeled spatial interactions, 1000s of moieties, etc… As it is, I just feel… disappointed.
http://phylogenomics.blogspot.jp/.../for-those-interested...

Microbe with stripped-down DNA may hint at secrets of life - Mar 24, 2016
Excerpt: The newly created bacterium has a smaller genetic code than does any natural free-living counterpart, with 531,000 DNA building blocks containing 473 genes. (Humans have more than 3 billion building blocks and more than 20,000 genes).
But even this stripped-down organism is full of mystery. Scientists say they have little to no idea what a third of its genes actually do.
"We're showing how complex life is, even in the simplest of organisms," researcher J. Craig Venter told reporters. "These findings are very humbling.",,,
The genome is not some one-and-only minimal set of genes needed for life itself. For one thing, if the researchers had pared DNA from a different bacterium they would probably have ended up with a different set of genes.,,,
The genome is "as small as we can get it and still have an organism that is ... useful," Hutchison said.,,,
http://hosted.ap.org/dynamic/stories/U/US_SCI_SKINNY_GENES

https://reasonandscience.catsboard.com

Otangelo


Admin

What might be a Cell’s minimal requirement of parts?  

Essential genes of a minimal bacterium 

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  F3_lar12

How Many Genes Can Make a Cell: The Minimal-Gene-Set Concept
https://www.ncbi.nlm.nih.gov/books/NBK2227/
Several theoretical and experimental studies have endeavoured to derive the minimal set of genes that are necessary and sufficient to sustain a functioning cell 
under ideal conditions, that is, in the presence of unlimited amounts of all essential nutrients and in the absence of any adverse factors, including competition. A comparison of the first
 two completed bacterial genomes, those of the parasites Haemophilus influenza and Mycoplasma, produced a version of the minimal gene set consisting of ~250 genes.

Following  irreducible processes and parts  are required to keep cells alive and illustrate mount improbable to get a life a first go: 
Reproduction. Reproduction is essential for the survival of all living things.
Metabolism. The enzymatic activity allows a cell to respond to changing environmental demands and regulate its metabolic pathways, both of which are essential to cell survival. 
Nutrition. This is closely related to metabolism. Seal up a living organism in a box for long enough and in due course, it will cease to function and eventually die. Nutrients are essential for life. 
Complexity. All known forms of life are amazingly complex. Even single-celled organisms such as bacteria are veritable beehives of activity involving millions of components. 
Organization. Maybe it is not complexity per se that is significant but organized complexity. 
Growth and development. Individual organisms grow and ecosystems tend to spread (if conditions are right). 
Information content. In recent years scientists have stressed the analogy between living organisms and computers. Crucially, the information needed to replicate an organism
is passed on in the genes from parent to offspring. 
Hardware/software entanglement. All life of the sort found on Earth stems from a deal struck between two very different classes of molecules: nucleic acids and proteins. 
Permanence and change. A further paradox of life concerns the strange conjunction of permanence and change.
Sensitivity. All organisms respond to stimuli— though not always to the same stimuli in the same ways.
Regulation. All organisms have regulatory mechanisms that coordinate internal processes.

https://reasonandscience.catsboard.com

Otangelo


Admin

A minimal estimate of the proteins of the last universal common ancestor

https://reasonandscience.catsboard.com/t2110-what-might-be-a-cells-minimal-requirement-of-parts#6558

A minimal estimate for the gene content of the last universal common ancestor—exobiology from a terrestrial perspective 
19 December 2005
Using an algorithm for ancestral state inference of gene content, given a large number of extant genome sequences and a phylogenetic tree, we aim to reconstruct the gene content of the last universal common ancestor (LUCA), a hypothetical life form that presumably was the progenitor of the three domains of life. The common belief that the hypothetical genome of LUCA should resemble those of the smallest extant genomes of obligate parasites is not supported by recent advances in computational genomics. Instead, a fairly complex genome similar to those of free-living prokaryotes, with a variety of functional capabilities including metabolic transformation, information processing, membrane/transport proteins and complex regulation, shared between the three domains of life, emerges as the most likely progenitor of life on Earth, with profound repercussions for planetary exploration and exobiology. 2

A truly minimal estimate of the gene content of the last universal common ancestor, obtained by three different tree construction methods and the inclusion or not
of eukaryotes (in total, there are 669 ortholog families distributed in 561 functional annotation descriptions, including 52 which remain uncharacterized)

Now either you believe that all these proteins came about randomly by CHANCE, that is by self-assemble spontaneously by orderly aggregation and sequentially correct manner without external direction, or you can stick to design. If you think chance is more compelling, then consider that way beyond 1 in 10^100.000 ( that is a one with 100.000 zeroes ) attempts would be necessary for cosmic lottery to mix a set of 20 amino acids, already pre-selected out of 500 different ones extant naturally on a prebiotic earth, and sorting them out between left handed and right handed ones, only using left-handed ones, and once done, connecting these at least 560 proteins in the right order. But we are not there. Amino acids are just ONE of the metabolites and basic building blocks required for life. We need as well DNA, RNA, hydrocarbons, and phospholipids.


Replication/recombination/repair/modification
The gene content of LUCA with respect to DNA processing (replication, recombination, modification and repair) contains a wide range of functions. The following families/functions are identified: 

DNA polymerase
excinuclease ABC
DNA gyrase 
topoisomerase
NADdependent
DNA ligase
DNA helicases 
DNA mismatch repair MutS  
MutT 
endonucleases 
RecA 
chromosome segregation SMC 
methyltransferase 
methyladenine
glycosylase
adenine glycosylase
adenine phosphoribosyltransferase
deoxyribodipyrimidine photolyase
integrase
HAM1
Sir2  involved in various aspects of genomic stability
TatD a recently discovered DNase
histone deacetylase  and restriction modification

Thus, one can reason that most aspects of DNA metabolism and information processing are well-represented in the minimal reconstruction of LUCA

Transcription/regulation

RNA polymerase
AsnC
ArsR
iron-dependent repressor
and ferric uptake regulator

Two processing families are represented by

transcription-repair coupling factor mfd
RNA helicase
bi-functional transcriptional regulator-GntR-aminotransferase class I  
wRBA, a trp-repressor binding protein family

Translation/ribosome

17 aminoacyl-tRNA synthetases—including bi-functional Gln/Glu-tRNA synthetase and the two subunits of Phe-tRNA synthetase covering 18 (or 19 if Asn-tRNA is considered to be covered by a dual-specificity enzyme or Gln amidotransferases, with the sole exception of Tyr-tRNA synthetase.
Certain—but not all—ribosomal proteins (12 small and 9 large subunits) are also identified, along with ribosome modification enzymes. Furthermore, a set of key translation initiation factors as well as elongation
factors EF-G  and EF-Tu are found.
translation-associated protein SUA5,
rRNA methyltransferase sun family
modification enzymes queuine tRNA-guanine ribosyltransferase transglycosylase  
tRNA pseudouridine synthase

RNA processing

ribonucleases
RNA methyltransferase
HIT pyrophosphatases, with a role in RNA processing.

Cellular processes

A number of families/functions involved in various aspects of cell division, thermoprotection, signaling and proteolysis are detected. Namely, cell division is represented by

FtsH/Z/Y
DnaJ
DnaK/HSP70
chaperonin GroES and GroEL
heat shock hsp20 and cold-shock
CheW/A/R two-component systems
GTP binding
Ser/Thr kinase
tyrosine kinase TrkA
GGDEF domain and sensor histidine kinase
proteases
terminal and amino-peptidases
oligoendopeptidase
peptidases
peptidyl-cis-trans isomerases and inhibitors.

Transport/membrane

Most importantly, LUCA appears to have been a complete cell with well-established membrane systems.

ABC transporters:
cobalt
iron
molybdenum
glycine
spermidine ABC
sugar
oligopeptide ABC
phosphate
amino acid and dipeptide
other non-specific ABC transporters
ammonium
heavymetal

ATPases:
copper and other P-ATPase
magnesium and/or cobalt
multidrug resistance

ion ATPases:
potassium ATPase A/B/C chains
sodium
permeases: transport system kinases
L-lactate permease
glutathione-Na antiporter
sodium symporter
non-specific antiporters
non-specific efflux systems

ion channels:

chloride channel
mechanosensitive channel
Trk (126) and other potassium channel and uptake
protein translocases: export SecD/F  and SecY
translocase TatC

general secretion pathway components

bacterioferritin comigratory protein
non-specific membrane protein families
SRP54
arsenical pump membrane
Mrp subfamily of ABC transporters and the rhomboid family

Unclassified function

CrcB camphor resistance
inorganic pyrophosphatase
TPR-containing proteins
ankyrin repeat proteins

Electron transport

A number of key electron transport systems also appear to be part of LUCA’s genomic signature.  These include

ferredoxin oxidoreductase components
ferredoxin
flavoproteins
NADH dehydrogenase components
iron-sulfur proteins
thioredoxin reductase
thioredoxin
ferrochelatase
HesB
alkyl hydroperoxide reductase
arsenate reductase
superoxide dismutase  Fe/Mn type

Metabolism

The Enzymatic and Metabolic Capabilities of Early Life 1
all reactions stored in the KEGG database that are catalyzed by metaconsensus enzyme functions. 

EC code 1.3.1.-:
Acyl-ACP <=> Dehydroacyl-ACP
Acyl-CoA <=> Dehydroacyl-CoA
Acyl-oleoylglycerophosphocholine <=> Acyl-linoleoylglycerophosphocholine
Androstaneione <=> Androstenedione
Anthracene <=> Anthracenediol
Bilirubin <=> Biliverdin
Bromomaleylacetate <=> Maleylacetate
Butanoic <=> Butenoate
Butyryl-ACP <=> Butenoyl-ACP
Chlorodihydroxycyclohexadiene <=> Chlorocatechol
Chloromaleylacetate <=> Maleylacetate
Chlorophyllide <=> Protochlorophyllide
Cholestanone <=> Cholestenone
Cholestanone <=> Cholestenone
Cholesterol <=> Cholestadienol
Decanoyl-ACP <=> Decenoyl-ACP
Decanoyl-CoA  <=> Decenoyl-CoA
Dehydroacyl-CoA <=> Tetradehydroacyl-CoA
Deoxycorticosterone <=> Dihydrodeoxycorticosterone
Dichlorooxohexenedioate <=> Chlorooxoadipate
Dihydrobenzene <=> Catechol
Dihydrobenzenediol <=> Catechol
Dihydrocortisone <=> Cortisone
Dihydrocucurbitacin <=> Cucurbitacin
Dihydrodihydroxybenzoate <=> Dihydroxybenzoate
Dihydrodihydroxykynurenate <=> Dihydroxykynurenate
Dihydronaphthalenediol <=> Naphthalenediol
Dihydroorotate <=> Orotate
Dihydrophthalic <=> Phthalate
Dihydropyrimidine <=> Pyrimidine
Dihydrotestosterone <=> Testosterone
Dihydroxycholestanone <=> Cholestendiolone
Dihydroxycyclohexadienecarboxylic <=> Catechol
Dihydroxydihydrohydroxymethylnaphthalene <=> Dihydroxyhydroxymethylnaphthalene
Dihydroxydihydromethylnaphthalene <=> Dihydroxymethylnaphthalene
Dihydroxyoxopregnanal <=> Aldosterone
Dihydroxypregnanedione <=> Corticosterone
Dihydroxypregnanetrione <=> Cortisone
Divinyl <=> Divinylprotochlorophyllide
Dodecanoyl-ACP <=> Dodecenoyl-ACP
Fluorocyclohexadienediolcarboxylate <=> Fluorocatechol
Fluoropyrimidine <=> Dihydrofluoropyrimidine
Geissoschizine <=> Dehydrogeissoschizine
Hexadecanal <=> Hexadecenal
Hexadecanoyl-ACP <=> Hexadecenoyl-ACP
Hexanoyl-ACP <=> Hexenoyl-ACP
Hexenoyl-CoA <=> Hexanoyl-CoA
Hydroxycholestanone <=> Hydroxycholestenone
Hydroxyphenylpropanoate <=> Hydroxycinnamate
Lauroyl-CoA <=> Dodecenoyl-CoA
meso-Tartrate <=> Dihydroxyfumarate
Methyloxindole <=> Methyleneoxindole
Nitropropanoate <=> Nitroacrylate
Octanoyl-ACP <=> Octenoyl-ACP
Octanoyl-CoA <=> Octenoyl-CoA
Oxoadipate <=> Maleylacetate
Palmitoyl-CoA <=> Hexadecenoyl-CoA
Phenylpropanoate <=> Cinnamate
Pregnanedione <=> Progesterone
Prephenate <=> Hydroxyphenylpyruvate
Styrene <=> Vinylcatechol
Succinate <=> Fumarate
Tetradecanoyl-ACP <=> Tetradecenoyl-ACP
Tetradecanoyl-CoA <=> Tetradecenoyl-CoA
Tetrahydrodipicolinate <=> Dihydrodipicolinate
Toluenedihydrodiol <=> Dihydroxytoluene
Trichlorodihydroxycyclohexadiene <=> Trichlorocatechol
Trihydroxypregnanedione <=> Cortisol

EC code 2.4.1.-:
Amylose <=> Dextran
Amylose <=> Glucose
Maltose <=> Glucose
Starch <=> Glucose
Sucrose <=> Fructose + Dextran
Sucrose <=> Fructose + Glucose
Sucrose <=> Glucose + Inulin
Sucrose <=> Glucose + Levan
Sucrose <=> Glucose + Starch
Sucrose<=> Fructose + Amylose

EC code 2.7.1.-:
Hexose <=> Hexose-phosphate
Glucose <=> Glucose-phosphate
Fructose <=> Fructose-phosphate
Mannose <=> Mannose-phosphate
Glucosamine <=> Glucosamine-phosphate
Sorbitol <=> Sorbitol-phosphate

EC code 2.7.7.-:
dNTP <=> DNA
FMN <=> FAD
NDP <=> Adenylylsulfate
NDP-glucose + Galactose <=> Glucose + NDP-galactose
Nicotinamide <=> NAD+
Nicotinate <=> Deamino-NAD+
NTP + Aminoethylphosphonate <=> NMP-aminoethylphosphonate
NTP + Choline <=> NDP-choline
NTP + Ethanolamine <=> NDP-ethanolamine
NTP + Galactose <=> NDP-galactose
NTP + Glucose <=> NDP-glucose
NTP + Mannose <=> NDP-mannose
NTP + TrimethylAminoethylphosphonate <=> NMP-trimethylaminoethylphosphonate
NTP + Xylose <=> NDP-xylose
NTP <=> Adenylylselenate
NTP <=> Adenylylsulfate
NTP <=> RNA
RNA + Orthophosphate <=> RNA + NDP
NMP + RNA <=> Diphosphate + RNA
Pantetheine <=> Dephospho-CoA

EC code 3.1.2.-:
Acetoacetyl-CoA <=> Acetoacetate
Acetyl-Citrate-lyase <=>  Thiol-Citrate-lyase
Acetyl-CoA <=> Acetate
Acyl-CoA <=> Carboxylate
Acylglutathione <=> Carboxylate + Glutathione
Arachidonyl-CoA <=> Icosatetraenoate
Dihydroxyacylglutathione <=> Glutathione + Hydroxycarboxylate
Docosahexaenoyl-CoA <=> Docosahexaenoate
Dodecanoyl-ACP <=> Dodecanoic
Eicosanoyl-CoA <=> Icosanoate
Eicosatrienoyl-CoA <=> Icosatrienoate
Formyl-CoA <=> Formate
Formylglutathione <=> Formate + Glutathione
Hexadecanoyl-ACP <=> Hexadecanoate
Hydroxyisobutyryl-CoA <=> Hydroxyisobutyrate
Hydroxymethylglutaryl-CoA <=> Hydroxymethylglutarate
Hydroxymethylpropanoyl-CoA <=> Hydroxymethylpropanoate
Hydroxypropionyl-CoA <=> Hydroxypropanoate
Icosapentaenoyl-CoA <=> Icosapentaenoate
Lactoylglutathione <=> Glutathione + Lactate
Linolenoyl-CoA <=> Octadecatrienoate
Linoleoyl-CoA <=> Linoleate
Methylmalonyl-CoA <=> Methylmalonate
Octadecanoyl-ACP <=> Octadecanoate
Octadecatrienoyl-CoA <=> Octadecatrienoate
Oleoyl-ACP <=> Octadecenoate
Oleoyl-CoA <=> Octadecenoate
Palmitoyl-CoA <=> Hexadecanoate
Stearoyl-CoA <=> Octadecanoate
Succinyl-CoA <=> Succinate
Succinylglutathione <=> Glutathione + Succinate
Tetradecanoyl-ACP <=> Tetradecanoic
Ubiquitin <=> Ubiquitin

EC code 3.1.4.-:
Glycerophosphocholine <=> Choline + Glycerol
Glycerophosphoethanolamine <=> Ethanolamine + Glycerol
Phosphatidylcholine <=> Diacylglycerol + Choline
Phosphatidylglycerol <=> Diacylglycerol + Glycerol
Phosphatidylethanolamine <=> Diacylglycerol + Ethanolamine
Phosphatidylmyoinositol <=> Inositol + Diacylglycerol
Alkenylacylglycerophosphoethanolamine <=> Alkenylacylglycerol + Ethanolamine
Phosphatidylcholine <=> Phosphatidate + Choline
Phosphatidylethanolamine <=> Ethanolamine + Phosphatidate
Alkenylacylglycerophosphoethanolamine <=> Acylalkenylglycerophosphate + Ethanolamine
Phosphatidylmyoinositol <=> Myoinositol + Diacylglycerol
Sphingomyelin <=> Acylsphingosine + Choline
Serinephosphoethanolamine <=> Serine + Ethanolamine
ACP <=> Pantetheine + Apo-ACP
Adenylylglutamate <=> AMP + Glutamate
Cyclic-NMP <=> NMP
Cyclic-NTP <=> NMP

EC code 3.2.1.-:
Starch <=> Dextrin + Starch
Starch <=> Maltodextrin + Maltose
Starch <=> Amylose + Maltose
Starch <=> Glucose + Starch
Dextrin <=> Glucose + Dextrin
Cellulose <=> Cellulose + Cellobiose
Isomaltose <=> Glucose

EC code 3.5.1.-:
Asparagine <=> Aspartate
Amide <=> Carboxylate
Glutamine <=> Glutamate

EC code 4.1.2.-:
Erythrulose <=> Glycerone + Formaldehyde
Deoxyribose <=> Glyceraldehyde + Acetaldehyde
Threonine <=> Glycine + Acetaldehyde
Allothreonine <=> Glycine + Acetaldehyde
Indoleglycerol <=> Indole + Glyceraldehyde
Xylulose <=> Glyceraldehyde
Mandelonitrile <=> Cyanide + Benzaldehyde
Cyanohydrin <=> Cyanide + Aldehyde
Cyanohydrin <=> Ketone + Cyanide
Hydroxymandelonitrile <=> Cyanide + Hydroxybenzaldehyde

EC code 6.3.2.-:
Pantoate + Alanine <=> Pantothenate
Glutamate + Cysteine <=> Peptide
Peptide + Glycine <=> Glutathione
Alanine <=> Peptide
PhosphoPantothenate + Cysteine <=> Phosphopantothenoyl-cysteine
Phosphoribosylaminoimidazolecarboxylate + Aspartate <=> Phosphoribosylaminosuccinocarboxamideimidazole
UDP-acetylmuramoylpeptide + Lysine <=> UDP-acetylmuramoylpeptide
UDP-acetylmuramate + Alanine <=> UDP-acetylmuramoylpeptide
UDP-acetylmuramoylpeptide + Glutamate <=> UDP-acetylmuramoylpeptide
UDP-acetylmuramoylpeptide + Dipeptide <=> UDP-acetylmuramoylpeptide
UDP-acetylmuramoylpeptide-diaminopimelate + Dipeptide <=> UDP-acetylmuramoylpeptide
Histidine + Alanine <=> Carnosine
Lysine + Alanine <=> Dipeptide
Arginine + Alanine <=> Dipeptide
Histidine + Aminobutanoate <=> Homocarnosine
Methylhistidine + Alanine <=> Alanylmethylhistidine
Dihydropteroate + Glutamate <=> Dihydrofolate
Formyltetrahydrofolate + Glutamate <=> Formyltetrahydrofolyl
UDP-acetylmuramoylpeptide + meso-Diaminoheptanedioate <=> UDP-acetylmuramoylpeptide-meso-diaminopimelate
Dihydroxybenzoate + Serine <=> Dihydroxybenzoylserine
Alanine + Alanyl-polyglycerolphosphate <=>  Alanyl-alanyl-polyglycerolphosphate
Tetrahydrofolyl-Glu(n) + L-Glutamate <=> Tetrahydrofolyl-Glu(n)
Tetrahydrofolate + Glutamate <=> Tetrahydrofolyl-Glu(n)
Histidine + Alanine <=> Carnosine

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?  Wk2zatF


A reconstructed metabolism composed of reactions imparted by metaconsensus enzyme functions.
Nodes represent reactants and products while the edges connecting them represent metaconsensus enzyme functions. The network is composed of 119 nodes and 135 edges. Reactions were assembled from the KEGG reactions database and small molecules and cofactors were removed. Yellow edges represent metaconsensus enzyme functions predicted by the universal sequence, universal structure, and universal reaction datasets. Green edges represent metaconsensus enzyme functions predicted by the universal sequence and universal structure datasets, but not the universal reaction dataset. Subnetworks circled in red roughly reflect subsets of metabolism related to amino acids and peptides, nucleotides and RNA, sugars and starches, and phospholipids. This reconstructed metabolism demonstrates that significant metabolic complexity is possible with only these ten metaconsensus enzyme functions.


1. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0039912#pone-0039912-t001
2. https://www.sciencedirect.com/science/article/pii/S0923250805002676

http://manet.illinois.edu/pathways.php



Last edited by Admin on Mon Sep 02, 2019 12:40 pm; edited 1 time in total

https://reasonandscience.catsboard.com

Otangelo


Admin

1. According to the latest estimation of a minimal protein set for the first living organism, the requirement would be ~ 560 proteins, being absolute minimum to keep the basic functions of a cell.
2. There is an average of about 400 amino acids per protein.
3. Each of the 400 positions in the amino acid chains could be occupied by any one of the 20 amino acids used in cells, so if we suppose that proteins emerged randomly on prebiotic earth, the total possible arrangements or odds to get one which would fold into a functional 3D protein would be
1 to 20^400 or 1 to 10^520. A truly enormous, super astronomical number.
4. Since we need 560 proteins total to make a first living cell, we would have to repeat the shuffle 560 times, to get all proteins required for life. The probability would be therefore 560/10^520. We arrive at a probability far beyond of 1 in 10^150.000.

Its obvious this is such a remote probability, far beyond and reasonable probability. We can conclude with high certainty, that design is the best inference to explain the origin of life.

https://reasonandscience.catsboard.com

Otangelo


Admin

Abiogenesis: What Might Be a Cell’s minimal requirement of parts ?

https://reasonandscience.catsboard.com/t2110-abiogenesis-what-might-be-a-cells-minimal-requirement-of-parts#6631

Prebiotic Evolution and Astrobiology, J. Tze-Fei Wong, PhD Antonio Lazcano, PhD  2009 Landes Bioscience

Clusters of Orthologous Groups (COG) in the LUCA Genome.48 COG Nos. that are present in all of the ancient eight are shown in bold font; those present in all of the ancient six but not in all of the ancient eight are shown in nonbold font; those not included in any of the minimal gene sets I-VI are shown in italics; the underlined COGs are common to the methanogenesis pathways of Mka, Mth and Mja

Translation, ribosomal structure and biogenesis
8 Glutamyl-tRNA synthetase
9 Putative translation factor (SUA5)
12 Predicted GTPase, probable translation factor
13 Alanyl-tRNA synthetase
16 Phenylalanyl-tRNA synthetase alpha subunit
17 Aspartyl-tRNA synthetase
18 Arginyl-tRNA synthetase
23 Translation initiation factor 1 (eIF-1/SUI1) and related proteins
24 Methionine aminopeptidase
30 Dimethyladenosine transferase (rRNA methylation)
48 Ribosomal protein S12
49 Ribosomal protein S7
51 Ribosomal protein S10
52 Ribosomal protein S2
60 Isoleucyl-tRNA synthetase
72 Phenylalanyl-tRNA synthetase beta subunit
80 Ribosomal protein L11
81 Ribosomal protein L1
87 Ribosomal protein L3
88 Ribosomal protein L4
89 Ribosomal protein L23
90 Ribosomal protein L2
91 Ribosomal protein L22
92 Ribosomal protein S3
93 Ribosomal protein L14
94 Ribosomal protein L5
96 Ribosomal protein S8
97 Ribosomal protein L6P/L9E
98 Ribosomal protein S5
99 Ribosomal protein S13
100 Ribosomal protein S11
101 Pseudouridylate synthase
102 Ribosomal protein L13
103 Ribosomal protein S9
124 Histidyl-tRNA synthetase
130 Pseudouridine synthase
143 Methionyl-tRNA synthetase
162 Tyrosyl-tRNA synthetase
172 Seryl-tRNA synthetase
180 Tryptophanyl-tRNA synthetase
182 Predicted translation initiation factor 2B subunit, eIF-2B α/β/δ family
184 Ribosomal protein S15P/S13E
185 Ribosomal protein S19
186 Ribosomal protein S17
197 Ribosomal protein L16/L10E
198 Ribosomal protein L24
199 Ribosomal protein S14
200 Ribosomal protein L15
231 Translation elongation factor P (EF-P)/translation initiation factor 5A
244 Ribosomal protein L10
252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D
255 Ribosomal protein L29
256 Ribosomal protein L18
343 Queuine/archaeosine tRNA-ribosyltransferase
361 Translation initiation factor 1 (IF-1)
423 Glycyl-tRNA synthetase (class II)
441 Threonyl-tRNA synthetase
442 Prolyl-tRNA synthetase
480 Translation elongation factors (GTPases)
495 Leucyl-tRNA synthetase
522 Ribosomal protein S4 and related proteins
525 Valyl-tRNA synthetase
532 Translation initiation factor 2 (IF-2; GTPase)
621 2-Methylthioadenine synthetase
1093 Translation initiation factor 2, alpha subunit (eIF-2alpha)
1258 Predicted pseudouridylate synthase
1325 Predicted exosome subunit
1358 Ribosomal protein HS6-type (S12/L30/L7a)
1369 RNase P/RNase MRP subunit POP5
1383 Ribosomal protein S17E
1384 Lysyl-tRNA synthetase (class I)
1471 Ribosomal protein S4E
1491 Predicted RNA-binding protein
1498 Protein implicated in ribosomal biogenesis Nop56p homolog
1499 NMD protein affecting ribosome stability and mRNA decay
1500 Predicted exosome subunit
1503 Peptide chain release factor 1 (eRF1)
1514 2ʹ-5ʹ RNA ligase
1534 Predicted RNA-binding protein containing KH domain possibly ribosomal protein
1549 Queuine tRNA-ribosyltransferases, contain PUA domain
1552 Ribosomal protein L40E
1588 RNase P/RNase MRP subunit p29
1601 Translation initiation factor 2, beta subunit (eIF-2beta)/eIF-5 N-terminal domain
1603 RNase P/RNase MRP subunit p30
1631 Ribosomal protein L44E
1632 Ribosomal protein L15E
1676 tRNA splicing endonuclease
1717 Ribosomal protein L32E
1727 Ribosomal protein L18E
1736 Diphthamide synthase subunit DPH2
1746 tRNA nucleotidyltransferase (CCA-adding enzyme)
1798 Diphthamide biosynthesis methyltransferase
1841 Ribosomal protein L30/L7E
1867 Dimethylguanosine tRNA methyltransferase
1889 Fibrillarin-like rRNA methylase
1890 Ribosomal protein S3AE
1911 Ribosomal protein L30E
1976 Translation initiation factor 6 (eIF-6)
1997 Ribosomal protein L37AE/L43A
1998 Ribosomal protein S27AE
2004 Ribosomal protein S24E
2007 Ribosomal protein S8E
2016 Predicted RNA-binding protein (contains PUA domain)
2023 RNase P subunit RPR2
2051 Ribosomal protein S27E
2053 Ribosomal protein S28E/S33
2058 Ribosomal protein L12E/L44/L45/RPP1/RPP2
2075 Ribosomal protein L24E
2092 Translation elongation factor EF-1beta
2097 Ribosomal protein L31E
2125 Ribosomal protein S6E (S10)
2126 Ribosomal protein L37E
2139 Ribosomal protein L21E
2147 Ribosomal protein L19E
2157 Ribosomal protein L20A (L18A)
2163 Ribosomal protein L14E/L6E/L27E
2167 Ribosomal protein L39E
2174 Ribosomal protein L34E
2238 Ribosomal protein S19E (S16A)
2260 Predicted Zn-ribbon RNA-binding protein
2263 Predicted RNA methylase
2511 Archaeal Glu-tRNAGln amidotransferase subunit E (contains GAD domain)
2519 tRNA(1-methyladenosine) methyltransferase and related methyltransferases
2888 Predicted Zn-ribbon RNA-binding protein with a function in translation
2890 Methylase of polypeptide chain release factors
3277 RNA-binding protein involved in rRNA processing
5256 Translation elongation factor EF-1alpha (GTPase)
5257 Translation initiation factor 2, gamma subunit (eIF-2gamma; GTPase)

RNA processing and modification
430 RNA 3ʹ-terminal phosphate cyclase
2136 Predicted exosome subunit/U3 small nucleolar ribonucleoprotein (snoRNP)
component, contains IMP4 domain

Transcription
85 DNA-directed RNA polymerase, beta subunit/140 kD subunit
86 DNA-directed RNA polymerase, beta’ subunit/160 kD subunit
195 Transcription elongation factor
202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit
250 Transcription antiterminator
640 Predicted transcriptional regulators
864 Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding
domain and a metal-binding domain
1095 DNA-directed RNA polymerase, subunit E’
1243 Histone acetyltransferase
1293 Predicted RNA-binding protein homologous to eukaryotic snRNP
1308 Transcription factor homologous to NACalpha-BTF3
1321 Mn-dependent transcriptional regulator
1378 Predicted transcriptional regulators
1395 Predicted transcriptional regulator
1405 Transcription initiation factor TFIIIB, Brf1 subunit
1522 Transcriptional regulators
1581 Archaeal DNA-binding protein
1644 DNA-directed RNA polymerase, subunit N (RpoN/RPB10)
1675 Transcription initiation factor IIE, alpha subunit
1758 DNA-directed RNA polymerase, subunit K/omega
1761 DNA-directed RNA polymerase, subunit L
1813 Predicted transcription factor, homolog of eukaryotic MBF1
1846 Transcriptional regulators
1996 DNA-directed RNA polymerase, subunit RPC10 (contains C4-type Zn-fi nger)
2012 DNA-directed RNA polymerase, subunit H, RpoH/RPB5
2093 DNA-directed RNA polymerase, subunit E’’
2101 TATA-box binding protein (TBP), component of TFIID and TFIIIB

Replication, recombination and repair
84 Mg-dependent DNase
164 Ribonuclease HII
177 Predicted EndoIII-related endonuclease
258 5ʹ-3ʹ exonuclease (including N-terminal domain of PolI)
270 Site-specifi c DNA methylase
350 Methylated DNA-protein cysteine methyltransferase
358 DNA primase (bacterial type)
417 DNA polymerase elongation subunit (family B)
419 ATPase involved in DNA repair
420 DNA repair exonuclease
468 RecA/RadA recombinase
470 ATPase involved in DNA replication
550 Topoisomerase IA
592 DNA polymerase sliding clamp subunit (PCNA homolog)
608 Single-stranded DNA-specifi c exonuclease
1041 Predicted DNA modifi cation methylase
1107 Archaea-specifi c RecJ-like exonuclease, contains DnaJ-type Zn fi nger domain
1111 ERCC4-like helicases
1112 Superfamily I DNA and RNA helicases and helicase subunits
1241 Predicted ATPase involved in replication control, Cdc46/Mcm family
1311 Archaeal DNA polymerase II, small subunit/DNA polymerase delta, subunit B
1389 DNA topoisomerase VI, subunit B
1423 ATP-dependent DNA ligase, homolog of eukaryotic ligase III
1467 Eukaryotic-type DNA primase, catalytic (small) subunit
1525 Micrococcal nuclease (thermonuclease) homologs
1591 Holliday junction resolvase—archaeal type
1599 Single-stranded DNA-binding replication protein A (RPA), large (70 kD) subunit and
related ssDNA-binding proteins
1637 Predicted nuclease of the RecB family
1697 DNA topoisomerase VI, subunit A
1793 ATP-dependent DNA ligase
1933 Archaeal DNA polymerase II, large subunit
1948 ERCC4-type nuclease
2219 Eukaryotic-type DNA primase, large subunit

Chromatin structure and dynamics
123 Deacetylases, including yeast histone deacetylase and acetoin utilization
protein
2036 Histones H3 and H4

Cell cycle control, cell division, chromosome 
37 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control
206 Cell division GTPase
455 ATPases involved in chromosome partitioning
489 ATPases involved in chromosome partitioning
1192 ATPases involved in chromosome partitioning
1718 Serine/threonine protein kinase involved in cell cycle control

Signal transduction mechanisms
467 RecA-superfamily ATPases implicated in signal transduction
589 Universal stress protein UspA and related nucleotide-binding proteins
3642 Mn2+-dependent serine/threonine protein kinase

Cell wall/membrane/envelope biogenesis
438 Glycosyltransferase
449 Glucosamine 6-phosphate synthetase
451 Nucleoside-diphosphate-sugar epimerase
463 Glycosyltransferases involved in cell wall biogenesis
472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-Nacetylglucosamine-1-phosphate transferase
668 Small-conductance mechanosensitive channel
750 Predicted membrane-associated Zn-dependent proteases 1
794 Predicted sugar phosphate isomerase involved in capsule formation
1208 Nucleoside-diphosphate-sugar pyrophosphorylase

Intracellular trafficking, secretion and vesicular transport
201 Preprotein translocase subunit SecY
341 Preprotein translocase subunit SecF
342 Preprotein translocase subunit SecD
541 Signal recognition particle GTPase
552 Signal recognition particle GTPase
681 Signal peptidase I
1400 Signal recognition particle 19 kDa protein
1989 Type II secretory pathway, prepilin signal peptidase PulO
2064 Flp pilus assembly protein TadC
2443 Preprotein translocase subunit Sss1
4962 Flp pilus assembly protein, ATPase CpaF

Posttranslational modification, protein turnover, chaperones
68 Hydrogenase maturation factor
71 Molecular chaperone (small heat shock protein)
298 Hydrogenase maturation factor
309 Hydrogenase maturation factor
330 Membrane protease subunits, stomatin/prohibitin homologs
396 ABC-type transport system involved in Fe-S cluster assembly, ATPase component
409 Hydrogenase maturation factor
459 Chaperonin GroEL (HSP60 family)
464 ATPases of the AAA+ class
492 Thioredoxin reductase
501 Zn-dependent protease with chaperone function
533 Metal-dependent proteases with possible chaperone activity
555 ABC-type sulfate transport system, permease component
602 Organic radical activating enzymes
638 20S proteasome, alpha and beta subunits
1047 FKBP-type peptidyl-prolyl cis-trans isomerases 2
1067 Predicted ATP-dependent protease
1180 Pyruvate-formate lyase-activating enzyme
1222 ATP-dependent 26S proteasome regulatory subunit
1370 Prefoldin, molecular chaperone implicated in de novo protein folding, alpha subunit
1382 Prefoldin, chaperonin cofactor
1730 Predicted prefoldin, molecular chaperone implicated in de novo protein folding
1899 Deoxyhypusine synthase
2518 Protein-L-isoaspartate carboxylmethyltransferase

Energy production and conversion
39 Malate/lactate dehydrogenases
45 Succinyl-CoA synthetase, beta subunit
74 Succinyl-CoA synthetase, alpha subunit
243 Anaerobic dehydrogenases, typically selenocysteine-containing
247 Fe-S oxidoreductase
371 Glycerol dehydrogenase and related enzymes
473 Isocitrate/isopropylmalate dehydrogenase
479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit
543 2-Polyprenylphenol hydroxylase and related fl avodoxin oxidoreductases
636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K
644 Dehydrogenases (fl avoproteins)
650 Formate hydrogenlyase subunit 4
674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit
680 Ni,Fe-hydrogenase maturation factor
716 Flavodoxins
731 Fe-S oxidoreductases
778 Nitroreductase
1012 NAD-dependent aldehyde dehydrogenases
1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit
1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin
oxidoreductases, gamma subunit
1029 Formylmethanofuran dehydrogenase subunit B
1032 Fe-S oxidoreductase
1035 Coenzyme F420-reducing hydrogenase, beta subunit
1036 Archaeal fl avoproteins
1053 Succinate dehydrogenase/fumarate reductase, fl avoprotein subunit
1142 Fe-S-cluster-containing hydrogenase components 2
1144 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, delta subunit
1145 Ferredoxin
1146 Ferredoxin
1148 Heterodisulfi de reductase, subunit A and related polyferredoxins
1149 MinD superfamily P-loop ATPase containing an inserted ferredoxin domain
1150 Heterodisulfi de reductase, subunit C
1151 6Fe-6S prismane cluster-containing protein
1152 CO dehydrogenase/acetyl-CoA synthase alpha subunit
1153 Formylmethanofuran dehydrogenase subunit D
1155 Archaeal/vacuolar-type H+-ATPase subunit A
1156 Archaeal/vacuolar-type H+-ATPase subunit B
1229 Formylmethanofuran dehydrogenase subunit A
1249 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component and related enzymes
1269 Archaeal/vacuolar-type H+−ATPase subunit I
1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases
1390 Archaeal/vacuolar-type H+-ATPase subunit E
1394 Archaeal/vacuolar-type H+-ATPase subunit D
1436 Archaeal/vacuolar-type H+-ATPase subunit F
1456 CO dehydrogenase/acetyl-CoA synthase gamma subunit (corrinoid Fe-S protein)
1527 Archaeal/vacuolar-type H+-ATPase subunit C
1592 Rubrerythrin
1614 CO dehydrogenase/acetyl-CoA synthase beta subunit
1625 Fe-S oxidoreductase, related to NifB/MoaA family
1819 Glycosyl transferases, related to UDP-glucuronosyltransferase
1880 CO dehydrogenase/acetyl-CoA synthase epsilon subunit
1838 Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain
1908 Coenzyme F420-reducing hydrogenase, delta subunit
1927 Coenzyme F420-dependent N(5), N(10)-methenyltetrahydromethanopterindehydrogenase
1941 Coenzyme F420-reducing hydrogenase, gamma subunit
1951 Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminaldomain
2037 Formylmethanofuran:tetrahydromethanopterin formyltransferase
2048 Heterodisulfi de reductase, subunit B
2055 Malate/L-lactate dehydrogenases
2069 CO dehydrogenase/acetyl-CoA synthase delta subunit (corrinoid Fe-S protein)
2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related fl avin-dependent oxidoreductases
2218 Formylmethanofuran dehydrogenase subunit C
2221 Dissimilatory sulfi te reductase (desulfoviridin), alpha and beta subunits
2710 Nitrogenase molybdenum-iron protein, alpha and beta chains
3259 Coenzyme F420-reducing hydrogenase, alpha subunit
3260 Ni,Fe-hydrogenase III small subunit
3261 Ni,Fe-hydrogenase III large subunit
4074 H2-forming N5,N10-methylenetetrahydromethanopterin dehydrogenase

Carbohydrate transport and metabolism
57 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate
dehydrogenase
61 Predicted sugar kinase
63 H2-forming N5,N10-methylenetetrahydromethanopterin dehydrogenase
120 Ribose 5-phosphate isomerase
126 3-Phosphoglycerate kinase
148 Enolase
149 Triosephosphate isomerase
235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases
269 3-Hexulose-6-phosphate synthase and related proteins
483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol
monophosphatase family
524 Sugar kinases, ribokinase family
574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase
662 Mannose-6-phosphate isomerase
1082 Sugar phosphate isomerases/epimerases
1109 Phosphomannomutase
1363 Cellulase M and related proteins
1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes
1980 Archaeal fructose 1,6-bisphosphatase
2074 2-phosphoglycerate kinase
3635 Predicted phosphoglycerate mutase, AP superfamily
3839 ABC-type sugar transport systems, ATPase components

Amino acid transport and metabolism
2 Acetylglutamate semialdehyde dehydrogenase
6 Xaa-Pro aminopeptidase
10 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family
65 3-Isopropylmalate dehydratase large subunit
66 3-Isopropylmalate dehydratase small subunit
75 Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase
76 Glutamate decarboxylase and related PLP-dependent proteins
78 Ornithine carbamoyltransferase
79 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase
111 Phosphoglycerate dehydrogenase and related dehydrogenases
112 Glycine/serine hydroxymethyltransferase
119 Isopropylmalate/homocitrate/citramalate synthases
136 Aspartate-semialdehyde dehydrogenase
174 Glutamine synthetase
289 Dihydrodipicolinate reductase
329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase
367 Asparagine synthase (glutamine-hydrolyzing)
436 Aspartate/tyrosine/aromatic aminotransferase
440 Acetolactate synthase, small (regulatory) subunit
460 Homoserine dehydrogenase
498 Threonine synthase
527 Aspartokinases
548 Acetylglutamate kinase
560 Phosphoserine phosphatase
620 Methionine synthase II (cobalamin-independent)
1387 Histidinol phosphatase and related hydrolases of the PHP family
1812 Archaeal S-adenosylmethionine synthetase
4992 Ornithine/acetylornithine aminotransferase

Nucleotide transport and metabolism
5 Purine nucleoside phosphorylase
15 Adenylosuccinate lyase
34 Glutamine phosphoribosyl-pyrophosphate amidotransferase
41 Phosphoribosyl-carboxy-aminoimidazole mutase
44 Dihydroorotase and related cyclic amidohydrolases
46 Phosphoribosylformyl glycinamidine synthase, synthetase domain
47 Phosphoribosylformylglycinamidine synthase, glutamine amidotransferase domain
104 Adenylosuccinate synthase
105 Nucleoside diphosphate kinase
125 Thymidylate kinase
127 Xanthosine triphosphate pyrophosphatase
150 Phosphoribosylaminoimidazole synthetase
151 Phosphoribosylamine-glycine ligase
152 Phosphoribosyl-aminoimidazole-succinocarboxamide synthase
167 Dihydroorotate dehydrogenase
284 Orotidine-5ʹ-phosphate decarboxylase
402 Cytosine deaminase and related metal-dependent hydrolases
461 Orotate phosphoribosyltransferase
462 Phosphoribosylpyrophosphate synthetase
503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins
504 CTP synthase
516 IMP dehydrogenase/GMP reductase
518 GMP synthase—Glutamine amidotransferase domain
519 GMP synthase, PP-ATPase domain/subunit
528 Uridylate kinase
540 Aspartate carbamoyltransferase, catalytic chain
717 Deoxycytidine deaminase
1051 ADP-ribose pyrophosphatase
1102 Cytidylate kinase
1328 Oxygen-sensitive ribonucleoside-triphosphate reductase
1437 Adenylate cyclase, class 2 (thermophilic)
1618 Predicted nucleotide kinase
1781 Aspartate carbamoyltransferase, regulatory subunit
1828 Phosphoribosyl-formylglycinamidine synthase, PurS component
1936 Predicted nucleotide kinase (related to CMP and AMP kinases)
2019 Archaeal adenylate kinase

Coenzyme transport and metabolism
43 3-Polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases
142 Geranylgeranyl pyrophosphate synthase
157 Nicotinate-nucleotide pyrophosphorylase
163 3-Polyprenyl-4-hydroxybenzoate decarboxylase
171 NAD synthase
214 Pyridoxine biosynthesis enzyme
237 Dephospho-CoA kinase
294 Dihydropteroate synthase and related enzymes
301 Thiamine biosynthesis ATP pyrophosphatase
303 Molybdopterin biosynthesis enzyme
311 Predicted glutamine amidotransferase involved in pyridoxine biosynthesis
315 Molybdenum cofactor biosynthesis enzyme
351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase
368 Cobalamin-5-phosphate synthase
379 Quinolinate synthase
382 4-Hydroxybenzoate polyprenyltransferase and related prenyltransferases
452 Phosphopantothenoylcysteine synthetase/decarboxylase
499 S-adenosylhomocysteine hydrolase
521 Molybdopterin biosynthesis enzymes
611 Thiamine monophosphate kinase
720 6-Pyruvoyl-tetrahydropterin synthase
746 Molybdopterin-guanine dinucleotide biosynthesis protein A
1270 Cobalamin biosynthesis protein CobD/CbiB
1339 Transcriptional regulator of a ribofl avin/FAD biosynthetic operon
1635 Flavoprotein involved in thiazole biosynthesis
1763 Molybdopterin-guanine dinucleotide biosynthesis protein
1767 Triphosphoribosyl-dephospho-CoA synthetase
2038 NaMN:DMB phosphoribosyltransferase
2266 GTP:adenosylcobinamide-phosphate guanylyltransferase
2896 Molybdenum cofactor biosynthesis enzyme

Lipid transport and metabolism
20 Undecaprenyl pyrophosphate synthase
170 Dolichol kinase
183 Acetyl-CoA acetyltransferase
575 CDP-diglyceride synthetase
615 Cytidylyltransferase
671 Membrane-associated phospholipid phosphatase
1257 Hydroxymethylglutaryl-CoA reductase
1267 Phosphatidylglycerophosphatase A and related proteins
1577 Mevalonate kinase
3425 3-Hydroxy-3-methylglutaryl CoA synthase

Inorganic ion transport and metabolism
168 Trk-type K+ transport systems, membrane components
306 Phosphate/sulphate permeases
370 Fe2+ transport system protein B
477 Permeases of the major facilitator superfamily
530 Ca2+/Na+ antiporter
569 K+ transport systems, NAD-binding component
619 ABC-type cobalt transport system, permease component CbiQ and related
transporters
704 Phosphate uptake regulator
725 ABC-type molybdate transport system, periplasmic component
1122 ABC-type cobalt transport system, ATPase component
1226 3-Hydroxy-3-methylglutaryl CoA synthase
1918 Fe2+ transport system protein A

Secondary metabolite biosynthesis, transport and catabolism
179 2-Keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase(catechol pathway)
500 SAM-dependent methyltransferases

General function prediction only
73 EMAP domain
312 Predicted Zn-dependent proteases and their inactivated homologs
375 Zn fi nger protein HypA/HybF (possibly regulating hydrogenase expression)
433 Predicted ATPase
446 Uncharacterized NAD(FAD)-dependent dehydrogenases
456 Acetyltransferases
491 Zn-dependent hydrolases, including glyoxylases
517 FOG: CBS domain
535 Predicted Fe-S oxidoreductases
603 Predicted PP-loop superfamily ATPase
622 Predicted phosphoesterase
663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily
714 MoxR-like ATPases
1011 Predicted hydrolase (HAD superfamily)
1019 Predicted nucleotidyltransferase
1078 HD superfamily phosphohydrolases
1084 Predicted GTPase
1094 Predicted RNA-binding protein (contains KH domains)
1100 GTPase SAR1 and related small G proteins
1163 Predicted GTPase
1201 Lhr-like helicases
1204 Superfamily II helicase
1205 Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster
1234 ATPase components of various ABC-type transport systems, contain duplicated ATPase
1235 Metal-dependent hydrolases of the beta-lactamase superfamily I
1237 Metal-dependent hydrolases of the beta-lactamase superfamily II
1245 Predicted ATPase, RNase L inhibitor (RLI) homolog
1313 Uncharacterized Fe-S protein Pfl X, homolog of pyruvate formate lyase activating proteins
1326 Uncharacterized archaeal Zn-fi nger protein
1355 Predicted dioxygenase
1365 Predicted ATPase (PP-loop superfamily)
1407 Predicted ICC-like phosphoesterases
1412 Uncharacterized proteins of PilT N-term superfamily
1418 Predicted HD superfamily hydrolase
1439 Predicted nucleic acid-binding protein, consists of a PIN domain and a Znribbon module
1458 Predicted DNA-binding protein containing PIN domain
1537 Predicted RNA-binding proteins
1545 Predicted nucleic-acid-binding protein containing a Zn-ribbon
1571 Predicted DNA-binding protein containing a Zn-ribbon domain
1608 Predicted archaeal kinase
1646 Predicted phosphate-binding enzymes, TIM-barrel fold
1759 ATP-utilizing enzymes of ATP-grasp superfamily (probably carboligases)
1779 C4-type Zn-fi nger protein
1782 Predicted metal-dependent RNase, consists of a metallo-beta-lactamase domain and an RNA-binding KH domain
1818 Predicted RNA-binding protein, contains THUMP domain
1829 Predicted metal-dependent RNase, consists of a metallo-beta-lactamase domain and an RNA-binding KH domain
1831 Predicted metal-dependent hydrolase (urease superfamily)
1855 ATPase (PilT family)
1907 Predicted archaeal sugar kinases
1938 Archaeal enzymes of ATP-grasp superfamily
1964 Predicted Fe-S oxidoreductases
1988 Predicted membrane-bound metal-dependent hydrolases
2047 Uncharacterized protein (ATP-grasp superfamily)
2102 Predicted ATPases of PP-loop superfamily
2118 DNA-binding protein
2129 Predicted phosphoesterases, related to the Icc protein
2151 Predicted metal-sulfur cluster biosynthetic enzyme
2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold
2244 Membrane protein involved in the export of O-antigen and teichoic acid
2520 Predicted methyltransferase
3269 Predicted RNA-binding protein, contains TRAM domain

https://reasonandscience.catsboard.com

Otangelo


Admin

To Model the Simplest Microbe in the World, You Need 128 Computers 1

https://www.theatlantic.com/technology/archive/2012/07/to-model-the-simplest-microbe-in-the-world-you-need-128-computers/260198/?fbclid=IwAR1HmIkPbuzCmxnIZCuP-uyAzMhTYA9XFZGxKZ57LStZRokenijwIv1SutA

Mycoplasma genitalium has one of the smallest genomes of any free-living organism in the world, clocking in at a mere 525 genes. That's a fraction of the size of even another bacterium like E. coli, which has 4,288 genes. M. genitalium's diminutive genome made it the first target for Stanford and J. Craig Venter Institute researchers who wanted to simulate an organism in software.

The bioengineers, led by Stanford's Markus Covert, succeeded in modeling the bacterium, and published their work last week in the journal Cell. What's fascinating is how much horsepower they needed to partially simulate this simple organism. It took a cluster of 128 computers running for 9 to 10 hours to actually generate the data on the 25 categories of molecules that are involved in the cell's lifecycle processes.

This has a direct bearing on one of the big questions about technology over the next 50 years: how successful will biotechnologies be? On the one hand, we've made tremendous strides in describing the molecular processes that power life. I'm not just talking about genomics, but  whole sets of other molecules and interactions (see: proteomics, metabolomics, epigenomics, transcriptomics). The new work stands as a testament to how far we've come. We can now simulate most known interactions within the cell: how the code of its DNA becomes proteins, how those proteins interact, and how the cell uses energy.

On the other hand, the depth and breadth of cellular complexity has turned out to be nearly unbelievable, and difficult to manage, even given Moore's Law. The M. genitalium model required 28 subsystems to be individually modeled and integrated, and many critics of the work have been complaining on Twitter that's only a fraction of what will eventually be required to consider the simulation realistic.

"Right now, running a simulation for a single cell to divide only one time takes around 10 hours and generates half a gigabyte of data," lead scientist Covert told the New York Times. "I find this fact completely fascinating, because I don't know that anyone has ever asked how much data a living thing truly holds."

One cell. One division. Half a gig of data. Now figure that millions of bacteria could fit on the head of a pin and that many of them are an order of magnitude more complex than M. genitalium. Or ponder the idea that the human body is made up of 10 trillion (big, complex) human cells, plus about 90 or 100 trillion bacterial cells. That's about 100,000,000,000,000 cells in total. That'd take a lot of computers to model, eh?  If it were possible, that is.

It's not that I think this level of biological complexity makes it impervious to human engineering. Clearly, that's not the case. But, it does seem that it is very difficult to manipulate or optimize living systems without causing major, unintended consequences. We can only simulate one of the simplest cells in the world through years of research, but we change trillions of trillions of cells with ease.

1. the peer reviewed paper upon which this article is based.
A Whole-Cell Computational Model Predicts Phenotype from Genotype
https://www.cell.com/fulltext/S0092-8674%2812%2900776-3?fbclid=IwAR343XnwaYqJTPmHUiNHWqB7-q3KaFLZKlo-yAenoWhFUGyKc9vK4LquZvc

https://reasonandscience.catsboard.com

Otangelo


Admin

Requirements for life:

https://www.geol.umd.edu/~tholtz/G331/lectures/331arche.html

Energy source: For most that's the Sun, but some deep-sea ecosystems depend on chemical energy from hydrothermal vents.
Proteins: Polymers of amino acids. Structural elements and catalysts.
Nucleic acids: Regulate synthesis of proteins in proper cells, acting both as information templates and enzymes.
Semipermeable membranes: In which to package and isolate the components of life.
Proteins: These end up being surprisingly easy to form under natural circumstances.

Proteins: These end up being surprisingly easy to form under natural circumstances.

The simple experiment of Miller and Urey, 1959, in a classic experiment, showed that amino acids are readily synthesized from presumed primordial components of Earth atmosphere. Indeed, the Murcheson meteorite (a CM carbonaceous chondrite) has been shown to contain numerous amino acids (some of types employed in protein synthesis and others not.)(Matson, 2010)
Fox et al., 1959 saw that concentrated solutions of amino acids form proteinoids (short polymers of 18 common amino acids) if heated to 140 deg. C. When cooled, proteinoids form suspiciously cell-like spheres. Fox ultimate found "wild" proteinoids in pools associated with Hawaiian volcanoes. It's not a huge stretch to speculate on a similar origin of proper proteins.

Nucleic acids: Here we have a more complex problem, as the nucleotides of which they are composed are more complex than amino acids, and coaxing them to polymerize is more difficult. Research has focused on the identification of non-living substrates that could serve as a polymerization template:

A. G. Cairns-Smith observed that RNA nucleotides can bind to the edges of clay minerals like smectite to form RNA-like polymers. (For an overview, see Genetic Takeover: And the Mineral Origins of Life).
RNA World? - Altman et al., 1986 demonstrated that RNA is capable of acting not only as a template for protein synthesis, but, in limited ways, as a biochemical catalyst. (Particularly interacting with phospholipids like those occurring in cell membranes). Sidney Altman went on to propose an early stage in the origin of life, called RNA World. in which simple "biochemical" processes were carried out entirely by RNA. In this scenario, double-strand DNA is derived from RNA at a later time. Recently, however, Gavette et al., 2016 observed that intermediate forms of nucleic acid that would have figured in this transition are unstable, making the RNA -> DNA transition problematic.
Only later did nucleic acids and proteins join forces.
PNAs: Because ribose, the sugar component of the RNA polymer is difficult to synthesize from a Miller and Urey-style primordial soup, but amino acids are easy, some researchers have proposed that the first gene-bearing molecules were "peptide-nucleic acids" that used amino acids instead of ribose as the polymerizing "backbone" of nucleic acids. Not crazy: PNAs have been synthesized, and recently identified in cyanobacteria by Banack et al., 2012. And yet, the substitution to RNA seems to have happened very early.

Membranes:

In living cells, these are made of highly impermeable phospholipid bilayers. (Indeed, protein channels regulate transport across the membrane.) The lab of Jack Szostak of Harvard has shown that fatty acids that would have been common in the Archean oceans can form vesicles that are permeable to nucleotide monomers and amino acids, but not to polymers of these. (See Szostak, 2012.)
Osmotic pressure from nucleic acid polymers causes larger vesicles to "steal" fatty acids from smaller ones that they encounter. Mechanical forces cause larger vesicles to fission.
The result, the beginning of natural selection, in which fatty acid vesicles that grow faster dominate.

Energy source: The crucial fact in the foregoing is that the both cell membranes and proteins seem to have originated in environments that are at least intermittently hot.

Synergies: The foregoing research suggests that the major components of life were able to self-assemble independently of one another in the primordial soup, Don't assume from this that the components of life necessarily evolved independently. Black et al., 2013 show that the simple fatty acid decanoic acid binds preferentially to the four RNA nucleotides found in RNA (adenine, guanine, cytosine, and uracil). Moreover, in their bound state, the nucleotides buffer decanoic acid against the disruptive effects of salt water. The result is a natural affinity between fatty acids and RNA nucleotides.

The Ultimate result: Cells in which information encoded in DNA is the template for the synthesis of proteins. The structure of DNA was discovered in 1953, and its role as the physical repository of genes illuminated in the following decades. A brief review of how information encoded as nucleic acid is expressed as proteins goes like this:
DNA at rest in the nucleus is a double helix (spiral staircase) whose corresponding base pairs are connected through weak bonds.
RNA polymerase separates the DNA helices and uses one of them as a template for assembling a single-strand messenger RNA molecule (mRNA).
mRNA passes from the nucleus into the cytoplasm, where it encounters the two components of the ribosome.
The ribosome grabs passing transfer RNA (tRNA) molecules. tRNA have nucleotide triplets (codons) exposed at one end. Each nucleotide triplet combination is associated with a specific amino acid, that is bound to the opposite end of the tRNA.
The ribosome matches nucleotide triples on the mRNA to corresponding tRNAs, moving down the mRNA strand. Amino acids connected to adjacent tRNAs bind to form a protein, an amino acid polymer, that forms part of the cell's structure or does work in it.

https://reasonandscience.catsboard.com

Otangelo


Admin

Synthesizing from a Minimal Gene Set
A minimal gene set contains the smallest number of required genes to sustain a functional organism, one that can self-replicate in the presence of all necessary nutrients and in the absence of any environmental stress. Two popular methods have been used to estimate the minimal gene set required for a self-replicating cell. One involves identifying a core set of genes shared by different existing organisms—those fundamental for a broad range of life. The other systematically inactivates individual genes in an organism to determine which genes are essential for survival and/or reproduction. The first method aims at identifying homologous genes—that is, genes that have similar sequences but are found in different organisms. Homologous genes are generally thought to have evolved from a common ancestral gene. Therefore, evolution would predict that all organisms share the remnants of a core set of essential genes. After complete sequencing of the first two bacterial genomes, a comparison of the 1,727 protein-coding genes of Haemophilus influenzae and the 468 Mycoplasma genitalium genes identified 240 homologous genes between the two. This suggested that those 240 genes might constitute a minimal genome for life. However, when the number of included prokaryotic8 genomes increased to one hundred, the number of homologous genes decreased to sixty-three. Finally, with the inclusion of one thousand prokaryotic genomes, the number of homologous genes became zero—not a single protein-coding gene was conserved across the thousand prokaryotes that were compared. Therefore, searching for homologous genes failed to determine determine the minimal gene set required to make a self-replicating cell. This finding also casts doubt on the belief that a common ancestor underlies all prokaryotes because they do not all share a fundamental set of essential genes.

The second method to arrive at a minimal genome strives to identify genes that are essential by assessing survivability of an organism after inactivating or deleting individual genes [35–44]. One such study suggested that Mycoplasma genitalium, the free-living organism that has the smallest genome, contains 425 essential genes (382 protein-coding genes and 43 RNA-coding genes). However, this is an inappropriately low estimate because the work did not account for synthetic lethality. The finding that at least four hundred genes are required for the survival and propagation of even the simplest cells came as somewhat of a surprise to those who imagine a simple start to life. This also casts substantial doubt on the possibility of creating a self-replicating cell by random chance. A minimal set of essential genes for life, collected from a variety of organisms, has not been identified.





https://reasonandscience.catsboard.com

Sponsored content



Back to top  Message [Page 1 of 1]

Permissions in this forum:
You cannot reply to topics in this forum