Intelligent Design, the best explanation of Origins

This is my personal virtual library, where i collect information, which leads in my view to Intelligent Design as the best explanation of the origin of the physical Universe, life, and biodiversity

You are not connected. Please login or register

Intelligent Design, the best explanation of Origins » Molecular biology of the cell » Origins - what cause explains best our existence, and why?

Origins - what cause explains best our existence, and why?

Go to page : Previous  1, 2

Go down  Message [Page 2 of 2]

26 What Is the Metabolic Fate of Ammonium? on Thu Mar 29, 2018 9:04 am


What Is the Metabolic Fate of Ammonium?

Plants and microorganism which utilize nitrate as nitrogen source have to reduce it to the level of ammonia before incorporation into various organic compounds of their cells, because in their cellular constituents nitrogen is present in a reduced state. 7 Reduction of nitrate to the level of ammonia involves conversion of nitrogen from its highest oxidized state (+5) to the most reduced state (-3), requiring transfer of 8 electrons. This conversion is supposed to take place in 4 steps, each step consisting of a two-electron transfer reaction.

The above sequence is known as assimilatory nitrate reduction pathway which is distinguished from a dissimilatory pathway operating in nitrate respiration (also known as de-nitrification). The first step in the above pathway involves reduction of nitrate to nitrite.

The reaction is catalysed by nitrate reductase. In microorganisms, plants and fungi, nitrate reductase is a soluble cytoplasmic enzyme. In Neurospora, the enzyme is a molybdenum containing flavo-protein. The probable pathway of electron flow to nitrate is NADPH2 —> FAD Mo —> NO3–.

Molybdenum undergoes valency change from Mo5+ to Mo6+ during electron transfer for reduction of nitrate to nitrite. Reduction of nitrite is catalysed by nitrite reductase. Further electron transport for production of NH3 requires a highly electronegative reductant. In green plants, reduced ferredoxine produced by the light-reaction of photosynthesis probably acts as the terminal reductant of nitrite to ammonia. The probable path for electron flow to nitrite is ferredoxine (reduced) —> NADP —> FAD —> NO2–.

The final product of nitrate reduction, ammonia, is then incorporated into organic compounds by the several alternative routes described below:

Incorporation of Ammonia into Organic Compounds
The key entry point is the amino acid glutamate. Glutamate and glutamine are the nitrogen donors in a wide range of biosynthetic reactions. Glutamine synthetase, which catalyzes the formation of glutamine from glutamate, is a main regulatory enzyme of nitrogen metabolism. 17

Glutamate dehydrogenase, Glutamine Synthetase & Aminotransferases play central roles in amino acid biosynthesis. The combined action of the enzymes glutamate dehydrogenase, glutamine synthetase, and the aminotransferases (Figure below) converts inorganic ammonium ion into the α-amino nitrogen of amino acids.

Glutamate, the precursor of the so-called “glutamate family” of amino acids, is formed by the reductive amidation of the citric acid cycle α-ketoglutarate, a reaction catalyzed by mitochondrial glutamate dehydrogenase ( first reaction, picture above ) The reaction strongly favors glutamate synthesis, which lowers the concentration of cytotoxic ammonium ion. The amidation of glutamate to glutamine catalyzed by glutamine synthetase ( second reaction, figure above ) and involves the intermediate formation of γ-glutamyl phosphate ( third reaction, figure above ) Following the ordered binding of glutamate and ATP, glutamate attacks the γ-phosphorus of ATP, forming γ-glutamyl phosphate and ADP. NH4+ then binds, and uncharged NH3 attacks γ-glutamyl phosphate. Release of Pi and of a proton from the γ-amino group of the tetrahedral intermediate then allows release of the product, glutamine.

Ammonia is incorporated into biomolecules through Glutamate and Glutamine. Reduced nitrogen in the form of NH4+ is assimilated into amino acids and then into other nitrogen-containing biomolecules. Two amino acids, glutamate and glutamine, provide the critical entry point. These same two amino acids play central roles in the catabolism of ammonia and amino groups in amino acid oxidation. Glutamate is the source of amino groups for most other amino acids, through transamination reactions. The amide nitrogen of glutamine is a source of amino groups in a wide range of biosynthetic processes.  An Escherichia coli cell requires so much glutamate that this amino acid is one of the primary solutes in the cytosol. Its concentration is regulated not only in response to the cell’s nitrogen requirements but also to maintain an osmotic balance between the cytosol and the external medium. The biosynthetic pathways to glutamate and glutamine are simple, and all or some of the steps occur in most organisms. The most important pathway for the assimilation of NH4+ into glutamate requires two reactions. First, glutamine synthetase catalyzes the reaction of glutamate and NH4+ to yield glutamine. This reaction takes place in two steps. 

Given the prevalence of Nitrogen atoms in cellular components, it is surprising that only three enzymatic reactions introduce ammonium into organic molecules. Of these three, glutamate dehydrogenase and glutamine synthetase are responsible for most of the ammonium assimilated into carbon compounds. 

Glutamate dehydrogenase (GDH)
Glutamate dehydrogenase (GDH) catalyzes the reductive amination of a-ketoglutarate to yield glutamate. Reduced pyridine nucleotides (NADH or NADPH) provide the reducing power.
Alpha-ketoglutarate (AKG) is a key molecule in the Krebs cycle determining the overall rate of the citric acid cycle of the organism. It is a nitrogen scavenger and a source of glutamate and glutamine that stimulates protein synthesis and inhibits protein degradation in muscles. 5

Untangling the glutamate dehydrogenase allosteric nightmare
Glutamate dehydrogenase (GDH) is found in all living organisms, but only animal GDH is regulated by a large repertoire of metabolites. More than 50 years of research to better understand the mechanism and role of this allosteric network has been frustrated by its sheer complexity. However, recent studies have begun to tease out how and why this complex behavior evolved. Much of GDH regulation probably occurs by controlling a complex ballet of motion necessary for catalytic turnover and has evolved concomitantly with a long antenna-like feature of the structure of the enzyme. Ciliates, the ‘missing link’ in GDH evolution, might have created the antenna to accommodate changing organelle functions and was refined in humans to, at least in part, link amino acid catabolism with insulin secretion. 20

A model for GDH allostery. 
This figure shows that GDH allostery might be a form of exaptation. Here, the antenna and some regulation could have been created in Ciliates in response to the requirement of organelle function. These features might have then been further adapted for a different function, insulin homeostasis, in animals.

Nitrogen transporter
Another function is to combine with nitrogen released in the cell, therefore preventing nitrogen overload. α-Ketoglutarate is one of the most important nitrogen transporters in metabolic pathways. The amino groups of amino acids are attached to it (by transamination) and carried to the liver where the urea cycle takes place. α-Ketoglutarate is transaminated, along with glutamine, to form the excitatory neurotransmitter glutamate. Glutamate can then be decarboxylated (requiring vitamin B6) into the inhibitory neurotransmitter GABA. 6

To a first approximation, two regulatory controls are paramount (Figure below):

(1) ADP inhibits the activity of nitrogenase; thus, as the ATP/ADP ratio drops, nitrogen fixation is blocked. 
(2) NH4+ represses the expression of the nif genes, the genes that encode the proteins of the nitrogen-fixing system. To date, some 20 nif genes have been identified with the nitrogen fixation process. Repression of nif gene expression by ammonium, the primary product of nitrogen fixation, is an efficient and effective way of shutting down N2 fixation when its end product is not needed. In addition, in some systems, covalent modification of nitrogenase
reductase leads to its inactivation. Inactivation occurs when Arg101 of nitrogenase reductase receives an ADP-ribosyl group donated by NAD1.

This reaction provides an important interface between nitrogen metabolism and cellular pathways of carbon and energy metabolism because a-ketoglutarate is a citric acid cycle intermediate.

Glutamine synthetase (GS)
Glutamine synthetase (GS) is an enzyme that plays an essential role in the metabolism of nitrogen by catalyzing the condensation of glutamate and ammonia to form glutamine 19

Glutamine Synthetase Is a Central Control Point in Nitrogen Metabolism
Glutamine is the amino group donor in the formation of many biosynthetic products as well as being a storage form of ammonia. The control of glutamine synthetase is therefore vital for regulating nitrogen metabolism. Mammalian glutamine synthetases are activated by α-ketoglutarate, the product of glutamate’s oxidative deamination. This control presumably helps prevent the accumulation of the ammonia produced by that reaction. Bacterial glutamine synthetase has a much more elaborate control system. The enzyme, which consists of 12 identical 468- residue subunits arranged at the corners of a hexagonal prism is regulated by several allosteric effectors as well as by covalent modification. Several aspects of its control system bear note. Nine allosteric feedback inhibitors, each with its own binding site, control the activity of bacterial glutamine synthetase in a cumulative manner. Six of these effectors—histidine, tryptophan, carbamoyl phosphate (as synthesized by carbamoyl phosphate synthetase), glucosamine- 6-phosphate, AMP, and CTP—are all end products of pathways leading from glutamine. The other three—alanine, serine, and glycine—reflect the cell’s nitrogen level. 18 

X-Ray structure of glutamine synthetase from the bacterium Salmonella typhimurium. 
The enzyme consists of 12 identical subunits, here drawn in ribbon form, arranged with D6 symmetry (the symmetry of a hexagonal prism). 
(a) View along the sixfold axis of symmetry showing only the six subunits of the upper ring in different colors, with the lower right subunit colored in rainbow order from its N-terminus (blue) to its C-terminus (red). The subunits of the lower ring are roughly directly below those of the upper ring. A pair of Mn2+ ions (purple spheres) that occupy the positions of the Mg2+ ions required for enzymatic activity are bound in each active site. The ADP bound to each active site is drawn in stick form with C green, N blue, O red, and P orange. 
(b) View along one of the protein’s twofold axes (rotated 90° about the horizontal axis with respect to Part a) showing only the eight subunits nearest the viewer. The sixfold axis is vertical in this view.

E. coli glutamine synthetase is covalently modified by adenylylation (addition of an AMP group) of a specific Tyr residue. The enzyme’s susceptibility to cumulative feedback inhibition increases, and its activity therefore decreases, with its degree of adenylylation. The level of adenylylation is controlled by a complex metabolic cascade that is conceptually similar to that controlling glycogen phosphorylase . Both adenylylation and deadenylylation of glutamine synthetase are catalyzed by adenylyltransferase in complex with a tetrameric regulatory protein, PII. This complex deadenylylates glutamine synthetase when PII is uridylylated (also at a Tyr residue) and adenylylates glutamine synthetase when PII lacks UMP residues. The level of PII uridylylation, in turn, depends on the relative levels of two enzymatic activities located on the same protein: a uridylyltransferase that uridylylates PII and a uridylyl-removing enzyme that hydrolytically excises the attached UMP groups Section 5 Amino Acid Biosynthesis of PII. The uridylyltransferase is activated by α-ketoglutarate and ATP and inhibited by glutamine and Pi, whereas uridylyl-removing enzyme is insensitive to those metabolites. This intricate metabolic cascade therefore renders the activity of E. coli glutamine synthetase extremely responsive to the cell’s nitrogen requirements.

Glutamine synthetase (GS) is essential for ammonium assimilation and the biosynthesis of glutamine. 14 All organisms contain the enzymes glutamate dehydrogenase and glutamine synthetase, which convert ammonia to glutamate and glutamine, respectively. 15 The Last Universal Common Ancestor (LUCA) accessed nitrogen via nitrogenase and via glutamine synthetase. 16

What a monumental admission.  Glutamine synthetase (GS) had to be fully operational and emerged PRIOR life and self replication began. We will see short after what that means. 

Amino and amide groups from these two compounds can then be transferred to other carbon backbones by transamination and transamidation reactions to make amino acids. Interestingly, glutamine is the universal donor of amine groups for the formation of many other amino acids as well as many biosynthetic products. Glutamine is also a key metabolite for ammonia storage. All amino acids, with the exception of proline, have a primary amino group (NH2) and a carboxylic acid (COOH) group. They are distinguished from one another primarily by , appendages to the central carbon atom.

Glutamine is a major Nitrogen donor in the biosynthesis of many organic Nitrogen compounds such as purines, pyrimidines, and other amino acids, and GS activity is tightly regulated. We require a constant supply of nitrogen to build the bases in nucleic acids and the amino acids in proteins. The amide-N of glutamine provides the nitrogen atom in these biosyntheses. Glutamine is the most abundant amino acid in humans.  The nitrogen gas in the air and the nitrogen in nitrates and nitrites, although abundant, are not reactive enough for this use. Ammonia is the preferred source of nitrogen for these reactions. Unfortunately, ammonia is very toxic and cannot be stored or transported safely. Instead, ammonia is attached to the amino acid glutamate, forming glutamine. Because it is a natural amino acid, normally used to build proteins, glutamine is easily transported throughout the body in large amounts. Ammonia may then be liberated only when needed. Glutamine synthetase connects a molecule of ammonia to the amino acid glutamate. A molecule of ATP (adenosine triphosphate) is used to power the process, to ensure that the reaction is performed only in the proper direction and not in reverse, carelessly liberating poisonous ammonia. The bacterial enzyme  is a highly regulated allosteric enzyme ( allosteric regulation or control is the regulation of an enzyme by binding an effector molecule at a site other than the enzyme's active site. ) 

Covalent Regulation of Glutamine Synthetase
" Glutamine synthetase is one of the most heavily regulated enzymes because it reacts with ammonia and pneumonias toxic we need to regulate ammonia levels very very tightly and there's a lot of ways we do that "

Glutamate, Glutamine Biosynthesis
Glutamate and glutamine are both made from the TCA cycle

So in order to have the substrate which Glutamine synthetase (GS) processes,

The Citric acid cycle, or Krebs (TCA) cycle

A molecular Computer
Glutamine synthetase has been likened or compared to a molecular computer. With its 12 interacting subunits, arranged in two rings of six, it senses the amounts of the amino acids and nucleotides ultimately constructed from the ammonia in glutamine. Glutamine synthetase weighs the concentrations of each, computes whether there is an overall deficit or excess, and turns on or off based on the result.  12

Our cells are continually faced with a changing environment. 13 Think about what you eat. Some days you might eat a lot of protein, other days you might eat a lot of carbohydrate. Sometimes you may eat nothing but chocolate. Your body must be able to respond to these different foods, producing the proper enzymes for capturing the nutrients in each. The same is doubly true for small organisms like bacteria, which do not have as many options in choosing their diet. They must eat whatever food happens to be close by, and then mobilize the enzymes needed to use it.

The enzyme glutamine synthetase is a key enzyme controlling the use of nitrogen inside cells. Glutamine, as well as being used to build proteins, delivers nitrogen atoms to enzymes that build nitrogen-rich molecules, such as DNA bases and amino acids. So, glutamine synthetase, the enzyme that builds glutamine, must be carefully controlled. When nitrogen is needed, it must be turned on so that the cell does not starve. But when the cell has enough nitrogen, it needs to be turned off to avoid a glut.

Glutamine synthetase acts like a tiny molecular computer, monitoring the amounts of nitrogen-rich molecules. It watches levels of amino acids like glycine, alanine, histidine and tryptophan, and levels of nucleotides like AMP and CTP. If too much of one of these molecules is made, glutamine synthetase senses this and slows production slightly. But as levels of all of these nucleotides and amino acids rise, together they slow glutamine synthetase more and more. Eventually, the enzyme grinds to a halt when the supply meets the demand.

Communication Between Many Active Sites
The glutamine synthetase molecule  is composed of twelve identical subunits, each of which has an active site for the production of glutamine. When performing its reaction, the active site binds to glutamate and ammonia, and also to an ATP molecule that powers the reaction. But, the active sites also bind weakly to other amino acids and nucleotides, partially blocking the action of the enzyme. All of the many sites communicate with one another, and as the concentrations of competing molecules rise, more and more of the sites are blocked, eventually shutting down the whole enzyme. The cell has a more direct approach when it wants to shut down the enzyme. At a key tyrosine next to the active site, colored yellow here and shown by the arrow, an ADP molecule can be attached to the protein, completely blocking its action.

We make several versions of glutamine synthetase in our own cells. Most of our cells make a version similar to the bacterial one, but with eight subunits instead of twelve. Like the bacterial enzyme, it is controlled by the nitrogen-rich compounds down the synthetic pipeline. We also make a second glutamine synthetase in our brain. There, glutamate is used as a neurotransmitter, and glutamine synthetase is used when the glutamate is recycled after a nerve impulse is delivered. In the brain, glutamine synthetase is in constant action, so a highly-regulated version is not appropriate. Instead, the alternate form is active all the time, continually performing its essential duty.

Two Doors
Each of the twelve active sites of glutamine synthetase has two metal ions, either magnesium or manganese, bound at the center of a tunnel. The substrates enter from two sides of the tunnel: ATP enters on the exposed faces on the top and bottom of the enzyme (ATP is easily seen in the upper picture on the previous page) and glutamate and ammonia squeeze through an opening between the upper ring of subunits and the lower ring. This structure contains a ADP molecule bound in the ATP site, two manganese ions (which bind tighter than magnesium, but make the enzyme slightly slower), and an inhibitor that is about the same size and shape as glutamine.

Nitrogen is found everywhere on Earth, forming about three-fourths of the air. Nitrogen gas, however, is chemically inert and of little use to us. Our primary source of nitrogen is the ammonia in amino acids and nucleotides, obtained by eating other living things. But small amounts of ammonia are lost from the biosphere over time, locked up in minerals and buried out of reach. To replenish the global supply of biological nitrogen, nitrogen gas is converted into ammonia in the process of nitrogen fixation. Today, this is accomplished in three ways: about 15% is formed geologically, by lightning and ultraviolet radiation; 25% is produced industrially and distributed as fertilizer; and the remaining 60% is produced by a small class of bacteria and algae. These "diazotrophic" microorganisms fix nitrogen using nitrogenases, enzymes that rip apart the two tightly bound atoms in nitrogen gas and add hydrogen atoms to them, forming ammonia. Nitrogenases contain dozens of reactive iron atoms, as well as rarer metals such as molybdenum and vanadium. These unusual metal ions are required to apply the chemical tension that wrenches apart the stable nitrogen molecule. However, they are extremely sensitive to oxygen.

Leguminous plants, like peas and beans, have worked out a solution to this problem. In a classic example of symbiotic cooperation, legume roots build a nodule custom-made for bacteria, filled with leghemoglobin, a protein similar to the hemoglobin that carries oxygen in our blood. Leghemoglobin soaks up any oxygen that ventures near. In return for this safe haven, the bacteria release some of their fixed nitrogen for use by the plant. This abundant supply of ammonia carries a heavy price, however. Nitrogen fixation is very expensive, requiring about 16 ATP molecules per nitrogen molecule split into ammonia.

Glutamine synthetase (GS) is found in all organisms. In addition to its importance for NH4+ assimilation in bacteria, it has a central role in amino acid metabolism in mammals, converting free NH4+ which is toxic, to glutamine for transport in the blood.     It is a huge enzyme, which has in E. coli 5628 amino acids, which catalyzes a reaction that introduces reduced nitrogen into cellular metabolism, and is among the most complex regulatory enzymes known.  In bacteria and plants, glutamate is produced from glutamine in a reaction catalyzed by glutamate synthase. alpha Ketoglutarate, an intermediate of the citric acid cycle, undergoes reductive amination with glutamine as nitrogen donor:   It is regulated allosterically (with at least eight different modulators); by reversible covalent modification; and by the association of other regulatory proteins. It catalyzes glutamine synthesis from glutamate and ammonia at the expenditure of ATP. 9  by ATP-dependent amidation of the g-carboxyl group of glutamate to form glutamine  The reaction proceeds via a g-glutamyl-phosphate intermediate. Glutamine synthetase (GS) activity depends on the presence of divalent cations such as Mg2+. 

Glutamine synthetase (GS) catalyzes the ATP-dependent amidation of the g-carboxyl group of glutamate to form glutamine

An example of an energetically unfavorable biosynthetic reaction driven by ATP hydrolysis. b
(A) Schematic illustration of the formation of A–B in the condensation reaction described in the text. 
(B) The biosynthesis of the common amino acid glutamine from glutamic acid and ammonia. Glutamic acid is first converted to a high-energy phosphorylated intermediate (corresponding to the compound B–O–PO3 described in the text), which then reacts with ammonia (corresponding to A–H) to form glutamine. In this example, both steps occur on the surface of the same enzyme, glutamine synthetase. The high-energy bonds are shaded red; here, and the symbol Pi = HPO42–, and a yellow “circled P” = PO32–. 11

What Regulatory Mechanisms Act on coli Glutamine Synthetase?
Glutamine plays a pivotal role in nitrogen metabolism by donating its amide nitrogen to the biosynthesis of many important organic N compounds. Consistent with its metabolic importance, in prokaryotic cells such as E. coli, GS is
regulated at three different levels:

1. Its activity is regulated allosterically by feedback inhibition.
2. GS is interconverted between active and inactive forms by covalent modification.
3. Cellular amounts of GS are carefully controlled at the level of gene expression and protein synthesis.

The activity of glutamine synthetase is regulated in virtually all organisms—not surprising, given its central metabolic role as an entry point for reduced nitrogen. In enteric bacteria such as E. coli, the regulation is unusually
complex. The enzyme has 12 identical subunits  and is regulated both allosterically and by covalent modification. Alanine, glycine, and at least six end products of glutamine metabolism are allosteric inhibitors of the enzyme. Each
inhibitor alone produces only partial inhibition, but the effects of multiple inhibitors are more than additive, and all eight together virtually shut down the enzyme. This control mechanism provides a constant adjustment of glutamine levels to match immediate metabolic requirements.

Glutamine Synthetase Is Allosterically Regulated
Nine distinct feedback inhibitors (Gly, Ala, Ser, His, Trp, CTP, AMP, carbamoyl-P, and glucosamine-6-P) act on GS. Gly, Ala, and Ser are key indicators of amino acid metabolism in the cell; each of the other six compounds represents an end product of a biosynthetic pathway dependent on Gln ( see figure below )

The allosteric regulation of glutamine synthetase activity by feedback inhibition.

Evolution of the glutamine synthetase gene, one of the oldest existing and functioning genes 2
December 14, 1992
We performed molecular phylogenetic analyses of glutamine synthetase (GS) genes in order to investigate their evolutionary history. We suggest that GS genes are one of the oldest existing and functioning genes in the history of gene evolution and that GSI genes should also exist in eukaryotes.

The third, carbamoyl-phosphate synthetase I, is a mitochondrial enzyme that participates in the urea cycle. This reaction provides an important interface between nitrogen metabolism and cellular pathways of carbon and energy metabolism because a-ketoglutarate is a citric acid cycle intermediate. 

The amino acid and nucleotide biosynthetic pathways make repeated use of the biological cofactors

- pyridoxal phosphate
- tetrahydrofolate
- S-adenosylmethionine

Pyridoxal phosphate is required for transamination reactions involving glutamate and for other amino acid transformations. One-carbon transfers require S-adenosylmethionine and tetrahydrofolate. Glutamine amidotransferases catalyze reactions that incorporate nitrogen derived from glutamine.

a Guanosine-5'-triphosphate (GTP) is a purine nucleoside triphosphate. It is one of the building blocks needed for the synthesis of RNA during the transcription process. Its structure is similar to that of the guanine nucleobase, the only difference being that nucleotides like GTP have a ribose sugar and three phosphates, with the nucleobase attached to the 1' and the triphosphate moiety attached to the 5' carbons of the ribose.

It also has the role of a source of energy or an activator of substrates in metabolic reactions, like that of ATP, but more specific. It is used as a source of energy for protein synthesis and gluconeogenesis. 3

Hydrolysis  usually means the cleavage of chemical bonds by the addition of water. When a carbohydrate is broken into its component sugar molecules by hydrolysis 10

11. Molecular biology of the cell, Alberts, 6th ed. page 66
12. Goodsell, Our molecular nature, page 31
17. Lehninger, principles of biochemistry, page 860
18. Fundamentals of biochemistry, 6th ed. page 749

Last edited by Admin on Tue Aug 07, 2018 2:38 pm; edited 6 times in total

View user profile


Glutamine synthetase (GS), a incredible molecular super-computer which defies naturalistic explanations

I have written extensively about the importance of nitrogen to sustain and make life possible on earth, how it is fixed by some of the most complex enzymes known, nitrogenase, which transforms nitrogen gas into ammonia. Nitrogen gas forms about 78 percent of the air. It  is chemically inert by its triple bond and requires enormous amounts of energy to be split. For that reason, nitrogenase is called a molecular sledgehammer. Only the force of lightning is able to split dinitrogen, which illustrates the forces required. Our primary source of nitrogen is ammonia in amino acids, buildingblocks to make proteins,  and nucleotides, that is RNA, and DNA, the information storage devices inside our cells.  60% of ammonia is produced by a small class of bacteria, that is, cyanobacterias, and algae. These "diazotrophic" microorganisms fix nitrogen using nitrogenases, enzymes that rip apart the two tightly bound atoms in nitrogen gas and add hydrogen atoms to them, forming ammonia. Nitrogenases contain dozens of reactive iron atoms, as well as rarer metals such as molybdenum. These unusual metal ions are required to apply the chemical tension that wrenches apart the stable nitrogen molecule. However, they are extremely sensitive to oxygen. I wrote previously about the hyper complex biosynthesis processes, which requires complex import mechanisms of Iron, Sulfur, and Molybden into the cytoplasm of the cell, and the enormously complex multistep synthesis process to make the active centers of nitrogenase, Iron - sulfur, and Iron-sulfur-Molybden clusters. 

Overview of the Nitrogenase enzyme complex

Biosynthesis of the Cofactors of Nitrogenase

Molybdenum, essential for life

Iron Uptake and Homeostasis in Cells

Once ammonium is made, it has to be introduced into the process for further intracellular biosynthesis. And here Glutamine synthetase (GS), come into play. They are essential for ammonium assimilation and the biosynthesis of glutamine. 14  All organisms contain the enzymes glutamate dehydrogenase and glutamine synthetase, which convert ammonia to glutamate and glutamine, respectively. 15 The Last Universal Common Ancestor (LUCA) accessed nitrogen via nitrogenase and via glutamine synthetase

My comment: This is a monumental admission.  Glutamine synthetase (GS) had to be fully operational and had to emerge PRIOR life and cellular self replication began. This is amazing - we will see in short what that means, once we understand what kind of enzyme that is, and what it is capable of.  

Glutamine is a major Nitrogen donor in the biosynthesis of many organic Nitrogen compounds such as purines, pyrimidines, and other amino acids. We require a constant supply of nitrogen to build the bases in nucleic acids and the amino acids in proteins. Ammonia is very toxic and cannot be stored or transported safely. Instead, ammonia is attached to the amino acid glutamate, forming glutamine. Because it is a natural amino acid, normally used to build proteins, glutamine is easily transported throughout the body in large amounts. Ammonia may then be liberated only when needed. Glutamine synthetase connects a molecule of ammonia to the amino acid glutamate. A molecule of ATP (adenosine triphosphate) is used to power the process, to ensure that the reaction is performed only in the proper direction and not in reverse, carelessly liberating poisonous ammonia. The bacterial enzyme  is a highly regulated  enzyme

Covalent Regulation of Glutamine Synthetase
" Glutamine synthetase is one of the most heavily regulated enzymes because it reacts with ammonia and we need to regulate ammonia levels very very tightly and there's a lot of ways we do that "

Glutamate, Glutamine Biosynthesis
Glutamate and glutamine are both made from the TCA cycle

So in order to have the substrate which Glutamine synthetase (GS) processes, we need the product of the TCA Cycle. The origin of the TCA cycle is a unsolved origin of life problem, since a multitude of various enzymes are required to work together to produce ATP, and amino acids. 

The Citric acid cycle, or Krebs (TCA) cycle

A molecular Computer
Glutamine synthetase has been likened or compared to a molecular computer. With its 12 interacting subunits, arranged in two rings of six, it senses the amounts of the amino acids and nucleotides ultimately constructed from the ammonia in glutamine. Glutamine synthetase weighs the concentrations of each, computes whether there is an overall deficit or excess, and turns on or off based on the result.  

Our cells are continually faced with a changing environment. Think about what you eat. Some days you might eat a lot of protein, other days you might eat a lot of carbohydrate. Sometimes you may eat nothing but chocolate. Your body must be able to respond to these different foods, producing the proper enzymes for capturing the nutrients in each. The same is doubly true for small organisms like bacteria, which do not have as many options in choosing their diet. They must eat whatever food happens to be close by, and then mobilize the enzymes needed to use it.

The enzyme glutamine synthetase is a key enzyme controlling the use of nitrogen inside cells. Glutamine, as well as being used to build proteins, delivers nitrogen atoms to enzymes that build nitrogen-rich molecules, such as DNA bases and amino acids. So, glutamine synthetase, the enzyme that builds glutamine, must be carefully controlled. When nitrogen is needed, it must be turned on so that the cell does not starve. But when the cell has enough nitrogen, it needs to be turned off to avoid a glut.

Glutamine synthetase acts like a tiny molecular computer, monitoring the amounts of nitrogen-rich molecules. It watches levels of amino acids like glycine, alanine, histidine and tryptophan, and levels of nucleotides like AMP and CTP. If too much of one of these molecules is made, glutamine synthetase senses this and slows production slightly. But as levels of all of these nucleotides and amino acids rise, together they slow glutamine synthetase more and more. Eventually, the enzyme grinds to a halt when the supply meets the demand.

Communication Between Many Active Sites
The glutamine synthetase molecule  is composed of twelve identical subunits, each of which has an active site for the production of glutamine. When performing its reaction, the active site binds to glutamate and ammonia, and also to an ATP molecule that powers the reaction. But, the active sites also bind weakly to other amino acids and nucleotides, partially blocking the action of the enzyme. All of the many sites communicate with one another, and as the concentrations of competing molecules rise, more and more of the sites are blocked, eventually shutting down the whole enzyme. The cell has a more direct approach when it wants to shut down the enzyme. At a key tyrosine next to the active site, colored yellow here and shown by the arrow, an ADP molecule can be attached to the protein, completely blocking its action.

We make several versions of glutamine synthetase in our own cells. Most of our cells make a version similar to the bacterial one, but with eight subunits instead of twelve. Like the bacterial enzyme, it is controlled by the nitrogen-rich compounds down the synthetic pipeline. We also make a second glutamine synthetase in our brain. There, glutamate is used as a neurotransmitter, and glutamine synthetase is used when the glutamate is recycled after a nerve impulse is delivered. In the brain, glutamine synthetase is in constant action, so a highly-regulated version is not appropriate. Instead, the alternate form is active all the time, continually performing its essential duty.

Glutamine synthetase is a life essential, ultracomplex molecular computer, which had to emerge fully operational. It is also one of the huge in molecular size,  it requires over 5500 amino acids, and in case of prokaryotes, twelve subunits, used in several regulation functions. The TCA cycle has no function without Glutamine synthetase, and vice versa. I would say, unless we infer a intelligent creator, its origin is a bit misterious.... 

View user profile

28 How Do Organisms Synthesize Amino Acids? on Mon Apr 09, 2018 12:33 pm


How Do Organisms Synthesize Amino Acids?

In 1943, Gordon, Martin, and Synge used partition chromatography to separate and study constituents of proteins (Gordon, Martin, & Synge 1943), a major breakthrough that contributed to the rapid identification of the twenty amino acids used in proteins by all living organisms. After this initial burst of discovery, two additional amino acids, which are not used by all organisms, were added to the list: selenocysteine (Bock 2000) and pyrrolysine (Srinivasan et al. 2002). 2 Aside from their role in composing proteins, amino acids have many biologically important functions. They are also energy metabolites, and many of them are essential nutrients. Amino acids can often function as chemical messengers in communication between cells. For example, Arvid Carlsson discovered in 1957 that the amine 3-hydroxytyramine (dopamine) was not only a precursor for the synthesis of adrenaline from tyrosine, but is also a key neurotransmitter. Certain amino acids — such as citrulline and ornithine, which are intermediates in urea biosynthesis — are important intermediaries in various pathways involving nitrogenous metabolism. Although other amino acids are important in several pathways, S-adenosylmethionine acts as a universal methylating agent.

The pathways for the biosynthesis of these molecules are extremely ancient, going back to the last common ancestor of all living things. 4 Many of the intermediates in energy-yielding pathways play a role in biosynthesis as well. These common intermediates allow efficient interplay between energy-yielding (catabolic) and energy-requiring biosynthetic (anabolic) pathways. Thus, cells are able to balance the degradation of compounds for energy mobilization and the synthesis of starting materials for macromolecular construction.

Metabolic processes inside the cell require a delicate and finely orchestrated balance between anabolic, and catabolic processes. The right dosage or production of the various basic building blocks is essential and must be adjusted and calibrated to the cells and organisms needs. Amino acids, the basic building blocks of proteins, if there are too many in the cell, they must be degraded. Proteins which have malfunctioned must be degraded, and their turnover is a regulated process requiring complex enzyme systems and ATP energy supply. Proteasome enzymes - protein grinders,   had to emerge prior life began, to do this job.

Proteasome Garbage Grinders, evidence of luck, evolution, or design?

The proteasome also requires ATP energy to function.

Question: Have you ever seen a garbage recycle factory emerge randomly, by a lucky accident?

Each of the 20 amino acids used in life requires specific complex degradation enzymes and coenzymes, which are all there, keen to do their job. Did prebiotic molecules lying around on early earth suddenly see the need to join together to form these enzymes, foreseeing that they would be required for amino acid degradation, one day?

Not only that. Once amino acids are degraded, there is a complex factory, keenly waiting for them for reuse: The machinery is all there to do the job: The carbon atoms of degraded amino acids are converted into pyruvate, acetyl CoA, acetoacetate, or an intermediate of the citric acid cycle.

Imagine: If these processes were not fully set up, it would result in accumulation for example of phenylalanine, and subsequently, if there were advanced animals with brains, the result would be mental retardation. I would not be here, writing these lines, and you would be unable to read, and eventually understand my write up.

But luckily, you can, and also recognize, that these ultrasophisticated processes could not be the result of a lucky accident. Thank God, these processes exist, we have discovered them, and so the hidden message between the lines. God is telling us: Hey, i made all this. Do you not want to know who i am ??

Amino Acid Precursors and Biosynthesis Pathways
In the study of metabolism, a series of biochemical reactions for compound synthesis or degradation is called a pathway. Amino acid synthesis can occur in a variety of ways. For example, amino acids can be synthesized from precursor molecules by simple steps. Alanine, aspartate, and glutamate are synthesized from keto acids called pyruvate, oxaloacetate, and alpha-ketoglutarate, respectively, after a transamination reaction step. a Similarly, asparagine and glutamine are synthesized from aspartate and glutamate, respectively, by an amidation reaction step. The synthesis of other amino acids requires more steps; between one and thirteen biochemical reactions necessary to produce the different amino acids from their precursors of the central metabolism (Figure below).

Amino acid metabolism in context
Numerous metabolism pathways are depicted: 

central metabolism (in black), 
pentose phosphate metabolism (in brown), 
nitrogen metabolism (in magenta), 
and various amino acid metabolism pathways (all other colors).

Nodes (dots) represent metabolites, and lines represent enzymes and intermediaries. The nitrogen metabolism pathway overlaps with the biosynthesis of arginine and proline, with glutamate as the shared precursor. Histidine biosynthesis branches off the pentose phosphate metabolism. Lysine (AAA) biosynthesis can be synthesized through different pathways, the aminoadipate (AAA) pathway or the diaminopimelate (DAP) pathway (shown in dark blue). There are gene homologies between different biosynthetic pathways. In the dark blue pathways, shaded rectangles represent homologies between enzymes. Similarly, the AAA pathway contains enzymes that share homologies with the branched chain amino acid (BCAA) pathways, whereas the DAP pathway contains homologies with the arginine biosynthetic pathway. In various pathways, homologous enzymes are denoted by shaded rectangles. Different shaded colors indicate different pairs of homologous enzymes.

Numerous metabolism pathways and many other connecting pathways, including amino acid metabolism pathways, are depicted as interconnected lines and nodes. The nodes are shown as colored dots that represent different metabolites. The colored lines represent different enzymes and intermediaries in the pathways. The different colors are used to differentiate the pathways, and the pathways branch off of one another at nodes. The central metabolism pathway, which includes glycolysis and the citric acid cycle, is shown on the left in black. The pentose phosphate pathway, which is shown in brown, branches off of the central metabolism pathway during glycolysis. The aminoadipate pathway, which is shown in dark blue, is labeled "AAA" and the diaminopimelate pathway, which is shown in dark purple, is labeled "DAP." Both the AAA and DAP pathways branch off of the central metabolism pathway at later steps during glycolysis or during the citric acid cycle. The nitrogen metabolism pathway is shown in magenta at the far left. Some of the nodes are labeled to indicate key metabolites.

What makes an amino acid essential?
Not all the organisms are capable of synthesizing all the amino acids, and many are synthesized by pathways that are present only in certain plants and bacteria. Mammals, for example, must obtain eight of twenty amino acids from their diets. This requirement leads to a convention that divides amino acids into two categories: essential and nonessential (given a certain metabolism). Because of particular structural features, essential amino acids cannot be synthesized by mammalian enzymes. Nonessential amino acids, therefore, can be synthesized by nearly all organisms.

Nature magazine goes on and claims:
The loss of the ability to synthesize essential amino acids likely emerged very early in evolution, because this dependence on other organisms for the source of amino acids is common among all eukaryotes, not just those of mammals.

This matter of fact goes against a naturalistic explanation. There would be no survival advantage ( rather the opposite is the case), if higher animals in taxonomy would stop producing amino acids, and depend on food ingestion to obtain them. How could a slow gradual transition of this state of affair occur? It seems to make more sense, that the interdependence was set up right from the beginning.  

How do certain amino acids become essential for a given organism? Studies in ecology and evolution give some clues. Organisms evolve under environmental constraints, which are dynamic over time. If an amino acid is available for uptake, the selective pressure to keep intact the genes responsible for that pathway might be lowered, because they would not be constantly expressing these biosynthetic genes. Without the selective pressure, the biosynthetic routes might be lost or the gene could allow mutations that would lead to a diversification of the enzyme's function. Following this logic, amino acids that are essential for certain organisms might not be essential for other organisms subjected to different selection pressures. 

Might be, could, might not be..... is there something beside baseless speculation here ??!!

The relative uses of amino acid biosynthetic pathways vary widely among species because different synthesis pathways fulfill unique metabolic needs in different organisms. Although some pathways are present in certain organisms, they are absent in others. Therefore, experimental results about amino acid metabolism that are achieved with model organisms may not always have relevance for the majority of other organisms.

Amino acids are made from intermediates of the Citric Acid Cycle and other major pathways
The pathways for the biosynthesis of amino acids are diverse. However, they have an important common feature: their carbon skeletons come from intermediates of glycolysis, the pentose phosphate pathway, or the citric acid cycle. On the basis of these starting materials, amino acids can be grouped into six biosynthetic families 1 ( see below )

Major metabolic precursors are shaded blue. Amino acids that give rise to other amino acids are shaded yellow. Essential amino acids are in boldface type.

Organisms show substantial differences in their capacity to synthesize the 20 amino acids common to proteins. Typically, plants and microorganisms can form all of their nitrogenous metabolites, including all of the amino acids, from inorganic forms of N such as ammonium NH4+ and nitrate NO3+. In these organisms, the a-amino group for all amino acids is derived from glutamate, usually via transamination of the corresponding a-keto acid analog of the amino acid. In many cases, amino acid biosynthesis is thus a matter of synthesizing the appropriate a-keto acid carbon skeleton, followed by transamination with Glu. The amino acids can be classified according to the
source of intermediates for the a-keto acid biosynthesis.

Several classes of reactions play special roles in the biosynthesis of amino acids and nucleotides
(1) transamination reactions and other rearrangements promoted by enzymes containing pyridoxal phosphate (PLP) ;
(2) transfer of one-carbon groups, with either tetrahydrofolate (usually at the —CHO and —CH2OH oxidation levels) or S-adenosylmethionine (at the —CH3 oxidation level) as cofactor; and
(3) transfer of amino groups derived from the amide nitrogen of glutamine.

Pyridoxal phosphate (PLP) - Vitamin B6
Pyridoxal phosphate (PLP)-dependent enzymes are unrivaled in the diversity of reactions that they catalyze. 7  Vitamin B6 was first identified as pyridoxine, a catalytically inactive form, in 1938 while the catalytically active aldehyde (pyridoxal) and amine (pyridoxamine) forms and their phosphorylated derivatives (pyridoxal 5’-phosphate and pyridoxamine 5’-phosphate) were discovered in the early 1940’s. They all act as vitamin B6, although pyridoxal 5’-phosphate is the enzymatically active form. 9

PLP is considered as the only active form of vitamin B6. In addition to its role in several enzymatic processes, including amino acid and fatty acid metabolism 39

Following its identification as one of the active vitamers of vitamin B6, pyridoxal 5'-phosphate (PLP) has been the subject of extensive research directed toward understanding its unequaled catalytic versatility. These enzymes are principally involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars.

All aminotransferases have the same prosthetic group (co-factor) and the same reaction mechanism. The prosthetic group is pyridoxal phosphate (PLP) b, the coenzyme form of pyridoxine, or vitamin B6.  

The reaction involves three sequential steps:

(i) formation of a tetrahedral intermediate with the active site lysine and the amino substrate bonded to the PLP co-factor;
(ii) non-direct proton transfer between the amino substrate and the lysine residue; and
(iii) formation of the external aldimine after the dissociation of the lysine residue. The overall reaction is exothermic (−12.0 kcal/mol), the second step being rate-limiting, with 12.6 kcal/mol for the activation energy

The cofactor in all cases functions to stabilize negative charge development at C - alpha in the transition state that is formed after condensation of the amino acid substrate with PLP to form a Schiff base (referred to as the external aldimine c )

The majority of known structures are of Fold Type I (aspartate aminotransferase family) enzymes, a group that includes many of the best-characterized PLP enzymes. They invariably function as homodimers or higher-order oligomers, with two active sites per dimer. The active sites lie on the dimer interface, and each monomer contributes essential residues to both active sites. 

The breadth of reaction specificity enabled by PLP is illustrated to the right, using serine as an example substrate. The first and common step for all PLP-dependent enzyme catalyzed reactions is a Schiff base exchange reaction (transimination). All known PLP enzymes exist in their resting state as a Schiff base (internal aldimine) with an active site lysine residue. The incoming, amine-containing substrate displaces the lysine e-amino group from the internal aldimine, in the process forming a new aldimine with the substrate (external aldimine).

The external aldimine is the common central intermediate for all PLP catalyzed reactions, enzymatic and nonenzymatic. Divergence in reaction specificity occurs from this point. The great majority of pyridoxal phosphate catalyzed reactions depend on the formation of a carbanionic intermediate.

From the external aldimine intermediate, carbanions formed by heterolytic cleavage of any one of the bonds to Ca (except for the C-N bond) can be stabilized. Loss of CO2 gives a carbanion that is commonly reprotonated on Ca to give the corresponding amine as the product. Less commonly, for example with dialkylglycine decarboxylase, the resulting carbanion is reprotonated on C4’ of the coenzyme to give oxidized substrate and the reduced, amino form (PMP) of the coenzyme. 9

Proton abstraction is the most common forward step that external aldimines undergo since racemization, transamination, and beta-elimination, three common reaction types, all require it. Retro-aldol cleavage of serine, central to one-carbon metabolism, is initiated by abstraction of a proton from the beta-hydroxyl group followed by Ca-Cb cleavage. Other known reaction types include beta-decarboxylation of aspartate, beta-elimination and replacement, gamma-elimination and replacement, a/g-elimination, cyclopropyl ring opening, radical-based 1,2-amino migrations, and others. This extraordinarily wide range of reaction types makes PLP enzymes extraordinarily useful to cells. The enzyme commission has more than 140 EC numbers assigned to PLP enzymes, and free living prokaryotes devote ~1.5% of their open reading frames to them.

The commonly accepted mechanism for stabilization of the resulting carbanion is resonance delocalization within the extended conjugated pi system. This is illustrated to the right where the three most significant resonance forms are shown. The rightmost resonance structure is referred to as the “quinonoid” since its structure resembles that of a quinone. It has strong absorption at ~500 nm and is sometimes but not always spectroscopically observable in enzyme catalyzed reactions. This quinonoid resonance structure is commonly considered the major species responsible for the catalytic power of PLP since the electrons from Ca are neutralized by the protonated pyridine nitrogen. This simple view of PLP catalyzed reactions may not be wholly accurate.

Molecular origin of Pyridoxal phosphate enzymes
The pyridoxal-5'-phosphate (PLP)-dependent or vitamin B6-dependent enzymes that catalyze manifold reactions in the metabolism of amino acids belong to no fewer than four evolutionarily independent protein families. 6

It is remarkable that the authors of this science paper do not make a distinction, recognizing that the origin of pyridoxal phosphate enzymes ( Vitamin B6) had to be fully operational at LUCA, and when life began, and could therefore not be the result of evolution.

The multiple evolutionary origin and the essential mechanistic role of PLP in these enzymes argue for the cofactor having arrived on the evolutionary scene before the emergence of the respective apoenzymes and having played a dominant role in the molecular evolution of the B6 enzyme families.

Why would have natural occurrences on a prebiotic earth have produced co-enzymes without the respective apo-enzymes to interact with. Did they emerge, without function at all, just waiting for the respective proteins to interact with, to arrive later on the scene, to then eagerly looking hot to find them, and starting the molecular interaction? 

It has now become clear that the majority of organisms capable of producing this vitamin do so via a different route, involving precursors from glycolysis and the pentose phosphate pathway 40

The biosynthesis of the vitamin B6 involves two branches with seven enzymatic steps. In one branch, the sequential action of the enzymes GapA, PdxB and PdxF results in the conversion of erythrose 4-phosphate into 4-phosphohydroxyL-threonine. The latter then undergoes oxidation and decarboxylation by PdxA to form 3-hydroxy-1-aminoacetone phosphate.

In the other branch, DXP (deoxyxylulose 5- phosphate) is derived from GAP (glyceraldehyde 3-phosphate) and pyruvate by the action of DXP synthase. The products of the two branches, i.e. 3-hydroxy-1-aminoacetone phosphate and DXP, are then condensed by PdxJ to form PNP (pyridoxine 5' -phosphate) , which must undergo oxidation, catalysed by PdxH, to form the cofactor vitamer PLP. 

The occurrence of two distinct and mutually exclusive pathways for the de novo biosynthesis of vitamin B6 poses an attractive challenge regarding the rationale for the evolution of two independent pathways for the same molecule.

Diverse functionality of vitamin B6 and its involvement in bodily functions. 
The inner ring shows three of the vitamin B6 vitamers where the chemical entity at the 4position can be an aldehyde, an alcohol or an amine. R1 can either be a hydrogen or a phosphate group, thereby representing the vitamers shown or their phosphorylated derivatives respectively. The second and third rings indicate biochemical and physiological functions respectively in humans.

The three pathways of vitamin B6 biosynthesis.
Pase*, the apparently unspecific phosphatases involved in dephosphorylating the phosphorylated B6 vitamers; Tase*, transaminase. 40

Tetrahydrofolate ( THF  H4 folate ) and Vitamin B12 
One-carbon Metabolism: Basic Concepts
There is a group of biochemical reactions that have a special set of enzymes and coenzymes.   They are involved in amino acid metabolism and also play roles in nucleotide metabolism.   This group of reactions is referred to as one-carbon metabolism because what they have in common is the transfer of one-carbon groups.
One-carbon metabolism exists because one-carbon groups are too volatile and need to be attached to something while being processed. 42

Essentially, there are three ways of moving groups of atoms containing a single carbon atom using the following molecules:

  1. Tetrahydrofolate (THF) as a cofactor in enzymatic reactions.
  2. S-adenosylmethionine (SAM) as a methyl (-CH3) donor.
  3. Vitamin B12 (Cobalamin) as a co-enzyme in methylation and rearrangement reactions.

TETRAHYDROFOLATE (THF)  is the most versatile one-carbon donor in biosynthetic reactions.   THF is composed of three types of groups.   THF is derived from the vitamin folic acid (folate).   Folate is made by plants and microorganisms. The enzyme dihydrofolate reductase converts dihydrofolate into tetrahydrofolate, which is the active form that carries 1-carbon groups in a variety of reactions.  

All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. Most microorganisms must synthesize folate de novo because they lack the active transport system of higher vertebrate cells that allows these organisms to use dietary folates. 38

Tetrahydrofolate (H4 folate)has fundamental importance for the biosynthesis of purines, pyrimidines, and several amino acids. The folate derivative, 5,10-methylene-tetrahydrofolate is essential for the synthesis of dTMP from dUMP and it is, therefore, crucial for DNA replication and cell division. Tetrahydrofolate is an essential substrate in the biosynthesis of amino acid, glycine. Dihydrofolate reductase enzyme replenishes tetrahydrofolate from dihydrofolate for the above mentioned biosynthetic processes. 36

Folate is necessary for the production and maintenance of new cells, for DNA synthesis and RNA synthesis through methylation, and for preventing changes to DNA, and, thus, for preventing cancer. It is especially important during periods of frequent cell division and growth, such as infancy and pregnancy. Folate is needed to carry one-carbon groups for methylation reactions and nucleic acid synthesis (the most notable one being thymine, but also purine bases).  It gets this carbon atom by sequestering formaldehyde produced in other processes. Thus, folate deficiency hinders DNA synthesis and cell division, affecting hematopoietic cells and neoplasms the most because of their greater frequency of cell division. 37
In the form of a series of tetrahydrofolate (THF) compounds, folate derivatives are substrates in a number of single-carbon-transfer reactions, and also are involved in the synthesis of dTMP (2′-deoxythymidine-5′-phosphate) from dUMP (2′-deoxyuridine-5′-phosphate). It is a substrate for an important reaction that involves vitamin B12 and it is necessary for the synthesis of DNA, and so required for all dividing cells

Tetrahydrofolate THF can be imagined as an arm that transfers single carbons in different reduced states from one molecule to another. 

The importance of folate compounds in metabolism has been established for over 50 years. Folate derivatives participate in a myriad of biosynthetic reactions involving transfers of groups containing a single carbon atom. For example, these functional units are essential components in the metabolism of the amino acids glycine, serine, methionine, and histidine, and the biosynthesis of purines and pyrimidines. 21  Tetrahydrofolic acid is a cofactor in many reactions, especially in the synthesis (or anabolism) of amino acids and nucleic acids.  It gets this carbon atom by sequestering formaldehyde f produced in other processes. 10

Tetrahydrofolate acts as a donor or acceptor of one-carbon unit in biosynthetic and degradative processes and has an essential role in the biosynthesis of purines, thymidylate, pantothenate, RNA and amino acids, such as methionine and glycine-to-serine conversion 18  

There is a group of biochemical reactions that have a special set of enzymes and coenzymes.   They are involved in amino acid metabolism and also play roles in nucleotide metabolism.   This group of reactions is referred to as one-carbon metabolism because what they have in common is the transfer of one-carbon groups. 11 

It means moving a carbon atom from one molecule to another. THF is the most versatile one-carbon donor in biosynthetic reactions.   THF is composed of three types of groups.   THF is derived from the vitamin folic acid (folate). Folate is made by plants and microorganisms. The folate derivative, 5,10-methylene-tetrahydrofolate is essential for the synthesis of dTMP d from dUMP and it is, therefore, crucial for DNA replication and cell division. Tetrahydrofolate is an essential substrate in the biosynthesis of amino acid, glycine. 14  

The two essential precursors of folate biosynthesis are 4-aminobenzoate (a product of shikimate biosynthesis pathway) and GTP e .  

Thymidylate cycle, a part of folate biosynthesis pathway plays important role in the generation of amino acid glycine and dTMP. It is made of 11 enzymatic steps.

Some of the carbon atoms of purines are acquired from derivatives of N10 -formyltetrahydrofolate. The methyl group of thymine, a pyrimidine, comes from N5 , N10 -methylenetetrahydrofolate. This tetrahydrofolate derivative can also donate a one-carbon unit in an alternative synthesis of glycine that starts with oxygen CO2 and ammonium NH4+ , a reaction catalyzed by glycine synthase (called the glycine cleavage enzyme when it operates in the reverse direction).   25 

It is synthesized in bacteria, consists of substituted pterin (6-methylpterin)p-aminobenzoate, and glutamate moieties.

Chemical structure of tetrahydrofolate (THF), monoglutamyl form. 
The red arrowhead marks the oxidatively labile C9–N10 bond. A polyglutamyl tail can be attached via the γ-carboxyl group of the glutamate moiety. 19

Here another image of the same molecule:

The nitrogen atoms to which one-carbon groups are attached in tetrahydrofolate are shown in blue. The one-carbon group undergoing transfer, in any of three oxidation states, is bonded to N-5 or N-10 or both.

Most forms of tetrahydrofolate are interconvertible and serve as donors of one-carbon units in a variety of metabolic reactions. The primary source of one-carbon units for tetrahydrofolate is the carbon removed in the conversion of serine to glycine, producing N5,N10-methylenetetrahydrofolate.

The oxidized form, folate, is a vitamin for mammals; it is converted in two steps to tetrahydrofolate by the enzyme dihydrofolate reductase. The one-carbon group undergoing transfer, in any of three oxidation states, is bonded to N-5 or N-10 or both. The most reduced form of the cofactor carries a methyl group, a more oxidized form carries a methylene group, and the most oxidized forms carry a methenyl, formyl, or formimino group ( see figure below )  43

Conversions of one-carbon units on tetrahydrofolate.
The different molecular species are grouped according to oxidation state, with the most reduced at the top and most oxidized at the bottom. All species within a single shaded box are at the same oxidation state. The conversion of N5,N10-methylenetetrahydrofolate to N5- methyltetrahydrofolate is effectively irreversible. The enzymatic transfer of formyl groups, as in purine synthesis and in the formation of formylmethionine in bacteria, generally uses
N10-formyltetrahydrofolate rather than N5-formyltetrahydrofolate. The latter species is significantly more stable and therefore a weaker donor of formyl groups. N5-Formyltetrahydrofolate is a minor byproduct of the cyclohydrolase reaction, and can also form spontaneously. Conversion of N5-formyltetrahydrofolate to N5,N10-methenyltetrahydrofolate requires ATP, because of an otherwise unfavorable equilibrium. Note that N5-formiminotetrahydrofolate is derived from histidine in a catabolic pathway 

Most forms of tetrahydrofolate are interconvertible and serve as donors of one-carbon units in a variety of metabolic reactions. The primary source of one-carbon units for tetrahydrofolate is the carbon removed in the conversion of serine to glycine, producing N5,N10-methylenetetrahydrofolate. Although tetrahydrofolate can carry a methyl group at N-5, the transfer potential of this methyl (adoMet) is the preferred cofactor group is insufficient for most biosynthetic reactions. SAdenosylmethionine (adoMet) is the preferred cofactor for biological methyl group transfers. It is synthesized from ATP and methionine by the action of methionine adenosyl transferase,  step 1 ( figure below )  

Synthesis of methionine and S-adenosylmethionine in an activated methyl cycle. 
The steps are described in the text. In the methionine synthase reaction (step 4 ), the methyl group is transferred to cobalamin to form methylcobalamin, which in turn is the methyl donor in the formation of methionine. S-Adenosylmethionine, which has a positively charged sulfur (and is thus a sulfonium ion), is a powerful methylating agent in several biosynthetic reactions. The methyl group acceptor (step 2 ) is designated R. 

This reaction is unusual in that the nucleophilic sulfur atom of methionine attacks the 5'carbon of the ribose moiety of ATP rather than one of the phosphorus atoms. Triphosphate is released and is cleaved to Pi and PPi on the enzyme, and the PPi is cleaved by inorganic pyrophosphatase; thus three bonds, including two bonds of high-energy phosphate groups, are broken in this reaction. The only other known reaction in which triphosphate is displaced from ATP occurs in the synthesis of coenzyme B12. S-Adenosylmethionine is a potent alkylating agent by virtue of its destabilizing sulfonium ion. The methyl group is subject to attack by nucleophiles and is about 1,000 times more reactive than the methyl group of N5-methyltetrahydrofolate. Transfer of the methyl group from S-adenosylmethionine to an acceptor yields S-adenosylhomocysteine (Figure above, step 2 ), which is subsequently broken down to homocysteine and adenosine (step 3 ). Methionine is regenerated by transfer of a methyl group to homocysteine in a reaction catalyzed by methionine synthase (step 4 ), and methionine is reconverted to S-adenosylmethionine to complete an activated methyl cycle. One form of methionine synthase common in bacteria uses N5-methyltetrahydrofolate as a methyl donor. Another form of the enzyme present in some bacteria and mammals uses N5-methyltetrahydrofolate, but the methyl group is first transferred to cobalamin, derived from coenzyme B12, to form methylcobalamin as the methyl donor in methionine formation. This reaction and the rearrangement of L-methylmalonyl-CoA to succinyl-CoA are the only known coenzyme B12–dependent reactions in mammals.

Folic acid, a B vitamin found in green plants, fresh fruits, yeast, and liver, takes its name from folium, Latin for “leaf.” Folic acid is a pterin (the 2-amino-4-oxo derivative of pteridine). Mammals cannot synthesize pterins and thus cannot make folates; they derive folates from their diet or from microorganisms in their intestines.  Folates are acceptors and donors of one-carbon units for all oxidation levels of carbon except CO2 (for which biotin is the relevant carrier). The active form is tetrahydrofolate (THF). THF is formed through two successive reductions of folate by dihydrofolate reductase. 41

The two-stage reduction of folate to THF. Both reactions are catalyzed by dihydrofolate reductase.

One-carbon units in three different oxidation states may be bound to THF at the N5 or N10 nitrogens (table below).

*Calculated by assigning valence bond electrons to the more electronegative atom and then counting the charge on the quasi ion. A carbon assigned four valence electrons would have an oxidation number of 0. The carbon in N5-methyl-THF is assigned six electrons from the three COH bonds and thus has an oxidation number of 22. †Note: All vacant bonds in the structures shown are to atoms more electronegative than C.

The one-carbon unit carried by THF can come from formate (HCOO-), the a-carbon of glycine, the b-carbon of serine, or the 3-position carbon in the imidazole ring of histidine. NADPH-dependent reactions interconvert the oxidation states of the various THF-bound one-carbon units. 

The conversion of serine to glycine is a prominent means of generating one-carbon derivatives of THF, which are so important for the biosynthesis of purines and the C-5 methyl group of thymine (a pyrimidine ), as well as the amino acid methionine. Glycine itself contributes to both purine and heme synthesis.  glycine can be synthesized by a reversal of the glycine oxidase reaction (Figure b).

Biosynthesis of glycine from serine 
(a) via serine hydroxymethyltransferase and 
(b) via glycine oxidase.

Here, glycine is formed when N5, N10-methylene-THF condenses with ammonium (NH4)+ and CO2. Via this route, the b-carbon of serine becomes part of glycine. The conversion of serine to glycine is a prominent means of generating one-carbon derivatives of THF, which are so important for the biosynthesis of purines and the C-5 methyl group of thymine.

Folate is necessary for the production and maintenance of new cells, for DNA synthesis and RNA synthesis through methylation, and for preventing changes to DNA, and, thus, for preventing cancerIt is especially important during periods of frequent cell division and growth, such as infancy and pregnancy. Folate is needed to carry one-carbon groups for methylation reactions and nucleic acid synthesis (the most notable one being thymine, but also purine bases). Thus, folate deficiency hinders DNA synthesis and cell division, affecting hematopoietic cells and neoplasms the most because of their greater frequency of cell division. RNA transcription, and subsequent protein synthesis, are less affected by folate deficiency, as the mRNA can be recycled and used again (as opposed to DNA synthesis, where a new genomic copy must be created). 34 

In the form of a series of tetrahydrofolate (THF) compounds, folate derivatives are substrates in a number of single-carbon-transfer reactions, and also are involved in the synthesis of dTMP (2′-deoxythymidine-5′-phosphate) from dUMP (2′-deoxyuridine-5′-phosphate). It is a substrate for an important reaction that involves vitamin B12 and it is necessary for the synthesis of DNA, and so required for all dividing cells 34 

Folates are essential in all living systems with the exception of methanogenic bacteria, where they are replaced by methanopterin derivatives. The various C1 moieties carried by THF-type coenzymes serve as building blocks for the biosynthesis of purines, pyrimidines, and methionine. Methenyltetrahydrofolate serves as an optical transponder in DNA photolyases that are involved in the photochemically driven repair of photodamaged DNA in numerous organisms, albeit not in mammals. 

Folate, distinct forms of which are known as folic acid, folacin, and vitamin B9, is one of the B vitamins 12 Folic acid, also known as Vitamin B9 is important to several biological functions. Folates are among the most complex pterin coenzymes. Tetrahydrofolate (THF) is the basic molecule of the folate family. It is synthesized in micro-organisms, including bacteria and lower eukaryotes, and plants, but not in animals. THF is synthesized from GTP, chorismate and glutamate 16 The folate pathway is central to any study related to DNA methylation, dTMP synthesis or purine synthesis. 17 

Folate is a designation for cofactors that consist of three moieties: 

p-aminobenzoic acid (pABA) (synthesized by chorismate pathway), 
pterin and 
glutamates and is an essential vitamin (vitamin B9) used by all cells.

Although the number of glutamates attached to PABA can vary depending on the source, in the cellular plasma, the monoglutamated and the tetra-reduced pterin ring (tetrahydrofolate) are the most predominant forms. 

a Transamination, a chemical reaction that transfers an amino group to a ketoacid to form new amino acids. This pathway is responsible for the deamination of most amino acids. This is one of the major degradation pathways which convert essential amino acids to nonessential amino acids (amino acids that can be synthesized de novo by the organism). 3
Transamination in biochemistry is accomplished by enzymes called transaminases or aminotransferases. α-ketoglutarate acts as the predominant amino-group acceptor and produces glutamate as the new amino acid.

b Pyridoxal phosphate (PLP, pyridoxal 5'-phosphate, P5P), the active form of vitamin B6, is a coenzyme in a variety of enzymatic reactions. The Enzyme commission has cataloged more than 140 PLP-dependent activities, corresponding to ~4% of all classified activities. The versatility of PLP arises from its ability to covalently bind the substrate, and then to act as an electrophilic catalyst, thereby stabilizing different types of carbanionic reaction intermediates. 5  PLP acts as a coenzyme in all transamination reactions, and in certain decarboxylation, deamination, and racemization reactions of amino acids

c In organic chemistry, an aldimine is an imine that is an analog of an aldehyde.[1] As such, aldimines have the general formula R–CH=N–R'. Aldimines are similar to ketimines, which are analogs of ketones.
An important subset of aldimines are the Schiff bases, in which the substituent on the nitrogen atom (R') is an alkyl or aryl group (i.e. not a hydrogen atom) 8

d Thymidine monophosphate (TMP), also known as thymidylic acid (conjugate base thymidylate), deoxythymidine monophosphate (dTMP), or deoxythymidylic acid (conjugate base deoxythymidylate), is a nucleotide that is used as a monomer in DNA. It is an ester of phosphoric acid with the nucleoside thymidine. dTMP consists of a phosphate group, the pentose sugar deoxyribose, and the nucleobase thymine. Unlike the other deoxyribonucleotides, thymidine monophosphate often does not contain the "deoxy" prefix in its name; nevertheless, its symbol often includes a "d" ("dTMP") 13

e Guanosine-5'-triphosphate (GTP) is a purine nucleoside triphosphate. It is one of the building blocks needed for the synthesis of RNA during the transcription process. Its structure is similar to that of the guanine nucleobase, the only difference being that nucleotides like GTP have a ribose sugar and three phosphates, with the nucleobase attached to the 1' and the triphosphate moiety attached to the 5' carbons of the ribose. 15

Formaldehyde (systematic name methanal) is a naturally occurring organic compound with the formula CH2O (H-CHO). It is the simplest of the aldehydes (R-CHO). The common name of this substance comes from its similarity and relation to formic acid. 23 Formaldehyde was the first polyatomic organic molecule detected in the interstellar medium. Since its initial detection in 1969, it has been observed in many regions of the galaxy. Because of the widespread interest in interstellar formaldehyde, it has recently been extensively studied, yielding new extragalactic sources.
" The poisonous chemical formaldehyde may have helped create the organic compounds present in the universe that gave rise to life, new research suggests. " 24

g In the chemical sciences, methylation denotes the addition of a methyl group on a substrate, or the substitution of an atom (or group) by a methyl group. Methylation is a form of alkylation, with a methyl group, rather than a larger carbon chain, replacing a hydrogen atom. These terms are commonly used in chemistry, biochemistry, soil science, and the biological sciences. 26 A methyl group is an alkyl derived from methane, containing one carbon atom bonded to three hydrogen atoms — CH3. 27 Such hydrocarbon groups occur in many organic compounds. It is a very stable group in most molecules. While the methyl group is usually part of a larger molecule, it can be found on its own in any of three forms: anion, cation or radical. The anion has eight valence electrons, the radical seven and the cation six.


1. Berg, Biochemistry, 5th edition, page 973
4. Biochemistry 5th edition, Styer, page 713
25. Biochemistry, Styer , 8th ed. page 724
33. Biochemistry 6th edition, Garrett, page 1063
36. [url= Folate recycling][/url]
41. Biochemistry 6th edition, Garrett, page 930
43. Biochemistry 6th edition, Garrett, page 690

Last edited by Admin on Fri Aug 03, 2018 8:50 am; edited 83 times in total

View user profile

29 The folate biosynthesis pathway on Fri Apr 13, 2018 8:37 am


The folate biosynthesis pathway

The pathway leading to the formation of tetrahydrofolate (FH4) begins when folic acid (F) is reduced to dihydrofolate (DHF) (FH2), which is then reduced to THF. Dihydrofolate reductase catalyses the last step. Vitamin B3 in the form of NADPH is a necessary cofactor for both steps of the synthesis. Thus, hydride molecules are transferred from NADPH to the C6 position of the pteridine ring to reduce folic acid to THF 

Folate biosynthesis - Reference pathway

Tetrahydrofolate (H4 folate) MTHFR ( in red ) in the Folate cycle

The folate biosynthesis in plants and microorganisms starts with the synthesis of the pterin ring, which is catalyzed by GTP cyclohydrolase I (GTPCHI) and this reaction is followed by other five reactions catalyzed by five distinct enzymes that convert GTP into 7,8-dihydrofolate, which is reduced by dihydrofolate reductase to produce 5,6,7,8-tetrahydrofolate ( Figure B, below ) 

Folate derivatives and biosynthesis. 
(A) Chemical moieties of folates. 
(B) Overview of the folate pathway in microorganisms.

The folate biosynthesis in plants and microorganisms starts with the synthesis of the pterin ring, which is catalyzed by GTP cyclohydrolase I (GTPCHI) and this reaction is followed by other five reactions catalyzed by five distinct enzymes that convert GTP into 7,8-dihydrofolate, which is reduced by dihydrofolate reductase to produce 5,6,7,8-tetrahydrofolate 20

pABA, which is attached to the pterin moiety by dihydropteroate synthase (the forth step of the pathway), is produced by two enzymatic steps from chorismate and makes a link between the folate and the shikimate pathways. The first reaction of the folate pathway is catalyzed by GTPCHI, a homodecamer with D5-symmetry that involves intensive rearrangement, including the ring opening, an Amadori rearrangement and finally a ring closure. In this process, a molecule of GTP yields dihydroneopterin triphosphate, which is the substrate for dihydroneopterin aldolase (DHNA).DHNA is a homo-octameric enzyme that has a catalytic mechanism similar to the class I aldolases. It does not require a Schiff base, even though this feature is characteristic of this enzyme class. In addition, this enzyme is also able to catalyze the epimerization of 7,8-dihydroneopterin (DHNP) to rend 7,8-dihydromonopterin (DHMP).The next step of the pathway is catalyzed by 7,8-hydroxymethyl-7,8-dihydropterin pyrophosphokinase (HPPK), which transfers the pyrophosphate from ATP to DHMP producing 6-hydroxymethyl-7,8-dihydropterin pyrophosphate (DHPPP), that is the substrate of dihydropteroate synthase (DHPS), an enzyme that performs the condensation of this molecule with pABA to yield 7,8-dihydropteroate. Mono/dihydrofolate synthase and the bifunctional folylpoly-γ-glutamate synthetase (DHFS/FPGS) add one or more glutamates, respectively to the 7,8-dihydropteroate, producing dihydrofolate and its derivatives. The last step of the pathway is catalyzed by dihydrofolate reductase (DHFR), which together with DHPS is the most studied enzyme of the folate pathway. DHFR has a Rossmannoid fold and catalyzes the reduction of dihydrofolate to tetrahydrofolate using NADPH as a cofactor. The last two steps of the folate pathway are found in both prokaryotic and eukaryotic organisms and have been used as antimicrobial or human disease targets, respectively. Figure B above  shows all the steps of the folate pathway.
The proteins used in the folate biosynthesis pathway

GTP cyclohydrolase I (GTPCH)
Dihydroneopterin aldolase (DHNA)
6-hydroxymethyl-7,8-dihydropterin pyrophosphokinase  (HPPK)
Dihydropteroate synthase (DHPS)
folylpolyglutamate synthetase (FPGS)
Dihydrofolate reductase (DHFR)

GTP cyclohydrolase I
GTPCHI was described for the first time in 1976  and is responsible for the rate-limiting step of the folate pathway, being regulated at both transcriptional and substrate/product levels. The reaction catalyzed by GTPCHI is considered the most complex of the pathway (Figure A below), in which is involved the production of  DHNP triphosphate and formic acid from GTP. GTPCHI initially breaks the guanine imidazole ring, followed by the cleavage at the N9-C8 and theN7-C8 bonds to produce theN-formyl pyrimidine intermediate and additionally releases the formate derivative from C8. The ribose moiety still undergoes an Amadori rearrangement, producing the dihydropyrazine ring, which is recyclized by a condensation reaction to provide the pteridine ring moiety of DHNP triphosphate (Figure A)

GTP cyclohydrolase I reaction

GTPCHI is conserved in bacteria, protozoa, fungi, plants, and vertebrates; however, its product is a substrate in more than one pathway among different organisms. In bacteria, fungi and plants, this enzyme catalyzes the first step
within de novo biosynthesis of folate, while invertebrates, it is associated with the biosynthesis of tetrahydrobiopterin, a key cofactor for the nitric oxide-producing enzymes, melanin and neurotransmitters.  However, the description of GTPCHI as a metalloenzyme was only established with the determination of human GTPCHI structure. A zinc ion binds near a histidine located within the active site. Posteriorly, a careful analysis of the EcGTPCHI active site also revealed an electronic density for this metal, which is coordinated by one histidine and two cysteines. The role of the zinc ion has been further proved to be essential for opening the imidazole ring by attacking the C8 of GTP.

GTP cyclohydrolase I
(A) Mechanism of catalysis of GTPCHI proposed by. 
(B) Dodecameric structure of Escherichia coli GTPCHI. Each protomer is represented by a different color. 
(C) Structure of an EcGTPCHI protomer. 
(D) Active site of EcGTPCHI. The interface of two protomers A (electrostatic surface) and B (ribbons and residues involved in the coordination of the essential zinc ion) are represented. 
(E) Dimeric structure of Neisseria gonorrhoeae GTPCH-IB, which also has the same tunneling fold of GTPCHI. EcGTPCHI: E. coli GTPCHI; GTPCHI: GTP cyclohydrolase I.

Dihydroneopterin aldolase
DHNA  catalyzes the reversible conversion of DHNP to 6-hydroxymethyl-7,8-dihydropterin (HP) and glycolaldehyde ( Figure below )

Dihydroneopterin aldolase. 
(A) Possible catalytic reactions for dihydroneopterin aldolase proposed by Czekster et al. 
(B) Octameric structure of SaDHNA. The bend black line indicates the interaction surface between the two tetramer that forms the octameric structure of DHNA. 
(C) Active site formation of dihydroneopterin aldolase by two adjacent protomers. 
(D) First series of compound  identified against SaDHNA based in a high-throughput x-ray crystallography campaign by Sanders et al. 
(E) Optimization of the 8-aminopurine analog that rendered compounds with high affinity to SaDHNA. SaDHNA: Staphylococcus aureus dihydroneopterin aldolase.

DHNA is also reported to catalyze the epimerization at the 2-carbon to produce DHMP. DHNA protomer contains a four-strand β-sheet and four α-helices wherein two are long and the other two are shorter, a feature not frequently described in all structures (Figure B & C). There are four active sites per DNHA tetramer and they are situated on the external face of the β-barrel and at the interface between two protomers of the tetramer with the contributions of amino acids from two adjacent molecules (Figure C above) Since the enzymatic catalysis does not involve cofactors, the active site is located in a narrow, deep and highly negatively charged pocket where amino acid residues are crucial for the recognition of the pterin ring substrate; however, a lysine, which acts as a general base, and a tyrosine, which has a key role in the protonation of the enol intermediate, are essential for the enzyme catalytic mechanism and they are conserved in almost all DHNA sequences. 

6-hydroxymethyl-7,8-dihydropterin pyrophosphokinase
HPPK is an ATP-binding enzyme which transfers a pyrophosphoryl group to 6-hydroxymethyl-HP and produces hydroxymethyl-7,8-dihydropterin pyrophosphate (HPPP) and AMP (Figure A below).

(A) Catalytic mechanism of HPPK proposed by Blaszczyk et al. 
(B) Structure of EcHPPK in complex with 6-hydroxymethyl-7,8-dihydropterin and an analog of ATP (AMPCPP) indicating the position of the three essential loops for the activity of the enzyme. 
(C) Schematic representation of the dynamic movements of the essential Loop1 (yellow), -2 (pink) and -3 (green) during the binding of substrate and catalysis of HPPK. (D) Best inhibitors identified against EcHPPK and SaHPPK using different approaches. AMPCPP: Methyleneadenosine 5’-triphosphate; HPPK: 6-Hydroxymethyl-7,8-dihydropterin pyrophosphokinase; EcHPPK: Escherichia coli HPPK; SaHPPK: Staphylococcus aureus HPPK.

The structure of HPPK from E. coli (EcHPPK) revealed that this enzyme is monomeric and has a fold composed by a β-sandwich or a three-layered αβα similar to the ribosomal S6 protein  and nucleoside diphosphate kinase
(Figure B above)

Dihydropteroate synthase
This enzyme catalyzes the condensation of pABA and HPPP to produce dihydropteroate through the formation of a carbon–nitrogen bond (Figure A below)

Dihydropteroate synthase. 
(A) Catalytic mechanism proposed for dihydropteroate synthase based on the structure of Bacillus anthracis and Yersinia pestis DHPS by Yun et al. 
(B) Examples of classical dihydropteroate synthase inhibitors: prontosil and sulfamethoxazole. 
(C) Dimeric structure of Staphylococcus aureus dihydropteroate synthase. (D) Superposition of S. aureus (green) and Y. pestis dihydropteroate synthase (salmon) indicating that sulfamethoxazole binds in p-aminobenzoic acid binding site. (E) MANIC and the series of compounds identified by Zhao et al. [88] through a structure-based design campaign against B. anthracis DHPS.

Dihydrofolate synthetase/folylpolyglutamate synthetase

Dihydrofolate synthetase/folylpolyglutamate synthetase.
(A) Overall structure of monomeric Mycobacterium tuberculosis dihydrofolate synthetase/folylpolyglutamate synthetase indicating the two domains. 
(B) Surface representation of the two possible binding sites for Escherichia coli dihydrofolate synthetase/folylpolyglutamate synthetase. The ATP binding site bound with ADP and the folate binding site bound with phosphorylated dihydropteroate are shown in blue and green, respectively (PDB: 1W78). 
(C) Surface representation of open (blue) and closed (green) active site conformations of E coli folylpolyglutamate synthetase/dihydrofolate synthetase, the cavity is shown in red dashed circle.

Dihydrofolate reductase
Dihydrofolate reductase, or DHFR, is an enzyme that reduces dihydrofolic acid to tetrahydrofolic acid, using NADPH as electron donor, which can be converted to the kinds of tetrahydrofolate cofactors used in 1-carbon transfer chemistry. Dihydrofolate reductase converts dihydrofolate into tetrahydrofolate, a methyl group shuttle required for the de novo synthesis of purines, thymidylic acid, and certain amino acids. Found in all organisms, DHFR has a critical role in regulating the amount of tetrahydrofolate in the cell. Tetrahydrofolate and its derivatives are essential for purine and thymidylate synthesis, which are important for cell proliferation and cell growth. 22

Dihydrofolate reductase. 
(A) Examples of classical dihydrofolate reductase inhibitors. 
(B) Structure of Escherichia coli dihydrofolate reductase in complex with methotrexate. 
(C) Superposition of three different conformational states of E. coli dihydrofolate reductase based in the conformation of Met20 loop. 
(D) Superposition of open (orange) and closed (green) conformation of M. tuberculosis DHFR. The black arrow indicates that in the open position, the nicotinamide group is out of the active site and disordered. The green and orange arrows indicate different distances between the loop and adenosine binding subdomain, respectively in the closed and open conformations of M. tuberculosis DHFR. 
(E) Superposition between the open and closed conformations of M. tuberculosis DHFR in complex with trimethoprim indicating that the loss of the π-interaction between the nicotinamide ring of NADPH and the pyrimidine
ring of trimethoprim possible causes a change in the conformation of the ligand in the active site and also a decreasing of affinity.

S-adenosylmethionine (adoMet or SAM)
S-ADENOSYLMETHIONINE (SAM) is the most prolific donor of one-carbon groups in biosynthetic reactions.   Its formation is catalyzed by methionine adenosyltransferase.   It costs THREE high-energy phosphate bonds to make SAM.34 All known DNA methylases use S-adenosylmethionine as a methyl group donor. 36

The particularly important feature of SAM is that it donates methyl groups (we call them " active ~CH3 " to a large number of acceptors, including DNA, RNA, phospholipids, and many proteins.   Donation of these methyl groups is part of a small cycle

The step in which methionine is re-generated requires two important vitamin derivatives:

[list="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);"]
[*]Methyl-cobalamin, a derivative of vitamin B12, which donates the methyl group to homocysteine
[*]N5 -methyl-tetrahydrofolate, which donates its methyl group to cobalamin, and allows the reaction to continue.   Dietary or other deficiencies of either vitamin can cause serious problems.   Notice that all three of the one-carbon carriers are involved in this cycle.   SAM is also used to make cysteine.

S-adenosylmethionine (SAM or AdoMet) is a conjugate of nucleotide adenosine and amino acid methionine, two ubiquitous biological compounds that almost certainly were present in the common ancestor of living cells and may have been found in the prebiotic environment on Earth, predating the origin of Life itself 35  SAM is an essential metabolic intermediate in every studied cellular life form, and each cellular organism has several SAM-utilizing enzymes. One relatively well-understood biological role of SAM is to donate methyl groups for covalent modification of different substrates – from as simple as oxidized arsenic, chloride, bromide, and iodine ions [2-4], to as complex as rRNA, tRNA, and essential proteins, whose methylation status can serve as a regulatory signal for maturation and control interactions with other macromolecules. 

S-Adenosylmethionine is a common cofactor involved in methyl group transfers, transsulfuration, and aminopropylation 32 

SAM is a substrate of methyltransferases in a variety of methyl-donor reactions, such as the formation of phosphatidylcholine from phosphatidylethanolamine, DNA methylation, and methylation of Arg and Lys residues in the
regulation of DNA: histone interactions in chromatin.  33

S-Adenosylmethionine (AdoMet) occupies a central role in the metabolism of all cells. 28 Reactions using S-adenosylmethionine (AdoMet) are among the most abundant processes taking place in any cell 29 The routes in which the AdoMet-consuming reactions are involved allow the synthesis of a large variety of compounds, as well as the control of cell function (i.e., epigenetic modifications). This wide use of AdoMet derives from the variety of groups that this molecule is able to donate, being methyl group donation the main consumer of the compound

S-adenosylmethionine (adoMet or SAM) synthesis
It is important to note the role of methionine itself in methylation reactions. The enzyme S-adenosylmethionine synthase catalyzes the reaction of methionine with ATP to form S-adenosylmethionine, or SAM (Figure below).

The synthesis of S-adenosylmethionine (SAM) and its fates.

SAM is a substrate of methyltransferases in a variety of methyl-donor reactions, such as the formation of phosphatidylcholine from phosphatidylethanolamine, DNA methylation, and methylation of Arg and Lys residues in the regulation of DNA: histone interactions in chromatin. 

S-adenosylmethionine synthetase (or methionine adenosyltransferase (MAT)) are the only enzymes known to synthesize AdoMet in a rather unusual reaction that occurs in two steps. As a methyl donor SAM allows DNA methylation. Once DNA is methylated, it switches the genes off and therefore, S-adenosylmethionine can be considered to control gene expression. 31

S-adenosylmethionine synthase 2, tetramer 

Methionine adenosyltransferases (MAT) are the family of enzymes that synthesize the main biological methyl donor, S-adenosylmethionine. 30 Methionine is a non-polar amino acid characterized by the presence of a methyl group attached to a sulfur atom located in its side chain. In addition to its role in protein synthesis, large amounts of this amino acid are used for the synthesis of S-adenosylmethionine (AdoMet) by methionine adenosyltransferases (MAT) in a reaction that is the rate-limiting step of the methionine cycle ( see figure below )

The mammalian methionine cycle and related pathways
The figure shows a scheme of the hepatic methionine cycle and some of the related pathways. Methionine is converted to S-adenosylmethionine (AdoMet) by methionine adenosyltransferases (MAT); this compound can be used by a multitude of enzymes such as methyltransferases (MTases), SAM radical proteins  and AdoMet decarboxylase ( AdoMetDC ). Polyamine synthesis occurs with methylthioadenosine (MTA) production, a compound that can be reused for methionine synthesis by the methionine salvage pathway. On the other hand, the action of MTases renders methylated products and S-adenosylhomocysteine (AdoHcy) that can be hydrolyzed by AdoHcy hydrolase (SAHH) to adenosine and homocysteine (Hcy). This reaction is reversible and favors AdoHcy synthesis. Hcy can be metabolized through the trans-sulfuration pathway by the consecutive action of cystathionine β synthase (CBS) and cystathionine γ lyase (CγL) rendering cysteine for glutathione synthesis, among other purposes. In addition, Hcy can also serve in resynthesis of methionine by two reactions catalyzed by methionine synthase (MS) and betaine homocysteine methyltransferase (BHMT). Some of these steps can be modulated by metabolites synthesized in these pathways and dashed lines indicate the most relevant.

S-adenosylmethionine synthetase enzymes also known as methionine adenosyltransferase ( MAT) catalyzes the only known route of AdoMet biosynthesis. The synthetic process occurs in a unique reaction in which the complete triphosphate chain is displaced from ATP and a sulfonium ion formed. 

MATs from various organisms contain ∼400-amino acid polypeptide chains. We have recently found that the protein sequences comprise two categories, the extensively studied eucaryal-bacterial type (encoded by a catalytic subunit denoted α ) and the archaeal type (encoded by a subunit that we denote γ ) The sequences of the two classes are widely diverged, e.g. there is only 22% identity.

This is amongst many others, one more indication that common ancestry of the 3 domains of life is a failed hypothesis.

Tetrahydrofolate can carry a methyl group on its N-5 atom, but its transfer potential is not sufficiently high for most biosynthetic methylations. Rather, the activated methyl donor is usually S- adenosylmethionine, which is synthesized by the transfer of an adenosyl group from ATP to the sulfur atom of methionine. It is important to note the role of methionine amino acids itself in methylation reactions. The enzyme S-adenosylmethionine synthase catalyzes the reaction of methionine with ATP to form S-adenosylmethionine, or SAM. 

a Transamination, a chemical reaction that transfers an amino group to a ketoacid to form new amino acids. This pathway is responsible for the deamination of most amino acids. This is one of the major degradation pathways which convert essential amino acids to nonessential amino acids (amino acids that can be synthesized de novo by the organism). 3
Transamination in biochemistry is accomplished by enzymes called transaminases or aminotransferases. α-ketoglutarate acts as the predominant amino-group acceptor and produces glutamate as the new amino acid.

b Pyridoxal phosphate (PLP, pyridoxal 5'-phosphate, P5P), the active form of vitamin B6, is a coenzyme in a variety of enzymatic reactions. The Enzyme commission has cataloged more than 140 PLP-dependent activities, corresponding to ~4% of all classified activities. The versatility of PLP arises from its ability to covalently bind the substrate, and then to act as an electrophilic catalyst, thereby stabilizing different types of carbanionic reaction intermediates. 5  PLP acts as a coenzyme in all transamination reactions, and in certain decarboxylation, deamination, and racemization reactions of amino acids

c In organic chemistry, an aldimine is an imine that is an analog of an aldehyde.[1] As such, aldimines have the general formula R–CH=N–R'. Aldimines are similar to ketimines, which are analogs of ketones.
An important subset of aldimines are the Schiff bases, in which the substituent on the nitrogen atom (R') is an alkyl or aryl group (i.e. not a hydrogen atom) 8

d Thymidine monophosphate (TMP), also known as thymidylic acid (conjugate base thymidylate), deoxythymidine monophosphate (dTMP), or deoxythymidylic acid (conjugate base deoxythymidylate), is a nucleotide that is used as a monomer in DNA. It is an ester of phosphoric acid with the nucleoside thymidine. dTMP consists of a phosphate group, the pentose sugar deoxyribose, and the nucleobase thymine. Unlike the other deoxyribonucleotides, thymidine monophosphate often does not contain the "deoxy" prefix in its name; nevertheless, its symbol often includes a "d" ("dTMP") 13

e Guanosine-5'-triphosphate (GTP) is a purine nucleoside triphosphate. It is one of the building blocks needed for the synthesis of RNA during the transcription process. Its structure is similar to that of the guanine nucleobase, the only difference being that nucleotides like GTP have a ribose sugar and three phosphates, with the nucleobase attached to the 1' and the triphosphate moiety attached to the 5' carbons of the ribose. 15

Formaldehyde (systematic name methanal) is a naturally occurring organic compound with the formula CH2O (H-CHO). It is the simplest of the aldehydes (R-CHO). The common name of this substance comes from its similarity and relation to formic acid. 23 Formaldehyde was the first polyatomic organic molecule detected in the interstellar medium. Since its initial detection in 1969, it has been observed in many regions of the galaxy. Because of the widespread interest in interstellar formaldehyde, it has recently been extensively studied, yielding new extragalactic sources.
" The poisonous chemical formaldehyde may have helped create the organic compounds present in the universe that gave rise to life, new research suggests. " 24

g In the chemical sciences, methylation denotes the addition of a methyl group on a substrate, or the substitution of an atom (or group) by a methyl group. Methylation is a form of alkylation, with a methyl group, rather than a larger carbon chain, replacing a hydrogen atom. These terms are commonly used in chemistry, biochemistry, soil science, and the biological sciences. 26 A methyl group is an alkyl derived from methane, containing one carbon atom bonded to three hydrogen atoms — CH3. 27 Such hydrocarbon groups occur in many organic compounds. It is a very stable group in most molecules. While the methyl group is usually part of a larger molecule, it can be found on its own in any of three forms: anion, cation or radical. The anion has eight valence electrons, the radical seven and the cation six.


1. Berg, Biochemistry, 5th edition, page 973
4. Biochemistry 5th edition, Styer, page 713
25. Biochemistry, Styer , 8th ed. page 724
33. Biochemistry 6th edition, Garrett, page 1063
36. Lehninger, Principles of biochemistry, 5th ed. page 292

Last edited by Admin on Fri Aug 03, 2018 8:40 am; edited 28 times in total

View user profile


In Narnia, we are transported to fantasy land. I feel sometimes, that God is taking my hands, and takes me to a walk to see what he has done to make life possible, to a reality which is more bewildering than in someone’s wildest dreams. When i move forward in my investigation, I step into new territory, and what i discover, makes me speechless.

The ability of transfer of just ONE SINGLE CARBON atom is absolutely essential for the metabolism of the amino acids glycine, serine, methionine, and histidine, and the biosynthesis of purines and pyrimidines - which constitute DNA molecules, the information carriers of cells.

And in order for biological cells to achieve this transfer, they require tetrahydrofolate cofactors, consisting of three moieties. Folates are among the most complex pterin coenzymes. The folate pathway is central to any study related to DNA methylation, dTMP synthesis or purine synthesis, and as such, to the origin of life itself, since without amino acids, and DNA - no life.

Annexed below, you can see the Folate biosynthesis pathway - at each branch point, there is a ramification of a web of complex enzymes which work in a coordinated, orchestrated, and interconnected way together to produce just this Tetrahydrofolate cofactor. To make things even more complex, the two essential precursors of folate biosynthesis are 4-aminobenzoate (a product of shikimate biosynthesis pathway) and GTP. To give you an idea about the complexity of the shikimate metabolic pathway, you can have a look here:

The central pathway uses six extremely complex enzymes, which I describe in detail in the article below. Moral of the story: These metabolic networks, enzymes, and co-factors are upon which life depends, and could hardly be explained with any other causal mechanism, besides a super intelligent creator.

Cheating of secular science papers, claiming of evolutionary mechanisms in place prior to life fully setup and self-replication.
The deceptive narrative of secular science papers of origins is evident between the lines when carefully analyzed.

Anyone that follows my posts over a certain time period, will observe, that I post periodically about my findings on molecular biology. It follows a logic since I am writing a book on the intracellular world.

Origins - what cause explains best our existence, and why?
Molecular biochemistry, biology, the origin of life and biodiversity, systematically analyzed from an epistemologically universal perspective

Slowly, I am unraveling the unexpected awe-inspiring mechanisms that cells use to produce life. Basically, all basic building blocks used in life are synthesized inside the Cell by extremely complex, irreducible, interdependent molecular machines and factories. That is six types of macromolecules. They are

amino acids,
fatty acids,
and nucleotides.

Now I am investigating in-depth about how Cells synthesize amino acids. That has brought me in the last month in extensive length to elucidate the bewildering unfathomably complex machinery that transforms dinitrogen ( 78% of the air we breath is dinitrogen ) in the atmosphere into ammonia and nitrate, incorporated in the cells to make amino acids, used to make proteins, the molecular workhorses of the Cell. After nitrogenase does the job of transformation of Dinitrogen into ammonia by an enormously energy consuming process, 3 enzymes, amongst them Glutamate dehydrogenase, a veritable molecular Supercomputer, converts inorganic ammonium ion into the α-amino nitrogen of amino acids.

What Is the Metabolic Fate of Ammonium?

The key entry point is the amino acid glutamate. Glutamate dehydrogenase (GDH) catalyzes the reductive amination of a-ketoglutarate to yield glutamate.  Alpha-ketoglutarate (AKG) is a nitrogen scavenger and a source of glutamate and glutamine - which are the nitrogen donors in a wide range of biosynthetic reactions.

The amino acid and nucleotide biosynthetic pathways make repeated use of the biological cofactor pyridoxal phosphate.

Now comes the key point of this post.

What is the origin of Pyridoxal phosphate enzymes?

The authors of following science paper claim that:

The pyridoxal-5'-phosphate (PLP)-dependent or vitamin B6-dependent enzymes that catalyze manifold reactions in the metabolism of amino acids belong to no fewer than four evolutionarily independent protein families.

My comment: It is remarkable that the authors do not make a distinction, recognizing that the origin of pyridoxal phosphate enzymes ( Vitamin B6) and the respective enzyme families which they interact with had to be fully operational at the Last Universal Common Ancestor ( LUCA ),  or in other words, when life began,  and could therefore not be the result of evolution. The fact that B6 interacts with four independent protein families challenges naturalistic explanations even further. Convergent emergence means that the same enzymatic reaction would have had to emerge independently and separately four times - that is extremely unlikely without guidance and direction, and teleology - or, goal oriented.

Further, the authors go and claim:
"The multiple evolutionary origins and the essential mechanistic role of PLP in these enzymes argue for the cofactor having arrived on the evolutionary scene before the emergence of the respective apoenzymes and having played a dominant role in the molecular evolution of the B6 enzyme families."

Why would natural occurrences on a prebiotic earth have produced co-factors ( in comparison of lock and key, the key alone) without the respective apo-enzymes ( the lock ) to interact with? Did they emerge, without function at all, just waiting for the respective proteins to interact with, to arrive later on the scene, to then eagerly looking to find them, and starting the molecular interaction?

Recycling, and the orchestration of anabolism and catabolism, evidence of natural forces, or design?

Recycling or reuse of used material, organized decomposition into basic building blocks, separation, and organized re-use is an exclusive activity performed by intelligence, namely by us, humans, who have figured out of know-how. And the more we practice it, the more sustainable our and less destructive our activities are for the planet where we live in.

In biological cells, recycling is a highly orchestrated, complex, and coordinated process. It is called catabolism. While in anabolism, metabolic networks construct molecules from smaller units, while in catabolism, a set of metabolic pathways breaks down molecules into smaller units that are either oxidized to release energy or used in other anabolic reactions       Interestingly, anabolism and catabolism occur simultaneously in the cell. The conflicting demands of concomitant catabolism and anabolism are managed by cells in two ways. First, the cell maintains tight and separate regulation of both catabolism and anabolism, so metabolic needs are served in an immediate and orderly fashion. Second, competing metabolic pathways are often localized within different cellular compartments. Isolating opposing activities within distinct compartments, such as separate organelles, avoids interference between them.

Question: How could unguided, random, not goal oriented processes on early earth have brought such a system into being?  regulation, order, management, organized separation, compartmentalization are known to be exclusively brought into action by intelligence. No exception is known.

A rather limited collection of simple precursor molecules is sufficient to provide for the biosynthesis of virtually any cellular constituent, be it protein, nucleic acid, lipid, or polysaccharide.Certain of the central pathways of intermediary metabolisms, such as the citric acid cycle, and many metabolites of other pathways have dual purposes—they serve in both catabolism and anabolism. Remarkably, the opposite metabolic directions is that such pathways must be independently regulated.

Question: How could such regulation have emerged in a stepwise, slow, gradual trial and error fashion, and the fact that regulation is essential? Could and would both independent regulation implementation not have had to emerge simultaneously, if considered, that the reverse cycle is slightly different and differently adjusted in order to work properly?

If catabolism and anabolism passed along the same set of metabolic tracks, equilibrium considerations would dictate that slowing the traffic in one direction by inhibiting a particular enzymatic reaction would necessarily slow traffic in the opposite direction. Independent regulation of anabolism and catabolism can be accomplished only if these two contrasting processes move along different routes or, in the case of shared pathways, the rate-limiting steps serving as the points of regulation are catalyzed by enzymes that are unique to each opposing sequence.

It is evident that in order to implement such a system that works both ways, there must be foresight and the setting of specific goals, and teleology, which is what naturalism must try to avoid to be true.

The spatial compartmentalization of metabolic pathways within cells provides important advantages, one of which is isolating competing pathways from one another. Cells and organisms also exhibit temporal compartmentalization of their metabolic pathways. That is, metabolic pathways may be turned on and off in a time-dependent and/or cyclic fashion. For example, the metabolism of many organisms—microbes, animals, and plants—is regulated in synchrony with the 24-hour cycle of day and night, a pattern called circadian rhythmicity and often referred to as the biological clock.

Question: How could such a circadian rhythm have emerged in a random manner? It could not be explained by evolution since: 

" The 24-hour circadian clock found in human cells is the same as that found in algae and dates back millions of years to early life on Earth, and is linked to DNA and gene activity "

and: " Because light and/or varying nutrient availability represent key signals regarding the transitory nature of the environment, and organisms have evolved and adapted to exploit the information in such signals. ".

Remarkably, there is not only a 24h circadian clock, but also a 7-day Circadian Clock, which coincides with Gods setup of six days, and the Sabbath, the rest on the seventh day:

The seven-day Circadian Rhythms: "Nature's Intricate Clockwork"

Metabolic networks are extraordinarily complex. The Human Metabolomics Database ( provides data on more than over 40,000 metabolites in cells and fluids (blood, urine, and so on) of the human body. The metabolomes of plants are even more complex, with estimates suggesting hundreds of thousands of different metabolites across the plant kingdom. Metabolomic assays must be able to resolve and discriminate this extraordinary array
of small molecules.

Atheists can tell me as much and as often as they want, that I lack credulity towards their narrative, that such complex molecular networks could have emerged randomly on early Earth. I will grant them with the same return: It requires a lot of faith/credulity that such complexity could be the process of random chemical interaction on a prebiotic earth.

Last edited by Admin on Thu Jul 26, 2018 5:00 pm; edited 1 time in total

View user profile

31 Proteins, the complex nanomachines of the Cell on Tue Sep 11, 2018 1:42 pm


Proteins, the complex nanomachines of the Cell

In the same way, as machine parts must be precisely shaped and fine-tuned to interact with other parts to bear useful function, the correct 3D form of proteins is essential for function. That form and how proteins fold depend on the specified complex amino acid sequence and its chemical structure and are defined by the genetic code sequence. Proteins carry out a myriad of different tasks inside Cells. Thats why they are called " the working horses of the Cell ".

Proteins are polymers found in all cells and play critical roles in nearly all life processes. The word protein comes from the Greek proteios (meaning of the first rank), which aptly describes their importance. Proteins account for about 50% of the organic material in a typical animal’s body.

Each amino acid contains a unique side chain, or R group, that has its own particular chemical properties. For example, aliphatic and aromatic amino acids are relatively nonpolar, which means they are less likely to associate with water. These hydrophobic (meaning water-fearing) amino acids are often buried within the interior of a folded protein. In contrast, the polar amino acids are hydrophilic (water-loving) and are more likely to be on the surface of a protein, where they can favorably interact with the surrounding water. The chemical properties of the amino acids and their sequences. in a polypeptide are critical factors that determine the unique structure of that polypeptide.
Following gene transcription and mRNA translation, the end result is a polypeptide with a defined amino acid sequence. This sequence is the primary structure of a polypeptide. The Figure  shows the primary structure of an enzyme called lysozyme, a relatively small protein containing 129 amino acids.

An example of a protein’s primary structure.
This is the amino acid sequence of the enzyme lysozyme, which contains 129 amino acids in its primary structure. As you may have noticed, the first amino acid is not methionine; instead, it is lysine. The first methionine residue in this polypeptide sequence is removed after or during translation. The removal of the first methionine occurs in many (but not all) proteins.

The individual polypeptides are called subunits of the protein, each of which has its own tertiary structure. The association of multiple subunits is the quaternary structure of a protein.

Amino acids are joined together by a dehydration reaction that links the carboxyl group of one amino acid to the amino group of another ( Figure below a ) :

The chemistry of polypeptide formation.
Polypeptides are polymers of amino acids. They are formed by linking amino acids via dehydration reactions to make peptide bonds. Every polypeptide has an amino end, or N-terminus, and a carboxyl end, or C-terminus.

The primary structure of a typical polypeptide may be a few hundred or even a couple of thousand amino acids in length. Within a living cell, a newly made polypeptide is not usually found in a long linear state for a significant length of time. Rather, to become a functional unit, most polypeptides quickly adopt a compact three-dimensional structure. The folding process begins while the polypeptide is still being translated. The progression from the primary structure of a polypeptide to the three-dimensional structure of a protein is dictated by the amino acid sequence within the polypeptide. In particular, the chemical properties of the amino acid side chains play a central role in determining the folding pattern of a protein. In addition, the folding of some polypeptides is aided by chaperones—proteins that bind to polypeptides and facilitate their proper folding. This folding process of polypeptides is governed by the primary structure and occurs in multiple stages (Figure below ). 

The first stage involves the formation of a regular, repeating shape known as a secondary structure. The two types of secondary structures are the α helix and the β sheet (Figure b above). A single polypeptide may have some regions that fold into an α helix and other regions that fold into a β sheet. Because of the geometry of secondary structures, certain amino acids, such as glutamic acid, alanine, and methionine, are good candidates to form an α helix. Other amino acids, such as valine, isoleucine, and tyrosine, are more likely to be found in a β-sheet conformation. Secondary structures within polypeptides are primarily stabilized by the formation of hydrogen bonds between atoms that are located in the polypeptide backbone. In addition, some regions do not form a repeating secondary structure. Such regions have shapes that look very irregular in their structure because they do not follow a repeating folding pattern. The short regions of secondary structure within a polypeptide are folded relative to each other to make the tertiary structure of a polypeptide. As shown in Figure c, α-helical regions and β-sheet regions are connected by irregularly shaped segments to determine the tertiary structure of the polypeptide.

The folding of a polypeptide into its secondary and then tertiary conformation can usually occur spontaneously because it is a thermodynamically favorable process. The structure is determined by various interactions, including the tendency of hydrophobic amino acids to avoid water, ionic interactions among charged amino acids, hydrogen bonding among amino acids in the folded polypeptide, and weak bonding known as van der Waals interactions. A protein is a functional unit that can be composed of one or more polypeptides. Some proteins are composed of a single polypeptide. Many proteins, however, are composed of two or more polypeptides that associate with each other to make a functional protein with a quaternary structure (Figure d). 

Five factors are critical for protein folding and stability
1. Hydrogen bonds
The large number of weak hydrogen bonds within a polypeptide and between polypeptides adds up to a collectively strong force that promotes protein folding and stability. As we have already learned, hydrogen
bonding is a critical determinant of protein secondary structure and also is important in tertiary and quaternary
2. Ionic bonds and other polar interactions
Some amino acid side chains are positively or negatively charged. Positively charged side chains may bind to negatively charged side chains via ionic bonds. Similarly, uncharged polar side chains in a protein may bind to ionic amino acids. Ionic bonds and polar interactions are particularly important in tertiary and quaternary structure.
3. Hydrophobic effect
Some amino acid side chains are nonpolar. These amino acids tend to exclude water. As a protein folds, the hydrophobic amino acids are likely to be found in the center of the protein, minimizing contact
with water. As mentioned, some proteins have stretches of nonpolar amino acids that anchor them in the hydrophobic portion of membranes. The hydrophobic effect plays a major role in tertiary and quaternary structures.
4. van der Waals forces
Atoms within molecules have weakattractions for each other if they are an optimal distance apart. This optimal distance is called the van der Waals radius, and the weak attraction is the van der Waals force. If two atoms are very close together, their electron clouds will repel each other. If they are far apart,the van der Waals force will diminish. The van der Waals forces are particularly important for tertiary structure.
5. Disulfide bridges
The side chain of the amino acid cysteine contains a sulfhydryl group (—SH), which can react with a sulfhydryl group in another cysteine side chain. The result is a disulfide bridge or bond, which links the two amino acid side chains together (—S—S—). Disulfide bonds are covalent bonds that can occur within a polypeptide or between different polypeptides. Though other forces are usually more important in protein folding, the covalent nature of disulfide bonds can help to stabilize the tertiary structure of a protein.

Factors that influence protein folding and stability

The first four factors just described are also important in the ability of different proteins to interact with each other. Many cellular processes involve steps in which two or more different proteins interact with each other. For this to occur, the surface of one protein must bind to the surface of the other. Such binding is usually very specific. The surface of one protein precisely fits into the surface of another . Such protein-protein interactions are critically important so that cellular processes can occur in a series of defined steps. In addition, protein-protein interactions are important in building cellular structures that provide shape and organization to cells.

The Defining Concept of Biochemistry Is “Molecular Recognition Through Structural Complementarity”
Structural complementarity is the means of recognition in biomolecular interactions. The complicated and highly organized patterns of life depend on the ability of biomolecules to recognize and interact with one another in very specific ways. Such interactions are fundamental to metabolism, growth, replication, and other vital processes. The interaction of one molecule with another, a protein with a metabolite, for example, can be most precise if the structure of one is complementary to the structure of the other, as in two connecting pieces of a puzzle or, in the more popular analogy for macromolecules and their b ligands, a lock and its key. This principle of structural complementarity is the very essence of biomolecular recognition. Structural complementarity is the significant clue to understanding the functional properties of biological systems. Biological systems, from the macromolecular level to the cellular level, operate via specific molecular recognition mechanisms based on structural complementarity: A protein recognizes its specific metabolite, an antibody recognizes its antigen, a strand of DNA recognizes its complementary strand, sperm recognize an egg. All these interactions involve structural complementarity between molecules. 1

It's remarkable how the author does not avoid teleology in his explanation. Recognizing something depends on volition, which biomolecules definitively lack. The principle of structural complementation extends in ALL molecular biology and is a core reason why the origin of proteins based on non-intelligent causal mechanisms is far too unspecific, besides being the very core of Behe's argument of irreducible complexity: Most proteins depend on multiple interlocked and interdependent subunits, structurally and by form fine-tuned and adapted to each other, which work in a coordinated manner together, provoking conformational changes and a vast array of different reactions. One subunit has no function in absence of the other, in the same manner as a lock has no function without the key. Recognition of the other subunit(s) must be pre-visualized, thought of, invented, and implemented accordingly. And that all depends on intelligence......

Proteins contain functional domains within their structures
Modern research into the functions of proteins has revealed that many proteins have a modular design. This means that portions within proteins, called modules, motifs, or domains, have distinct structures and functions. These units of amino acid sequences have been duplicated during evolution so that the same kind of domain may be found in several different proteins. When the same domain is found in different proteins, the domain has the same three-dimensional shape and performs a function that is characteristic of that domain. As an example, Figure below shows a member of a family of related proteins that are known to play critical roles in regulating how certain genes are turned on and off in living cells. 

The domain structure of a STAT protein

This protein bears the cumbersome name of signal transducer and activator of transcription (STAT) protein. Each domain of this protein is involved in a distinct biological function, a common occurrence in proteins with multiple domains. For example, one of the domains is labeled the SH2 domain (Figure above). Many different proteins contain this domain. It allows such proteins to recognize other proteins in a very specific way. The function of SH2 domains is to bind to tyrosine amino acids to which phosphate groups have been added by cellular enzymes. When an amino acid receives a phosphate group in this way, it is said to be phosphorylated (as is the protein in which the tyrosine exists). As might be predicted, proteins that contain SH2 domains all bind to phosphorylated tyrosines in the proteins they recognize. As a second example, a STAT protein has another domain called a DNA-binding domain. This portion of the protein has a structure that specifically binds to DNA. Overall, the domain structure of proteins enables them to have multiple, discrete regions, each with its own structure and purpose in the functioning of the protein.

Cellular Proteins are primarily responsible for the characteristics of living cells and an organism’s traits
Why is the genetic material largely devoted to storing the information to make proteins? To a great extent, the characteristics of a cell depend on the types of proteins that it makes. In turn, the traits of multicellular organisms are determined by the properties of their cells. Proteins perform a variety of functions critical to the life of cells and to the morphology and function of organisms. Some proteins are important in determining the shape and structure of a given cell. For example, the protein tubulin assembles into large cytoskeletal structures known as microtubules, which provide eukaryotic cells with internal structure and organization. Some proteins are inserted into the cell membrane and aid in the transport of ions and small molecules across the membrane. An example is a sodium channel that transports sodium ions into nerve cells. Another interesting category of proteins are those that function as biological motors, such as myosin, which is involved in the contractile properties of muscle cells. Within multicellular organisms, certain proteins function in cell signaling and cell surface recognition. For example, proteins, such as the hormone insulin, are secreted by endocrine cells and bind to the insulin receptor proteins found within the plasma membrane of target cells.

Many proteins are enzymes, which function to accelerate chemical reactions within the cell. Some enzymes assist in the breakdown of molecules or macromolecules into smaller units. These are known as catabolic enzymes and are important in utilizing cellular energy. In contrast, anabolic enzymes function in the synthesis of molecules and macromolecules. Throughout the cell, the synthesis of molecules and macromolecules relies on enzymes and accessory proteins. Ultimately, then, the construction of a cell greatly depends on its anabolic enzymes because these are required to synthesize all cellular macromolecules.

The impossible task to synthesize proteins on a prebiotic earth without external direction
Eliminative inductions argue for the truth of a proposition by arguing that competitors to that proposition are false. There was no sufficient nitrogen fixation on a prebiotic earth. The sufficiency of ammonia has also been brought into question. The source of sorting out of right-handed DNA, and left-handed amino acids on a prebiotic earth is unsolved for decades. A recent science paper reported that the set of amino acids selected, being used in life, appears to be near ideal. Why the particular 20 amino acids were selected to be encoded by the Genetic Code remains a puzzle. This is nothing short than astounding. Why were they selected amongst over 500 different ones known? Amino acid synthesis requires essential regulation. How could that have been achieved without evolution? Regulation requires a regulator - or - intelligence.

Lifeless matter has no teleological goal to regulate things. How the amino acids would and could have been bonded together in the correct manner without the Ribosome is another unsolved question. The probability is far higher that polymers would disintegrate, rather than the opposite. How could natural processes have foresight, which seems to be absolutely required, to "know" which amino acid sequences would provoke which forces, and how they would fold the protein structure to get functional for specific purposes within the cell? Let's consider, that in order to have a minimal functional living cell, at least 561 proteins and protein complexes would have to be fully set up, working, and interacting together to confer a functional whole with all life-essential functions.

Many proteins require " help " proteins to fold correctly. Also, some which were essential for life to begin. How should and could natural nonintelligent mechanisms forsee the necessity of chaperones in order to get a specific goal and result, that is functional proteins to make living organisms? Nonliving matter has no natural " drive " or purpose or goal to become living. The make of proteins to create life, however, is a multistep process of many parallel acting complex metabolic pathways and production-line like processes to make proteins and other life essential products like lipids, carbohydrates etc. The right folding of proteins is just one of several other essential processes in order to get a functional protein. But a functional protein by its own has no function unless correctly embedded through the right order of assembly at the right place.

Last not least, this is probably one of the most screaming problems: For biological cells to make proteins, and direct and insert them to the right place where they are needed, at least 25 unimaginably complex biosyntheses and production-line like manufacturing steps are required. Each step requires extremely complex molecular machines composed of numerous subunits and co-factors, which require the very own processing procedure, which makes its origin an irreducible  catch22 problem.

Medical Research Council Unit for the Study of Molecular Biology, Cavendish Laboratory, Cambridge

The nature of protein synthesis: 
The basic dilemma of protein synthesis ha8 been realized by many people, but it has been particularly aptly expressed by Dr A. L. Dounce (1956); My interest in templates, and the conviction of their necessity, originated from a question asked me on my Ph.D. oral examination by Professor J. B. Sumner. He enquired how I thought proteins might be synthesized. I gave what seemed the obvious answer, namely, that enzymes must be responsible. Professor Sumner then asked me the chemical nature of enzymes, and when I answered .that enzymes were proteins or contained proteins as essential components, he asked whether these enzyme proteins were synthesized by other enzymes and so on ad Infinitum. The dilemma remained in my mind, causing me to look for possible solutions that would be acceptable, at least from the standpoint of logic. The dilemma, of course, involves the specificity of the protein molecule, which doubtless depends to a considerable degree on the sequence of amino acids in the peptide chains of the protein. The problem is to find a reasonably simple mechanism that could account for specific sequences without demanding the presence of an ever-increasing number of new specific enzymes for the synthesis of each new protein molecule. It is thus clear that the synthesis of proteins must be radically different from the synthesis of polysaccharides, lipids, co-enzymes and other small molecules; that it must be relatively simple, and to a considerable extent uniform throughout Nature; that it must be highly specific, making few mistakes; and that in all probability it must be controlled at not too many removes by the genetic material of the organism. 

Proteins: how they provide striking evidence of design
Proteins are evidence of intelligent design par excellence. Instructional/specified complex information is required to get the right amino acid sequence which is essential to get functionality in a vast sequence space ( amongst trillions os possible sequences, rare are the ones that provide function ), and every protein is irreducibly complex in the sense, that a minimal number of amino acids are required for each protein to exercise specific tasks. This constitutes an insurmountable hurdle for the origin of life scenarios based on naturalistic hypotheses since unguided random events are too unspecific to get functional sequences in a viable timespan. Another true smack-down is the fact that single proteins or enzymes by themselves confer no advantage of survival at all, and have by their own no function. There is no reason why random RNA strands would become self-replicating. And even IF that were the case, there would be no utility for them unless at least 50 different precisely arranged and correctly interlinked enzymes and proteins, each with its specific function, would form a web of complex, just right metabolic pathways, and be encapsulated in a complex membrane with gates and pores, and in a precisely finely tuned and balanced homeostatic ambiance. Energy production and supply to each protein would also have to be fully set up right from the start. 

The argument of the proteins specified complexity
1. The number and sequence of amino acids in proteins, such as enzymes, are crucial.
2. Only specially-shaped forms (left-handed configurations) of each amino acid are functional in proteins.
3. Amino acids can be joined only by peptide bonds to form proteins.
4. To link together, each amino acid first must be activated by a specific enzyme.
5. Multiple special enzymes are required to bind messenger RNA to ribosomes before protein synthesis can begin or end.
6. Out of many details, even these few have specified complexity without which the proteins would not bear function.
7. An irreducibly complex system cannot be produced by unguided, nonetheless orderly aggregation and sequentially correct manner without external direction. Any precursor to an irreducibly complex system is by definition nonfunctional. An irreducibly complex molecular machine would have to arise as an integrated unit, fully functional from the beginning. It is almost universally conceded that such a sudden event would be irreconcilable with self-assembly without involving guiding intelligence.
8. Therefore, the origin of the minimal set of proteins to produce the first living organisms is best explained by the guiding hand of an intelligent creator.

Few of the many  possible polypeptide chains will be useful to Cells 
Paul Davies :
‘Making a protein simply by injecting energy is rather like exploding a stick of dynamite under a pile of bricks and expecting it to form a house. You may liberate enough energy to raise the bricks, but without coupling the energy to the bricks in a controlled and ordered way, there is little hope of producing anything other than a chaotic mess.’ It is one thing to produce bricks; it is an entirely different thing to organize the building of a house or factory. If you had to, you could build a house using stones that you found lying around, in all the shapes and sizes in which they came due to natural causes. However, the organization of the building requires something that is not contained in the stones. It requires the intelligence of the architect and the skill of the builder. It is the same with the building blocks of life. Blind chance just will not do the job of putting them together in a specific way. Organic chemist and molecular biologist A.G. Cairns-Smith puts it this way: ‘Blind chance… is very limited… he can produce exceedingly easily the equivalent of letters and small words, but he becomes very quickly incompetent as the amount of organization increases. Very soon indeed long waiting periods and massive material resources become irrelevant.’

Bruce Alberts, Molecular biology of the cell : 

Since each of the 20 amino acids is chemically distinct and each can, in principle, occur at any position in a protein chain, there are 20 x 20 x 20 x 20 = 160,000 different possible polypeptide chains four amino acids long, or 20n different possible polypeptide chains n amino acids long. For a typical protein length of about 300 amino acids, a cell could theoretically make more than 10^390  different pollpeptide chains. This is such an enormous number that to produce just one molecule of each kind would require many more atoms than exist in the universe. Only a very small fraction of this vast set of conceivable polypeptide chains would adopt a single, stable three-dimensional conformation-by some estimates, less than one in a billion. And yet the vast majority of proteins present in cells adopt unique and stable conformations. How is this possible?

The complexity of living organisms is staggering, and it is quite sobering to note that we currently lack even the tiniest hint of what the function might be for more than 10,000 of the proteins that have thus far been identified in the human genome. There are certainly enormous challenges ahead for the next generation of cell biologists, with no shortage of fascinating mysteries to solve.

Now comes Alberts  striking explanation of how the right sequence arised : 

The answer Iies in natural selection. A protein with an unpredictably variable structure and biochemical activity is unlikely to help the survival of a cell that contains it. Such
proteins would therefore have been eliminated by natural selection through the enormously long trial-and-error process that underlies biological evolution. Because evolution has selected for protein function in living organisms, the amino acid sequence of most present-day proteins is such that a single conformation is extremely stable. In addition, this conformation has its chemical properties finely tuned to enable the protein to perform a particular catalltic or structural function in the cell. Proteins are so precisely built that the change of even a few atoms in one amino acid can sometimes disrupt the structure of the whole molecule so severelv that all function is lost.

Proteins are not rigid lumps of material. They often have precisely engineered moving parts whose mechanical actions are coupled to chemical events. It is this coupling of chemistry and movement that gives proteins the extraordinary capabilities that underlie the dynamic processes in living cells

It seems that natural selection  is the key answer to any phenomena in biology, where there is no scientific evidence to make a empricial claim. Much has been written about the fact that natural selection cannot produce instructional codified information. Alberts short explanation is a prima facie example about how secular scientists make without hesitation " just so "  assertions claims without being able to provide a shred of evidence, just in order to maintain a paradigm on which the scientific establishment relies, where evolution is THE answer to almost every biochemical phenomena. Fact is that precision, coded information, stability, interdependence and irreducible complexity are products and require intelligent foresight, action, and guidance. Alberts seems also to forget that natural selection cannot occur before the first living cell began with self-replication. Several hundred proteins had to be already in place and fully operating in order to make even the simplest living organism start to live.
How rare, or common, are the functional sequences of amino acids among all the possible sequences of amino acids in a chain of any given length? 

Douglas Axe answered this question in 2004, and  Axe was able to make a careful estimate of the ratio of (a) the number of 150-amino-acid sequences that can perform that particular function to (b) the whole set of possible amino acid sequences of this length. Axe estimated this ratio to be 1 to 10^77. 3

This was a staggering number, and it suggested that a random process would have great difficulty generating a protein with that particular function by chance. But I didn't want to know just the likelihood of finding a protein with a particular function within a space of combinatorial possibilities. I wanted to know the odds of finding any functional protein whatsoever within such a space. That number would make it possible to evaluate chance-based origin-of-life scenarios, to assess the probability that a single protein—any working protein—would have arisen by chance on the early earth. 

Fortunately, Axe's work provided this number as well.17 Axe knew that in nature proteins perform many specific functions. He also knew that in order to perform these functions their amino-acid chains must first fold into stable three-dimensional structures. Thus, before he estimated the frequency of sequences performing a specific (beta-lactamase) function, he first performed experiments that enabled him to estimate the frequency of sequences that will produce stable folds. On the basis of his experimental results, he calculated the ratio of (a) the number of 150-amino-acid sequences capable of folding into stable "function-ready" structures to (b) the whole set of possible amino-acid sequences of that length. He determined that ratio to be 1 to 10^74. 

In other words, a random process producing amino-acid chains of this length would stumble onto a functional protein only about once in every 10^74 attempts. 

When one considers that Robert Sauer was working on a shorter protein of 100 amino acids, Axe's number might seem a bit less prohibitively improbable. Nevertheless, it still represents a startlingly small probability. In conversations with me, Axe has compared the odds of producing a functional protein sequence of modest (150-amino-acid) length at random to the odds of finding a single marked atom out of all the atoms in our galaxy via a blind and undirected search. Believe it or not, the odds of finding the marked atom in our galaxy are markedly better (about a billion times better) than those of finding a functional protein among all the sequences of corresponding length.

It is not surprising that various studies on evolving proteins have failed to show a viable mechanism. One study concluded that 10^63 attempts would be required to evolve or produce randomly a relatively short protein.

"The estimated number of sequences capable of adopting the h repressor fold is still 10^63 an exceedingly small fraction, about one in of the total number of possible 92-residue sequences." 1

Another study found that 10^64 to 10^77 attempts are required. So something like 10^70 attempts are required. To be a realistic scenario, someone has to assume billions of years are available, and that for that entire time the Earth is covered with self-replicating RNA molecules, constantly churning out mutations and new protein experiments.

The protein that enables a firefly to glow, and also reproduce (as its illuminated abdomen also serves as a visible mating call), is a protein made up of a chain of 1,000 amino acids. The full range of possible proteins that can be coded with such a chain is 17 times the number of atoms in the visible universe. This number also represents the odds against the RANDOM coding of such a protein. Yet, DNA effortlessly assembles that protein, in the exactly correct, and absolutely necessary sequence and number of amino acids for the humble firefly. What are we to say of the 25,000 individual, highly specialized, absolutely necessary, and exactly correctly coded proteins in the human body?

A short protein molecule of 150 amino acids, the probability of building a 150 amino acids chain in which all linkages are peptide linkages would be roughly 1 chance in 10^45. 

Paul Davies:
How did stupid atoms spontaneously write their own software…? Nobody knows …… there is no known law of physics able to create information from nothing.

William Dembski: 
We also know from broad and repeated experience that intelligent agents can and do produce information-rich systems: we have positive experience-based knowledge of a cause that is sufficient to generate new instructing complex information, namely, intelligence.  the design inference does not constitute an argument from ignorance. Instead, it constitutes an "inference to the best explanation" based upon our best available knowledge.  It asserts the superior explanatory power of a proposed cause based upon its proven—its known—causal adequacy and based upon a lack of demonstrated efficacy among the competing proposed causes.  The problem is that nature has too many options and without design couldn’t sort them all out. Natural mechanisms are too unspecific to determine any particular outcome. Mutation and natural selection or luck/chance/probability could theoretically form a new complex morphological feature like a  leg or a limb with the right size and form, and arrange to find out the right body location to grow them , but it could  also produce all kinds of other new body forms, and grow and attach them anywhere on the body, most of which have no biological advantage or are most probably deleterious to the organism. Natural mechanisms have no constraints, they could produce any kind of novelty. Its however that kind of freedom that makes it extremely unlikely that mere natural developments provide new specific evolutionary arrangements that are advantageous to the organism.  Nature would have to arrange almost an infinite number of trials and errors until getting a new positive arrangement. Since that would become a highly unlikely event, design is a better explanation.

A. I. Oparin
Even the simplest of these substances [proteins] represent extremely complex compounds, containing many thousands of atoms of carbon, hydrogen, oxygen, and nitrogen arranged in absolutely definite patterns, which are specific for each separate substance.  To the student of protein structure the spontaneous formation of such an atomic arrangement in the protein molecule would seem as improbable as would the accidental origin of the text of irgil’s “Aeneid” from scattered letter type.
The individual macromolecules are complex
But the complex interaction of biological macromolecules is only one aspect of the problem facing the origin of life. What compounds the enigma is that the individual macromolecular components are themselves complex, in the sense that their sequences - of ribonucleotides in the case of RNA, or amino acids for proteins - are very specific. The linear amino acid sequence of a protein is specific because it must (a) be able to fold into a discrete 3-dimensional structure, and (b) have the right amino acids in the right positions in the linear sequence so that, when folded, they are in exactly the right positions in relation to each other to form the active site(s) of the protein. (And similar considerations apply to RNAs.) Sequences which meet these criteria are exceedingly rare compared with the astronomical number of possible sequences of a suitable length. For example, Douglas Axe has estimated that only 1 in about 10^74 possible sequences will have a biological function. So it is totally unrealistic to think that such sequences could have arisen by random lucky self-assembly alone. How much less a suite of mutually dependent macromolecules? If the components themselves were not so improbable then it might be realistic to think that a complex combination of components could arise by chance, but the extreme improbability of the individual components is such that they are very unlikely to arise individually, and hence there is no chance whatever of an interdependent system. 5

Where even just two macromolecules are required to perform a function, then it would be necessary for both components to arise together: Because natural non-intelligence does not have foresight: if one component arises alone it will not be retained for potential future usefulness (when the second component is available), but would simply degrade. And, it should be noted, if the probability of getting one component is 1 in 10^74 then the probability of getting two together is 1 in 10^148 (not 1 in 2x10^74); and so on for multi-component systems. This is why the obligatory mutual dependence of many macromolecules in even basic biological systems completely defies any hope of a natural non-guided origin. So, in summary, the crux of the problem is that even a basic biological replicating system requires (a) several macromolecules with complementary functions with (b) each having a highly improbable sequence. And this combination of complexities presents an insurmountable challenge to a naturalistic origin of life. 

Proteins need to interact together. They need interface compatibility. So this adds to the unsurmountable problem ot making them by chance. Let's say chance, after 10^78 trial and errors, got a functional protein made. That function has imho no value, if it cannot interact properly with other proteins. Then random events made another 10^78 trial and errors attempts and got the second protein made. How many attempts would be required to make them compatible to be able to interact in a functional way ? So let's suppose, that they got the right interface compatibility, and interact properly. So what? This interaction alone would have a  long way to go, to get for example all at least 20 protein complexes needed to get DNA replication done, and making them interact all together in a functional way. Its the same as to make a piston by trial and error. At the end you have eventually a device that is piston like. If you do not know however exactly the size the piston has to have to fit into the motor block, nothing done. Nothing will go. So FORSIGHT and INTELLIGENCE in order to PROJECT the whole device is ESSENTIAL.

The proteins in living cells are made of just certain kinds of amino acids, those that are “alpha” (short) and “left-handed.” Miller’s “primordial soup” contained many long (beta, gamma, delta) amino acids and equal numbers of both right-and left-handed forms. Problem: just one long or right-handed amino acid inserted into a chain of short, left-handed amino acids would prevent the coiling and folding necessary for proper protein function. What Miller actually produced was a seething brew of potent poisons that would absolutely destroy any hope for the chemical evolution of life. 1

Amino Acids Used by Life Are Finely Tuned to Explore "Chemistry Space" 4
A paper in Nature's journal Scientific Reports, "Extraordinarily Adaptive Properties of the Genetically Encoded Amino Acids,"  has found that the twenty amino acids used by life are finely tuned to explore "chemistry space" and allow for maximal chemical reactions. Considering that this is a technical paper, they give an uncommonly lucid and concise explanation of what they did:

We drew 108 random sets of 20 amino acids from our library of 1913 structures and compared their coverage of three chemical properties: size, charge, and hydrophobicity, to the standard amino acid alphabet. We measured how often the random sets demonstrated better coverage of chemistry space in one or more, two or more, or all three properties. In doing so, we found that better sets were extremely rare. In fact, when examining all three properties simultaneously, we detected only six sets with better coverage out of the 108 possibilities tested. That's quite striking: out of 100 million different sets of twenty amino acids that they measured, only six are better able to explore "chemistry space" than the twenty amino acids that life uses. That suggests that life's set of amino acids is finely tuned to one part in 16 million. 3

Of course, they only looked at three factors -- size, charge, and hydrophobicity. When we consider other properties of amino acids, perhaps our set will turn out to be the best: While these three dimensions of property space are sufficient to demonstrate the adaptive advantage of the encoded amino acids, they are necessarily reductive and cannot capture all of the structural and energetic information contained in the 'better coverage' sets. They attribute this fine-tuning to natural selection, as their approach is to compare chance and selection as possible explanations of life's set of amino acids: This is consistent with the hypothesis that natural selection influenced the composition of the encoded amino acid alphabet, contributing one more clue to the much deeper and wider debate regarding the roles of chance versus predictability in the evolution of life.
But selection just means it is optimized and not random. They are only comparing two possible models -- selection and chance. They don't consider the fact that intelligent design is another cause that's capable of optimizing features. The question is: Which cause -- natural selection or intelligent design -- optimized this trait? 

To do so, you'd have to consider the complexity required to incorporate a new amino acid into life's genetic code. That in turn would require lots of steps: a new codon to encode that amino acid, and new enzymes and RNAs to help process that amino acid during translation. In other words, incorporating a new amino acid into life's genetic code is a multimutation feature. 

The biochemical language of the genetic code uses short strings of three nucleotides (called codons) to symbolize commands -- including start commands, stop commands, and codons that signify each of the 20 amino acids used in life. After the information in DNA is transcribed into mRNA, a series of codons in the mRNA molecule instructs the ribosome which amino acids are to be strung in which order to build a protein. Translation works by using another type of RNA molecule called transfer RNA (tRNA). During translation, tRNA molecules ferry needed amino acids to the ribosome so the protein chain can be assembled.

Each tRNA molecule is linked to a single amino acid on one end, and at the other end exposes three nucleotides (called an anti-codon). At the ribosome, small free-floating pieces of tRNA bind to the mRNA. When the anti-codon on a tRNA molecule binds to matching codons on the mRNA molecule at the ribosome, the amino acids are broken off the tRNA and linked up to build a protein.

For the genetic code to be translated properly, each tRNA molecule must be attached to the proper amino acid that corresponds to its anticodon as specified by the genetic code. If this critical step does not occur, then the language of the genetic code breaks down, and there is no way to convert the information in DNA into properly ordered proteins. So how do tRNA molecules become attached to the right amino acid?

Cells use special proteins called aminoacyl tRNA synthetase (aaRS) enzymes to attach tRNA molecules to the "proper" amino acid under thelanguage of the genetic code. Most cells use 20 different aaRS enzymes, one for each amino acid used in life. These aaRS enzymes are key to ensuring that the genetic code is correctly interpreted in the cell.

Yet these aaRS enzymes themselves are encoded by the genes in the DNA. This forms the essence of a "chicken-egg problem": aaRS enzymes themselves are necessary to perform the very task that constructs them.

How could such an integrated, language-based system arise in a step-by-step fashion? If any component is missing, the genetic information cannot be converted into proteins, and the message is lost. The RNA world is unsatisfactory because it provides no explanation for how the key step of the genetic code -- linking amino acids to the correct tRNA -- could have arisen.

Amino acids link together when the amino group of one amino acid bonds to the carboxyl group of another. Notice that water is a by-product of the reaction (called a condensation reaction). 

Stephen Meyer, Signature of the cell: 
According to neo-Darwinian theory, new genetic information arises first as random mutations occur in the DNA of existing organisms. When mutations arise that confer a survival advantage on the organisms that possess them, the resulting genetic changes are passed on by natural selection to the next generation. As these changes accumulate, the features of a population begin to change over time. Nevertheless, natural selection can "select" only what random mutations first produce. And for the evolutionary process to produce new forms of life, random mutations must first have produced new genetic information for building novel proteins. That, for the 
mathematicians, physicists, and engineers at Wistar, was the problem. Why? 

The skeptics at Wistar argued that it is extremely difficult to assemble a new gene or protein by chance because of the sheer number of possible base or amino-acid sequences. For every combination of amino acids that produces a functional protein there exists a vast number of other possible combinations that do not. And as the length of the required protein grows, the number of possible amino-acid sequence combinations of that length grows exponentially, so that the odds of finding a functional sequence—that is, a working protein—diminish precipitously. 

To see this, consider the following. Whereas there are four ways to combine the letters A and B to make a two-letter combination (AB, BA, AA, and BB), there are eight ways to make three-letter combinations (AAA, AAB, ABB, ABA, BAA, BBA, BAB, BBB), and sixteen ways to make four-letter combinations, and so on. The number of combinations grows geometrically, 22, 23, 24, and so on. And this growth becomes more pronounced when the set of letters is larger. For protein chains, there are 202, or 400, ways to make a two-amino-acid combination, since each position could be any one of 20 different alphabetic characters. Similarly, there are 203, or 8,000, ways to make a three-amino-acid sequence, and 204, or 160,000, ways to make a sequence four amino acids long, and so on. As the number of possible combinations rises, the odds of finding a correct sequence diminishes correspondingly. But most functional proteins are made of hundreds of amino acids. Therefore, even a relatively short protein of, say, 150 amino acids represents one sequence among an astronomically large number of other possible sequence combinations (approximately 10^195). 

Consider the way this combinatorial problem might play itself out in the case of proteins in a hypothetical prebiotic soup. To construct even one short protein molecule of 150 amino acids by chance within the prebiotic soup there are several combinatorial problems—probabilistic hurdles—to overcome. First, all amino acids must form a chemical bond known as a peptide bond when joining with other amino acids in the protein chain 

Consider the way this combinatorial problem might play itself out in the case of proteins in a hypothetical prebiotic soup. To construct even one short protein molecule of 150 amino acids by chance within the prebiotic soup there are several combinatorial problems—probabilistic hurdles—to overcome. First, all amino acids must form a chemical bond known as a peptide bond when joining with other amino acids in the protein chain (see Fig. 9.1). If the amino acids do not link up with one another via a peptide bond, the resulting molecule will not fold into a protein. In nature many other types of chemical bonds are possible between amino acids. In fact, when amino-acid mixtures are allowed to react in a test tube, they form peptide and nonpeptide bonds with roughly equal probability. Thus, with each amino-acid addition, the probability of it forming a peptide bond is roughly 1/2. Once four amino acids have become linked, the likelihood that they are joined exclusively by peptide bonds is roughly 1/2 × 1/2 × 1/2 ×  1/2 = 1/16, or (1/2) 4. The probability of building a chain of 150 amino acids in which all linkages are peptide linkages is (1/2)149, or roughly 1 chance in 10^45. 

Second, in nature every amino acid found in proteins (with one exception) has a distinct mirror image of itself; there is one left-handed version, or L-form, and one right-handed version, or D-form. These mirror-image forms are called optical isomers (see Fig. 9.2). Functioning proteins tolerate only left-handed amino acids, yet in abiotic amino-acid production the right-handed and left-handed isomers are produced with roughly equal frequency. Taking this into consideration further compounds the improbability of attaining a biologically functioning protein. The probability of attaining, at random, only L-amino acids in a hypothetical peptide chain 150 amino acids long is (1/2)150, or again roughly 1 chance in 1045. Starting from mixtures of D-forms and L-forms, the probability of building a 150-amino-acid chain at random in which all bonds are peptide bonds and all amino acids are L-form is, therefore, roughly 1 chance in 1090. 

Second, in nature every amino acid found in proteins (with one exception) has a distinct mirror image of itself; there is one left-handed version, or L-form, and one right-handed version, or D-form. These mirror-image forms are called optical isomers . Functioning proteins tolerate only left-handed amino acids, yet in abiotic amino-acid production the right-handed and left-handed isomers are produced with roughly equal frequency. Taking this into consideration further compounds the improbability of attaining a biologically functioning protein. The probability of attaining, at random, only L-amino acids in a hypothetical peptide chain 150 amino acids long is (1/2)150, or again roughly 1 chance in 10^45. Starting from mixtures of D-forms and L-forms, the probability of building a 150-amino-acid chain at random in which all bonds are peptide bonds and all amino acids are L-form is, therefore, roughly 1 chance in 10^90. 

Functioning proteins have a third independent requirement, the most important of all: their amino acids, like letters in a meaningful sentence, must link up in functionally specified sequential arrangements. In some cases, changing even one amino acid at a given site results in the loss of protein function. Moreover, because there are 20 biologically occurring amino acids, the probability of getting a specific amino acid at a given site is small—1/20. (Actually the probability is even lower because, in nature, there are also many nonprotein-forming amino acids.) On the assumption that each site in a protein chain requires a particular amino acid, the probability of attaining a particular protein 150 amino acids long would be (1/20)150, or roughly 1 chance in 10^195. 

Problems with Making Mutation the Basis for Macroevolution

Dembsi, the design of life, general notes, page 11: 
If the proportion of gene sequences that are biologically useful were large, there might be reason to think that point or chromosome mutations could be helpful in achieving the novel biological structures required by macroevolution. But all the evidence points to biologically useful gene sequences being exceedingly rare. It’s therefore highly unlikely that point and chromosome mutations can transform a duplicated gene into a novel functional gene. Genetic sequence space (i.e., the set of all possible genetic sequences) is functionally sparse (i.e., the overwhelming majority of genetic sequences don’t, and indeed can’t, do anything biologically useful or significant). As a consequence, navigating genetic sequence space by undirected means is no help getting from one island of functionality (i.e., one region of biologically useful or significant genetic sequences) to the next.

For instance, there is no evidence that conventional evolutionary mechanisms, such as natural selection, can evolve a gene in one region of genetic sequence space with one set of functions into a gene in a far distant region of genetic sequence space with another set of functions (distance here being measured in terms of sequence similarity). In the language of mathematical biology, genetic sequence space gives no indication of being highly interconnected by functional pathways that continuously connect genes with one function to genes with another (which would be required if natural selection, say, were to assist a duplicated gene in transforming into a novel gene). But there are still more problems with trying to make mutation the basis for macroevolution. For point and chromosome mutations to account for macroevolutionary change, it is not enough for individual genes to be transformed into novel genes that exhibit novel functions. Rather it is required that a whole suite of novel genes be produced through the coordinated transformation of existing genes. This is because for new biological structures to evolve (as required by macroevolution), many genes will have to change.

But it has not been demonstrated that mutations can produce the highly coordinated protein parts required for many biological structures. These are the structures that macroevolution would need to produce. Till now, however, there is no evidence for the coordinated macromutations required for macroevolution. The closest evidence cited in
textbooks includes increased immunity to malaria associated with the mutation for sickle-cell anemia and the resistance to antibiotics by mutant strains of bacteria.
In no such case, however, do we see a coordinated set of mutations that lead to complex novel structures. Sickle-cell anemia, for instance, is induced by a single point mutation that leads to single change in an amino acid (a valine is substituted for a glutamic acid in a hemoglobin molecule). Point mutations like this might enable organisms to stabilize and maintain themselves in the face of severe environmental pressure. In most instances, however, novel traits induced by such mutations do not continue to benefit the organisms when the environmental pressure is removed. Apart from environmental pressure, such mutations can even be deleterious. For instance, when an individual is heterozygous for sickle-cell anemia, the mutation provides an advantage for surviving the threat of malaria. It does so, however, at the expense of inflicting on homozygous individuals an anemia that impairs the transport of oxygen to the body’s cells. Indeed, sickle-cell anemia is often lethal.

To observe the origin of new species, one would expect to find it most readily in bacteria. That’s because bacteria can be easily mutated with chemicals and radiation in the laboratory, they take up very little space, and they have very short generation times. Indeed, thousands of mutations, billions of organisms, and thousands of generations can be studied by a single scientist. Yet, bacteriologists have never witnessed the origin of a new species. (Some new plant species have been observed to originate through hybridization, but the combining of two species to make a third is the opposite of the Darwinian process of splitting one species into two.) For mutations to contribute to evolution, they must benefit the organism. If a mutation harms the organism, it will tend to be eliminated, rather than favored, by natural selection. The only beneficial mutations that have ever been observed, in bacteria or in any other kind of organism, have been biochemical. That is, they affect only single molecules (such as the target molecule for streptomycin). There are no known beneficial mutations affecting morphology, or shape. All known morphological mutations are either neutral (i.e., they don’t have any noticeable effect on the organism’s fitness), or they are harmful—and the bigger their effect the more harmful they are. Yet, Darwinian evolution (i.e., the origin of new species, new organs, and new body plans) clearly requires changes in shape. So, there is no evidence for (and indeed a lot of evidence against) a role for mutations as providing raw materials for Darwinian evolution.

The specific genetic changes that give rise to the evolutionary origins of novel protein-protein interactions have rarely been documented in detail 6 
Although numerous investigators assume that the global features of genetic networks are moulded by natural selection, there has been no formal demonstration of the adaptive origin of any genetic network 7  The mechanisms by which genetic networks become established evolutionarily are far from clear. 

Many physicists, engineers and computer scientists, and some cell and developmental biologists, are convinced that biological networks exhibit properties that could only be products of natural selection; however, the matter has rarely been examined in the context of well-established evolutionary principles.

Alon states that it is “…wondrous that the solutions found by evolution have much in common with good engineering,”

2) B.Alberts  Molecular biology of the cell.

More readings:
Comprehensive experimental fitness landscape and evolutionary network for small RNA
Our study suggests that replaying the “tape of life” at the very origin of life might lead to quite different results.


Last edited by Admin on Mon Sep 17, 2018 2:41 pm; edited 4 times in total

View user profile


Peptide bonding of amino acids to form proteins and its origins

The crucial feature of amino acids that allows them to polymerize to form peptides and proteins is the existence of their two identifying chemical groups: the amino (ONH31) and carboxyl (OCOO2) groups, as shown in Figure below:

Anatomy of an amino acid. Except for proline and its derivatives, all of the amino acids commonly found in proteins possess this type of structure.

The amino and carboxyl groups of amino acids can react in a head-to-tail fashion, eliminating a water molecule and forming a covalent amide linkage, which, in the case of peptides and proteins, is typically referred to as a peptide bond. The equilibrium for this reaction in aqueous solution favors peptide bond hydrolysis. For this reason, biological systems, as well as peptide chemists in the laboratory, must couple peptide bond formation in an indirect manner or with energy input. Repetition of the reaction shown in above Figure produces polypeptides and proteins. The remarkable properties of proteins all depend in one way or another on the unique properties and chemical diversity of the 20 common amino acids found in proteins.

The amino acid sequences of polypeptides determine the structure and function of Proteins
The Figure below shows the 20 different amino acids that may be found within polypeptides.

The amino acids that are incorporated into polypeptides during translation. 
Parts (a) through (e) show the 20 standard amino acids, and part (f) shows two amino acids that are occasionally incorporated into polypeptides by the use of stop codons
The structures of amino acid side chains can also be covalently modified after a polypeptide is made, a phenomenon called post-translational modification.

Primary Structure: Amino Acids Are Linked by Peptide Bonds to Form Polypeptide Chains
Proteins are linear polymers formed by linking the alpha-carboxyl group of one amino acid to the a-amino group of another amino acid. This type of linkage is called alpha peptide bond or an amide bond. The formation of a dipeptide from two amino acids is accompanied by the loss of a water molecule. The equilibrium of this reaction lies on the side of hydrolysis rather than synthesis under most conditions. Hence, the biosynthesis of peptide bonds requires an input of free energy. Nonetheless, peptide bonds are quite stable kinetically because the rate of hydrolysis is extremely slow; the lifetime of a peptide bond in aqueous solution in the absence of a catalyst approaches 1000 years.

A series of amino acids joined by peptide bonds form a polypeptide chain, and each amino acid unit in a polypeptide is called a residue. A polypeptide chain has directionality because its ends are different: an alpha -amino group is present at one end and an a -carboxyl group at the other. The amino end is taken to be the beginning of a polypeptide chain; by convention, the sequence of amino acids in a polypeptide chain is written starting with the amino-terminal residue. 12

Question: By the convention of whom ??!!

Thus, in the polypeptide Tyr-Gly-Gly-Phe- Leu (YGGFL), tyrosine is the amino-terminal (N-terminal) residue and leucine is the carboxyl-terminal (C-terminal) residue . Leu- Phe-Gly-Gly-Tyr (LFGGY) is a different polypeptide, with different chemical properties.

Amino acid sequences have direction. 
This illustration of the pentapeptide Tyr-Gly-Gly-Phe-Leu (YGGFL) shows the sequence from the amino terminus to the carboxyl terminus. This pentapeptide, Leu-enkephalin, is an opioid peptide that modulates the perception of pain. The reverse pentapeptide, Leu-Phe-Gly-Gly-Tyr (LFGGY), is a different molecule and has no such effects.

A polypeptide chain consists of a regularly repeating part, called the main chain or backbone, and a variable part, comprising the distinctive side chains (Figure below).

Components of a polypeptide chain. 
A polypeptide chain consists of a constant backbone (shown in black) and variable side chains (shown in green).

The polypeptide backbone is rich in hydrogen-bonding potential. Each residue contains a carbonyl group (C = O), which is a good hydrogen-bond acceptor, and, with the exception of proline, an NH group, which is a good hydrogen-bond donor. These groups interact with each other and with functional groups from side chains to stabilize particular structures

Peptidyl transferase catalyzes peptide-bond synthesis
A molecule called the Peptidyl Transferase Center (PTC) is considered by some as having an essential role in the emergence of life, since this catalytic ability to get together amino acids is crucial for protein synthesis and thus, for the first transition from an RNA world to a Ribonucleoprotein world, as seen in modern organisms.

All known cellular organisms have the PTC conserved and the process of reading the information contained in the messenger RNA, in general, is similar in all life forms. Would the common ancestor of all life forms be a part of the largest subunit of the ribosomal RNA? When thinking about LUCA as a molecule, and more specifically, as the large subunit of the ribosome or even more specifically as the PTC, there is an extensive modification into the junction point on which all living organisms came to be. Here the nature of LUCA is changed since it places the common point of origin in a time where the RNA was the information-carrying molecule and the cellular systems were still starting to maturate. 10

The ribosome accelerates peptide bond formation by lowering the activation entropy of the reaction due to positioning the two substrates, ordering water in the active site, and providing an electrostatic network that stabilizes the reaction intermediates. Proton transfer during the reaction appears to be promoted by a concerted proton shuttle mechanism that involves ribose hydroxyl groups on the tRNA substrate. 11

Positioning, ordering, providing, stabilizing, promoting a concerted shuttle mechanism are all tasks which we can easily attribute to the action of an intelligence, but could hardly emerge without external direction by random unguided events.

Protein synthesis in the cell is performed on ribosomes, large ribonucleoprotein particles that consist of three RNA molecules and more than 50 proteins. Ribosomes are composed of two subunits, the larger of which has a sedimentation coefficient of 50S in prokaryotes (the 50S subunit) and the smaller which sediments at 30S (the 30S subunit); together they form 70S ribosomes. The ribosome is a molecular machine that selects its substrates, aminoacyl-tRNAs (aa-tRNAs) d , rapidly and accurately and catalyzes the synthesis of peptides from amino acids. The 30S subunit contains the decoding site, where base-pairing interactions between the mRNA codon and the tRNA anticodon determines the selection of the cognate aa-tRNA.

The large ribosomal subunit contains the site of catalysis—the peptidyl transferase (PT) center—which is responsible for making peptide bonds during protein elongation and for the hydrolysis of peptidyl-tRNA (pepttRNA) during the termination of protein synthesis. The ribosome has three tRNA binding sites: A, P, and E sites ( figure below )

Schematic of Peptide Bond Formation on the Ribosome
The a-amino group of aminoacyl-tRNA in the A site (red) attacks the carbonyl carbon of the pept-tRNA in the P site (blue) to produce a new, one amino acid longer pept-tRNA in the A site and a deacylated tRNA in the P site. The 50S subunit, where the PT center is located, is shown in light gray and the 30S subunit in dark gray. A, P, and E sites of the ribosome are indicated.

During the elongation cycle of protein synthesis, aa-tRNA is delivered to the A site of the ribosome in a ternary complex e with elongation factor Tu (EF-Tu) c and GTP. Following GTP hydrolysis and release from EF-Tu, aa-tRNA accommodates in the A site of the Peptidyl Transferase Center ( PT center ) and reacts with pept-tRNA bound to the P site, yielding deacylated tRNA in the P site and A site pept-tRNA that is extended by one amino acid residue. The subsequent movement of tRNAs and mRNA through the ribosome (translocation) is catalyzed by another elongation factor (EF-G in bacteria). During translocation, pept-tRNA and deacylated tRNA move to the P and E sites, respectively; a new codon is exposed in the A site for the interaction with the next aa-tRNA, and the deacylated tRNA is released from the E site.

The movement of aa-tRNA into the A site is a multistep process that requires structural rearrangements of the ribosome, EF-Tu, and aa-tRNA.

Structure of the Active Site of the Peptidyl Transferase Center (PTC)
50S subunits are composed of two rRNA molecules, 23S rRNA and 5S rRNA, and more than 30 proteins (Figure A below).

Structure of the Peptidyl Transferase Center
(A) Crystal structure of the 50S subunit from H. marismortui with a transition state analog (red) bound to the active site. Ribosomal proteins are blue, the 23S rRNA backbone is brown, the 5S rRNA backbone is olive, and
rRNA bases are pale green. 
(B) Substrate binding to the active site. Base pairs formed between cytosine residues of the tRNA analogs in the A site (yellow) and P site (orange) with 23S rRNA bases (pale green) are indicated. The a-amino group of the A site substrate (blue) is positioned for the attack on the carbonyl carbon of the ester linking the peptide moiety of the P site substrate (green). Inner shell nucleotides are omitted for clarity.

The Mechanism of Peptide Bond Formation
The combined evidence supports the idea that peptide bond formation on the ribosome is driven by a favorable entropy change. The A and P site substrates are precisely aligned in the active center by interactions of the tRNA  CCA b sequences and of the nucleophilic a-amino group with residues of 23S rRNA in the active site. The most favorable catalytic pathway involves a six-membered transition state (Figure below) in which proton shuttling occurs via the 20-OH of A76 of the P site tRNA. The reaction does not involve chemical catalysis by ribosomal groups but may be modulated by conformational changes at the active site which can be induced by protonation.

Concerted Proton Shuttle Mechanism of Peptide Bond Formation
Pept-tRNA (P site) and aminoacyl-tRNA (A site) are blue and red, respectively, ribosome residues are pale green, and ordered water molecules are gray. The attack of the a-NH2 group on the ester carbonyl carbon results in a six-membered transition state in which the 20-OH group of the A site A76 ribose moiety donates its proton to the adjacent leaving 30 oxygen and simultaneously receives a proton from the amino group. Ribosomal residues are not involved in chemical catalysis but are part of the H bond network that stabilizes the transition state.

In addition to placing the reactive groups into close proximity and precise orientation relative to each other, the ribosome appears to work by providing an electrostatic environment that reduces the free energy of forming the highly polar transition state, shielding the reaction against bulk water, helping the proton shuttle forming the leaving group or a combination of these effects. With this preorganized network, the ribosome avoids the extensive solvent reorganization that is inevitable in the corresponding reaction in solution, resulting in significantly more favorable entropy of activation of the reaction on the ribosome.

With both the P site and the A site occupied by aminoacyl-tRNA, the stage is set for the formation of a peptide bond: the formylmethionine molecule linked to the initiator tRNA will be transferred to the amino group of the amino acid in the A site. The formation of the peptide bond, one of the most important reactions in life, is a thermodynamically spontaneous reaction catalyzed by a site on the 23S rRNA of the 50S subunit called the peptidyl transferase center. This catalytic center is located deep in the 50S subunit near the tunnel that allows the nascent peptide to leave the ribosome. The ribosome, which enhances the rate of peptide bond synthesis by a factor of 10^7 over the uncatalyzed reaction, derives much of its catalytic power from catalysis by proximity and orientation. The ribosome positions and orients the two substrates so that they are situated to take advantage of the inherent reactivity of an amine group (on the aminoacyl-tRNA in the A site) with an ester (on the initiator tRNA in the P site). The amino group of the aminoacyl-tRNA in the A site, in its unprotonated state, makes a nucleophilic attack on the ester linkage between the initiator tRNA and the formylmethionine molecule in the P site (Figure A below). 

Peptide-bond formation.
(A) The amino group of the aminoacyl-tRNA attacks the carbonyl group of the ester linkage of the peptidyl-tRNA. 
(B) An eight-membered transition state is formed. Note: Not all atoms are shown and some bond lengths are exaggerated for clarity.
(C) This transition state collapses to form the peptide bond and release the deacylated tRNA.

The nature of the transition state that follows the attack is not established and several models are plausible. One model proposes roles for the 2' OH of the adenosine of the tRNA in the P site and a molecule of water at the peptidyl transferase center (Figure B above). The nucleophilic attack of the a-amino group generates an eight-membered transition state in which three protons are shuttled about in a concerted manner. The proton of the attacking amino group hydrogen bonds to the 2' oxygen of ribose of the tRNA. The hydrogen of 2' OH, in turn, interacts with the oxygen of the water molecule at the center, which then donates a proton to the carbonyl oxygen. A collapse of the transition state with the formation of the peptide bond allows protonation of the 3'OH of the now empty tRNA in the P site (Figure C above). The stage is now set for translocation and formation of the next peptide bond.

The formation of a peptide bond is followed by the GTP-driven a translocation of tRNAs and mRNA
With the formation of the peptide bond, the peptide chain is now attached to the tRNA whose anticodon is in the A site on the 30S subunit. The two subunits rotate with respect to one another, and this structural change places the CCA b end of the same tRNA and its peptide in the P site of the large subunit (Figure below). 

Mechanism of protein synthesis. 
The cycle begins with peptidyltRNA in the P site. 
(1) An aminoacyl-tRNA binds in the A site. 
(2) With both sites occupied, a new peptide bond is formed.
(3) The tRNAs and the mRNA are translocated through the action of elongation factor G, which moves the deacylated tRNA to the E site. 
(4) Once there, the tRNA is free to dissociate to complete the cycle.

Another aminoacyl-tRNA arrives and binds at the A site (1). Again, peptide bond synthesis occurs (2). However, protein synthesis cannot continue without the translocation of the mRNA and the tRNAs within the ribosome. Elongation factor G (EF-G, also called translocase) c catalyzes the movement of mRNA, at the expense of GTP hydrolysis, by a distance of three nucleotides. Now, the next codon is positioned in the A site for interaction with the incoming aminoacyl-tRNA (3). The peptidyl- tRNA moves out of the A site into the P site on the 30S subunit and at the same time, the deacylated tRNA moves out of the P site into the E site and is subsequently released from the ribosome (4). The movement of the peptidyl-tRNA into the P site shifts the mRNA by one codon, exposing the next codon to be translated in the A site.

The three-dimensional structure of the ribosome undergoes significant change during translocation, and evidence suggests that translocation may result from properties of the ribosome itself. However, EF-G accelerates the process. A possible mechanism for accelerating the translocation process is shown in Figure below.

Translocation mechanism. 
In the GTP form, EF-G binds to the A site on the 50S subunit. This binding stimulates GTP hydrolysis, inducing a conformational change in EF-G that forces the tRNAs and mRNA to move through the ribosome by a distance corresponding to one codon.

Question: How did unguided random processes select and finely tune the forces to move the tRNAs and mRNA by the right distance of one codon?

EF-G in the GTP form binds to the ribosome near the A site, interacting with the 23S rRNA of the 50S subunit. The binding of EF-G to the ribosome stimulates the GTPase activity of EF-G. On GTP hydrolysis, EF-G undergoes a conformational change that displaces the peptidyl-tRNA in the A site to the P site, which carries the mRNA and the deacylated tRNA with it. The dissociation of EF-G leaves the ribosome ready to accept the next aminoacyl-tRNA into the A site. Note that the peptide chain remains in the P site on the 50S subunit throughout this cycle, growing into the exit tunnel. This cycle is repeated, with mRNA translation taking place in the 5' ==>> 3' direction, as new aminoacyl-tRNAs move into the A site, allowing the polypeptide to be elongated until a stop signal is found.

The direction of translation has important consequences. Transcription also is in the 5' ==>> 3' direction. If the direction of translation were opposite that of transcription, only fully synthesized mRNA could be translated. In contrast, because the directions are the same, mRNA can be translated while it is being synthesized.

Question: How could natural, unguided, random processes select the right direction to be translated? Trial and error ?

In bacteria, almost no time is lost between transcription and translation. The 5' end of mRNA interacts with ribosomes very soon after it is made, well before the 3' end of the mRNA molecule is finished. An important feature of bacterial gene expression is that translation and transcription are closely coupled in space and time.

There is a huge gap that has to be filled between " modern " polypeptide formation through ribosomes, mRNA, and tRNA's, and supposed primordial amino chain formations without this advanced machinery. How could the gap be closed? Not only are prebiotic mechanisms unlikely, but the transition would require the emergence of all the complex machinery and afterward transition from one mechanism to the other. Tamura admits that fact clearly: the ultimate route to the ribosome remains unclear.   It takes a big leap of faith to believe, that could be possible in any circumstances. 

Mystery of Life's Origin 4
Experimental evidence indicates that if there are bonding preferences between amino acids, they are not the ones found in natural organisms. There are three basic requirements for a biologically functional protein.

One: It must have a specific sequence of amino acids. At best prebiotic experiments have produced only random polymers. And many of the amino acids included are not found in living organisms.

Second: An amino acid with a given chemical formula may in its structure be either “righthanded” (D-amino acids) or “left-handed” (L-amino acids). Living organisms incorporate only L-amino acids. However, in prebiotic experiments where amino acids are formed approximately equal numbers of D- and L-amino acids are found. This is an “intractable problem” for chemical evolution (p. vi).

Third: In some amino acids there are more positions than one on the molecule where the amino and carboxyl groups may join to form a peptide bond. In natural proteins only alpha-peptide bonds (designating the location of the bond) are found. In proteinoids, however, beta, gamma and epsilon peptide bonds largely predominate. Just the opposite of what one would expect if bonding preferences played a role in prebiotic evolution.

Studies of peptide bond formation in the absence of modern biological machinery can give insight into the mechanism employed by the ribosome’s active site, as well as yield important information in the prebiotic route to the first peptides in the origin of life. The formation of a peptide bond (reaction R1 shown below) is a condensation reaction, eliminating a water molecule for each peptide bond formed, and thus faces both thermodynamic and kinetic constraints in bulk aqueous solution

Amino Acids joined together through a dehydration reaction, where a water molecule is formed and removed to form a covalent bond called a peptide bond. A structure resulting from a bunch of these bonds repeating over and over is called a polypeptide. Like DNA molecules, polypeptides have a direction: they’ve got an amino acid at one end (the N-terminus) and a carboxyl group at the other (the C-terminus).

In modern biology, the condensation reactions necessary in the formation of peptide bonds are facilitated catalytically by the large subunit of the ribosome.

Fazale Rana's Cell's design: The chemical reactions that form the bonds that join amino acids together in polypeptide chains are catalyzed or assisted by ribosomes. The ribosome, mRNA, and tRNA molecules work cooperatively to produce proteins. Using an assembly-line process, protein manufacturing machinery forms the polypeptide chains (that constitute proteins) one amino acid at a time. This protein synthetic apparatus joins together three to five amino acids per second. Ribosomes, in conjunction with mRNA and tRNAs, assemble the cell's smallest proteins, about one hundred to two hundred amino acids in length, in less than one minute. The processing of proteins in the lumen (posttranslational modification) is quite extensive. Posttranslational modifications include (1) formation and reshuffling of disulfide bonds (these bonds form between the side chains of cysteine amino acid residues within a protein, stabilizing its three-dimensional structure)

Amino Acids Are Added to the C-terminal End of a Growing Polypeptide Chain
Each amino acid is first coupled to specific tRNA molecules, next is the mechanism that joins these amino acids together to form proteins. The fundamental reaction of protein synthesis is the formation of a peptide bond between the carboxyl group at the end of a growing polypeptide chain and a free amino group on an incoming amino acid. Consequently, a protein is synthesized stepwise from its N-terminal end to its C-terminal end. Throughout the entire process, the growing carboxyl end of the polypeptide chain remains activated by its covalent attachment to a tRNA molecule (forming a peptidyl-tRNA). Each addition disrupts this high-energy covalent linkage, but immediately replaces it
with an identical linkage on the most recently added amino acid

The incorporation of an amino acid into a protein. A polypeptide chain grows by the stepwise addition of amino acids to its C-terminal end. The formation of each peptide bond is energetically favorable because the growing C-terminus has been activated by the covalent attachment of a tRNA molecule. The peptidyl-tRNA linkage that activates the growing end is regenerated during each addition. The amino acid side chains have been abbreviated as R1, R2, R3, and R4; as a referencepoint, all of the atoms in the second amino acid in the polypeptide chain are shaded gray. The figure shows the addition of the fourth amino acid (red) to the growing chain.

Peptide Bond Formation: RNA's Big Bang

The genetic code may have been established gradually (Wong, 1975). 5

observe the " may have's ", by some means, might have, proposed the idea, would have,

The second law of thermodynamics indicates that peptide bond formation does not occur spontaneously. Therefore, energy must be added into the system by some means and amino acids must be "activated." Modern biological systems use the energy of the ATP hydrolysis for coupling many reactions (Lipmann, 1941). However, during the prebiotic stage, the light from the sun, geothermal energy, pressure in the thermal vent, or other similar sources may have been used in the process of activating the molecules of a system. The development of prebiotic precursors of biomolecules might have occurred in interstellar space, and were subsequently transferred to Earth by comets, asteroids, or meteorites (Oró, 1961; Chyba et al., 1990; Chyba & Sagan, 1992). Reactions on clay (Paecht-Horowitz et al., 1970) and/or dry mixtures of amino acids (Fox & Harada, 1958) may have facilitated the condensation of activated amino acids, thereby forming peptide bonds. Iron sulfate is known to cause unusual reducing reactions, especially with H2S. Wächtershäuser (1992) previously proposed the idea of an "iron-sulfur world" where low-molecular weight constituents may have originated autotrophic metabolism. In such circumstances, amino acids would have been converted into simple peptides (Huber & Wächtershäuser, 1998). In fact, it has been demonstrated that the peptide containing a thioester at the carboxyl-terminal undergoes nucleophilic attack by the side chain of the Cys residue at the amino terminal of another peptide. Moreover, the formed thioester ligation product readily undergoes a rapid intramolecular reaction at the α-amino group of the Cys to yield a product with a native peptide bond. This series of events is called "native chemical ligation" and is important in the general application of protein chemistry (Dawson et al., 1994). These possibilities should be further considered in terms of the very early mechanisms responsible for peptide bond formation. However, because we must consider the modern ribosome, we cannot avoid consideration of RNA in the evolution of biological systems.

Its remarkable how mainstream scientific papers give to their naturalistic proposals a positive connotation, but without providing compelling evidence for their assertions.

The Emergence of Information-Rich Biopolymers 1
Given an ocean full of small molecules of the types likely to be produced on a prebiological earth with the types of processes postulated by origin of life enthusiasts, we must next approach the question of polymerization. This question poses a two edged sword: we must first demonstrate that macromolecule synthesis is possible under prebiological conditions, then we must construct a rationale for generating macromolecules rich in the information necessary for usefulness in a developing precell. We shall deal with these separately.

The synthesis of proteins and nucleic acids from small molecule precursors represents one of the most difficult challenges to the model of prebiological evolution. There are many different problems confronted by any proposal. Polymerization is a reaction in which water is a product. Thus it will only be favored in the absence of water. The presence of precursors in an ocean of water favors depolymerization of any molecules that might be formed. Careful experiments done in an aqueous solution with very high concentrations of amino acids demonstrate the impossibility of significant polymerization in this environment.

Polymer formation in aqueous environments would most likely have been necessary on early Earth because the liquid ocean would have been the reservoir of amino acid precursors needed for protein synthesis. 3

A thermodynamic analysis of a mixture of protein and amino acids in an ocean containing a 1 molar solution of each amino acid (100,000,000 times higher concentration than we inferred to be present in the prebiological ocean) indicates the concentration of a protein containing just 100 peptide bonds (101 amino acids) at equilibrium would be 10-338 molar. Just to make this number meaningful, our universe may have a volume somewhere in the neighborhood of 10^85 liters. At 10-338 molar, we would need an ocean with a volume equal to 10229 universes (100, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000) just to find a single molecule of any protein with 100 peptide bonds. So we must look elsewhere for a mechanism to produce polymers. It will not happen in the ocean.

Sidney Fox, an amino acid chemist, and one of my professors in graduate school, recognized the problem and set about constructing an alternative. Since water is unfavorable to peptide bond formation, the absence of water must favor the reaction. Fox attempted to melt pure crystalline amino acids in order to promote peptide bond formation by driving off water from the mix. He discovered to his dismay that most amino acids broke down to a tarry degradation product long before they melted. After many tries, he discovered two of the 20 amino acids, aspartic and glutamic acid, would melt to a liquid at about 200oC. He further discovered that if he were to dissolve the other amino acids in the molten aspartic and glutamic acids, he could produce a melt containing up to 50% of the remaining 18 amino acids. It was no surprise then that the amber liquid, after cooking for a few hours, contained polymers of amino acids with some of the properties of proteins. He subsequently named the product proteinoids. The polymerized material can be poured into an aqueous solution, resulting in the formation of spherules of protein-like material which Fox has likened to cells. Fox has claimed nearly every conceivable property for his product, including that he had bridged the macromolecule to cell transition. He even went so far as to demonstrate a piece of lava rock could substitute for the test tube in proteinoid synthesis and claimed the process took place on the primitive earth on the flanks of volcanoes. However, his critics, as well as his own students, have stripped his credibility. Note the following problems:

1) Proteinoids are not proteins; they contain many non-peptide bonds and unnatural cross-linkages.

2) The peptide bonds they do contain are beta bonds, whereas all biological peptide bonds are alpha.

3) His starting materials are purified amino acids bearing no resemblance to the materials available in the "dilute soup." If one were to try the experiment with condensed "prebiological soup," tar would be the only product.

4) The ratio of 50% Glu and Asp necessary for success in these experiments bears no resemblance to the vastly higher ratio of Gly and Ala found in nearly all primitive earth synthesis experiments.

5) There is no evidence of information contained in the molecules.

All of his claims have failed the tests of rationality when examined carefully. As promising as his approach seemed in theory, the reality is catastrophic to the hopes of paleobiogeochemists.

A number of other approaches have been tried. The most optimistic of these is the use of clays. Clays are very thin, very highly ordered arrays of complex aluminum silicates with numerous other cations. In this environment, the basic amino groups tend to order and polymers of several dozen amino acids have been produced. While these studies have generated enthusiastic interest on the part of prebiological evolutionists, their relevance is quickly dampened by several factors.

1) While ordered amino acids joined by peptide bonds result, the product contains no meaningful information.

2) The clays exhibit a preference for basic amino acids.

3) No polymerization of amino acids results if free amino acids are used.

4) Pure activated amino acids attached to adenine must be used in order to drive the reaction toward polymerization. Adenylated amino acids are not exactly the most likely substrate to be floating about the prebiological ocean.

5) The resultant polymers are three dimensional rather than linear, as is required for biopolymers.

At least one optimistic scientist (Cairns-Smith, 1982) believes that the clay particles themselves formed the substance of the first organisms! In reality, the best one can hope for from such a scenario is a racemic polymer of proteinous and non-proteinous amino acids with no relevance to living systems.

           A final chapter has recently been opened with the discovery of autocatalytic RNA molecules. These were originally received with great excitement by the prebiological evolutionists because they gave hope of alleviating the need to make proteins in the first cell. These so-called "ribozymes" proved incapable of rising to the occasion, however, for not only are the molecules themselves very limited in what they have been shown capable of doing, but the production of the precursors of RNA by any prebiological mechanism considered thus far is a problem at least as difficult as the one ribozymes purport to solve:

1) While ribose can be produced under simulated prebiological conditions via the formose reaction, it is a rare sugar in formaldehyde polymers (the prebiological mechanism believed to have given rise to sugars). In addition the presence of nitrogenous substances such as amino acids in the reaction mixture would prevent sugar synthesis (Shapiro, 1988). Cairns-Smith (1993) has summarized the situation as follows:"Sugars are particularly trying. While it is true that they form from formaldehyde solutions, these solutions have to be far more concentrated than would have been likely in primordial oceans. And the reaction is quite spoilt in practice by just about every possible sugar being made at the same time - and much else besides. Furthermore the conditions that form sugars also go on to destroy them. Sugars quickly make their own special kind of tar - caramel - and they make still more complicated mixtures if amino acids are around."

2) When produced and condensed with a nucleotide base, a mixture of optical isomers results, only one of which is relevant to prebiological studies.

3) Polymerization of nucleotides is inhibited by the incorporation of such an enantiomorph.

4) While only 3'-5' polymers occur in biological systems, 5'-5' and 2'-5' polymers are favored in prebiological type synthetic reactions (Joyce and Orgel, 1993, but see Usher,et. al. for an interesting sidelight).

5) None of the 5 bases present in DNA/RNA are produced during HCN oligomerization in dilute solutions (the prebiological mechanism believed to give rise to nucleotide bases). And many other non-coding bases would compete during polymerization at higher concentrations of HCN.

In addition to the problems of synthesis of the precursors and the polymerization reactions, the whole scheme is dependent on the ability to synthesize an RNA molecule which is capable of making a copy of itself, a feat that so far has eluded strenuous efforts. The molecule must also perform some function vital to initiating life force. So far all of this talk of an "RNA World" remains wishful thinking best categorized as fiction. The most devastating indictment of the scheme however, is that it offers no clue as to how one gets from such a scheme to the DNA-RNA-Protein mechanism of all living cells. The fact that otherwise rational scientists would exhibit such rampant enthusiasm for this scheme so quickly reveals how little faith they have in all other scenarios for the origin of life, including the ones discussed above.

Guanosine triphosphate ( GTP) is a high energy nucleotide (not to be confused with nucleoside) found in the cytoplasm or polymerised to form the guanine base. 17 It is a result of it's complex three dimensional structure and the variety of different chemical groups which it comprises of. 
Guanosine-5'-triphosphate (GTP) is a purine nucleoside triphosphate. It is one of the building blocks needed for the synthesis of RNA during the transcription process. Its structure is similar to that of the guanine nucleobase, the only difference being that nucleotides like GTP have a ribose sugar and three phosphates, with the nucleobase attached to the 1' and the triphosphate moiety attached to the 5' carbons of the ribose. It also has the role of a source of energy or an activator of substrates in metabolic reactions, like that of ATP, but more specific. It is used as a source of energy for protein synthesis and gluconeogenesis. GTP is essential to signal transduction, in particular with G-proteins, in second-messenger mechanisms where it is converted to guanosine diphosphate (GDP) through the action of GTPases. 16

Guanosine is a purine nucleoside comprising guanine attached to a ribose (ribofuranose) ring via a β-N9-glycosidic bond. Guanosine can be phosphorylated to become guanosine monophosphate (GMP), cyclic guanosine monophosphate (cGMP), guanosine diphosphate (GDP), and guanosine triphosphate (GTP). These forms play important roles in various biochemical processes such as synthesis of nucleic acids and proteins, photosynthesis, muscle contraction, and intracellular signal transduction (cGMP). 18

For the synthesis purines, following enzymes are required:

phosphoribosylamine-glycine ligase,
phosphoribosylglycinamide formyltransferase,
phosphoribosylformylglycinamidine synthase,
phosphoribosylformylglycinamidine cyclo-ligase, 20

Guanine is one of the four main nucleobases found in the nucleic acids DNA and RNA. 
For scientists attempting to understand how the building blocks of RNA originated on Earth, guanine -- the G in the four-letter code of life -- has proven to be a particular challenge. While the other three bases of RNA -- adenine (A), cytosine (C) and uracil (U) -- could be created by heating a simple precursor compound in the presence of certain naturally occurring catalysts, guanine had not been observed as a product of the same reactions.

How could and would random events attach a phosphate group to the right position of a ribose molecule to provide the necessary chemical activity? And how would non-guided random events be able to attach the nucleic bases to the ribose?  The coupling of a ribose with a nucleotide is the first step to form RNA, and even those engrossed in prebiotic research have difficulty envisioning that process, especially for purines and pyrimidines.”
The sugar found in the backbone of both DNA and RNA, ribose, has been particularly problematic, as the most prebiotically plausible chemical reaction schemes have typically yielded only a small amount of ribose mixed with a diverse assortment of other sugar molecules. 16

Glycosidic bond
The formation of nucleosides in abiotic conditions is a major hurdle in origin-of-life studies. The formamido pyrimidine-based syntheses are high regioselective, moderately stereoselective, multi-step, only apply to purines and afford a mixture of furanosides and pyranosides. The prebiotic worth of these syntheses is inversely proportional to the procedural complexities involved, requiring numerous concentration, purification and supplementation steps, designed to specifically overcome intermediate reactions bottlenecks. 21

Guanosine monophosphate (GMP)

b CCA is a terminal sequence required for the function of all tRNAs, is added to the 3' ends of tRNA molecules for which this terminal sequence is not encoded in the DNA. The enzyme that catalyzes the addition of CCA is atypical for an RNA polymerase in that it does not use a DNA template. A third type of processing is the modification of bases and ribose units of ribosomal RNAs. 6 CCA is added by the CCA-adding enzyme (Figure below).

Transfer RNA precursor processing. 
The conversion of a yeast tRNA precursor into a mature tRNA requires the removal of a 14-nucleotide intron (yellow), the cleavage of a 59 leader (green), and the removal of UU and the attachment of CCA at the 39 end (red). In addition, several bases are modified.

Eukaryotic tRNAs are also heavily modified on base and ribose moieties; these modifications are important for function. In contrast with prokaryotic tRNAs, many eukaryotic pre-tRNAs are also spliced by an endonuclease and a ligase to remove an intron.

tRNA nucleotidyltransferase adds the invariant CCA terminus to the tRNA 30-end, a central step in tRNA maturation.7

Protein synthesis takes place in cytosolic ribosomes, mitochondria (mitoribosomes), and in plants, the plastids (chloroplast ribosomes). Each of these compartments requires a complete set of functional tRNAs to carry out protein synthesis. The production of mature tRNAs requires processing and modification steps such as the addition of a 3’-terminal cytidine-cytidine-adenosine (CCA). Since no plant tRNA genes encode this particular sequence, a tRNA nucleotidyltransferase must add this sequence post-transcriptionally and therefore is present in all three compartments. 8

c  EF-G (elongation factor G, historically known as translocase) is involved in protein translation. As a GTPase, EF-G catalyzes the movement (translocation) of transfer RNA (tRNA) and messenger RNA (mRNA) through the ribosome. EF-G is made up of 704 amino acids that form 5 domains, labeled Domain I through Domain V. 9

d The joining of an amino acid to a tRNA molecule to form an aminoacyl-tRNA is catalyzed by a specific enzyme called an Aminoacyl tRNA synthetase 14An aminoacyl-tRNA synthetase (aaRS or ARS), also called tRNA-ligase, is an enzyme that attaches the appropriate amino acid onto its tRNA. It does so by catalyzing the esterification of a specific cognate amino acid or its precursor to one of all its compatible cognate tRNAs to form an aminoacyl-tRNA. In humans, the 20 different types of aa-tRNA are made by the 20 different aminoacyl-tRNA synthetases, one for each amino acid of the genetic code. This is sometimes called "charging" or "loading" the tRNA with the amino acid. Once the tRNA is charged, a ribosome can transfer the amino acid from the tRNA onto a growing peptide, according to the genetic code. Aminoacyl tRNA therefore plays an important role in RNA translation, the expression of genes to create proteins. 13

e ternary complex is a protein complex containing three different molecules that are bound together.  15

6. Styer, Biochemistry, 8th. edition, page 870
12. Styer, Biochemistry, 8th. edition, page 36


Last edited by Admin on Mon Sep 17, 2018 2:44 pm; edited 2 times in total

View user profile


Forces Stabilizing Proteins - essential for their correct folding

Proteins are the most complex molecules in life and are involved in basically all biochemical processes. Human cells contain 19000 genes, which can produce upon splicing up to over 6 million different proteins. Properly folded proteins are absolutely essential for a cell’s viability. In the context of the extremely crowded cellular environment, the folding of polypeptide chains into precise functional structures is a daunting task 7

Fundamentals of biochemistry, fourth ed., starting at page 163: 
Studies of protein stability and renaturation suggest that protein folding is directed largely by the residues that occupy the interior of the folded protein. But how does a protein fold to its native conformation? One might guess that this process occurs through the protein’s random exploration of all the conformations available to it until it eventually stumbles onto the correct one. A simple calculation demonstrates that this cannot possibly be the case: If the protein could explore a new conformation every 10-13 s (the rate at which single bonds reorient), the time t, in seconds, required for the protein to explore all possible conformations available to  it until it discovers a functional fold, for a small protein of 100 residues, would be  t = 10^87 s, which is immensely greater than the supposed age of the universe ( ~ 13.7 billion years  4.3 x 10^17 s) 5

Clearly, by a random search, there would be not enough time to find functional folds. 

Experiments have shown that many proteins fold to their native conformations in less than a few seconds. This is because proteins fold to their native conformations via directed pathways rather than stumbling on them through random conformational searches.

It is remarkable how our textbook implicitly assumes design. Directed pathways means, someone, intelligent and conscient, did set up the right protein amino acid sequence by informing and instructing the DNA blueprint how to arrange the sequence in the right order to result in function bearing folding. Foresight is required. As the calculations demonstrate, there would not be enough time since the beginning of the universe, to try randomly all possible foldings, until the right, functional one would appear. That is the case for a small peptide with 100 residues. What to say about all proteins required to kick-start life ?  

Experimental observations indicate that protein folding begins with the formation of local segments of secondary structure. This early stage of protein folding is extremely rapid, with much of the native secondary structure in small proteins appearing within 5 ms of the initiation of folding. Since native proteins contain compact hydrophobic cores, it is likely that the driving force in protein folding is what has been termed a hydrophobic collapse. The collapsed state is known as a molten globule, a species that has much of the secondary structure of the native protein but little of its tertiary structure. Theoretical studies suggest that helices and sheets form in part because they are particularly compact ways of folding a polypeptide chain.

Over the next 5 to 1000 ms, the secondary structure becomes stabilized and tertiary structure begins to form. During this intermediate stage, the nativelike elements are thought to take the form of subdomains that are not yet
properly docked to form domains. In the final stage of folding, which for small, single-domain proteins occurs over the next few seconds, the protein undergoes a series of complex rearrangements in which it attains its relatively
stable internal side chain packing and hydrogen bonding while it expels the remaining water molecules from its hydrophobic core.

In multidomain and multisubunit proteins, the respective units then assemble in a similar manner, with a few slight conformational adjustments required to produce the protein’s native tertiary or quaternary structure. Thus,
proteins appear to fold in a hierarchical manner, with small local elements of structure forming and then coalescing to yield larger elements, which coalesce with other such elements to form yet larger elements, etc.

A folding protein must proceed from a high-energy, high-entropy state to a low-energy, low-entropy state. An unfolded polypeptide has many possible conformations (high entropy). As it folds into an ever-decreasing number of possible conformations, its entropy, and free energy decrease. 

Evidently, proteins have evolved to have efficient folding pathways as well as stable native conformations.

This is an outrageous claim, based on no empirical scientific evidence, but just on the current, predominant naturalistic framework, which fills gaps of understanding with evolution. There is no reason to claim " evidently". Evidently evolution, based on what evidence? The problems with such an assertion are manifold. First of all, there was no evolution prior self-replication of the first living organisms began. The life essential proteome had to emerge either randomly, or through design. As demonstrated above, the size of sequence space and possible amino acid combinations of a protein with just 100 amino acid residues is too large and it would take too much time for a functional protein to emerge randomly. Goal oriented, purposeful design by an intelligent creative agency is the only rational explanation of origins. Secondly, there is a minimum size of each protein to confer function, and there can be proteins with just one, as with various subunits forming complex three-dimensional structures, and several allosteric sites, pockets and cavities to host co-factors and their respective binding sites, essential to bear function.    

The function of proteins depends on the right sequence of amino acids, lined up into polypeptide chains of various sizes, ranging from short sequences, like extracellular hemoglobin with 140 amino acids, to monstrously large proteins, like Tintin which is used in sarcomeres, responsible for passive elasticity of striated muscles. with lengths that vary from ~27,000 to ~33,000 amino acids (depending on the splice isoform), the largest known protein. Proteins, in order to become functional, must fold into very specific 3D shapes, which happens right when they come out of the Ribosome, where they are synthesized. Specific protein shape and conformation depends on the interactions between its amino acid side chains, but also on the solvent (water or lipid bilayer), the concentration of salts, the pH, the temperature, the possible presence of cofactors and of molecular chaperones. 6

For a protein to function it must fold into a resting state which is a complex three-dimensional structure.  If a protein fails to fold into its functional structure then it is not only without function but it cab become toxic to the cell. As proteins fold, they test a variety of conformations before reaching their final form, which is unique and compact. Folded proteins are stabilized by thousands of noncovalent bonds between amino acids. A relatively small protein of only 100 amino acids can take some 10^100 different configurations. If it tried these shapes at the rate of 100 billion a second, it would take longer than the age of the universe to find the correct one. Just how these molecules do the job in nanoseconds, nobody knows. 3

Nobody knows, but since there was no evolution, to produce the first proteome for the first living cells, there are two options: blind, unguided, random lucky events on early earth, or the creative act of a super intelligent agent.

The 3D shape depends on the chemical forces, the chains are folded into a uniquely defined configuration, in which they are held by hydrogen bonds between the peptide nitrogen and oxygen atoms, which make a large contribution to protein stability, but other forces play an essential role as well, like hydrophobic interactions, disulfide bonds,  Charge-charge interactions on the surface of proteins which are electrostatic interactions,  pure ionic interactions, van der Waals forces, but also the surroundings of proteins, and their milieu. Chemical forces between a protein and its immediate environment contribute to protein shape and stability.

A special note deserve Van der Waals forces and interactions, which origins are ultimately a quantum mechanical. (At least the induced dipole portions: how the electrons move with respect to each other and an external electrical field is driven by quantum mechanics.)  2 

Remarkably, quantum transition model fits the folding curves of 15 different proteins and even explains the difference in folding and unfolding rates of the same proteins.  That means, the shape could change by quantum transition, meaning that the protein could ‘jump’ from one shape to another without necessarily forming the shapes in between.    Impressive stuff !! 3

The 20 amino acids used in life are classified in Acidic amino acids, Basic amino acids with net positive charge, polar but uncharged, and nonpolar or hydrophobic amino acids. This variety contributes critically and is important for the processes that drive protein chains to “fold,” that is to form their natural (and functional) structures.

How could natural processes have foresight, which seems to be absolutely required, to "know" which amino acid sequences would provoke which forces, and how they would fold the protein structure to get functional for specific purposes within the cell? Let's consider, that in order to have a minimal functional living cell, at least 561 proteins and protein complexes would have to be fully set up, working, and interacting together to confer a functional whole with all life-essential functions 4

A minimal estimate for the gene content of the last universal common ancestor, 19 December 2005
A truly minimal estimate of the gene content of the last universal common ancestor, obtained by three different tree construction methods and the inclusion or not of eukaryotes (in total, there are 669 ortholog families distributed in 561 functional annotation descriptions, including 52 which remain uncharacterized)

So there had to be at least 561 proteins of various sizes, set up correctly in order to fold into functional 3D shapes, governed by chemical bond forces interacting between the individual amino acids. So not only is the right amino acid sequence essential but also as a side-effect the right dosage of the bond forces in between the amino acids.  

The problem becomes even more acute when we consider that many, if not most proteins, are governed and build up by primary, secondary, tertiary, and quaternary structures. A quaternary structure refers to two or more polypeptide chains held together by intermolecular interactions to form a multi-subunit complex. And there are proteins, which require co-factors, often made of trace metals, like Iron, molybdenum etc. These cofactors require a pocket of the right size, and binding residues at the right place, and a tunnel where these cofactors can pass through during biosynthesis, to be inserted at the precise location inside the protein.

Evolution of the correct protein foldings
To explain the origin of correct protein folds is paramount to explain the origin of life and biodiversity. Its therefore of significant interest to see, how secular science papers explain its origin. Let's have a look at following:

Protein folding as an evolutionary process  7 December 2008
To provide a description that is consistent with other natural processes, protein folding is formulated from the principle of increasing entropy. It then becomes evident that protein folding is an evolutionary process among many others. During the course of folding protein structural hierarchy builds up in succession by diminishing energy density gradients in the quest for a stationary state determined by surrounding density-in-energy. Evolution toward more probable states, eventually attaining the stationary state, naturally selects steeply ascending paths on the entropy landscape that correspond to steeply descending paths on the free energy landscape. 1

The problem with this explanation is massive, and actually, untenable, naive, superficial and demanding. This explanation does not take into consideration that the right energy density at the right place must be precisely fine-tuned. Not any kind of energy density, anywhere within the protein structure will do, but on each place on the polypeptide " ladder ", there must be an emanating force from given amino acid, which will interact with the right strength with an adjacent or nearby amino acid, which emanating force sums up with other forces to confer the right fold. Trial and error, or natural selection, are too unspecific and random to get a result which is functional 

5. Fundamentals of biochemistry, fourth ed., page 163

Last edited by Admin on Wed Sep 12, 2018 4:46 pm; edited 7 times in total

View user profile

34 Chaperones on Wed Sep 12, 2018 9:31 am



In cells, many proteins require the assistance of molecular chaperones for their folding. 17 Chaperones are found in all types of cells and in every cellular compartment. They bind to target proteins to facilitate proper folding, prevent or reverse improper associations, and protect their accidental degradation. 13 They are also involved in many macromolecular assembly processes, including the assembly of nucleosomes, protein transport in bacteria, assembly of bacterial pili, binding of transcription factors, and ribosome assembly in eukaryotes. A subset of molecular chaperones has even been implicated in signal transduction. This follows upon the discovery that steroid hormone receptors, which are cytoplasmic proteins, combine not only with their respective hormones but also require chaperones in order to form functioning recycling complexes.

Molecular chaperones are key players in the maintenance of proper protein folding and overall proteostasis. Chaperones are required by newly synthesized proteins to ensure both accurate folding and to prevent aggregation. Indeed, chaperones function both cotranslationally and in times of cellular stress 14

Molecular chaperones help guide the folding of most proteins
Proteins fold in the presence of extremely high concentrations of other proteins with which they can potentially interact. Molecular chaperones are essential proteins that bind to unfolded and partially folded polypeptide chains to prevent the improper association of exposed hydrophobic segments that might lead to non-native folding as well as polypeptide aggregation and precipitation. This is especially important for multidomain and multisubunit proteins, whose components must fold fully before they can properly associate with each other. Molecular chaperones also induce misfolded proteins to refold to their native conformations. Many molecular chaperones were first described as heat shock proteins (Hsp) because their rate of synthesis is increased at elevated temperatures. Presumably, the additional chaperones are required to recover heat-denatured proteins or to prevent misfolding under conditions of environmental stress. 12

There are several classes of molecular chaperones in both prokaryotes and eukaryotes, including the following: 

1. The Hsp70 family of proteins are highly conserved 70-kD proteins in both prokaryotes and eukaryotes. In association with the cochaperone protein Hsp40, they facilitate the folding of newly synthesized proteins and reverse the denaturation and aggregation of proteins. Hsp70 proteins also function to unfold proteins in preparation for their transport through membranes and to subsequently refold them.
2. Trigger factor is a ribosome-associated chaperone in prokaryotes that prevents the aggregation of polypeptides as they emerge from the ribosome. Trigger factor and Hsp70 are the first chaperones a newly made prokaryotic protein encounters. Subsequently, many partially folded proteins are handed off to other chaperones to complete the folding process. Eukaryotes lack trigger factor but contain other small chaperones that have similar functions.
3. The chaperonins form large, multisubunit, cagelike assemblies in both prokaryotes and eukaryotes. They bind improperly folded proteins and induce them to refold inside an internal cavity.
4. The Hsp90 proteins are eukaryotic proteins that mainly facilitate the late stages of folding of proteins involved in cellular signaling. Hsp90 proteins are among the most abundant proteins in eukaryotes, accounting for up to 6% of cellular protein under stressful conditions that destabilize proteins.

All of these molecular chaperones operate by binding to an unfolded or aggregated polypeptide’s solvent-exposed hydrophobic surface and subsequently releasing it, often repeatedly, in a manner that facilitates its proper folding.

The Hsp70 family
In bacteria and eukaryotic cells, the classical Hsp70s have a central role in the cytosolic chaperone network. They interact with a multitude of nascent and newly synthesized polypeptides but have no direct affinity for the ribosome. Hsp70 chaperones function with cochaperones of the Hsp40 family (also known as DnaJ proteins or J proteins) and nucleotide exchange factors (NEFs)  23

Organization of chaperone pathways in the cytosol.
In bacteria (a), archaea (b), and eukarya (c), ribosome-bound chaperones [trigger factor (TF) in bacteria, nascent-chain-associated complex (NAC) in archaea and eukarya] aid folding cotranslationally by binding to hydrophobic segments on the emerging nascent chains. For longer nascent chains, members of the heat shock protein (Hsp)70 family (DnaK in bacteria and Hsp70 in eukarya), together with Hsp40s and nucleotide exchange factors (NEFs), mediate co- and posttranslational folding. In archaea lacking the Hsp70 system, prefoldin (PFD) assists in folding downstream of NAC. Partially folded substrates may be transferred to the chaperonins [GroEL-GroES in bacteria, thermosome in archaea, and tailless complex polypeptide-1 (TCP-1) ring complex (TRiC)/chaperonin-containing TCP-1 (CCT) in eukarya]. The Hsp90 system also receives its substrates from heat shock cognate 70 (Hsc70) and mediates their folding with additional cofactors. The insert in panel c shows the ribosome-binding chaperone system, the ribosome-associated complex (RAC), in fungi. RAC consists of Ssz1 (a specialized Hsp70) and zuotin (Hsp40) and assists nascent chain folding together with another Hsp70 isoform, Ssb. Percentages indicate the approximate protein flux through the various chaperones. 

Heat-shock proteins (Hsp70s) are essential chaperones required for key cellular functions. Their partner with structurally diverse Hsp40s (J proteins), generating distinct chaperone networks in various cellular compartments that perform myriad housekeeping and stress-associated functions in all organisms. Hsp70 - J protein machines play an important role in fine-tuning cellular protein quality control. 

Hsp70 chaperone machineries have pivotal roles in an array of fundamental biological processes through their facilitation of protein folding, disaggregation, and remodeling. The obligate J-protein co-chaperones of Hsp70s drive much of this remarkable multifunctionality, with most Hsp70s having multiple J-protein partners. J-protein-driven versatility is substantially due to precise localization within the cell and the specificity of substrate protein binding. However, this relatively simple view belies the intricacy of J-protein function. Examples are emerging of J-protein interactions with Hsp70s and other chaperones, as well as integration into broader cellular networks. These interactions fine-tune, in critical ways, the ability of Hsp70s to participate in diverse cellular processes. 21

How did these proteins get finely tuned to exercise their functions? Trial and error? Evolution? How could it be by evolution, if intermediate stages would confer no function? 

J-Proteins Function in Many Cellular Processes. 
J-proteins are found in major cellular compartments: endoplasmic reticulum (ER, blue), cytosol and/or nucleus (Cyt/Nuc, purple), and mitochondria (Mito, orange). An overview is given of the cellular processes in which J-protein–Hsp70 chaperone systems function, with the adjacent small circles indicating the compartments in which the process occurs. Those processes listed on the left are, in general, carried out by J-proteins referred to as ‘general binders’, those that interact with a wide range of substrates. The ER and mitochondrial process of ‘Protein translocation across membranes’ is driven by J-proteins that are localized to a specific site of action where many different substrates emerge from the membrane. Remodeling of protein complexes generally involves ‘specific binders’, which interact with one or a few substrates and function in a specific cellular process, as discussed in the main text.  

Hsp70 Machinery Fundamentals. 
The two-domain architecture of Hsp70s [N-terminal nucleotide-binding domain (NBD) and C-terminal substrate-binding domain (SBD)] is key to their function, as is the interaction of J-protein and nucleotide exchange factor (NEF) co-chaperones, which, as indicated, stimulate ATPase activity and exchange of nucleotide, respectively. This figure also illustrates four functional keys to the interaction of Hsp70 with substrate: 
(i) Dramatic differences between the ATP- and ADP-bound conformations are key to substrate interaction on a biologically relevant timescale. When ATP is bound, the SBD is docked onto the NBD (left). When ADP is bound, the two domains do not interact but are tethered to each other only by a linker (right). The SBD contains two subdomains: one containing the substrate-binding cleft, the other a lid. Both subdomains, as well as the linker, interact with the NBD in the docked, ATP state. The lid is restrained by this interaction, giving the substrate easy, rapid access to the cleft. In the ADP state, when closed, the lid limits access of substrate to the cleft, but once bound stabilizes it. 
(ii) The two conformational states are not static, stable states, but are dynamic, with intermediates, such as the one shown in brackets with the lid undocked, but linker and cleft subdomain docked (center). 
(iii) The ATPase activity of the NBD serves as a switch between conformations. ATPase stimulation is the core activity of J-proteins. The J-domain (J) forms a finger-like structure that interacts at the interface between the NBD and SBD. Substrate interaction in the peptide-binding cleft also stimulates Hsp70 ATPase activity, with coordinated timing of stimulation with the J-domain likely being key to the productive binding of substrate (center and right). 
(iv) NEFs are also key to regulating the cycle by stimulating release of ADP, allowing binding of ATP, which is typically more abundant (bottom).

Hsp70 molecular chaperones function in a variety of critical cellular processes, including protein folding, translocation of proteins across membranes and assembly/disassembly of protein complexes. Hsp70 systems consist of a core Hsp70 protein and its co-chaperones: J-protein and nucleotide release factor NRF. These co-chaperones regulate the cycle of interaction with protein substrate via stimulating the ATPase activity of Hsp70 (J-protein) and promoting nucleotide exchange (NRF). Compartments within the eukaryotic cell often contain multiple Hsp70s, J-proteins and NRFs. The capabilities of these systems to carry out diverse cellular functions results from either specialization of an Hsp70 or by the interaction of multifunctional Hsp70 with an array of specialized J-proteins. Hsp70 chaperone systems may be found in all compartments of the eukaryotic cell, where they function in many vital processes. Such a universal mechanism allows Hsp70 chaperones to perform many important functions, e.g., folding the polypeptide chain of newly synthesized proteins, modulating the interactions between proteins by affecting their conformation, or promoting polypeptide transport through cellular membranes. Hsp70 systems also facilitate the refolding of polypeptides that lost their native conformation under stress conditions, and, if the refolding is unfeasible, redirect the polypeptides to intracellular proteolytic systems.  20

The structure of Hsp70-J domain. 
Hsp70 structure has been preserved in evolution. ATPase domain (yellow) contains a deep slot with the ATP binding site. The substrate binding domain consists of a β-sheet slot (blue), interacting with a group of hydrophobic amino acids, and covered by α-helical lid (green). Flexible link (red) allows the domains to an interact allosterically (Figure by PDB id: 2KHO). Each functional J-domain contains the HPD tripeptide, that is necessary to stimulate the ATPase activity of a partner Hsp70

The fundamental mitochondrial Hsp70 system is closely related to bacterial chaperones: Hsp70 DnaK, J-protein DnaJ and nucleotide exchange factor GrpE. Bacterial and yeast proteins reveal a significant degree of similarity in terms of their structure and sequence. The Ssc1/Mdj1/Mge1 protein system is located in the mitochondrial matrix, where it functions similarly to its bacterial equivalent (DnaK/DnaJ/GrpE). This system is responsible for the maintenance and propagation of mitochondrial DNA. The homologous bacterial proteins (DnaK/DnaJ/GrpE) modify the stability of the protein complex involved in the initiation of DNA replication. 

This fact may indicate that the Ssc1/Mdj1/ Mge1 system has evolved under the influence of a selection pressure caused by the specificity of mitochondrial function. 

Mitochondria have their own genome (mtDNA) encoding a limited number of mitochondrial proteins, mitochondrial tRNA and subunits of mitochondrial ribosomes. Maintenance and replication of mtDNA is prerequisite for a proper functioning of both the mitochondria themselves and a eukaryotic cell. These processes are managed by a complex of proteins associated with mtDNA, known as the mitochondrial nucleoid. The components of Ssc1/Mdj1/Mge1 system occur in nucleoid complex as many other mitochondrial proteins whose main activity is not evidently related to DNA metabolism. So far, little is known about the role of these proteins in the maintenance and replication of mtDNA. However, the results of research on Mdj1 are unambiguous. Both deletion of the gene encoding Mdj1 and the substitution of the HPD sequence in J-domain lead to a rapid loss of mtDNA, even under optimal conditions in yeast culture , in which the mitochondrial DNA polymerase is fully active. Moreover, most of Mdj1 is located in the nucleoid complex probably by a direct interaction of Mdj1 with mtDNA. Although the molecular mechanism of Mdj1 in maintenance and replication of mtDNA has not been known yet, the results suggest that Mdj1 has to be located in the vicinity to the nucleoid complex and that it has to interact with its partner Hsp70 Ssc1. 

It is evident that mitochondrial DNA depends on the Ssc1/Mdj1/Mge1 system. That brings the claim that this system evolved under selection pressures in question, since, as described above, without it, rapid loss of mtDNA would also mean non-function and death of mitochondria. It can be safely concluded, that this is an irreducible system that could not have evolved gradually, but had to emerge fully functional right from the beginning. 

One of the key functions of mitochondria is the synthesis of iron-sulfur centers (Fe/S), which are prosthetic group (cofactors) of numerous cellular proteins. Fe/S centers occur, among others, in proteins involved in mitochondrial oxidative phosphorylation and in many other proteins located in all compartments of eukaryotic cell.  In bacteria, the Fe/S transfer is promoted by a specialized Hsp70 system consisting of Hsp70 HscA and J-protein HscB . The system is present in all species of bacteria harboring the ISC pathway. 

Mitochondria have inherited most of the protein components of the ISC pathway from their bacterial ancestors.

Or they were co-created in parallel by the same intelligent designer. 

Evolution of Hsp70 proteins
The functional specificities of most plant J proteins in fundamental chaperone functions are conserved across long evolutionary timescales. 19 Phylogenetic analyses revealed five distinct Hsp70-groups. Our analyses suggest an independent evolution of the heat-inducible cytosol-type hsp70s in Paramecium and in its close relative Tetrahymena, as well as within higher eukaryotes. This result indicates convergent evolution during hsp70 subfunctionalization and implies that heat-inducibility evolved several times during the course of eukaryotic evolution. Our analyses suggest an independent evolution of the heat-inducible cytosol-type hsp70s in Paramecium and in its close relative Tetrahymena, as well as within higher eukaryotes.  18 

Convergent evolution is a common ad-hoc assertion when the scientific evidence does demonstrate the same function of a family of proteins but indicating divergence from common ancestral genes and unrelated gene sequences resulting in the same protein function.

The copy number variation among J-proteins can be explained by specific gene duplications that occurred along the branches of the tree of life 21 Even in cases as well characterized as that of these three J-proteins, the mechanisms behind duplicate retention remain unclear. The origin of structural divergence among J-proteins is even more difficult to explain. Yet, gene duplications are likely also behind the more extreme differences, because they create a situation in which one copy can maintain its original functions, leaving the other ‘free’ to evolve. 

More examples of guesswork enclothed as science

Is gene duplication a viable explanation for the origination of biological information and complexity?
although the process of gene duplication and subsequent random mutation has certainly contributed to the size and diversity of the genome, it is alone insufficient in explaining the origination of the highly complex information pertinent to the essential functioning of living organisms.

GroEL/ES Chaperonins
The chaperonins are essential for protein folding in all domains of life. The bacterial chaperonin GroEL and its cofactor GroES constitute the paradigmatic molecular machine of protein folding. Larger proteins and proteins with complex fold topologies often fail to reach their functional structure and instead aggregate. Such proteins tend to populate kinetically trapped non-native conformations during folding that expose hydrophobic amino acid residues to the solvent and thus are prone to form nonfunctional aggregates in a concentration-dependent manner. In the highly crowded cellular environment, aggregation is exacerbated as a result of excluded volume effects, which increase the effective concentrations of macromolecules. Indeed, it has become clear over the past 25 years that cells have evolved essential machineries, referred to as molecular chaperones, that assist protein folding primarily by preventing inappropriate interactions between non-native polypeptides that would otherwise lead to aggregation. 

Since Chaperones had to exist to fold life essential proteins, and an ur or proto chaperonin supposedly existed at LUCA, it could not have evolution as origin. Also, if a polypeptide amino acid sequence would emerge randomly on prebiotic earth, why should there be a goal to have a proper folding bearing life essential function? Molecules by themselves have no need, drive, or urge to become alive.   

GroEL is a large double-ring cylinder with ATPase activity that binds non-native substrate protein (SP) via hydrophobic residues exposed towards the ring center. Binding of the lid-shaped GroES to GroEL displaces the bound protein into an enlarged chamber, allowing folding to occur unimpaired by aggregation. GroES and SP undergo cycles of binding and release, regulated allosterically by the GroEL ATPase. Recent structural and functional studies are providing insights into how the physical environment of the chaperonin cage actively promotes protein folding. 22

The GroEL/ES Chaperonin forms closed chambers in which proteins fold 
The chaperonins in E. coli consist of two types of subunits named GroEL and GroES. The X-ray structure of a GroEL–GroES–(ADP) complex reveals fourteen identical 549-residue GroEL subunits arranged in two stacked rings of seven subunits each. 

X-Ray structure of the GroEL–GroES–(ADP) complex. 
(a) A space-filling drawing as viewed perpendicularly to the complex’s sevenfold axis with the GroES ring orange, the cis ring of GroEL green and the trans ring of GroEL red with one subunit of each ring shaded more brightly. The dimensions of the complex are indicated. Note the different conformations of the two GroEL rings. The ADPs, whose binding sites are in the base of each cis ring GroEL subunit, are not seen because they are surrounded by protein.
(b) As in Part a but viewed along the sevenfold axis. (c) As in Part a but with the two GroEL subunits closest to the viewer in both the cis and trans rings removed to expose the interior of the complex. The level of fog increases with the distance from the viewer. Note the much larger size of the cavity formed by the cis ring and GroES in comparison to that of the trans ring.

This complex is capped at one end by a domelike heptameric ring of 97-residue GroES subunits to form a bullet-shaped complex with C7 symmetry. The two GroEL rings each enclose a central chamber in which partially folded proteins fold to their native conformations. A barrier in the center of the complex (Fig.c) prevents a folding protein from passing between the two GroEL chambers. The GroEL ring that contacts the GroES heptamer is called the cis ring; the opposing GroEL ring is known as the trans ring.

ATP binding and hydrolysis drive the conformational changes in GroEL/ES 
Each GroEL subunit has a binding pocket for ATP that catalyzes the hydrolysis of its bound ATP to ADP + Pi. When the cis ring subunits hydrolyze their bound ATP molecules and release the product Pi, the protein undergoes a conformational change that widens and elongates the cis inner cavity so as to more than double its volume from 85,000 Ångstrom to 175,000 Ångstrom. (In the structure shown in Figure above, the cis ring has already hydrolyzed its seven molecules of ATP to ADP.) The expanded cavity can enclose a partially folded substrate protein. All seven subunits of the GroEL ring act in concert; that is, they are mechanically linked such that they change their
conformations simultaneously. The cis and trans GroEL rings undergo conformational changes in a reciprocating fashion, with events in one ring influencing events in the other ring. The entire GroEL/ES chaperonin complex functions as follows:

Reaction cycle of the GroEL/ES chaperonin

1. One GroEL ring that has bound 7 ATP also binds an improperly folded substrate protein, which associates with hydrophobic patches that line the inner wall of the GroEL chamber. The GroES cap then binds to the GroEL ring like a lid on a pot, inducing a conformational change in the resulting cis ring that buries the hydrophobic patches, thereby depriving the substrate protein of its binding sites. This releases the substrate protein into the now enlarged and closed cavity, where it commences folding. The cavity, which is now lined only with hydrophilic groups, provides the substrate protein with an isolated microenvironment that prevents it from nonspecifically aggregating with other misfolded proteins. Moreover, the conformational change that buries GroEL’s hydrophobic patches stretches and thereby partially unfolds the improperly folded substrate protein before it is released. This rescues the substrate protein from a local energy minimum in which it had become trapped, thereby permitting it to continue its conformational journey down the folding funnel toward its native state (the state of lowest free energy).

2. Within ~10 s (the time the substrate protein has to fold), the cis ring catalyzes the hydrolysis of its 7 bound ATPs to ADP + Pi and the Pi is released. The absence of ATP’s phosphate group weakens the interactions that bind GroES to GroEL.

3. A second molecule of improperly folded substrate protein binds to the trans ring followed by 7 ATP. Conformational linkages between the cis and trans rings prevent the binding of both substrate protein and ATP to the trans ring until the ATP in the cis ring has been hydrolyzed.

4. The binding of substrate protein and ATP to the trans ring conformational induces the cis ring to release its bound GroES, 7 ADP, and the presumably now better-folded substrate protein. This leaves ATP and substrate protein bound only to the trans ring of GroEL, which now becomes the cis ring as it binds GroES.

Most proteins probably do not fold correctly during their synthesis and require a special class of proteins called molecular chaperones to do so. Molecular chaperones are useful for cells because there are many different folding paths available to an unfolded or partially folded protein. Without chaperones, some of these pathways would not lead to the correctly folded (and most stable) form because the protein would become “kinetically trapped” in structures that are off-pathway. Some of these off-pathway configurations would aggregate and be left as irreversible dead ends of nonfunctional (and potentially dangerous) structures.

Molecular chaperones specifically recognize incorrect, off-pathway configurations by their exposure of hydrophobic surfaces, which incorrectly folded proteins are typically buried in the interior. The binding of these exposed hydrophobic surfaces to each other is what causes off-pathway conformations to irreversibly aggregate.  In some cases of inherited human diseases, aggregates do form and can cause severe symptoms and even death. Chaperones prevent this from happening in normal proteins by binding to the exposed hydrophobic surfaces using hydrophobic surfaces of their own. There are several types of chaperones; once bound to an incorrectly folded protein, they ultimately release it in a way that gives the protein another chance to fold correctly.

Protein "dressing room" has electronic walls 
Properly folded proteins are essential to all of life.  When a polypeptide, or chain of amino acids, emerges from the ribosome translation factory on its way to becoming a protein, it looks like a useless, shapeless piece of string.  It cannot perform its function till folded into a precise, compact shape particular for its job.  Some short polypeptides will spontaneously fold into their “native” state, ready for work, but many of the bigger ones need help.  Fortunately, the cell provides a private dressing room called the GroEL-GroES chaperonin that not only gives them privacy, away from the bustle of colliding molecules in the cytoplasm but actually helps them get correctly folded.  This chaperone or “helper” machine thus not only gets the actor ready for the stage faster but prevents misfolding that could clutter the cell with useless or harmful aggregates of protein. 8

The inside walls of the GroEL barrel and the inside walls of the GroES lid contain protrusions that generate electrostatic and hydrophobic forces on the interior space.  When the unfolded protein enters, therefore, it is subjected to gentle pressures that coax it to fold.  These forces are nonspecific enough to work on hundreds of different substrates that use this general-purpose machine.   The forces change during the entry of the nascent protein.  The interior is not barrel shaped when the actor approaches the door; the GroES lid, with the help of the energy molecule ATP, guides the protein in, and then the barrel pops into its shape, providing a safe haven for folding.  The electronic walls turn on to provide that gentle nudge to get the polypeptide over its energy barriers and into the right folding pathway.  When the protein has properly completed its folding after about 10 seconds, the door opens and the protein pops out, ready for operation.

How finely tuned is this machine?  The authors did some experiments on mutating the chaperone to make the barrel looser and tighter.  They found that volume changes as small as 2-5% slowed down the folding considerably. The barrel volume needs to be within certain narrow limits, yet general enough to handle a variety of small, medium and large proteins. 9

The GroEL/GroES nano-cage allows a single protein molecule to fold in isolation.  This reaction has been compared to spontaneous folding at infinite dilution.  However, recent experimental and theoretical studies indicate that the physical environment of the chaperonin cage can alter the folding energy landscape, resulting in accelerated folding for some proteins.  By performing an extensive mutational analysis of GroEL, we have identified three structural features of the chaperonin cage as major contributors to this capacity: 

1. geometric confinement exerted on the folding protein inside the limited volume of the cage; 
2. a mildly hydrophobic, interactive surface at the bottom of the cage; 
3. Clusters of negatively charged amino acid residues exposed on the cavity wall. These features in combination provide a physical environment that has been optimized to catalyze the structural annealing of proteins with kinetically complex folding pathways.  Thus, the chaperonin system and its mutant versions may prove as useful tools in understanding how proteins navigate their energy landscape of folding.

What we see here are molecular machines working with precision, efficiency, control – and design. Protein folding is a complex affair, wherein several domains of the polypeptide fold sequentially or simultaneously following an energy landscape (like a pinball negotiating obstacles) that leads to the completed protein. Some domains fold into a helix or sheet, or several, which then combine into larger structures.  Even then, after the protein exits the chaperone, there can be subsequent modifications: multiple proteins, for instance, might be joined into complexes, with metal ions inserted (as in hemoglobin or chlorophyll), and these proteins usually become part of networks.  Add to that now the exciting discovery that the walls of the chaperone barrel are interactive, coaxing the proteins to fold properly. At every stage, there is coordinated, synchronized, elegant design. Think about how these molecules operate in the blind.  They do not have eyes and brains telling them where to go – yet they succeed.  There is no analog in human technology; the closest, perhaps, is computer programming, but in life, at scales smaller than most of us can imagine, nano-factories operate with physical entities moving through space and time.  How fortunate we are to see these marvels unfold.  Our ancestors might have wondered at the mysteries of biological life, but could they in their wildest dreams have imagined the city-like organization at work at the molecular level?

What Chaperone Proteins Know 
Here's a riddle for you: Proteins are used to make proteins, so if we assume a purely naturalistic origin of life, where did the first proteins come from? 10

If a cell is a factory, proteins are the factory workers. Proteins conduct most of the necessary functions in a cell. Proteins are made up of amino acid building blocks. A chain of amino acids must fold into the appropriate three-dimensional structure so that the protein can function properly. Within cells are proteins known as chaperones that help fold the amino acid chain into its proper three-dimensional structure. If the amino acid chain folds improperly, then this could wreak havoc on the cell and potentially the entire organism. The chaperone works to prevent folding defects and is a key player in the final steps of protein synthesis.

However, as important as chaperones are, there are still many questions as to how exactly they work. For example, do the chaperones fold the amino acid chain while it is still being constructed (during translation), or is the amino acid chain first put together, and then the folding beings? Or, is it some combination of both? Studies indicate that it is indeed a combination of both. There are two different kinds of chaperone proteins within the cell, one for translation and one for post-translation. With these two different kinds of chaperones, where and how does regulation happen to prevent misfolds?

Recent research on bacterial cells sheds light on the chaperones' important function. One chaperone in particular, Trigger Factor, plays a key role in correcting misfolds that may occur early on in the translational process. Trigger Factor can slow down improper amino acid folding, and it can even unfold amino acid chains that have already folded up incorrectly.

Here are some of the neat features of Trigger Factor:

Trigger Factor actually constrains protein folding more than the ribosome does. It doesn't just "get in the way" like the ribosome. It also regulates the folding.
Trigger Factor's function is specific to the particular region of the amino acid chain. It does not just perform one function no matter what the composition of the amino acid chain. It changes based on the region of the chain it is working with.
Trigger Factor also changes its activity based on where the protein is in the translation process.
Trigger Factor's process depends on how the amino acid chain is bound to the ribosome, and can even unfold parts of the chain that were misfolded in the translation process.
An additional factor that regulates when amino acid chains fold into proteins is its distance from the ribosome (the place where the amino acid chain is made). The closer the chain is to the ribosome, the less room it has to fold into a three-dimensional protein. Trigger Factor works with this spatial hindrance, making an interesting and complex regulation system.

Trigger Factor is only called into the game once the amino acid chain is a certain length (around 100 amino acids long) and when the chain has certain features, such as hydrophobicity. As the authors state it, Trigger Factor keeps the protein from folding into its three-dimensional structure until the amino acid chain has all of the information it needs to fold properly:

In summary, we show that the ribosome and TF each uniquely affect the folding landscape of nascent polypeptides to prevent or reverse early misfolds as long as important folding information is still missing and the nascent chain is not released from the ribosome.
So we have a protein that is able to perform various functions that inhibit or slow protein folding until the amino acid has the right chemical information for folding to occur.
This does not solve the riddle about proteins being made from proteins (otherwise known as the chicken-and-the-egg problem). It actually adds another twist to the riddle: How does one protein know how much information a completely different protein needs to fold into a three-dimensional structure? How does a protein evolve the ability to "know" how to respond to specific translational circumstances as Trigger Factor does?

Cells utilize several types of chaperones
Many molecular chaperones are called heat-shock proteins (designated hsp), because they are synthesized in dramatically increased amounts after a brief exposure of cells to an elevated temperature (for example, 42°C for cells that normally live at 37°C). This reflects the operation of a feedback system that responds to an increase in misfolded proteins (such as those produced by elevated temperatures) by boosting the synthesis of the chaperones that help these proteins refold. There are several major families of molecular chaperones, including the hsp60 and hsp70 proteins. Different members of these families function in different organelles. Thus, mitochondria contain their own hsp60 and hsp70 molecules that are distinct from those that function in the cytosol; and a special hsp70 (called BIP) helps to fold proteins in the endoplasmic reticulum. The hsp60 and hsp70 proteins each work with their own small set of associated proteins when they help other proteins to fold. These hsps share an affinity for the exposed hydrophobic patches on incompletely folded proteins, and they hydrolyze ATP, often binding and releasing their protein substrate with each cycle of ATP hydrolysis. In other respects, the two types of hsp proteins function differently. The hsp70 machinery acts early in the life of many proteins (often before the protein leaves the ribosome), with each monomer of hsp70 binding to a string of about four or five hydrophobic amino acids ( see figure below ) 7

The hsp70 family of molecular chaperones. These proteins act early, recognizing a small stretch of hydrophobic amino acids on a protein’s surface. Aided by a set of smaller hsp40 proteins (not shown), ATP-bound hsp70 molecules grasp their target protein and then hydrolyze ATP to ADP, undergoing conformational changes that cause the hsp70 molecules to associate even more tightly with the target. After the hsp40 dissociates, the rapid rebinding of ATP induces the dissociation of the hsp70 protein after ADP release. Repeated cycles of hsp binding and release help the target protein to refold.

On binding ATP, hsp70 releases the protein into solution allowing it a chance to re-fold. In contrast, hsp60-like proteins form a large barrel-shaped structure that acts after a protein has been fully synthesized. This type of chaperone sometimes called a chaperonin, forms an “isolation chamber” for the folding process (Figure below).

The structure and function of the hsp60 family of molecular chaperones. (A) 
A misfolded protein is initially captured by hydrophobic interactions with the exposed surface of the opening. The initial binding often helps to unfold a misfolded protein. The subsequent binding of ATP and a cap releases the substrate protein into an enclosed space, where it has a new opportunity to fold. After about 10 seconds, ATP hydrolysis occurs, weakening the binding of the cap. Subsequent binding of additional ATP molecules ejects the cap, and the protein is released. As indicated, only half of the symmetric barrel operates on a client protein at any one time. This type of molecular chaperone is also known as a chaperonin; it is designated as hsp60 in mitochondria, TCP1 in the cytosol of vertebrate cells, and GroEL in bacteria. (B) The structure of GroEL bound to its GroES cap, as determined by x-ray crystallography. On the left is shown the outside of the barrel-like structure and on the right a cross-section through its center. 

To enter a chamber, a substrate protein is first captured via the hydrophobic entrance to the chamber. The protein is then released into the interior of the chamber, which is lined with hydrophilic surfaces, and the chamber is sealed with a lid, a step requiring ATP. Here, the substrate is allowed to fold into its final conformation in isolation, where there are no other proteins with which to aggregate. When ATP is hydrolyzed, the lid pops off, and the substrate protein, whether folded or not, is released from the chamber.  Chaperones often need many cycles of ATP hydrolysis to fold a single polypeptide chain correctly. This energy is used to perform mechanical movements of the hsp60 and hsp70 “machines,” converting them from binding forms to releasing forms. Just as we saw for transcription, splicing, and translation, the expenditure of free energy can be used by cells to improve the accuracy of a biological process. In the case of protein folding, ATP hydrolysis allows chaperones to recognize a wide variety of misfolded structures, to halt any further misfolding, and to recommence the folding of a protein in an orderly way.

Question: Could this recognition be a mechanism that arose by natural means? If so, why would chance, or evolution or whatever natural mechanism is supposed, produce such a device, that performs in such an accurate way production control and repair and optimization? Is that not a goal-driven event, that has to be programmed and invented? Or does mindless matter have any goal driven purposes? That's hard to fathom.

Although our discussion focuses on only two types of chaperones, the cell has a variety of others. The enormous diversity of proteins in cells presumably requires
a wide range of chaperones with versatile surveillance and correction capabilities

Exposed Hydrophobic Regions Provide Critical Signals for Protein Quality Control
If radioactive amino acids are added to cells for a brief period, the newly synthesized proteins can be followed as they mature into their final functional forms. This type of experiment demonstrates that the hsp70 proteins act first, beginning when a protein is still being synthesized on a ribosome, and the hsp60-like proteins act only later to help fold completed proteins. We have seen that the cell distinguishes misfolded proteins, which require additional rounds of ATP-catalyzed refolding, from those with correct structures through the recognition of hydrophobic surfaces. Usually, if a protein has a sizable exposed patch of hydrophobic amino acids on its surface, it is abnormal: it has either failed to fold correctly after leaving the ribosome, suffered an accident that partly unfolded it at a later time, or failed to find its normal partner subunit in a larger protein complex. Such a protein is not merely useless to the cell, it can be dangerous. Proteins that rapidly fold correctly on their own do not display such patterns and generally bypass the chaperones. For the others, the chaperones can carry out “protein repair” by giving them additional chances to fold while, at the same time, preventing their aggregation.

The processes that monitor protein quality following protein synthesis. A newly synthesized protein sometimes folds correctly and assembles on its own with its partner proteins, in which case the quality control mechanisms leave it alone. Incompletely folded proteins are helped to properly fold by molecular chaperones: first by a family of hsp70 proteins, and then, in some cases, by hsp60-like proteins. For both types of chaperones, the substrate proteins are recognized by an abnormally exposed patch of hydrophobic amino acids on their surface. These “protein-rescue” processes compete with another mechanism that, upon recognizing an abnormally exposed hydrophobic patch, marks the protein for destruction by the proteasome. The combined activity of all of these processes is needed to prevent massive protein aggregation in a cell, which can occur when many hydrophobic regions on proteins clump together nonspecifically.

The figure above outlines all of the quality-control choices that a cell makes for a difficult-to-fold, newly synthesized protein. As indicated, when attempts to refold a protein fail, an additional mechanism is called into play that completely destroys the protein by proteolysis. This proteolytic pathway begins with the recognition of an abnormal hydrophobic patch on a protein’s surface, and it ends with the delivery of the entire protein to a protein-destruction machine, a complex protease known as the proteasome. As described next, this process depends on an elaborate protein-marking system that also carries out other central functions in the cell by destroying selected normal proteins.

Chaperone-assisted protein folding 
Protein folding is the process by which newly synthesized polypeptide chains acquire the three-dimensional structures necessary for biological function. For many years, protein folding was believed to occur spontaneously, on the basis of the pioneering experiments of Christian Anfinsen, who showed in the late 1950s that purified proteins can fold on their own after removal from denaturant. Anfinsen had discovered the fundamental principle that the linear amino acid sequence holds all the information necessary to specify a protein’s three-dimensional structure. But it soon became apparent that test-tube folding experiments work mostly for small, single-domain proteins, often only in conditions far removed from those encountered in a cell. Large proteins frequently fail to reach native state under these experimental conditions, forming nonfunctional aggregates instead. 5

That raises interesting questions: How should and could natural nonintelligent natural mechanisms forsee the necessity of chaperones in order to get a specific goal and result, that is functional proteins to make living organisms? Non-living matter has no natural " drive " or purpose or goal to become living. The make of proteins to create life, however, is a multistep process of many parallel acting complex metabolic pathways and production-line like processes to make proteins and other life essential products like lipids, carbohydrates etc. The right folding of proteins is just one of several other essential processes in order to get a functional protein. But a functional protein by its own has no function unless correctly embedded through the right order of assembly at the right place.

Despite these problems, protein folding was of little interest to cell biologists until the mid- and late 1980s, when the chaperone story began to unfold. As a result, we now know that in cells, many (perhaps most) proteins require molecular chaperones and metabolic energy to fold efficiently and at a biologically relevant rate. Here I describe, from a personal perspective, the developments leading to this new view.

GroEL binds its substrates in a loosely folded, ‘molten globule’–like conformation, exposing hydrophobic surfaces. As proteins in such states tend to aggregate, their binding by GroEL explained how aggregation is prevented. We also obtained evidence that at least partial folding occurred in association with GroEL and that this process was dependent on the presence of GroES, suggesting an encapsulation mechanism.  GroEL complex consists of two stacked, heptameric rings. The new images revealed that GroEL binds the unfolded protein in the ring center. GroES, a heptameric ring of ~10 kDa subunits, binds like a lid over the central GroEL cavity, causing major conformational changes in the interacting GroEL subunits. The  GroEL-GroES complex is asymmetrical and highly dynamic, with GroES binding and unbinding in a mechanism regulated by the GroEL ATPase.  GroEL and GroES essentially function as a folding cage. Nature’s The creator's solution to the problem of protein folding in the crowded cellular environment seemed extremely impressive in its simplicity and elegance: a single protein molecule folding in a macromolecular cage would be unable to aggregate. However, our evidence to support this model was still indirect, and many researchers had difficulty accepting the idea of a proteinaceous folding cage. the crystal structure of GroEL, solved in 1994 by the late Paul Sigler in collaboration with Art Horwich, seemed to support the folding-in-solution model: the central cavity of GroEL was simply not wide enough for even a relatively small protein such as rhodanese to fit. However, this interpretation of the structure failed to take into account that GroEL cooperates with GroES. Interestingly, a month before the crystal structure was published, Helen Saibil and her colleagues had shown by advanced electron cryomicroscopy that GroES binding causes a dramatic conformational change in GroEL, resulting in the formation of a large space capped by GroES. ATP-dependent GroES binding results in protein displacement and encapsulation in the central cavity (the so-called cis complex). The cage opens again after ~10 s in a reaction timed by the allosterically regulated GroEL ATPase: When the seven ATP molecules in the GroES-bound GroEL ring have been hydrolyzed, ATP binds to the trans ring, triggering the signal that causes  GroES to unbind Using quantitative proteomics in collaboration with Matthias Mann, we showed in 2005 that at least 250 different proteins interact with GroEL upon synthesis (~10% of cytosolic proteins).

How could natural mechanisms have achieved that ??  

Of these, 60–80 proteins are absolutely GroELGroES–dependent, including a number of essential proteins. They are generally below ~60 kDa in size and can be accommodated by the chaperonin cage. Interestingly, many GroEL-dependent proteins have complex fold topologies comprising a mixture of a-helices and b-sheets, such as the TIM barrel, and are known for their tendency to populate kinetically trapped states during folding. We suggested that the confining environment of the chaperonin cage not only prevents aggregation but also can smooth rugged folding energy landscapes, allowing folding to occur within a biologically relevant time frame.

Cells contain at least one other type of ATP-dependent chaperone, Hsp70, which also interacts with newly synthesized proteins. But what was the relationship between the Hsp70 and chaperonin systems, and did they cooperate in protein folding? There was evidence that Hsp70 binds hydrophobic peptides44 and can associate with nascent polypeptide chains emerging from ribosomes, that is, at a stage when the polypeptide is structurally incomplete and not yet capable of folding. Taking this into consideration, we envisioned a coherent pathway in which Hsp70 would interact with the (growing) polypeptide chain, preventing premature misfolding and aggregation (the negative principle), and then GroEL-GroES would mediate folding of the completed protein to the native state (the positive principle)

7. Molecular biology of the Cell, 6th ed. page 355
12. Fundamentals of biochemistry, forth ed., page 166


Last edited by Admin on Sat Sep 15, 2018 7:23 am; edited 9 times in total

View user profile


Chaperone machines for protein folding, unfolding and disaggregation  1

a. Overview of unliganded (apo) GroEL (left) and the GroEL–GroES complex  (right). The overall shapes are shown as blue surfaces, with three subunits colored by domain in red, green and yellow in apo GroEL. One subunit of GroEL and one of GroES (cyan) are highlighted in the GroEL–GroES complex.
b. Conformation of a GroEL subunit in the apo form (left) and the GroES-bound form (right), with GroEL key sites indicated (GroES is not shown).
c. Cartoons of complexes with folding proteins. Hydrophobic surfaces and residues are shown in yellow and polar residues in green.
d. Cut open view of the cryo-electron microscopy structure of GroEL in complex with bacteriophage capsid protein, with a non-native gp23 bound to both rings64. The pink density in the folding chamber corresponds to newly folded gp23, and the yellow density in the open ring is part of a non-native gp23 subunit. The corresponding atomic structures are shown embedded in the electron microscopy density map, except for the non-native substrate, which is unknown and only partially visualized owing to disorder. The open ring with its hydrophobic lining is the acceptor state for non-native polypeptides, and binding to multiple sites may facilitate unfolding. ATP and GroES binding to the chaperonin create a protected chamber with a hydrophilic lining that allows the encapsulated protein to fold.

Such assemblies, called chaperonins, also exist in other cellular compartments and are essential components, mediating protein folding under both heat shock and normal conditions. Ever since 1987, we've been studying these fascinating molecules both in vivo and in vitro, with particular emphasis on the Hsp60 homolog in E. coli known as GroEL.  We and others found early on that a chaperonin-mediated folding reaction can be reconstituted in a test tube, and that has enabled structural and functional studies that have begun to explain how chaperonins work. In particular, a combination of crystallographic studies, with the late Paul Sigler's group here at Yale, and functional studies, using dynamic studies of a variety of mutant chaperonins, have begun to reveal how these chaperonins work.  The schematic diagram below summarizes our current view of the chaperonin-mediated protein folding pathway. 2

(a,b) GroEL alone.
(c) GroEL-unfolded rhodanese.
(d) GroEL-unfolded rhodanese-GroES. 
(e) GroEL-GroES complexes

End-on views are shown in ac and d and side views in b and e. In e, GroES sits like a lid on the GroEL cavity, causing a conformational change in the outer domains of the interacting GroEL subunits.

Hsp70 chaperones: Cellular functions and molecular mechanism 
Hsp70 proteins are central components of the cellular network of molecular chaperones and folding catalysts. They assist a large variety of protein folding processes in the cell by transient association of their substrate binding domain with short hydrophobic peptide segments within their substrate proteins. The substrate binding and release cycle is driven by the switching of Hsp70 between the low-affinity ATP bound state and the high-affinity ADP bound state. Thus, ATP binding and hydrolysis are essential in vitro and in vivo for the chaperone activity of Hsp70 proteins. This ATPase cycle is controlled by co-chaperones of the family of J-domain proteins, which target Hsp70s to their substrates, and by nucleotide exchange factors, which determine the lifetime of the Hsp70-substrate complex. Additional co-chaperones fine-tune this chaperone cycle. For specific tasks the Hsp70 cycle is coupled to the action of other chaperones, such as Hsp90 and Hsp100. 3


a. In the ADP-bound or nucleotide-free state, the nucleotide-binding domain (green; Protein Data Bank (PDB) code: 3HSC)16 of heat shock protein 70 (HSP70) is connected by a flexible linker to the substrate-binding domain (blue; PDB code: 1DKZ), with the lid domain (red) locking a peptide substrate (yellow) into the binding pocket18. A side view of the substrate domain is shown on the right. A cartoon depicting the two-domain complex is shown below. The bound nucleotide is shown in space-filling format.
b. In the ATP-bound state, the lid opens, and both the lid and the substrate-binding domain dock to the nucleotide-binding domain (PDB code: 4B9Q)20. The corresponding cartoon of this confirmation is shown below. When ATP binds, the cleft closes, triggering a change on the outside of the nucleotide-binding domain that creates a binding site for the linker region. Linker binding causes the substrate-binding domain and the lid domain to bind different sites on the nucleotide-binding domain, resulting in a widely opened substrate-binding site that enables rapid exchange of polypeptide substrates. After hydrolysis, the domains separate and the lid closes over the bound substrate. Such binding and release of extended regions of polypeptide chain are thought to unfold and stabilize non-native proteins either for correct folding or degradation.

1 and 2. The nascent chain is stabilized in a folding-competent state during translation by the Hsp70 chaperone system (DnaK, DnaJ)  
3. These chaperones bind hydrophobic segments exposed by the extended chain that will later be buried within the folded structure. Upon completion of translation, the protein is unable to fold using the Hsp70 chaperone system and must be transferred into the central cavity of GroEL. This step requires GrpE, the nucleotide exchange factor of DnaK 
4. After binding of the protein in a molten globule-like conformation into the open ring of GroEL 
5. the protein is encapsulated by GroES in the folding cage 
6. Folded protein emerges from the cage as GroES unbinds 

The model was later extended to include the cooperation of DnaK with the ribosome-bound chaperone trigger factor and the finding that the Hsp70 system mediates the folding of proteins that do not require the physical environment of the chaperonin cage

Substrate binding to GroEL (upon transfer from the upstream chaperone Hsp70) may result in local unfolding. 
ATP binding then triggers a conformational rearrangement of the GroEL apical domains. This is followed by the binding of GroES (forming the cis complex) and substrate encapsulation for folding. At the same time, ADP and GroES dissociate from the opposite (trans) GroEL ring, allowing the release of substrate that had been enclosed in the former cis complex (omitted for simplicity). The substrate remains encapsulated, free to fold, for the time needed to hydrolyze the seven ATP molecules in the newly formed cis complex (~10 s). Binding of ATP and GroES to the trans ring causes the opening of the cis complex.

Biochemists trap a chaperone machine in action 6
Molecular chaperones have emerged as exciting new potential drug targets, because scientists want to learn how to stop cancer cells, for example, from using chaperones to enable their uncontrolled growth. Now a team of biochemists at the University of Massachusetts Amherst led by Lila Gierasch have deciphered key steps in the mechanism of the Hsp70 molecular machine by "trapping" this chaperone in action, providing a dynamic snapshot of its mechanism.

She and colleagues describe this work in the current issue of Cell. Gierasch's research on Hsp70 chaperones is supported by a long-running grant to her lab from NIH's National Institute for General Medical Sciences.

Molecular chaperones like the Hsp70s facilitate the origami-like folding of proteins, made in the cell's nanofactories or ribosomes, from where they emerge unstructured like noodles. Proteins only function when folded into their proper structures, but the process is so difficult under cellular conditions that molecular chaperone helpers are needed.

The newly discovered information about chaperone action is important because all rapidly dividing cells use a lot of Hsp70, Gierasch points out. "The saying is that cancer cells are addicted to Hsp70 because they rely on this chaperone for explosive new cell growth. Cancer shifts our body's production of Hsp70 into high gear. If we can figure out a way to take that away from cancer cells, maybe we can stop the out-of-control tumor growth. To find a molecular way to inhibit Hsp70, you've got to know how it works and what it needs to function, so you can identify its vulnerabilities."

Chaperone proteins in cells, from bacteria to humans, act like midwives or bodyguards, protecting newborn proteins from misfolding and existing proteins against loss of structure caused by stress such as heat or a fever. In fact, the heat shock protein (Hsp) group includes a variety of chaperones active in both these situations.

As Gierasch explains, "New proteins emerge into a challenging environment. It's very crowded in the cell and it would be easy for them to get their sticky amino acid chains tangled and clumped together. Chaperones bind to them and help to avoid this aggregation, which is implicated in many pathologies such as neurodegenerative diseases. This role of chaperones has also heightened interest in using them therapeutically."

However, chaperones must not bind too tightly or a protein can't move on to do its job. To avoid this, chaperones rapidly cycle between tight and loose binding states, determined by whether ATP or ADP is bound. In the loose state, a protein client is free to fold or to be picked up by another chaperone that will help it fold to do its cellular work. In effect, Gierasch says, Hsp70s create a "holding pattern" to keep the protein substrate viable and ready for use, but also protected.

She and colleagues knew the Hsp70's structure in both tight and loose binding affinity states, but not what happened between, which is essential to understanding the mechanism of chaperone action. Using the analogy of a high jump, they had a snapshot of the takeoff and landing, but not the top of the jump. "Knowing the endpoints doesn't tell us how it works. There is a shape change in there that we wanted to see," Gierasch says.

To address this, she and her colleague's postdoctoral fellows Anastasia Zhuravleva and Eugenia Clerico obtained "fingerprints" of the structure of Hsp70 in different states by using state-of-the-art nuclear magnetic resonance (NMR) methods that allowed them to map how chemical environments of individual amino acids of the protein change in different sample conditions. Working with an Hsp70 known as DnaK from E. coli bacteria, Zhuravleva and Clerico assigned its NMR spectra. In other words, they determined which peaks came from which amino acids in this large molecule.

The UMass Amherst team then mutated the Hsp70 so that cycling between tight and loose binding states stopped. As Gierasch explains, "Anastasia and Eugenia were able to stop the cycle part-way through the high jump, so to speak, and obtain the molecular fingerprint of a transient intermediate." She calls this accomplishment "brilliant."

Now that the researchers have a picture of this critical allosteric state, that is, one in which events at one site control events in another, Gierasch says many insights emerge. For example, it appears nature uses this energetically tense state to "tune" alternate versions of Hsp70 to perform different cellular functions. "Tuning means there may be evolutionary changes that let the chaperone work with its partners optimally," she notes.

"And if you want to make a drug that controls the amount of Hsp70 available to a cell, our work points the way toward figuring out how to tickle the molecule so you can control its shape and its ability to bind to its client. We're not done, but we made a big leap," Gierasch adds. "We now have an idea of what the Hsp70 structure is when it is doing its job, which is extraordinarily important."

How chaperones emerged based on naturalistic explanations:
In the beginning, you didn't need chaperones because every protein folded rapidly on its own. Some of these primitive proteins might have been a bit slow to fold so the evolution of the first chaperones was advantageous because it enhanced the rate of folding for these proteins. The chaperones weren't absolutely necessary for survival but they conferred a selective advantage on those cells that had them. Once chaperones were present, new proteins could evolve that would otherwise have been too slow to fold in the absence of chaperones. Over time, cells accumulated more and more of these slowly folding proteins so that today no cell can survive without chaperones. 11

The speed at which a protein evolves depends on how stable a protein’s folded structure is, how well it avoids aggregation, and how well-chaperoned it is. 16

The structural analysis of the chaperonin fold presented here suggests that chaperonins emerged very early from a PRX-like gene insertion into the as yet unknown ancestral origin of the equatorial domain. A so-called Ur-Chaperonin was likely present in LUCA and evolved from a purely oxidative stress protein into a constitutively active chaperonin involved in the folding of a set of proteins with ancient folds. The initial functional
interaction between PRX- and TRX-like proteins may have been responsible for the persistence of effective interactions between Ur-Chaperonin and TRX-like proteins, resulting in subsequent co-evolution. This may
explain the prevalence of TRX-like proteins amongst chaperonin co-factors, for example, the phosducin-like proteins. 15

( Supposed ) Evolution of the chaperonin fold. 
a. Schematic diagram of the ( supposed ) evolutionary origin of the chaperonin fold. Under high oxidative stress, PRX oligomerizes and acts like a chaperone. The stabilization of this chaperone-active conformation - possibly due to events of random mutations, interaction with ATP, or the merging with the equatorial domain, the order of which is unknown- would have lead to the earliest chaperonin: the Ur-Chaperonin. Abbreviations: TRX thioredoxin, PRX peroxiredoxin, LMW low molecular weight, HMW high molecular weight. 
b. Schematic diagram of evolution of the Group-I and Group-II chaperonins in the different domains. Note: The mitochondria and chloroplast chaperonins are Group-I type because of their symbiotic relationship to the eukaryotic cell. 

Chaperones are necessary as well for the minimal proteome set for life to kick-start, which makes the hypothesis of an unguided, random naturalistic origin even more unrealistic. 

The general function of chaperones in assisting protein folding is significant in facilitating the structural evolution of proteins. 1

Science papers often explain biological function in language that sounds suspiciously teleological. Above is a nice example. Did early earth have the foresight to see that Chaperone proteins would be necessary to assist protein folding? and even more, that energy would be required, providing it at the right place, in the right dosage? 

Hsp90 chaperones
Hsp90 is a highly abundant and ubiquitous molecular chaperone which plays an essential role in many cellular processes including cell cycle control, cell survival, hormone, and other signaling pathways. It is important for the cell’s response to stress and is a key player in maintaining cellular homeostasis.   Hsp90 has a critical ATPase activity a , and ATP binding and hydrolysis known to modulate the conformational dynamics of the protein.  Hsp90 is required for the correct maturation and activation of a number of key cellular proteins and protein complexes. The proteins on which it acts are collectively called “clients” and many of them play important roles in signal transduction pathways.2

The heat shock protein 90 (HSP90) chaperone machinery is a key regulator of proteostasis under both physiological and stress conditions in eukaryotic cells. As HSP90 has several hundred protein substrates (or 'clients'), it is involved in many cellular processes beyond protein folding, which include DNA repair, development, the immune response and neurodegenerative disease. A large number of co-chaperones interact with HSP90 and regulate the ATPase-associated conformational changes of the HSP90 dimer that occur during the processing of clients. Heat shock protein 90 (HSP90) is a molecular chaperone that is conserved from bacteria to humans and facilitates the maturation of substrates (or clients) that are involved in many different cellular pathways. In higher eukaryotes, HSP90 are also present in mitochondria, chloroplasts and the endoplasmic reticulum. 4

The function, structure and conformational cycle of HSP90. 
a  Schematic representation of the ways in which heat shock protein 90 (HSP90) can affect ‘clients’ (that is, substrates). HSP90 can facilitate protein folding (top panel), the assembly of multiprotein complexes (middle panel) or the binding of a ligand to its target or receptor (bottom panel). 
b  Schematic representation of the domain organization of the HSP90 dimer. Each HSP90 monomer is composed of an amino-terminal domain (NTD) that is connected to a middle domain (MD) by a flexible linker region. A Met-Glu-Glu-Val-Asp (MEEVD) motif is present in the carboxy-terminal domain (CTD). 
c  The HSP90 conformational cycle and the action of co‑chaperones in different parts of the cycle. HSP90 remains in different states for different amounts of time. HSP90 transitions from the open ATP-bound state to the intermediate state after ATP binding and closure of the ‘lid’, which is followed by an interaction between the NTDs of the monomers (closed 1 state) and twisting of the HSP90 monomers (closed 2 state). Co‑chaperones associate with specific conformational states of HSP90. HSC70/HSP90‑organizing protein (HOP; Sti1 in yeast) binds to HSP90 and stabilizes the open conformation, thereby inhibiting the ATPase activity of HSP90. Peptidyl-prolyl cis–trans isomerases (PPIases) are thought to bind all conformational states of HSP90. CDC37 binds to HSP90 early in the cycle and is important for the recruitment of client kinases. Activator of HSP90 ATPase homologue 1 (AHA1) promotes the formation of the closed 1 state and accelerates the chaperone cycle. p23 (Sba1 in yeast) stabilizes the closed 2 state and regulates the progression of the reaction cycle by reducing the ATPase activity of HSP90. Pi , inorganic phosphate.

An HSP90 monomer consists of three highly conserved domains (FIG. 1b): the amino-terminal domain (NTD), which mediates binding to ATP17; the middle domain (MD), which is important for ATP hydrolysis and the binding of HSP90 to clients; and the carboxy-terminal domain (CTD), which is responsible for HSP90 dimerization.  It also contains a C-terminal Met-Glu-Glu-Val-Asp (MEEVD) motif, which is important for the interaction with cochaperones.  In eukaryotes a charged linker contains unique regulatory sites, thereby acting as a “rheostat”, finely tuning the Hsp90 chaperone machine. 

a ATPases are a class of enzymes that catalyze the decomposition of ATP into ADP and a free phosphate ion. This dephosphorylation reaction releases energy, which the enzyme (in most cases) harnesses to drive other chemical reactions that would not otherwise occur. This process is widely used in all known forms of life. 3


View user profile


The interdependent and irreducible structures required to make proteins

To make proteins, and direct and insert them to the right place where they are needed, at least 25 extremely complex biosyntheses and production-line like manufacturing steps are required. Each step requires highly specific, sophisticated molecular machine complexes and holo-enzymes composed of numerous subunits and co-factors, which require the very own processing procedure described below to be manufactured, which makes its origin an irreducible  catch22 problem. There is a complex, sophisticated sequence of events that must all be fully operational, and the molecular machinery fully place and operational,  to make the final product, the workhorses of Cells: proteins. The sequence is described in the 25 points below. In order for evolution to work, the robot-like working machinery and assembly line must be in place, fully operational.  The minimal set of molecular machines to start the process of protein production cannot be explained through evolution. 


The production of a protein by a eukaryotic cell. 
The final level of each protein in a eukaryotic cell depends upon the efficiency of each step depicted. 3 

Paul Davies, the origin of life, page 59
Proteins are a godsend to DNA because they can be used both as building material, to make things like cell walls, and as enzymes, to supervise and accelerate chemical reactions. Enzymes are chemical catalysts that ‘oil the wheels’ of the biological machine. Without them, metabolism would grind to a halt, and there would be no energy available for the business of life. Not surprisingly, therefore, a large part of the DNA databank is used for storing instructions on how to make proteins. Here is how those instructions get implemented. Remember that proteins are long-chain molecules made from lots of amino acids strung together to form polypeptides. Each different sequence of amino acids yields a different protein. The DNA has a wish list of all the proteins the organism needs. This information is stored by recording the particular amino acid sequence that specifies each and every protein on the list. It does so using DNA's four-letter alphabet A, G, C and T; the exact sequence of letters spells out the amino acid recipe, protein by protein – typically a few hundred base pairs for each. To turn this dry list of amino acids into assembled, functioning proteins, DNA enlists the help of a closely related molecule known as RNA (for ribonucleic acid). RNA is also made from four bases, A, G, C and U. Here U stands for uracil; it is similar to T and serves the same purpose alphabetically. RNA comes in several varieties; the one of interest to us here is known as messenger RNA, or mRNA for short. Its job is to read off the protein recipes from DNA and convey them to tiny factories in the cell where the proteins are made. These mini-factories are called ribosomes and are complicated machines built from RNA and proteins of various sorts. Ribosomes come with a slot into which the mRNA feeds, after the fashion of a punched tape of the sort used by old-fashioned computers.

The mRNA ‘tape’ chugs through the ribosome, which then carries out its instructions bit by bit, hooking amino acids together, one by one in the specified sequence, until an entire protein has been constructed. Earthlife makes proteins from 20 different varieties of amino acids, 2 and the mRNA records which one comes after which so the ribosome can put them together in the right order. It is quite fascinating to see how the ribosome goes about joining the amino acids up into a chain. Naturally, the amino acids don't obligingly come along in the right order, ready to be hooked on to the end of the chain. So how does the ribosome ensure that the mRNA gets its specified amino acid at each step? The answer lies with another set of RNA molecules, called transfer RNA, or tRNA for short. Each particular tRNA molecule brings along to the ribosome factory one and only one sort of amino acid stuck to its end, to present it to the production line. At each step in the assembly of the protein, the trick is to get the right tRNA, with the right amino acid attached, to give up its cargo and transfer it to the end of the growing protein chain, while rejecting any of the remaining 19 alternatives that may be on offer. This is accomplished as follows. The mRNA (remember, this carries the instructions) exposes a bit of information (i.e. a set of ‘letters’) that says ‘add amino acid such-and-such now’. The instructions are implemented correctly because only the targeted tRNA molecule, carrying the designated amino acid, will recognize the exposed bit of mRNA from its shape and chemical properties, and bind to it. The other tRNA molecules the ones that are carrying the ‘wrong’ amino acids – won't fit properly into the binding site. Having thus seduced the right tRNA molecule to berth at the production line, the next step is for the ribosome to persuade the newly arrived amino acid cargo to attach itself to the end of the protein chain. The chain is waiting in the ribosome, dangling from the end of the previously selected tRNA molecule.

At this point the latter molecule lets go and quits the ribosome, passing the entire chain on to the newly arrived tRNA, where it links on to the amino acid it has brought with it. The chain thus grows by adding amino acids to the head rather than the tail. If you didn't follow all of this on the first read through, don't worry, it isn't essential for understanding what follows. I just thought it was sufficiently amazing to be worth relating in some detail. When the protein synthesis is complete, the ribosome receives a ‘stop’ signal from the mRNA ‘tape’ and the chain cuts loose. The protein is now assembled, but it doesn't remain strung out like a snake. Instead, it rolls up into a knobbly ball, rather like a piece of elastic that's stretched and allowed to snap back. This folding process may take some seconds, and it is still something of a mystery as to how the protein attains the appropriate final shape. To work properly, the three-dimensional form of the protein has to be correct, with the bumps and cavities in all the right places, and the right atoms facing outwards. Ultimately it is the particular amino acid sequence along the chain that determines the final three-dimensional conformation, and therefore the physical and chemical properties, of the protein. This whole remarkable sequence of events is repeated in thousands of ribosomes scattered throughout the cell, producing tens of thousands of different proteins. It is worth repeating that, in spite of the appearance of purpose, the participating molecules are completely mindless. Collectively they may display systematic cooperation, as if to a plan, but individually they just career about. The molecular traffic within the cell is essentially chaotic, driven by chemical attraction and repulsion and continually agitated by thermal energy. Yet out of this blind chaos order emerges spontaneously.

3. Molecular biology of the Cell, 6th ed. page 361

View user profile

37 The gene regulatory network on Tue Sep 18, 2018 3:14 pm


Gene Regulation

Nucleosomes function and design

Chromosome condensation and compaction is nothing short than awe-inspiring, amazing evidence of setup by a supreme intelligence.

Gene expression is a complex multilevel process that involves various functional units, from nucleosomes to fully formed chromatin fibers accompanied by a host of various chromatin binding enzymes. 1

Eukaryotic genes Are flanked by nucleosome-free regions
Studies over the last 10 years or so have revealed that many eukaryotic genes show a common pattern of nucleosome organization (Figure below).

Nucleosome arrangements in the vicinity of a eukaryotic structural gene.

For active genes or those genes that can be activated, the core promoter is found at a nucleosome-free region (NFR), which is a site that is missing nucleosomes. The NFR is typically 150 bp in length. Although the NFR may be required for transcription, it is not, by itself, sufficient for gene activation. At any given time in the life of a eukaryotic cell, many genes that contain an NFR are not being actively transcribed. The NFR is flanked by two nucleosomes that are termed the –1 and +1 nucleosomes. These nucleosomes often contain histone variants that promote transcription. The end of many eukaryotic genes is followed by another NFR. This arrangement at the end of genes may be important for transcriptional termination.

Transcriptional activation involves changes in nucleosome locations, composition, and histone modifications
A key role of certain activators is to recruit ATP-dependent chromatin- remodeling complexes and histone-modifying enzymes to the promoter region of eukaryotic genes. Though the order of recruitment may differ among specific activators, this appears to be critical for transcriptional initiation and elongation. In the scenario shown in Figure below, an activator binds to an enhancer in the NFR.

A simplified model for the transcriptional activation of a eukaryotic structural gene.

The activator then recruits chromatin-remodeling complexes and histone-modifying enzymes to this region. The chromatin-remodeling complex may shift nucleosomes or temporarily evict nucleosomes from the promoter region. Nucleosomes containing certain histone variants are thought to be more easily removed from the DNA than those containing the standard histones. Histone-modifying enzymes, such as histone acetyltransferase, covalently modify histone proteins and may affect nucleosome contact with the DNA. The actions of chromatin remodeling complexes and histone-modifying enzymes facilitate the binding of general transcription factors and RNA polymerase II to the core promoter, thereby allowing the formation of a preinitiation complex (see Figure above, step 2). Further changes in chromatin structure are necessary for elongation to occur. RNA polymerase II cannot transcribe DNA that is tightly wrapped in nucleosomes. For transcription to occur, histones are evicted, partially displaced, or destabilized so RNA polymerase II can pass. Evicted histones are then reassembled by chaperone proteins and placed back on the DNA behind the moving RNA polymerase II (see Figure above). These histones may be deacetylated—have their acetyl groups removed—so they bind more tightly to the DNA.

Different types of molecular changes underlie epigenetic gene regulation
The molecular mechanisms that promote epigenetic gene regulation in eukaryotes are the subject of a great deal of recent research. The most common types of molecular changes that underlie epigenetic gene regulation are DNA methylation, chromatin remodeling, covalent histone modification, and the localization of histone variants (Table below).

these types of changes can also be involved in transient (nonepigenetic) gene regulation. In some cases, epigenetic changes stimulate the transcription of a given gene, and in other cases, they repress gene transcription.

Epigenetic gene regulation may occur as a programmed developmental change
Many epigenetic modifications that regulate gene expression are programmed changes that occur at specific stages of development ( Table above ).

-As discussed in Chapter 17, genomic imprinting of the I g f 2 gene occurs during gametogenesis—the maternal allele is silenced whereas the paternal allele is active.
-X-chromosome inactivation occurs during embryogenesis in female mammals. In early embryonic cells, one of the X chromosomes of a female is inactivated and forms a Barr body, whereas the other remains active. This pattern is maintained as the cells divide and persists in the adult organism.
-Similarly, the differentiation of specific cell types, such as muscle cells and neurons, involves epigenetic modifications. During embryonic development, certain genes undergo epigenetic changes that affect their expression throughout the rest of development. For example, in an embryonic cell that is destined to divide and become a group of muscle cells, a large number of genes that should not be expressed in muscle cells undergo epigenetic modifications that prevent their expression; such changes persist through adulthood.

Epigenetic gene regulation may be caused by environmental agents
An exciting discovery in the field of epigenetics is that a wide range of environmental agents have epigenetic effects. Many recent studies have suggested that environmentally induced changes in an organism’s characteristics are sometimes rooted in epigenetic changes that alter gene regulation. A striking example is found in honeybees (Apis mellifora). Female honeybees are of two types: queen bees and worker bees (Figure below).

Epigenetics and the environment. 
Female honeybees that are fed royal jelly throughout the entire larval stage and into adulthood develop into queen bees. The larger queen bee is marked with a blue disk labeled 68. By comparison, larvae that do not continue to receive this diet become smaller worker bees. These differences in development may be caused by epigenetic changes that affect gene regulation.

Queens are larger, live for years, and produce up to 2,000 eggs each day. By comparison, the smaller worker bees are sterile, typically live a few weeks, and engage in specialized tasks, which include cleaning and constructing comb cells, nurturing larvae, guarding the hive entrance, and foraging for pollen and nectar. The striking differences between queen and worker bees are largely caused by differences in their diet. Certain worker bees, called nurse bees, produce royal jelly in glands in their mouths. All female larvae are initially fed royal jelly, but those that are bathed in royal jelly throughout their entire larval development and feed on it into adulthood become queens. In contrast, female larvae that are switched at an early stage of development to a diet of pollen and nectar become worker bees. In 2008, a study  indicated that DNA methylation may play a role in controlling the developmental pathways that result in queen and worker bee morphologies. Bee larvae were fed a diet that should produce worker bees. These larvae were injected with a substance that inhibits DNA methyltransferase, the enzyme that methylates DNA. The result was that most of them became queen bees with fully developed ovaries! Although other factors may contribute to the development of queens, these results are consistent with the hypothesis that royal jelly may contain a substance that inhibits DNA methylation. Such inhibition is thought to allow the expression of genes that contribute to the development of the traits observed in queen bees. Another topic of great interest to many geneticists is how environmental toxins can cause epigenetic changes. In humans, exposure to tobacco smoke has been shown to alter DNA methylation and the covalent histone modifications of specific genes in lung cells. These alterations are associated with changes in gene regulation that may cause normal cells to become cancerous.

Regulation of RNA modification and translation in eukaryotes
Eukaryotic gene expression is  commonly regulated at the levels of RNA modification and translation. These added levels of regulation provide important benefits to eukaryotic species. First, by regulating RNA modification, eukaryotes can produce more than one mRNA transcript from a single gene. This allows a gene to encode two or more polypeptides, thereby increasing the complexity of eukaryotic proteomes. A second issue is timing. Regulation
of transcription in eukaryotes takes a fair amount of time before its effects are observed at the cellular level. During transcription

(1) the chromatin must be converted to an open conformation,
(2) the gene must be transcribed,
(3) the RNA must be modified and exported from the nucleus, and
(4) the protein must be made via translation.

All four steps take time, on the order of several minutes. One way to achieve faster regulation is to control steps that occur after an RNA transcript is made. In eukaryotes, regulation of translation provides a faster way to regulate the levels of gene products, namely, proteins. A small RNA molecule or RNA-binding protein can bind to an mRNA and affect the ability of the mRNA to be translated into a polypeptide.

Alternative splicing of pre-mRNAs increases protein diversity
In eukaryotes, a pre-mRNA transcript is modified before it becomes a mature mRNA. When a pre-mRNA has multiple introns and exons, splicing may occur in more than one way, resulting in the production of two or more different polypeptides. Such alternative splicing is a form of gene regulation that allows an organism to use the same gene to make different proteins at different stages of development, in different cell types, and/or in response to a change in the environmental conditions. Alternative splicing is an important form of gene regulation in complex eukaryotes such as animals and plants. An advantage of alternative splicing is that two or more different polypeptides can be derived from a single gene, thereby increasing the size of the proteome while minimizing the size of the genome. Let’s consider an example of alternative splicing for a pre-mRNA that encodes a protein known as α-tropomyosin, which functions in the regulation of cell contraction in animals. It is located along the thin filaments found in smooth muscle cells, such as those in the uterus and small intestine, and in striated muscle cells that are found in cardiac and skeletal muscle. α-Tropomyosin is also synthesized in many types of nonmuscle cells but in lower amounts. Within a multicellular organism, different types of cells must regulate their contractibility in subtly different ways. One way this may be accomplished is by the production of different forms of α-tropomyosin. Figure below shows the intron-exon structure of the rat α- tropomyosin pre-mRNA and two alternative ways that the premRNA can be spliced.

Alternative splicing of the rat α-tropomyosin pre-mRNA. 
The top part of this figure depicts the structure of the rat α-tropomyosin pre-mRNA. Exons are red or green, and introns are yellow. The lower part of the figure describes the final mRNA products in smooth and striated
muscle cells after alternative splicing. Note: Exon 8 is found in the final mRNA of smooth and striated muscle cells, but not in the mRNA of some other cell types.

The pre-mRNA contains 14 exons, 6 of which are constitutive exons (shown in red), which are always found in the mature mRNA from all cell types. Presumably, constitutive exons encode polypeptide segments of the α-tropomyosin protein that are necessary for its general structure and function. By comparison, alternative exons (shown in green) are not always found in the mRNA after splicing has occurred. The polypeptide sequences encoded by alternative exons may subtly change the function of α-tropomyosin to meet the needs of the cell type in which it is found. For example, Figure above shows the predominant splicing products found in smooth muscle cells and striated muscle cells. Exon 2 encodes a segment of the α-tropomyosin protein that alters its function to make it suitable for smooth muscle cells. By comparison, the α-tropomyosin mRNA found in striated muscle cells does not include exon 2. Instead, this mRNA contains exon 3, which is more suitable for that cell type.

RNA interference blocks the expression of mRNA
Let’s now turn our attention to regulatory mechanisms that affect translation. MicroRNAs (miRNAs) and short-interfering RNAs (siRNAs) are RNA molecules that are processed to a small size, typically 22 nucleotides in length, and silence the expression of pre-existing mRNAs. The precursors of miRNAs are encoded by genes and usually form a hairpin structure. In most cases, miRNAs are partially complementary to certain cellular mRNAs and inhibit their translation. By comparison, short-interfering RNAs are derived from two RNA molecules that come together to form a double-stranded region. For example, a cellular RNA may bind to an RNA that is transcribed from a viral genome. siRNAs are often a very close match to specific mRNAs and cause those mRNAs to be degraded. Insight into the mechanism of miRNA inhibition came from the research of American biologists Andrew Fire and Craig Mello, who
discovered the mechanism of action of miRNA (Figure below).

Mechanism of action of microRNA (miRNA). 
Note: Pre-siRNAs are also acted upon by dicer, but siRNAs are derived from two RNA molecules that form a double-stranded region rather than one RNA molecule that forms a hairpin.

A pre-miRNA is first synthesized as a single-stranded molecule that folds back on itself to form a hairpin structure. (A pre-siRNA would be composed of two RNA molecules that come together to form a double-stranded region.) The double-stranded region is trimmed to a 22-bp sequence by an enzyme called dicer. The 22-bp sequence becomes part of a complex called the RNA-induced silencing complex (RISC), which also includes several proteins. One of the RNA strands is then degraded. The miRNA or siRNA in the complex binds to a target mRNA with a complementary sequence. Upon binding, two different things may happen. When the miRNA and mRNA are not a perfect match or are only partially complementary, translation is inhibited. Alternatively, when an siRNA and mRNA are a perfect match or highly complementary, the mRNA is cut into pieces and then degraded. Both miRNA and siRNA have the same effect—the expression of the mRNA is silenced. Fire and Mello called this RNA interference (RNAi), because the miRNA interferes with the proper expression of an mRNA. Since this study, researchers have discovered that genes encoding miRNAs are widely found in animals and plants. In humans, for example, researchers estimate that over 1,000 different genes encode miRNAs. RNAi represents an important mechanism of gene regulation that results in mRNA silencing. In 2006, Fire and Mello were awarded the Nobel Prize in Physiology or Medicine for their studies of RNAi.

The prevention of iron toxicity in mammals involves the regulation of translation
Another way to regulate translation involves RNA-binding proteins that directly affect the initiation of translation. The regulation of iron absorption provides a well-studied example. Although iron is a vital cofactor for many cellular enzymes, it is toxic at high levels. To prevent toxicity, mammalian cells synthesize a protein called ferritin, which forms a hollow, spherical complex that stores excess iron. The mRNA that encodes ferritin is controlled by an RNA-binding protein known as the iron regulatory protein (IRP). When the iron level in the cytosol is low and more ferritin is not needed, IRP binds to a regulatory element within the ferritin mRNA known as the iron regulatory element (IRE). The IRE is located between the 5′ cap, where the ribosome binds, and the start codon where translation begins. Due to base pairing, it forms a stem-loop structure. The binding of IRP to the IRE inhibits translation of the ferritin mRNA (Figure a below).

Translational regulation of ferritin mRNA by the iron regulatory protein (IRP).

However, when iron is abundant in the cytosol, the iron binds directly to IRP, which changes its conformation and prevents it from binding to the IRE. Under these conditions, the ferritin mRNA is translated to make more ferritin protein (Figure b above). Why is translational regulation of ferritin mRNA an advantage over transcriptional regulation of the ferritin gene? This mechanism of translational control allows cells to rapidly respond to changes in their environment. When cells are confronted with high levels of iron, they can quickly make more ferritin protein to prevent the toxic buildup of iron. This mechanism is faster than transcriptional regulation, which would require the activation of the ferritin gene and the transcription of ferritin mRNA prior to the synthesis of more ferritin protein.


Last edited by Admin on Thu Nov 15, 2018 2:46 pm; edited 7 times in total

View user profile

38 How the histone Code directs gene expression on Wed Oct 03, 2018 1:57 pm


The gene regulatory network

Control of Gene Expression and gene regulatory networks  point to intelligent design

Gene Regulatory Networks Controlling Body Plan Development

A walk through the epigenetic landscape which regulates Gene expression points to the requirement of intelligent setup and design

Epigenetics refers to heritable changes in gene expression that occur without alteration in DNA sequence. These changes may be induced spontaneously, induced by environmental factors or as a consequence of specific mutations. There are two primary and interconnected epigenetic mechanisms: DNA methylation and covalent modification of histones. 42 . In addition, it has become apparent that non-coding RNA is also intimately involved in this process. The different mechanisms that control epigenetic changes do not stand alone, and there are a clear interconnection and interdependency between:
- DNA methylation 
- Histone modification and incorporation of histone variants
- Chromatin remodeling in Eukaryotic Cells  
- Non-coding RNA-mediated epigenetic regulation

These mechanisms, together with action of transcription factors and ATP-dependent chromatin remodeling, will establish unique epigenetic states resulting in alterations of gene expression that for example can determine cellular diversity with virtually no differences in DNA sequences. Epigenetic modifications are central to many cellular processes and essential to many organism functions.  If these modifications occur improperly, they can lead to major adverse health effects such as cancer or congenital diseases.

From the first cell division to the complex integration of signaling pathways in differentiated tissues, the multicellular organism must precisely regulate transcription to ensure correct gene expression. Gene activation and repression are mostly regulated through changes in chromatin structure imparted by DNA methylation, chromatin remodeling and histone modifications. 7

Interdependence means, if one of the three mechanisms is not present, the others cannot function properly. And if the modifications do not occur as they should, cancer and other diseases are the consequence. Which demonstrates that a gradual stepwise increase in complexity was not possible. These mechanisms are interlocked and had to emerge fully developed and properly functioning right from the beginning. 

DNA does not just store codified information, but a physical polymer. It is packaged, stored, labeled, and read in a controlled way. Also, consider that every cell in a  body has the exact same sequence of DNA. Part of the process by which stem cells differentiate into specific cell types (like muscle, nerve, or blood cells) is selectively, permanently, turning off genes. A neuron doesn’t need to make hemoglobin, and a fat cell doesn’t need to make light-sensing receptor proteins. Epigenetic modifications like methylation turn these genes off and establish a fine-tuned genetic program for the cell. 40

DNA packaging amounts to an amazing feat, often requiring several meters of DNA to be compacted into the confines of a 2–10 micron nucleus. This high level of compaction presents a potential problem, as the underlying DNA must remain accessible to the vast protein machineries that utilize it for critical biological functions. Thus, a fundamental question being addressed by many labs has been how these diverse genomic functions, such as gene transcription, DNA repair, replication, and recombination, occur at the appropriate place and time to promote cellular growth, differentiation, and proper organismal development. [url= B. Rothbart & Brian D. Strahl.pdf]35[/url]

The histone Code is awe-inspiring and has a central role to explain the making of multicellular organisms.

The regulation of transcription is a vital process in all living organisms. It is orchestrated by transcription factors and other proteins working in concert to finely tune the amount of messenger RNA being produced through a variety of mechanisms. Transcription factor proteins bind to specific gene target sequences, which signal to the transcription machinery, the RNA polymerase holo-enzyme complex, to begin the transcription of a specific gene sequence, and where to stop.

Transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. But how do Transcription Factors " know " when, and where either to express or to repress a gene ( silence a gene in order that it is not expressed) ? That is, how do they find the right place in the genome, and when it is the right time to bind and to give the signal: " Its time to start to transcribe this gene section into messenger RNA's? Rather than " know ", we need to ask: How is the information encoded in the Cell, that directs, or orchestrates the binding of transcription factors (TF's) at the right place in the genome, and also at the right time?  

Here, in Eukaryotic Cells, the Histone Code comes into play.

Genetic expression is a wide complex process. It must be regulated by a series of mechanisms.

Genes can't control an organism on their own; rather, they must interact with and respond to the organism's environment. Some genes are constitutive, or always "on," regardless of environmental conditions. Such genes are among the most important elements of a cell's genome, and they control the ability of DNA to replicate, express itself, and repair itself. These genes also control protein synthesis and much of an organism's central metabolism. In contrast, regulated genes are needed only occasionally — but how do these genes get turned "on" and "off"? What specific molecules control when they are expressed? 24

Regulation of gene expression includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products (protein or RNA), and is informally termed gene regulation. 14 Sophisticated programs of gene expression are widely observed in biology, for example, to trigger

Developmental pathways
- Developmental pathways are networks of genes that act coordinately to establish the body plan. Disruptions of genes in one pathway can have effects in related pathways and may result in serious dysmorphogenesis or cancer. 15
Respond to environmental stimuli
- The ability to change over time in response to the environment is fundamental and is determined by the organism's genetics. A living organism must be able to respond appropriately to external/environmental stimuli; A response can take many forms, from the contraction of a unicellular organism to external chemicals to complex reactions involving all the senses of multicellular organisms. An example of responding to stimuli is a bacterium forming an endospore when exposed to tough, unfavorable conditions to protect itself 15
Adapt to new food sources
- All organisms need to adapt to their habitat to be able to survive. 16

Cognitive chemistry is a special form of chemistry where the system is able to recognize and respond to the dynamic environmental signals and even get adapted to its environmental conditions. The system is highly connected with its environmental factors through molecular sensors and logic gates. The system is made of a cluster of precisely coded molecules (that through their electrochemical interactions) are highly connected to each other as well as their environmental factors. In fact, a cognitive chemistry system has the capacity of endogenous cognition. 8

Question: Do these not belong to the basic characteristics of life, and are, therefore, essential? How could one have emerged individually without the others?
Virtually any step of gene expression can be modulated, from transcriptional initiation, to RNA processing, and to the post-translational modification of a protein. Often, one gene regulator controls another, and so on, in a gene regulatory network. Gene regulation is essential for all lifeforms, including viruses as it increases the versatility and adaptability of an organism by allowing the cell to express protein when needed. In multicellular organisms, gene regulation drives cellular differentiation and morphogenesis in the embryo, leading to the creation of different cell types that possess different gene expression profiles from the same genome sequence. The initiating event leading to a change in gene expression includes activation or deactivation of receptors.

Any step of gene expression may be modulated, from the DNA-RNA transcription step to post-translational modification of a protein. The following is a list of stages where gene expression is regulated, the most extensively utilised point is Transcription Initiation:

- Chromatin domains
- Transcription
- Post-transcriptional modification
- RNA transport
- Translation
- mRNA degradation

In eukaryotes, the accessibility of large regions of DNA can depend on its chromatin structure, which can be altered as a result of histone modifications directed by DNA methylation, ncRNA, or DNA-binding protein. Hence these modifications may up or down-regulate the expression of a gene. Some of these modifications that regulate gene expression are inheritable and are referred to as epigenetic regulation.

DNA methylation plays an important regulatory role in eukaryotic genomes. Alterations in methylation can affect transcription and phenotypic variation. DNA methylation has long been considered a key regulator of gene expression. 17

Linking DNA methylation and histone modification: patterns and paradigms 18
Both DNA methylation and histone modification are involved in establishing patterns of gene repression during development. Certain forms of histone methylation cause local formation of heterochromatin, which is readily reversible, whereas DNA methylation leads to stable long-term repression. It has recently become apparent that DNA methylation and histone modification pathways can be dependent on one another, and that this crosstalk can be mediated by biochemical interactions between SET domain histone methyltransferases and DNA methyltransferases. Relationships between DNA methylation and histone modification have implications for understanding normal development as well as somatic cell reprogramming and tumorigenesis. 

The presence of active histone marks is necessary for TF binding.

Nucleosomes function and design

A large number of complexes have been biochemically and functionally identified to serve roles as coactivators and corepressors of specific transcriptional programs. Many coactivator and corepressor proteins are components of multisubunit coregulator complexes that exhibit an ever-expanding diversity of enzymatic activities that can be divided into two generic classes 44


The coactivator matrix. 
Sequence-specific activators, exemplified by nuclear receptors, bind to cis-active elements in promoters and enhancers of target genes and activate transcription in a signal (ligand)-dependent manner. Transcriptional activation requires the actions of many, multisubunit coactivator complexes that are recruited in a parallel and/or sequential manner. Enzymatic activities associated with specific components of coactivator complexes result in nucleosome remodeling and covalent modifications of histone tails, such as histones H3K4 methylation, H3K9 and H3K9 acetylation, H4K20 acetylation, and phosphorylation of the linker histone H1b. 5

A crucial aspect of development, homeostasis and prevention of disease is the strict maintenance of patterns of gene repression. Gene repression is largely achieved by the combinatorial action of various enzymatic complexes — known as co-repressor complexes — that are recruited to DNA by transcription factors and often act through enzymatic modification of histone protein tails.  Here, we consider specific strategies that underlie repression events — for example, those mediated by the nuclear receptor co-repressor (NCoR, also known as NCOR1) and silencing mediator of retinoic acid and thyroid hormone receptor (SMRT, also known as NCOR2) co-repressor complexes — and discuss emerging themes in gene repression.

Combinatorial action, recruiting, strategies, mediation, are all events and actions which setup depends on intelligence.  

Co-activators and co-repressors have a major role in altering chromatin structure through the modification of core histone amino-terminal tails.

The corepressor matrix. 
Sequence-specific repressors, exemplified by unliganded or antagonist-bound nuclear receptors, actively repress transcription by recruiting corepressor complexes to cis-active elements in promoters and enhancers of target genes. These factors act in a combinatorial manner to antagonize actions of coactivator complexes (e.g., through histone deacetylase activity, phosphatase activity, and corepressor-associated nucleosome remodeling activities), and by mediating covalent modifications (e.g., methylation of histones H3K9, H3K27, and H4K20) that serve as marks for recruitment of additional factors involved in transcriptional repression. NCoR and SMRT nucleate a core corepressor complex that contains HDAC3, TBL1, TBLR1, and GPS2, with additional weakly interacting factors (e.g., Sin3 complexes) forming a functional holo-complex.

Coactivator/corepressors and sequence-specific transcriptional factors constitute distinct axes on a matrix for many potential combinatorial interactions that are used in a context-dependent manner.

Dynamic exchange of enzymatic activities in coregulatory complexes
For many cofactors present in “limiting” concentrations, there is evidence that genome-wide patterns of competition for their recruitment to specific DNA-binding factors is an important quantitative determinant of overall programs of gene activation. An additional example supporting the concept of coactivator competition as a regulatory strategy has been provided in the Wnt pathway, where evidence from genetic screens indicates that, at high levels, non-TCF/LEF factors, including homeodomain factors, can compete with TCF/LEF for nuclear beta-catenin, hence dictating a differential transcriptional outcome. The signal-dependent interactions of coactivators and corepressors with sequence-specific transcription factors can be controlled at several levels, including cofactor expression, post-translational modifications of cofactors and their targets, and in the case of nuclear receptors, ligand binding.

The assembly of coactivator complexes is itself a dynamic and cell-specific process, with signal transduction pathways regulating the composition of specific coactivator complex components. 

Signal-dependent activator/coactivator cycles and epigenetic control
Coactivators and corepressors required for regulated actions of DNA-binding transcription factors reveal a network of sequentially exchanged cofactor complexes that execute a series of enzymatic modifications required for regulated gene expression. These coregulator complexes possess “sensing” activities required for interpretation of multiple signaling pathways. In this review, we examine recent progress in understanding the functional consequences of “molecular sensor” and “molecular adaptor” actions of corepressor/coactivator complexes in integrating signal-dependent programs of transcriptional responses at the molecular level. This strategy imposes a temporal order for modifying programs of transcriptional regulation in response to the cellular milieu, which is used to mediate developmental/homeostatic and pathological events. 1

The initially defined example of signal-dependent, temporal-specific factor exchange was provided by study of the HO locus in budding yeast, with ordered recruitment of 

- SWI5 and SBF
- SWI/SNF complex
- SAGA complex
- Ash1 repressor 

This ordered exchange not only defines the sequence in recruitment of enzymatic machinery necessary to achieve activation of specific transcription units, but also provides a temporally changing complement of potential “sensors” for responding to changes in the signaling milieu of the cells, and hence the opportunity to modify the transcriptional outcome. Increasing evidence indicates that active exchange cycles of sequence-specific transcription factors and associated coregulators are required for sustained transcriptional responses to signaling inputs in metazoan organisms. 5

The genome has the ability to respond in a precise and coordinated manner to cellular signals. It achieves this through the concerted actions of transcription factors and the chromatin platform, which are targets of the signaling pathways. 3

The gene expression profile of a cell determines its phenotype and function, and transcription factors play a key role in defining these gene expression profiles in response to cellular signals. However, control of gene expression is exerted by mechanisms involving the interaction of transcriptional complexes with not only the DNA sequence itself but also the chromatin proteins associated with the DNA. Access to the DNA by transcription factors is controlled by the packaging of DNA within the nucleus as chromatin. The specific composition of chromatin can dictate gene expression patterns in a cell by regulating the relative accessibility provided to transcription factors and the transcriptional machinery. Although the tight packaging of nucleosomes into heterochromatin inhibits transcription factor access, and transcription, euchromatin, with its associated relaxed chromatin and nucleosome positioning is more conducive to transcriptional activation due to the relative ease of transcription factor access. Chromatin composition is dynamic and is maintained or modified through the concerted actions of transcription factors and chromatin modifiers responding to cellular signaling cascades.  Not surprisingly then, mutations to transcription factors and the molecules that comprise and modify the chromatin landscape commonly underlie the altered gene expression profiles that are characteristic of cancer cells. The basis of the epigenetic landscape of a cell encompasses

- DNA methylation
- histone modifications
- histone variants
- nucleosome positioning

The expression of genes subject to strict regulation can be a highly dynamic, cyclical process that sequentially achieves and then limits transcription.  transcription is dependent upon cis-acting elements (DNA and nucleosomes) that either interact with or are modified by trans-acting factors. Induced local structural changes to chromatin encompassing regulatory elements of gene promoters include alteration of the positional phasing of nucleosomes, substitution by variant histones, post-translational modification of nucleosomes, changes in the methylation of CpG dinucleotides and breaks in the sugar-phosphate backbone of DNA. 4

Transcriptional activation is a complex, multistage process implemented by hundreds of proteins. Many transcriptional proteins are organized into coactivator complexes, which participate in transcription regulation at numerous genes and are a driver of this process. Transcriptional machinery includes hundreds of transcription factors that function coordinately to provide for the progression of the multistage transcriptional process. 6 It is currently considered that transcriptional activation is initiated by a limited set of coactivator complexes, while different research groups have shown that this process is cyclic and that coactivator complexes do not bind to promoters for a long time but replace each other in its course.

Chromatin modifications as signals for dynamic transcriptional modulation
Although the information about a wide variety of histone modifications is accumulating at a rapid rate, the relationship between the regulated transcriptional cycle and different modifications and their composite readout is not yet clear. Clearly, this is an interdependent cycle, where histone-modifying enzymes are unable to assess their substrates unless they are targeted, and the same enzyme will not modify all histones in all genes at the same time. The fine-tuning of the temporal order of coactivator recruitment dictated by combinations of promoter-specific transcriptional factors is central to their actions as periodic molecular sensors of a constantly diverging signaling network. Within this context, protein domains in these factors are able to recognize specific modifications, including 

- bromodomain 
- chromodomains
- RING fingers 
- Ph.D. fingers
- F-boxes
- SANT domains
- recognition sequences for SUMO ligases
- protein kinases
- protein phosphatases 

play critical roles in the targeting process. Each of these protein modules can contribute to both the recognition of specific histone modifications as well as to their settings at given locations. An example is provided by the identification of a WD-40 domain protein, WDR5, as a factor that recruits a complex containing methyltransferases to diMe K4-H3. It has been proposed that chromatin-binding domains could play a central role in helping to establish and maintain either periodicity in transcriptional states or long-term transcriptional states when it is needed. For example, the bromodomain of BRG1 binds the H4 tail when acetylated at K8, and the double bromodomain of TAFII250 binds the H3 tail acetylated at both K9 and K14. These dynamic, “histone code”-driven interactions can represent the sequential order of step-to-step transitions during transcriptional initiation.

The existence of a specific order of the actions of the histone/factor-modifying complexes implies a signaling pathway for mediating gene activation/repression events, and for temporal-specific “sensors” responding to additional signaling pathways activated/extinguished during the periodic time intervals of coregulator exchange. 

These events would seem to depend on a “feed-forward” system, by which marks that cause a preferential recruitment of one complex must be altered to permit sequential recruitment of the next complex in the cascade, causing the alterations in promoter complex and histone marks that elicit the next cohort of cofactor recruitment. This exchange also requires a strategy for rapid cofactor complex clearance, and probably for their degradation and/or relocation. The implication of these events is that a constantly changing array of histone modifications and coactivator complexes combinatorially serves as the platform for recruitment of the next cofactor complex, based on actions of each preceding complex. These would involve changes in the DNA-binding factor/histone modifications, changes in factor/ core machinery, and altered enzymatic actions, as well as allosteric effects of DNA-binding sites, that together dictate the choice of the next group of cofactor complexes. This cycle of recruitment of specific modifying complexes in response to covalent modifications of histones is consistent with current views of the “histone code” as a three-dimensional platform for recruitment of coregulatory complexes. Potential relationships between histone marks and factor/cofactor recruitment can actually impose regulatory constraints on transcription factors that might otherwise function as constitutive activators or repressors. The actions of three-dimensional histone/factor recruitment platforms imply that multiple recognition motifs combinatorially modulate cofactor/ enzyme complex recruitment events.

Corepressor/coactivator exchange complexes are targets of multiple extracellular and intracellular signaling pathways. 
For many signal-dependent transcription units, transcriptional activation requires active removal of corepressors in addition to recruitment of coactivators. In the case of nuclear receptors, TblR1 is used as a sensor of ligand binding, which activates its E3 ubiquitin ligase activity and leads to the ubiquitylation, clearance, and most likely proteasome-dependent degradation of corepressor complexes. Corepressor clearance results in gene derepression and is a prerequisite to the subsequent recruitment of coactivator complexes. This corepressor/coactivator exchange is mechanistically linked to changes in histone marks; for example, loss of histone H3-K9 and K27 methylation and gain of H3K9 and K14 acetylation. The factor exchange complexes themselves are proposed to represent targets of regulation, enabling an additional level of integration of multiple signaling inputs.

Ordered Recruitment: Gene-Specific Mechanism of Transcription Activation
Activators, chromatin-modifying enzymes, and basal transcription factors unite to activate genes, but are recruited in a precise order to promoters. The timing of the activation of transcription and the ordered recruitment of factors to promoters are the engines which, at the right moment and for the right length of time, drive the transcriptional regulation of each gene throughout the life of a cell. 2
In the many years of investigations aimed at an understanding of how genes encode their protein products, many efforts focused initially on the identification of basal ( general ) transcription factors, activators, repressors, and consensus sequences and, subsequently, on the comprehension of how these factors and their cognate binding regions within promoters maintain their cross-talk. The following “transcription scenario” emerged: 

(1) Activators bind one or multiple regulatory sequences; 
(2) In cooperation with TAF proteins, TBP (a subunit of TFIID) binds the TATA box; 
(3) TFIIB, which helps RNA Polymerase II to select the start, adds to the TBP complex; 
(4) The RNA Pol II holoenzyme, in concert with TFIIF, TFIIE, and TFIIH, associates with the promoter and forms the preinitiation complex (PIC); 
(5) Promoter melting and transcription initiation occur; and 
(6) TFIIH phosphorylates the largest RNA Pol II subunit, the Rpb1 C-terminal domain (CTD), leading to promoter clearance and progression into the elongation phase. 

This is, with a few variations and exceptions, the step-wise sequence of transcription activation. An alternative model, however, is one in which basal transcription factors and RNA Pol II are preassembled together in a gigantic complex that associates at once with promoters. In both models, activators guide the basal transcription machinery through Mediator, a modular complex of factors that “mediate,” like a regulatory bridge, the signals between activators and RNA Pol II.

Simultaneous with the transcription activation studies, other researchers tried to explain how DNA sequences are arranged within the chromosomes. This led to the recognition of chromosomes as being organized into nucleosomes, and of nucleosomes as being formed by DNA coiled around the histone octamers. However, the functional interplay between transcription and chromatin structure was not obvious for many years, and science went on in two parallel directions. Chromatin was divided into two different categories: euchromatin, which is highly transcribed and per se permissive to the binding of transcription factors, and heterochromatin, which was thought to be silenced and never transcribed. Today, the general view has dramatically changed. We now know that heterochromatin is rich in repetitive sequences but is not at all devoid of genes. Moreover, euchromatic regions of chromosomes can be silenced by packaging in a heterochromatic form.

It has become obvious that both gene transcription activation and chromatin structure studies are two functionally correlated fields. This conclusion has been made possible by the discovery in almost all organisms of chromatin-modifying complexes, which were recognized to be involved in modifying histone residues and nucleosome conformation, thereby leading to gene transcription activation. They can be divided into two main groups: 

- the ATP-dependent remodeling complexes, which use energy to modify chromatin structure in a noncovalent manner, and 
- the histone-modifying complexes, which add or remove covalent modifications from histone tails. 

The functional link between chromatin structure and transcription activation is the “histone code,” which is generated by methylation or acetylation of specific arginine and lysine residues within histones H3 and H4. In fact, the transcriptional apparatus reads this histone code and, as a consequence, activates or represses the neighboring genes. Epigenetic readers recognize specific epigenetic modifications on DNA or histones and include

- chromatin remodeling enzymes
- chromatin modifiers
- chromatin architectural proteins
- adaptor proteins.

Epigenetic plasticity is further regulated by the three-dimensional and higher-order chromatin structure, including nucleosome repositioning, DNA looping, and long-range chromatin interactions. Chromatin remodeling enzymes use the energy released from hydrolysis of ATP to reposition nucleosomes and include large multi-subunit complexes, such as SWI/SNF, ISWI, Nurd/Mi/CHD, SWR1, and INO80. These complexes must be recruited to the appropriate regulatory region to enable remodeling, which then facilitates access to the transcription factors and transcriptional machinery. 

Both the signaling kinases, PKC-theta and ERK2, have been found to have nuclear functions as chromatin-associated proteins. While it has been known that protein kinases operate by communicating signals from the cytoplasm to the nucleus, it is also evident that these and other nuclear kinases can also associate with chromatin in the nucleus impacting the chromatin landscape directly by phosphorylating histone proteins

8. Molecular Mechanisms of Autonomy in Biological Systems Relativity of Code, Energy and Mass, page 15

Last edited by Admin on Sun Oct 28, 2018 1:53 pm; edited 60 times in total

View user profile


Regulation of transcription in eukaryotes: changes in chromatin structure and DNA methylation
In eukaryotes, DNA is associated with proteins to form a structure called chromatin—the complex of DNA and proteins that makes up eukaryotic chromosomes. How does the structure of chromatin affect gene transcription?  Nucleosomes are composed of DNA wrapped around an octamer of histone proteins. Depending on the locations and arrangements of nucleosomes, a region containing a gene may be in a closed conformation, and transcription may be difficult or impossible. Transcription requires changes in chromatin structure that allow transcription factors to gain access to and bind to the DNA in the promoter region. Such chromatin, said to be in an open conformation, is accessible to gene transcription factors (GTFs) and RNA polymerase II, so transcription can take place. Chromatin is converted from a closed to an open conformation.  DNA methylation—the attachment of methyl groups to the base cytosine—affects chromatin conformation and gene expression.

Transcription is controlled by changes in chromatin structure
In recent years, geneticists have been trying to identify the steps that promote the interconversion between the closed and open conformations of chromatin. One way to change chromatin structure is through ATP-dependent chromatin-remodeling complexes, which are a group of proteins that alter chromatin structure. Such complexes use energy from ATP hydrolysis to drive a change in the locations and/or compositions of nucleosomes, thereby making the DNA more or less amenable to transcription. Therefore, chromatin remodeling is important for both the activation and repression of transcription. How do ATP-dependent chromatin-remodeling complexes change chromatin structure? Three effects are possible.

-One result is that these complexes may bind to chromatin and change the locations of nucleosomes (Figure a below). This may involve a shift of the relative positions of a few nucleosomes or a change in the relative spacing of nucleosomes over a long stretch of DNA.
-A second effect is that remodeling complexes may evict histone octamers from the DNA, thereby creating gaps where nucleosomes are not found (Figure b).
-A third possibility is that chromatin-remodeling complexes may change the composition of nucleosomes by removing standard histone proteins from an octamer and replacing them with histone variants (Figure c). A histone variant is a histone protein that has a slightly different amino acid sequence from the standard histone proteins. Some histone variants promote gene transcription, whereas others inhibit it.

ATP-dependent chromatin remodeling. 
Chromatin-remodeling complexes may 
(a) change the locations of nucleosomes,
(b) remove histones from the DNA, or 
(c) replace standard histones with variant histones. 
The chromatin-remodeling complex, which is composed of a group of proteins, is not shown in this figure.

In biology, histones are highly alkaline proteins found in eukaryotic cell nuclei that package and order the DNA into structural units called nucleosomes. They are the chief protein components of chromatin, acting as spools around which DNA winds, and playing a role in gene regulation. Without histones, the unwound DNA in chromosomes would be very long (a length to width ratio of more than 10 million to 1 in human DNA). 23 Gene regulation differs between prokaryotes and eukaryotes.  Histones are among the most evolutionarily conserved proteins known; they are vital for the well-being of eukaryotes and brook little change. When a specific gene is tightly bound with histone, that gene is "off." But how, then, do eukaryotic genes manage to escape this silencing? This is where the histone code comes into play. This code includes modifications of the histones' positively charged amino acids to create some domains in which DNA is more open and others in which it is very tightly bound up. DNA methylation is one mechanism that appears to be coordinated with histone modifications, particularly those that lead to silencing of gene expression. Small noncoding RNAs such as RNAi can also be involved in the regulatory processes that form "silent" chromatin. On the other hand, when the tails of histone molecules are acetylated at specific locations, these molecules have less interaction with DNA, thereby leaving it more open. The regulation of the opening of such domains is a hot topic in research. For instance, researchers now know that complexes of proteins called chromatin remodeling complexes use ATP to repackage DNA in more open configurations. 24

Histones, which represent the protein component of chromatin, are site of many dynamic and reversible post-translational modifications that play a fundamental role in the regulation of the underlying genes. There is an ever-growing list of these modifications and the complexity of their action is only just beginning to be understood. However, it is clear that histone modifications play fundamental roles in most biological processes that are involved in the manipulation and expression of DNA. There are a large number of different histone post-translational modifications (PTMs). 26

The N-terminal histone tails provide a molecular “handle” to manipulate DNA accessibility in chromatin. A wide range of MARKS has been identified at many sites in the histone N-terminal tails and elsewhere. Histones with acetylated lysines are generally associated with “open” chromatin that is permissive for RNA transcription, while histones with methylated lysines can be associated with either “open” or “closed” chromatin states.

Post-transcriptional modifications (PTMs) of histones affect gene transcription

Last edited by Admin on Tue Oct 16, 2018 3:35 pm; edited 45 times in total

View user profile


The modification of histones is dynamic, and the corresponding enzymes include

- kinases/phosphatases
- acetyltransferases/deacetylases
- methyltransferases/demethylases
- ubiquitin ligases/deubiquitinating enzymes
- SUMO ligases/deSUMOylating enzymes

Among the enzymes that modify histones, kinases mainly depend on the activation of specific upstream signaling pathways leading to cascades of protein phosphorylation and regulation of transcription in the nucleus. Recently, significant progress in understanding the roles of this particular type of modification has been made through the elucidation of mechanisms by which gene expression is directly affected through specific kinase-dependent phosphorylation of histones.  Furthermore, accumulating evidence suggests that some kinases that modify histones can also modify nonhistone substrates including chromatin remodeling factors and transcription factors

In comparison to other histone modifications, phosphorylation of histones requires specific activation of upstream signaling pathways.  Although these upstream kinases in the signaling pathways regulate transcription via activation of phosphorylation-dependent signal transduction cascades, the central kinases are not considered to directly modify histones or histone modifiers in the nucleus due to their major cytoplasmic localization. However, accumulating evidence has shown that the upstream kinases have a more profound effect on gene expression through direct phosphorylation of the chromatin. The activated signal transduction kinases may frequently occupy target genes by binding to transcription factors and coregulators in the nucleus  18 

The specific recognition of PTMs by readers recruits various components of the nuclear signaling network to chromatin, mediating fundamental processes such as gene transcription, DNA replication and recombination, DNA damage response and chromatin remodeling. Chromatin-associating complexes often contain multiple readers within one or several subunits that show specificities for distinct PTMs. Coordinated binding to multiple PTMs can provide a lock-and-key–type mechanism for targeting particular genomic sites and ensuring the proper biological outcomes. Misreading of epigenetic marks has been shown to underlie a host of human diseases, including autoimmune and developmental abnormalities and cancer. It was not until 20 years ago that enzymatic activities of a histone acetyltransferase (HAT) and a deacetylase were directly linked to transcriptional regulation. In addition to lysine acetylation, numerous PTMs have now been identified, including

Lysine methylation
Arginine methylation
Histone demethylases
propionylation of lysine residues
ADP-ribosylation of arginine residues
glycosylation of serine and threonine residues
ADP ribosylation
Histone tail clipping
Histone proline isomerization

Intricate control of nucleosomal structure and assembly governs access of RNA polymerase II to DNA and consequent RNA synthesis.  27 Histone modifications not only play important roles in regulating chromatin structure and nuclear processes but also can be passed to daughter cells as epigenetic marks. The key function of histone modifications is to signal for recruitment or activity of downstream effectors, through a histone language stored in the histone tails. 

Numerous chromatin-associated factors have been shown to specifically interact with modified histones via many distinct domains. In terms of gene expression, acetylation of the N-terminal domains of histones – particularly H3 and H4 – is of the greatest importance. These lysine-rich domains each have numerous lysine residues that are targets for acetylation. Acetylation serves to reduce the stability of the nucleosome complex so that the DNA is more readily accessible to the transcription machinery. The mechanisms by which histone acetylation are controlled are extremely complex. Levels of acetylation at any locus are dictated by the combined activities of histone acetyltransferases (HATs) and histone deacetylase complexes (HDACs). The activity of these large protein complexes is determined, in turn, by mechanisms that control their recruitment to different nuclear sites.

As the substantial degree of interplay between PTMs began to unfold, it led researchers to propose the histone code hypothesis, which states that “multiple histone modifications, acting in a combinatorial or sequential fashion on one or multiple histone tails, specify the unique downstream function.

Soon afterward, the terms ‘writer’, ‘eraser’ and ‘reader’ were formulated to describe proteins that deposit, remove and recognize PTMs, respectively. In addition to histone PTMs, other factors—including

hydroxymethylation of DNA
histone variants
nucleosome positioning
noncoding RNAs
histone chaperones

are necessary for fine-tuning chromatin structure and function, and together they constitute the powerful and dynamic epigenetic machinery. The intricate relationship between the epigenetic elements represents one of the most intriguing concepts in modern chromatin biology. Clearly, the spatial and temporal modulation of, and cross-talk between, histone PTMs has a very important role in defining the chromatin landscape.  Many readers can distinguish a particular sequence surrounding a PTM, affording specific chromatin targeting ability to their host proteins.

Distinct histone modifications, on one or more tails, act sequentially or in combination to form a `histone code' that is, read by other proteins to bring about distinct downstream events.

How eukaryotic genomes are manipulated within a chromatin environment is a fundamental issue in biology.  At the heart of chromatin structure are highly conserved histone proteins (H3, H4, H2A, H2B and H1) that function as building blocks to package eukaryotic DNA into repeating nucleosomal units that are folded into higher-order chromatin fibres. 

Chromatin organization and the tail of histone H3.

a, General chromatin organization. Like other histone `tails', the N terminus of H3 (red) represents a highly conserved domain that is likely to be exposed or extend outwards from the chromatin fibre. A number of distinct post-translational modifications are known to occur at the N terminus of H3 including acetylation (green tag), phosphorylation (grey circle) and methylation (yellow hexagon). Other modifications are known and may also occur in the globular domain.

b, The N terminus of human H3 is shown in single-letter amino-acid code. For comparison, the N termini of human CENP-A, a centromere-specific H3 variant, and human H4, the nucleosomal partner to H3, are shown. Note the regular spacing of acetylatable lysines (red), and potential phosphorylation (blue) and methylation (purple) sites. The asterisk indicates the lysine residue in H3 that is known to be targeted for acetylation as well as for methylation; lysine 9 in CENP-A (bold) may also be chemically modified (see text). The above depictions of chromatin structure and H3 are schematic; no attempt has been made to accurately portray these structures.

In general, the best known PTMs are classified as having an effect of transcription activation or repression. Acetylations are mostly activating, methylation can be either activating or repressing. A good table of the main histone PTMs can be found here:

Histone Modification Table

Now, some of these modifications have been mapped genome-wide in different types of cells. Their combinatorial aggregation can rather well predict different functional states of different parts of the genome and of chromatin. For example, the following paper:

Multi-scale chromatin state annotation using a hierarchical hidden Markov model

Chromatin structure contains multidimensional nucleosome structural information along the single-dimension genomic coordinates. millions of regulatory regions, such as enhancers and promoters, in various cell types Chromatin forms higher-order three-dimensional structures by folding and looping, facilitate long-range interactions between enhancers and target genes. The process is likely related to the distribution of histone marks over broad domains. Cell-fate transitions are accompanied by extensive remodeling of chromatin architecture. While most studies have focused on nucleosome-scale dynamics, several experimental methods have revealed higher-order chromatin reorganization. At the domain level, our diHMM analysis has identified three distinct patterns of enhancer domains, super-enhancer, upstream enhancer, and intron/enhancer, thereby unraveling significant complexity among different enhancers. We further find that the functionality of an enhancer strongly depends on the domain-level chromatin-state context, with the super-enhancer domain conferring the strongest regulatory potential. Our analysis is consistent with the recent discovery that multiple regulatory elements may cluster together, spanning over 10 kb regions, and cooperatively regulate cell identity. 16 

Long-range chromatin interactions play important roles in diverse biological processes including gene regulation, DNA replication and repair.

Histones are integral and dynamic components of the machinery responsible for regulating gene transcription. An extensive literature documents an elaborate collection of posttranslational modifications including acetylation, phosphorylation, methylation, ubiquitination and ADP-ribosylation3that take place on the `tail' domains of histones. . These tails, which protrude from the surface of the chromatin polymer and are protease sensitive, comprise,25±30% of the mass of individual histones3,4, thus providing an exposed surface for potential interactions with other proteins. Growing awareness of the remarkable diversity and biological specificity associated with distinct patterns of covalent histone marks has caused us and others to favor the view that a histone `language' may be encoded on these tail domains that is read by other proteins or protein modules. We refer to this language as the `histone code' and present evidence supporting the existence of this language.

A phosphorylation mark alone, or in combination with other marks (such as phosphorylation at serine 28), may recruit a binding factor that, in turn, has a role in mediating chromosome condensation and segregation. In contrast, a distinct mark or set of marks (for example, phosphorylation and acetylation at residues 10 and 14, respectively) may provide a unique binding surface to recruit factors promoting decondensation and transcription.

Regulation, governing, controlling, recruiting, interpretation, recognition, orchestrating, elaborating strategies, guiding, instruct the blocking or allowing transcription, selecting which genes are turned on or repressed, programming the right timing of an action, responding correctly to signals, setup specificity, setting up morphogen gradients, which in effect provide a positioning system that tells a cell wherein the body it is, and hence what sort of cell to become, programming the right cell differentiation, programming the expression of the appropriate cell types at the correct stage of development. are all terms and actions related to intelligence.

Secular science has enormous difficulty to explain the principles that govern biological processes without resorting to teleology - or purpose. For that reason, they often use a descriptive vocabulary that is suspiciously teleological but then try to find excuses to eradicate the implications, by attributing intention to evolutionary processes, nonetheless, claiming there was no intention to reach specific goals.  

Combinatorial readout of and cross-talk between PTMs
The list of newly identified histone readers has grown rapidly, yet the finite number of readers and the numerous biological processes they mediate suggest that explicit mechanisms exist to differentiate functions of the effector-containing proteins and elicit distinct biological outcomes. Given the extensive and complex nature of the chromatin landscape, several mechanisms involving the combinatorial readout of epigenetic marks have been uncovered. 1 The activity of a reader toward a particular modification can be influenced by neighboring PTMs

Combinatorial readout of PTMs. (a–c) 
(a) Recognition of a target PTM is influenced by adjacent PTMs on the same histone tail and  
(b,c) the combined action of multiple readers within the same protein the complex. 
(d) Multivalent engagement of readers within individual subunits of The reader-harboring proteins can also contain the catalytic domains (which act as writers and erasers) or scaffolding domains that bridge their host proteins with other subunits of the complex. Readers can recognize PTMs on a single histone tail (cis mechanism) or different histone tails (trans mechanism).

The activity of a reader toward a particular modification can be influenced by neighboring PTMs (Fig.a). In fact, many readers associate with a substantial stretch of the histone tail, allowing for the sensing of multiple marks. The neighboring PTMs can enhance or impede interaction with the target PTM.  Recruitment of a reader to a specific genomic region can be further modulated through the combinatorial action of multiple effectors (Fig. b,c) A number of chromatin-associated proteins contain more than one reader, in the form of several copies of the same reader or a combination of various readers that are often specific for distinct PTMs.  (Fig. b). Multiple PTMs can also be recognized on separate tails of a single nucleosome or on adjacent nucleosomes that are directly linked by DNA or are otherwise in close spatial proximity (Fig. c). An even more complex combinatorial trans readout can be seen in chromatin complexes containing histone readers in multiple subunits (Fig. 4d). The interplay between such effectors generates a multifaceted network of intertwined contacts that can provide a high degree of specificity. Furthermore, many nuclear complexes are not static and undergo component swapping, a powerful mechanism for altering their chromatin-targeting capabilities. Beyond mediating specific chromatin anchoring, PTM cross-talk can also trigger a cascading series of writing, erasing and reading events.

The DNA methylation code and language

Chromatin remodeling

Last edited by Admin on Tue Oct 16, 2018 3:28 pm; edited 15 times in total

View user profile

41 What defines body structures and architecture? on Mon Oct 08, 2018 4:16 pm


What defines body structures and architecture?

One of the aims of developmental biology is to discover the developmental origins of morphological variation. 11  Genomic regulatory systems drive embryonic development of the body plan 17 Many studies have revealed that genes encoding transcription factors and signalling molecules are critical controllers of pattern formation and cell fate specification during development. 18

The process of morphogenesis, which can be defined as an evolution of the form of an organism, is one of the most intriguing mysteries in the life sciences.  The discovery and description of the spatial-temporal distribution of the gene expression pattern during morphogenesis, together with its key regulators, is one of the main recent achievements in developmental biology. Nevertheless, gene expression patterns cannot explain the development of the precise geometry of an organism and its parts in space. First, we suggest that the geometry of the organism and its parts is coded by a molecular code located on the cell surfaces in such a way that, with each cell, there can be associated a corresponding matrix, containing this code. As a particular model, we propose coding by several types of oligosaccharide residues of glycoconjugates. 7

To understand the major trends in animal diversity and if the various kinds of morphology are due to evolution, we must first understand how animal form is generated. 

Morphology is the product of development, the process through which a single fertilized egg cell gives rise to an entire organism. Given that the DNA of (most) all cells in an animal is identical, how do different cells acquire the unique morphologies and functional properties required in the diverse organs and tissues of the body? We now understand that this process occurs through the selective expression of distinct subsets of the many thousands of genes in any animal’s genome in different cells. How genes are turned on and off in different cells over the course of animal development is an exquisitely orchestrated regulatory program. If morphological diversity is all about development, and development results from genetic regulatory programs, then is the origin of diversity directly related to genetic regulatory programs? Simply put, yes. 

But to understand how diversity emerged, and if evolution is an adequate explanation, it must first be understood how the genetic regulatory mechanisms operate in development. What is the genetic toolkit of development and how does it operate to build animals? The foremost challenge for embryology has been to identify the genes and proteins that control the development of animals from an egg into an adult. Early embryologists discovered that localized regions of embryos and tissues possess properties that have long-range effects on the formation and patterning of the primary body axes and appendages. Based on these discoveries, they postulated the existence of substances responsible for these activities. A small fraction of all genes in any given animal constitute the toolkit that is devoted to the formation and patterning of the body plan and body parts. 

Two classes of gene products with the most global effects on development are of special interest: families of proteins called transcription factors that regulate the expression of many other genes during development, and members of signalling pathways that mediate short- and long-range interactions between cells. The expression of specific transcription factors and signalling proteins marks the location of many classically defined regions within the embryo. These proteins control the formation, identity, and patterning of most major features of animal design and diversity.

Thom,1989 writes: We consider the concept of predetermination of a geometrical shape/form of living species to be the most appropriate. The matrix on a cell surface will be changed after each cell event according to the rule(s) dictated by the morphogenetic field of an organism. There is a connection between the morphogenetic code on the cell surface, cell motion law(s), and the geometry of an embryo.  It is impossible to create a formalization of morphogenesis that is not based on a “deterministic concept”. 

The developmental behaviour of a cell depends on the instructive signals from the surrounding space (area), and that different areas in a developing embryo contain precise instructions about the shape of corresponding organs.  Because these features cannot be ignored in any model aiming to formalize the developmental process, we continue to exploit the morphogenetic field term in the framework of our model, as a possible convenient tool to describe the connection between biological information, encoded in the cells, and the realization of the geometrical form of a developing organism.  A morphogenetic field is , in a mathematical sense, a “field” or a structure containing a space-time-dependent mechanism, which mediates the transformation of biological information, contained in cells, into the corresponding geometrical form of an organism in space-time; or, more precisely, into an instructive signal for a cell motion (cell event), depending on the position of this cell in the developing embryo. Thus, if the mathematical description of morphogenesis can be made in the framework of field theory, then it should be modified so as to consider the behaviour of the objects, whose nature depends on law(s) of coding of the information (i.e. geometrical information).

An overview of the various mechanisms that define body shape and form
Much attention has been focused on the processes that lead to determination of new structures during development. The major determinants of size and shape are more likely to involve just four major morphogenetic processes. These processes are

i) spatial regulation of mitotic density
ii) orientation of cell division
iii) biased rearrangements and intercalation of cells
iv) differential cell death

The orientation of division plays a key role in determining organ shape. In many tissues, cells can change relative positions by remodelling their contacts with neighbour cells. Biased orientation of the rearrangements results in tissue elongation, as is observed during the elongation of the Drosophila embryo. Differential cell death can result in a dramatic remodelling of tissue shape. For example, spacing between vertebrate digits is the consequence of inter-digital cell death. Moreover, an explanation of how variation in wing shape is generated requires a prior understanding of how these processes are regulated during development. 11

Signals regulating morphogenesis can be placed in two categories. On the one hand, morphogenetic cell behaviours are governed by extracellular signals secreted by cells, or by membrane-bound signals. For example, Bmp and Wnt-like proteins are generally involved in the control of cell proliferation and establish tissue fates, thus also playing a crucial role in cell differentiation. Because of their capacity to move from cell to cell, these proteins can establish concentration gradients pointing towards the source. Cells can sense the direction of such global gradients and translate these signals to establish their planar polarity (i.e, the polarity in the plane of the tissue) This is important because in many cases polarity defines the orientation of cell intercalations and divisions. In addition, the magnitude of that polarity has been proposed to regulate growth in some cases.

The second category of signals are mechanical forces, such as stretching or compression of a tissue, and the local growth environment, such as availability of nutrients. Mechanotransduction is the sensitivity of cells to mechanical signals. For example, the mammalian YAP/TAZ pathway, involved in growth regulation, is modulated by mechanical properties of the extracellular matrix. Forces extrinsic to epithelial cells can reorganize orientation patterns of cell rearrangements and divisions as well as cell fate, differentiation, and shape. Cell shape plays a special role in morphogenesis because it can directly modify tissue shape, and also regulates the morphogenetic processes as for example orientation of cell division and growth. Finally, there is growing evidence that cell death can be triggered by mechanical forces, as well as chemical signals.

The genetic toolkit
Toolkit genes are those whose products govern the construction of the house the toolkit that determines the overall body plan and the number, identity, and pattern of body parts. Toolkit genes have generally first been identified based on the catastrophes or monstrosities that arise when they are mutated.  

1. The toolkit is composed of a small fraction of all genes Only a small subset of the entire complement of genes in the genome affects development in discrete ways.
2. Most toolkit genes encode either transcription factors or components of signalling pathways, Therefore, toolkit genes generally act, directly or indirectly, to control the expression of other genes.
3. The spatial and temporal expression of toolkit genes is often closely correlated with the regions of the animal in which the genes function.
4. Toolkit genes can be classified according to the phenotypes caused by their mutation. Similar mutant phenotypes often reflect genes that function in a single developmental pathway. Distinct pathways exist for the generation of body axes, for example, and for the formation and identity of fields.
5. Many toolkit genes are widely conserved among different animal phyla.

Most toolkit genes can be classified according to their function in controlling the identity of fields (for example, different segments and appendages), the formation of fields (for example, organs and appendages), the formation of cell types (for example, muscle and neural cells), and the specification of the primary body axes. 

The second major category of toolkit genes encode proteins involved in the process of cell signalling, either as ligands, receptors for ligands, or components involved in the intracellular transduction of signals. At least seven major signalling pathways operate in the Drosophila embryo: 

- the Hedgehog, 
- Notch, 
- Wingless, 
- Dpp/transforming growth factor-β (TGF-β), 
- Toll, 
- epidermal growth factor (EGF), 
- fibroblast growth factor (FGF) signalling pathways

Field-specific selector genes
Another class of selector genes acts within specific developing fields to regulate the formation and/or the patterning of entire structures. Perhaps the best-known Drosophila field-specific selector gene is the eyeless (ey) gene. Flies that lack ey function can reach adulthood but never develop a compound eye. Molecular characterization of the ey gene revealed that it encodes a member of a particular homeobox gene family (Pax6 ), suggesting that the Ey protein acts as a DNA-binding transcription factor to regulate the expression of other genes. The ey gene is expressed in the developing eye field in the embryo, and in the larval eye imaginal disc, before the formation of the units (ommatidia) that make up the compound fly eye ( b below ) 

Field-specific selector genes
(a) Development of parts of the Drosophila adult depend upon the function of the ey (eyes), vg (flight appendages), and Dll (limbs) selector genes. 
(b–d) These genes are expressed in both the embryonic primordia (left) and larval imaginal discs (right), which will give rise to these structures.

Compartment selector genes
Several genes have been identified in Drosophila that act within certain developing fields to subdivide them into separate cell populations, or compartments. The specification of cell fate is important for organizing cells into functional tissues during animal development.  These compartment boundaries play a crucial role in tissue patterning and growth by stably positioning organizers. In Drosophila, the wing imaginal disc is subdivided into a dorsal and a ventral compartment. Cells of the dorsal, but not ventral, compartment express the selector gene apterous. Apterous expression sets in motion a gene regulatory cascade that leads to the activation of Notch signaling in a few cell rows on either side of the dorsoventral compartment boundary. Signalling between cells with different fates sets up a local source of organizers along compartments. Signalling molecules emanating from these organizer regions influence cell fate and growth of the entire tissue. Compartment boundaries thus serve as a reference line during pattern formation and growth.

For a flavour of the careful planning that goes into building even a relatively simple animal, let’s look briefly and sketchily at some of what’s been learned from studies of Drosophila development in recent years. The mother fly starts the process off by depositing in the egg, at the end that will become the head, a concentration of the instructions to make one kind of protein, called “bicoid,” and, at the end that will become the tail, a second kind of protein, called “nanos.” The bottom of the embryo is marked by the mother fly in a somewhat different way. The genes coding for proteins that specify the sides of the egg (front, back, top, bottom) are called “egg-polarity” genes. Critically, the proteins (or other proteins they affect) can stray in the egg, drifting away from their source; as they do they become more diffuse. As the egg initially divides into many cells, the high concentration of signal protein at one end of the fly turns on one set of control genes, the middle concentration turns on a different set of control genes in the middle portion of the embryo, and the lowest concentration activates a third set. Once the front, back, top, and bottom are marked (caution—it’s critical to keep in mind that the signal genes don’t actually form the structures found in those regions of the developing fly; they simply mark the location of a cell, like a surveying crew mapping out land for a construction project), positions are further refined with other control proteins. Several groups of proteins controlled by “segmentation genes” subdivide the embryo further. One group of about six so-called “gap genes” is switched on, marking chunks of segments; if one of these control proteins is defective, several neighbouring segments of the embryo will be missing. Oddly, another group of eight genes called “pair-rule genes,” affect alternate segments. If one of these is broken, a fly embryo will have only half its normal complement of segments. Finally, a group of ten “segment-polarity” genes helps differentiate each segment. Although in a normal fly the front of each segment looks a bit different from the back, in some segment-polarity mutants the two ends look the same.

The details can be mind-numbing, but the shape of the process is important: from egg-polarity genes to gap genes to pair-rule genes to segment-polarity genes, and we still aren’t ready to build the fly. The lifespan of all of the proteins coded by these control genes is brief, but they turn on genes for the more permanent Hox proteins, and thus permanently mark the position of cells in the developing animal embryo. Similar processes subsequently lay out compartments at finer and finer levels of the fly. For example, as a wing is built, the front, back, top, and bottom are marked by control genes, sometimes the same control genes that earlier marked various regions of the entire embryo. But now, working in a defined region of the developing animal, they mark the divisions and edges of the subcompartment. Remember, individual control genes don’t by themselves embody the instructions to build a wing—they just mark areas of the fly, and signal other genes to turn on or off. This short description leaves out many, many known details of the developmental process, including other means of cell-cell communication and the mechanics of how a signal is received and interpreted. But it at least gives a taste of how the body plan of a simple organism is set in motion. 16

What is a compartment boundary?
Within most proliferating tissues newborn cells can intermingle freely, and therefore occupy any position within the tissue. In some tissues, however, cells and their descendants are restricted to areas called compartments. The common border between two adjacent compartments is termed a compartment boundary. A key feature of compartment boundaries is that they involve a cell segregation mechanism. Cells from one compartment are kept segregated from cells of a neighbouring compartment leading to a straight boundary between compartments ( see picture below) Thus, a compartment boundary is a lineage boundary coupled to a cell-segregation mechanism. 9

Compartment boundaries are lineage borders linked to a cell segregation mechanism and serve to maintain the position and shape of organizers during growth of a tissue (exemplified by the Drosophila anteroposterior compartment boundary in the wing imaginal disc). 
(a) A tissue is subdivided into two founder cell populations a that differ in the expression of a ‘selector’ gene [engrailed; ‘on’ (green) or ‘off’ (white)]. The state of expression of this selector gene becomes fixed and heritably perpetuated. During cell proliferation, cells from both populations tend to intermingle, partly because the position of newly emerging daughter cells is not restrained. The border between the descendants of the two founder populations is wiggly (left panel). By establishing a cell segregation system that can sort out ‘on’ cells from ‘off’ cells based on their lineage, the border between the two populations remains straight during cell proliferation (right panel). The lineage border linked to a cell segregation system is termed the compartment boundary (dashed line). 
(b) The selector gene directs the expression of the short-range signalling molecule HH. The short-range signalling molecule moves to the adjacent ‘selector gene off’ cells, where it induces the expression of the morphogen DPP (red) in a few rows of cells, which act as an organizer. The morphogen moves away from its site of expression forming a graded distribution (red dots). The morphogen induces expression of target genes in a concentration-dependent manner leading to growth and patterning of the whole tissue. A wiggly border between ‘on’ and ‘off’ cells leads to an irregular and spatially unstable organizer incapable of directing precise patterning (left panel). In contrast, the compartment boundary between ‘on’ and ‘off’ cells leads to a straight and stable organizer and thereby to a precise patterning of the tissue (right panel). Furthermore, the position of the organizer is maintained during growth of the tissue.

Why are boundaries important?
During development, cells are ‘organized’ (i.e. instructed about their position and fate) by special, localized cell groups that secrete long-range signalling molecules. The exact positioning of these special cell groups, or organizers, is most critical for the patterning of tissues. Importantly, in growing tissues, cell proliferation leads to some cell mixing that might affect the spatial organization of the tissue. Nonetheless, the position and integrity of the organizer has to be maintained. There are at least two mechanisms by which this can be accomplished. The position of an organizer can itself depend on, and be maintained by, the influence of a more globally acting signalling molecule. Alternatively, an organizer can be positioned initially by a globally acting signalling system but then become independent of it. One important function of compartment boundaries is to maintain the position of organizers in growing tissues independent of globally acting signalling systems. This has been best illustrated for Drosophila wing development. 

Like all appendages, the adult wing is derived from an imaginal disc, which is formed by invagination of the embryonic epidermis. The patterning process can be divided into several steps. In the first step, two adjacent groups of cells are selected according to their position along the anteroposterior axis of the embryo. Under the control of the embryonic patterning machinery, one, but not the other, cell group expresses a ‘selector’ gene.

In the case of the wing disc, this selector gene is engrailed and encodes a homeodomain transcription factor

Modulation of Phase Shift between Wnt and Notch Signaling Oscillations Controls Mesoderm Segmentation
How signalling dynamics encode information is a central question in biology. During vertebrate development, dynamic Notch signalling oscillations control segmentation b of the presomitic mesoderm (PSM). In mouse embryos, this molecular clock comprises signalling oscillations of several pathways, i.e., Notch, Wnt, and FGF signalling. We find that Wnt and Notch signalling are coupled at the level of their oscillation dynamics.

Segmentation of the body axis into a series of repeating units is a canonical strategy in morphogenesis, and evidence of this can be seen in the skeletal system of all vertebrates. Invertebrates (and even some insects), the developmental process of segmentation is characterized by the rhythmic and sequential addition of segments (called somites) to an elongating body axis and is regulated by an oscillatory mechanism -- the segmentation clock. 13

Typical image data before and after semantic segmentation

Combinatorial gene regulation by modulation of relative pulse timing
The pulsatile transcription factors Msn2 and Mig1 combinatorially regulate their target genes through modulation of their relative pulse timing.  Regulation through relative signal timing is common in engineering and neurobiology, it functions also within the signalling and regulatory systems of the cell. In order to respond to environmental conditions, cells make extensive use of combinatorial gene regulation, in which two or more transcription factors co-regulate common target genes.

In order to respond to environmental conditions, cells make extensive use of combinatorial gene regulation, in which two or more transcription factors co-regulate common target genes. Most analysis of combinatorial regulation presumes that the concentrations of transcription factors in the nucleus are regulated in a continuous (non-pulsatile) manner

According to Wiki, regulation is an abstract concept of management of complex systems according to a set of rules and trends. These types of rules exist in various fields of biology.14
Question: Is setting rules and trends, and management not something only done by intelligence? Can a mindless process like evolution set rules of regulation? In order to do so, is there not the before anything else, knowledge required of what has to be regulated, why, and how? Is not knowledge required to implement a functional regulation of biological systems, in order for them to be able to adapt to the environment for successful survival and development?  For successful gene regulation, cells must know how to combine the right transcription factors, and how they find the right target genes ( interface compatibility of binding of the TF's to the right promoter sequence on the gene has to be pre-programmed.

Recent work has identified a large and growing list of transcription factors that activate in pulses. In such systems, a single pulse begins when many molecules of a given transcription factor are activated simultaneously, and ends when they are deactivated. Such pulses can occur repetitively, even under constant conditions. Pulsatile regulation has been observed in bacteria, yeast, and mammalian stress response and signalling pathways.  In these systems, inputs typically modulate the pulse frequency, amplitude, and/or duration of individual transcription factors to regulate genes. 

What functions could relative pulse timing modulation provide for the cell? One of the most fundamental concepts in combinatorial regulation is that cooperative interactions between transcription factors can increase their probability of simultaneous binding to a promoter, to implement cis-regulatory logic. Relative timing between signals plays many important roles throughout science and engineering. In neuroscience, the relative timing of action potentials at pre- and post-synaptic neurons controls the strength of synaptic connectivity through spike-timing-dependent plasticity. In communications, modulating the phase of a periodic signal relative to a reference signal is widely used to encode information. Cells seem to have evolved a related strategy by encoding aspects of the extracellular environment in the relative timing with which different transcription factors pulse. 15

Evolution seems to be an impotent mechanism to explain the origin of these ultrasophisticated coding and communication systems. The author mentions that " Cells evolved a strategy". This is clearly a teleological term.  According to Wiki, Strategy (from Greek "art of troop leader; office of general, command, generalship") is a high-level plan to achieve one or more goals under conditions of uncertainty. 

Pulsatile dynamics (both periodic and aperiodic) have been discovered in a growing list of central signalling and regulatory pathways. which are known to interact, or crosstalk, with one another.

The oscillation phase shift between Wnt and Notch signalling is critical for PSM segmentation. Dynamic signalling, i.e., the relative timing between oscillatory signals, encodes essential information during multicellular development.

Periodic segmentation of mesoderm into somites, the precursors of vertebrae, is controlled by a molecular clock, which in embryos includes ultradian (period ~2 hr) oscillations in Notch, Wnt, and FGF pathway activity. Notch signalling oscillations have been shown to be slightly phase shifted from one cell to the next along the anteroposterior axis in a spatially graded manner, and hence, oscillations generate periodic Notch activity waves that traverse the embryo from posterior to anterior. Moreover, unlike Wnt signalling oscillations, both the Notch and the FGF signalling oscillations depend on the transcriptional repressor Hes7, a core component of the segmentation clock, suggesting that oscillations of Notch and FGF signalling constitute outputs of a single clock mechanism. Between oscillatory Notch and FGF signals, it has been shown that Fgf signalling needs to be periodically shut off in newly forming segments within anterior presomitic mesoderm PSM to allow active Notch signalling to induce expression of the differentiation marker Mesp2 in this region. The differential regulation of individual oscillatory pathways may be critical for segmentation. There is a mechanism in which the local phase shift between oscillatory Wnt and Notch signalling in the presomitic mesoderm (PSM ) encodes information for mesoderm segmentation. Notch and Wnt signalling oscillations are coupled within the PSM, which is reflected in the ability to mutually synchronize each other upon entrainment of one signalling pathway.

My comment:  Pulses activating transcription factors. Pulsatile regulation observed in bacteria, yeast, and mammalian signalling pathways.  Pulse inputs modulating the pulse frequency, amplitude, and/or duration of individual transcription factors to regulate genes. Combinatorial regulation where cooperative interactions between transcription factors implement cis-regulatory logic. Cells using the strategy of modulating the phase of a periodic signal relative to a reference signal which is widely used to encode information. Pulsatile dynamics (both periodic and aperiodic) being part of a growing list of central signalling and regulatory pathways,  interacting, or crosstalk, with one another. if that were not enough, cells use an oscillation phase shift between Wnt and Notch signalling, which is critical for segmentation during development. Dynamic signalling, i.e., the relative timing between oscillatory signals, encodes essential information during multicellular development.

Polarity Formation
One of the most remarkable properties of biological systems is the capacity of polarity formation, which also can be defined as the symmetry breaking capacity of the system. Polarity formation involves autonomous generation and assembling of highly specific geometrical structures, which can grow asymmetrically through different layers of complexity from nanostructures to micro- and macro structures. Therefore, self-organization is accompanied by the formation of geometrical structures with highly specific patterns and morphologies. The process also is called morphogenesis. Polarity formation in a self-organizing system requires a precise internal program and a source of energy to guide and enforce the progress of the process. Thus, self-organization only occurs through a dedicated preprogrammed process

Sets of morphogenetic markers on the cell surface can be written in the form of a matrix
We assume that the information regarding the geometry of an organism is contained on the cell surface, in the form of a code composed of biological molecules of a special type. Our prevailing assumption is that, most likely, such a code can consist of oligosaccharide residues of glycoconjugates on the cell surface. There are 12 types of monosaccharide that exist in oligosaccharide residues of glycoconjugates (with six of them being of hexose type), and there have been numerous observations indicating that these oligosaccharide residues are connected with the determination of cellular morphogenetic pathways.  This information can be written in the form of a matrix, in which every element corresponds to the level of a certain type of oligosaccharide (or monosaccharide) in a given region (section) of the cell surface.

Morphogenesis, the shaping of an organism by cell movements, cell-cell interactions, collective cell behaviour, cell shape changes, cell divisions, and cell death, is a dynamic process on many scales, from fast subcellular rearrangements to slow structural changes at the whole-organism level. 8  

What sequences of the genome do in fact reside the causal differences responsible for morphological diversity, and how exactly do they function? A large part of the answer lies in the gene control circuitry encoded in the DNA, its structure, and its functional organization. The regulatory interactions mandated by this circuitry determine whether each gene is expressed in every cell, throughout developmental space and time. The control circuitry encoded in the DNA is comprised of cis-regulatory elements, i.e., the regions in the vicinity of each gene which contain the specific sequence motifs at which those regulatory proteins which affect its expression bind; plus the set of genes which encode these specific regulatory proteins (i.e., transcription factors).

In the same sense, as on top is a master plan, a general layout of an industrial complex of various interlinked factory buildings, and below are the blueprints to make the individual factories, compartments, assembly lines, machines, robots, machine elements, basic building blocks, material specification, in biology there is as well a master plan, which outlines the entire structure and body architecture of an organism. That master plan is stored in the genome, in a section, called homeobox. 2

Homeobox genes are a large family of similar genes that direct the formation of many body structures during early embryonic development. A homeobox is a DNA sequence found within genes that are involved in the regulation of patterns of anatomical development morphogenesis in animals, fungi, and plants. In humans, the homeobox gene family contains an estimated 235 functional genes and 65 pseudogenes, which are structurally similar genes that do not provide instructions for making proteins. Homeobox genes are present on every human chromosome, and they often appear in clusters. Many classes and subfamilies of homeobox genes have been described, although these groupings are used inconsistently. 1

Homeoboxes have been found in fungi, plants and animals. In each "kingdom" homeobox genes occupy a key position in the genetic control of either cell differentiation, morphogenesis and or body plan specification. 4 The degree of sequence conservation of the homeodomain is extremely high indicating strong functional constraints leading to a high pressure to retain the homeobox sequences constant.

That raises the question of how these sequences emerged in the first place. 

Certain clusters of homeotic genes, such as the Antp cluster in Drosophila and its homologous cluster, called HoxA-D in vertebrates, are responsible for anterior-posterior specification of body segments as well as being functionally tied to limb generation in mammals.

Homeobox genes encode transcription factors that bind DNA in a sequence-specific fashion through the homeodomain motif and control the expression of their target genes in a huge range of developmental processes.  It is difficult to find a developmental gene network in animals that does not include a homeobox gene. These genes are taxonomically widespread, being found in animals, plants, fungi, and protists. Another notable feature of animal homeobox genes is that a number of them exist in clusters that are widespread across the animal kingdom. These include clusters of genes from the

ANTP-class (e.g., Hox, ParaHox, NK, Mega-homeobox, and SuperHox clusters), the
PRD-class (the HRO cluster and its extension), the
TALE-class (Irx cluster), and
- the SINE-class (SIX cluster), as well as an intriguing
“pharyngeal”gene cluster composed of different classes of homeobox gene as well as other gene families

A homeobox is a DNA sequence, around 180 base pairs long, found within genes that are involved in the regulation of patterns of anatomical development (morphogenesis) in animals, fungi and plants. 1 The Hox family of clustered homeobox genes plays a fundamental role in the morphogenesis of the vertebrate embryo, providing cells with regional information along the main body axis. Homeobox genes are master developmental control genes that act at the top of genetic hierarchies regulating aspects of morphogenesis and cell differentiation in animals. The homeobox was shown to occur in all metazoa ranging from sponges to vertebrates and also in plants and fungi, and has thus been conserved ( not changed ) throughout the three kingdoms of multicellular organisms

There is uncertainty in our understanding of homeobox gene cluster evolution at present. This relates to our still rudimentary understanding of the dynamics of genome rearrangements and evolution over the evolutionary timescales being considered when we compare lineages from across the animal kingdom. 3

Transcriptional Regulation by Trithorax-Group Proteins
Tritorax proteins induce an open configuration of DNA-chromatin (euchromatin), activating the HOX network. 5

All cells in an organism must be able to “remember” what type of cell they are meant to be. This process, referred to as “cellular memory” or “transcriptional memory,” requires two basic classes of mechanisms. The first class functions to maintain an “off” state for genes that, if turned on, would specify an inappropriate cell type. The Polycomb-group (PcG) proteins have as their primary function a repressive role in cellular memory. The second class of mechanisms is composed of those that are required to maintain key genes in an “on” state. Any cell type requires the expression of master regulatory proteins that direct the specific functions required for that cell type. The genes that encode these master regulatory proteins must be maintained in an “on” state throughout the lifetime of an organism to maintain the proper cell types within that organism.

The proteins that are involved in maintaining the “on” state are called trithorax-group (trxG) proteins in honor of the trithorax gene, the founding member of this group of regulatory proteins. A large group of proteins with diverse functions make up the trxG. The roles these proteins play in the epigenetic mechanisms that maintain the “on” state appear more complex at this juncture than the roles for PcG proteins in repression. The first complexity is that a very large number of proteins and mechanisms are needed to actively transcribe RNA from any gene. Thus, in contrast to repression, which might be accomplished by comparatively simple mechanisms that block access of all proteins, activation of a gene requires numerous steps, any of which might play a role in maintaining an “on” state. Thus, there are numerous possible stages in which a trxG protein might work.

Numerous developmental decisions—including the determination of cell fates—are made in response to transient positional information in the early embryo. These decisions are dependent on changes in gene expression. This allows cells with identical genetic blueprints to acquire unique identities and follow distinct pathways of differentiation. The changes in gene expression underlying the determination of cell fates are heritable; a cell’s fate rarely changes once it is determined, even after numerous cell divisions and lengthy periods of developmental time. Understanding the molecular mechanisms underlying the maintenance of the determined state has long been a goal of developmental and molecular biologists. 6

Chromatin state often plays a critical role in gene regulation, but alterations in chromatin state may not necessarily lead to heritable changes. 3 Much remains unknown about the potential mechanisms for heritability of histone modifications, but evidence indicates that differences in histone modifications between two cells will not necessarily be transmitted through mitosis or meiosis.

Epigenetic Gene Regulation
The term epigenetics was first coined by British biologist Conrad Waddington in 1941. (The prefix epi- means “over.”) In the past few decades, researchers have used this term to describe certain types of variation in gene expression that are not based on variation in DNA sequences. How do geneticists distinguish epigenetic events from other types of gene regulation? In epigenetic gene regulation, an initial event causes a change in gene expression. For example, DNA methylation may inhibit transcription. For such a change to be an epigenetic phenomenon, it must be passed from cell to cell and must not involve a change in the DNA sequence. Therefore, a key feature of epigenetic gene regulation is the long-term maintenance of a change in gene expression. However, epigenetic changes are also reversible from one generation to the next. For example, a gene that is silenced in an individual may be active in that individual’s offspring. Although researchers are still debating the proper definition, one way to define epigenetic gene regulation is as follows:

Epigenetic gene regulation involves changes in gene expression that can be passed from cell to cell and are reversible but does not involve a change in the base sequence of DNA.

Some epigenetic changes are passed from parent to offspring. In multicellular species that reproduce via gametes (sperm and egg cells), the passing of an epigenetic change from parent to offspring is called epigenetic inheritance. Genomic imprinting is an epigenetic change that is passed from parent to offspring. However, not all epigenetic changes fall into this category. For example, an individual may be exposed to an environmental agent that causes an epigenetic change in a lung cell that is subsequently transmitted from cell to cell and promotes lung cancer. Such a change would not be transmitted to the individual’s offspring. In this section, we will begin by examining the molecular changes that cause epigenetic gene regulation. We will then consider how such changes may be programmed into an organism’s development or caused by environmental agents.

a  A cell capable of contributing to the establishment of one or more cell populations but is not a stem cell.
b  Segmentation is a difficult process to satisfactorily define. Many taxa (for instance the molluscs) have some form of serial repetition in their units, but are not conventionally thought of as segmented. Segmented animals are those considered to have organs that were repeated, or to have a body composed of self-similar units, but usually it is the parts of an organism that are referred to as being segmented. 12
Somites (outdated: primitive segments) are divisions of the body of an animal or embryo.
Somites are bilaterally paired blocks of paraxial mesoderm that form along the head-to-tail axis of the developing embryo in segmented animals. In vertebrates, somites subdivide into the sclerotomes, myotomes and dermatomes that give rise to the vertebrae of the vertebral column, rib cage, and part of the occipital bone; skeletal muscle, cartilage, tendons, and skin (of the back)

4. HOX Gene Expression Spyros Papageorgiou, Ph.D. page 18
16. Behe, edge of evolution, page 117

Last edited by Admin on Mon Nov 05, 2018 7:48 am; edited 49 times in total

View user profile

42 Homeobox and Hox Genes on Sun Oct 14, 2018 5:47 am


Homeobox and Hox Genes

Homeotic genes
It is a fascinating thought that the single cell zygote contains all the information required for the development of the adult organism. Understanding how this information is encoded and deciphered is a major uncompleted scientific challenge. 10

Homeotic genes act within cells to select their developmental fate. Homeotic genes, and other genes with analogous functions in controlling cell fate are therefore known as selector genes. They determine segmental identity. Systematic screening for homeotic genes led to the identification of eight linked genes, collectively referred to as Hox genes, that affect the specification of particular segment identities. The complete loss of any Hox gene function causes transformations of segmental identity and is lethal in early development.

One of the most intriguing features of these Hox genes is that they are linked in two gene complexes in Drosophila, the Bithorax and Antennapedia Complexes; each complex contains several distinct homeotic genes. Furthermore, the order of the genes on the chromosome and within the two complexes corresponds to the rostral (head) to caudal (rear) order of the segments that they influence, a relationship described as colinearity

The Hox genes of Drosophila
Eight Hox genes regulate the identity of regions within the adult and embryo. The color coding represents the segments and structures that are affected by mutations in the various Hox genes.

The ability to visualize Hox and other gene expression patterns during development was crucial to understanding the correlation between gene function and phenotypes. All Hox genes are expressed in spatially restricted, sometimes overlapping domains within the embryo. These genes are also expressed in subsets of the developing larval imaginal discs, which proliferate during larval development and differentiate during the pupal stages to give rise to the adult fly. Homeotic gene products exert their effects by controlling gene expression during development and that the homeodomain binds to DNA in a sequence-specific manner. The homeobox gene family is large and diverse. In fact, the homeodomain motif is found in approximately 20 other distinct families of homeobox-containing genes, all of which encode DNA-binding proteins.

Sections of genes codify Transcription factors, which are used by the cell to turn other genes on or off. There is a class of proteins, containing a region of about sixty amino acids called “homeobox.” This class of proteins is called Hox proteins. In subsequent years homeotic proteins and other classes of control proteins have proven to be master regulators of developmental programs in animals. In animals, a master switch sets in train a whole cascade of lesser switches, where the initial regulatory protein turns on the genes for other regulatory proteins, which turn on other regulatory proteins, and so on. Eventually, after a pyramid of control switches, a regulatory protein activates a gene that actually does some of the construction work to build an animal’s body. But there’s another complication. A gene in an animal cell might be regulated not by just one or a few proteins, but by more than ten. What’s more, there may be dozens of sites near the gene at which the regulatory proteins might bind, with multiple separate sites for some regulatory proteins. 1

Animal bodies contain many different kinds of cells that have to be positioned in definite relationships to other cells, in order to be formed into organs and to connect to other parts of the body.

Cal Tech biologist Eric Davidson emphasizes what the task of building an animal demands:
The most cursory consideration of the developmental process produces the realization that the program must have remarkable capacities, for development imposes extreme regulatory demands…Metaphors often have undesirable lives of their own, but a useful one here is to consider the regulatory demands of building a large and complex edifice, the way this is done by modern construction firms. All of the structural characters of the edifice, from its overall form to minute aspects that determine its local functionalities such as placement of wiring and windows, must be specified in the architect’s blueprints. The blueprints determine the activities of the construction crews from beginning to end.

Homeobox Genes and the Vertebrate Body Plan
This family of related genes determines the shape of the body. It subdivides the embryo along the head-to-tail axis into fields of cells that eventually become limbs and other structures.  Starting as a fertilized egg with a homogeneous appearance, an embryo made of skin, muscles, nerves and other tissues gradually arises through the division of cells. Long before most cells in the emerging body begin to specialize, however, a plan that designates major regions of the body-the head, the trunk, the tail and so on is established. This plan helps seemingly identical combinations of tissues arrange themselves into distinctly different anatomical structures, such as arms and legs. Individual genes mediate some of the developmental decisions involved in establishing the embryonic body plan.

The make of plans and blueprints prior something is made, and making decisions, is ALWAYS the result of intelligence. 

Key is a family of genes, known as homeobox genes, that subdivides the early embryo into fields of cells with the potential to become specific tissues and organs. 

Hox genes encode a group of transcription factors, responsible for developmental processes and the establishment of the body plan. All Hox genes and many other developmental transcription factors contain the homeobox, a DNA sequence encoding the functional DNA-binding domain. Hox genes are known for their colinearity: conserved arrangement on chromosomes that is the same as their order of activation along the body axis. The regulation is very precise, for example, the regions of activity of Hox genes are tightly confined to specific rhombomeres ( see picture below ) or to segments of the vertebrate anteroposterior body axis

The vertebrate Hox genes are synchronized: the expression domains of paralogs ( either of a pair of genes that derive from the same ancestral gene )  from the A, B, C and D clusters are virtually identical

The mechanisms responsible for the synchronous regulation of Hox genes and the molecular function of their colinearity remain unknown. Despite 35 years of active research, the mechanisms of Hox gene regulation have remained elusive. It has been argued that chromatin structure and histone demethylation play important roles in activation of Hox genes, but the mechanism precisely directing chromatin modifications to specific loci at the right time remains mysterious. Ultraconserved regions and regulatory elements have been found within the coding sequences of Hox genes, but the key questions remain unanswered. It is unknown what mechanism could be responsible for the exceptional synchronous colinearity of Hox gene clusters and the conserved synteny of other pairs of groups of homeobox-containing genes, however, the topology of chromatin has been proposed to play a role in the regulation of these genes. 2

The Homeotic Selector Genes

1. Behe, Edge of evolution, page 116

Last edited by Admin on Thu Dec 06, 2018 8:39 am; edited 11 times in total

View user profile


Design Principles of Regulatory Networks: Searching for the Molecular Algorithms of the Cell 11
The word design is often considered taboo within the biological community, given its close association with the term intelligent design—the notion that living systems were purposefully constructed by an intelligent force rather than through a random, evolutionary process. We argue that there is an important scientific role for considering design principles and how they influence biological systems.

They attribute  a 

" search process", 
" molecular implementation", 
" exploring",
" achieving particular functions ", 
" processing information ",
" finding solutions to functional needs",
" organizing",
" architecture and algorithm implementation",
" governing",
" preferring",
" engineering",
" nature having preference ",
" constructing"

implementing :
"  design principles",
"  design logic into regulatory networks "
"  autoregulatory circuits ",
"  feedback regulation systems",
"  ON or OFF states ",
"  systems of  noise resistance to perturbations ",
"  regulatory circuits and architecture ",
"  organizational rules",
"  information processing systems",
"  logic and mechanisms into complex biological processes",
"  feedforward loops (FFLs) ",
"  network design ",
"  critical structural patterns ",
"  molecular first principles ",
"  preferred biochemical solutions for particular tasks ",
"  links by chemical paths and branch points ",
"  enzymes regulation by a network ",

  to evolution, admit that there is a " huge size and large space of possible networks ",  and conclude that

" After all, biological systems have evolved under selective pressures to perform certain functions that increase organismal fitness.",  
" Prevalence of Particular network motifs because of  historical evolutionary accidents that locked in these types of solutions ",
" prediction of evolutionary convergence ",

The problem above is that they attribute a terminology which is clearly teleological to evolution, which is by definition a process which does involve no distant goals or purpose, but a mindless natural process of development towards more biological complexity. There is a clear disruption of logical flow from the evidence to the conclusion.

The gene regulatory network is like a movie director with the movie script
A movie director is someone that puts the script of the author of a story in practice while guiding the technical crew and actors in the fulfillment of that vision.  Upon the script, he must choose which actors will play which role in the movie and when they enter the scenes, and what they have to do, where, and how. Each actor is chosen upon his specific actor skills. 

There are parallels in the living world, where the gene regulatory network, genetic and epigenetic information have the same task as a movie director with the script.

In a multicellular organism, the development begins with a fertilized egg, the zygote. Through Cell division, in about twenty years, a human reaches adulthood, when the adult body contains about 37.2 Trillion Cells. The genome and information from over twenty other sources in the cell ( epigenetic information ) are like a movie script and contain information that coordinate the expression of sets of genes at the proper times and in the proper places. Transcription factors (TFs)  are like the director, which triggers the machinery ( RNA polymerase ) to express the right genes. They are proteins that bind to specific sequences on the DNA near their target genes, thus modulating transcription initiation. The function of TFs is to regulate - turn on and off - genes in order to make sure that they are expressed in the right cell at the right time and in the right amount throughout the life of the cell and the organism. While the director is the one instructing himself through the script, how to proceed, in Cells it is the inverse. Specific gene regions and epigenetic information recruit the Transcription factors, which signal the RNA polymerase which Gene to express. The human body expresses about 200 different Cell types. It is like the director of the movie has to chose upon the instructions from the script, from a movie with 200 actors, which one will play which scene, when, and how.

Information from the Gene regulatory network: GRN:

Director: In a movie with 200 actors, he must choose which actor will play what scene, and what role.
GRN: Has to chose which kind or type of cell ( Histology) to employ, where ( spatial organization ) and when ( when the specific cell type has to be expressed during development )

Director: The director has to chose which actor fits best the role in terms of physical appearance
GRN: Each Cell requires the right size and form

Director: The director has to assign a specific role in the movie to each actor. While one might be the bandit, another is the policemen enforcing the law.
GRN: Despite the number of different cell types in the human body being rather small, they need to be able to perform remarkably different tasks and specific functions, and so the Cell requires the capacity of differentiation of gene expression.  Differentiated eukaryotic cells possess a remarkable capacity for the selective expression of specific genes. The synthesis rates of a particular protein in two cells of the same organism may differ by as much as a factor of 10^9, or 1 billion. For example, reticulocytes (immature red blood cells) synthesize large amounts of haemoglobin but no detectable insulin, whereas the pancreatic cells produce large quantities of insulin but no haemoglobin.
The capacity of the Cell to express the right Cells for the right task at the right place is remarkable and requires foreknowledge.

Director: Needs to inform the actor were to act in the movie and at what scene.
GRN: Position and place in the body. This is crucial. Limbs like legs, fins, eyes etc. must all be placed in the right place.

Director: The director must hand the script over to each actor, they need to know how to interact with other actors, dialogue exchange etc.
GRN: Has to instruct how each Cell is interconnected with other cells, and what communication it requires to communicate with other cells, and the setup of the communication channels

Director: How many actors and which ones will be employed for each scene
GRN: Precisely how many new cell types must be produced for each tissue and organ?

Director: When each actor ends his participation in the movie ( sometimes he is killed etc. ) and the movie goes on without him
GRN: Programming of  time period the cell keeps alive in the body, and when is it time to self-destruct and be replaced by newly produced cells of the same kind

Director: Has to manage that all actors become shelter, food etc. during the recordings
GRN: Set up its specific nutrition demands for each Cell.

Above tasks are each, by its own, essential. If one of them is not performed, the movie director cannot make the film. In the same sense, if one of above is missing, Cells cannot build multicellular organisms. The difference between one and the other is The orchestration to build multicellular organisms is far far more complex than any movie made by man. 

Be, or not to be. Design, or no design? That's the big question... 

Example of a Movie Script Markup Language (MSML), a document specification for the structural representation of screenplay narratives which is comparable to a Gene regulatory network diagram 12

A few key points:
- A gene that is “turned on” is being transcribed to produce messenger RNA (mRNA) that is translated to make its corresponding protein 7  
- Cell differentiation results from the expression of different combinations of genes. Differentiation is controlled by turning specific sets of genes on or off.  
- DNA packing in eukaryotic chromosomes helps regulate gene expression. 
- Multiple mechanisms like Signal transduction pathways regulate gene expression in eukaryotes. 
- Small RNAs play multiple roles in controlling gene expression. 
- Transcriptional regulatory networks (TRNs) encode instructions for animal development and physiological responses.

Vast gene regulatory networks (GRNs) that connect transcription factors to their target regulatory sequences control gene expression in time and space and therefore determine the tissue-specific genetic programs that shape morphological structures. 10

gene (or geneticregulatory network (GRN) is a collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of messenger RNA (mRNA) and proteins. These play a central role in morphogenesis, the creation of body structures, which in turn is central to evolutionary developmental biology (evo-devo). 5 Sometimes a 'self-sustaining feedback loop' ensures that a cell maintains its identity and passes it on. Less understood is the mechanism of epigenetics by which chromatin modification may provide cellular memory by blocking or allowing transcription.

The expression of genetic information of a cell starts with transcription, in which a particular segment of DNA is copied into RNA (especially mRNA) by the enzyme RNA polymerase. 1 Transcriptional regulation is one of the most fundamental mechanisms for controlling the amount of protein produced by cells under different environmental conditions and developmental stages 2 Some proteins serve only to activate genes, and these are the transcription factors that are the main players in regulatory networks or cascades.

Regulation, governing, controlling, recruiting, interpretation, recognition, orchestrating, elaborating strategies, guiding, instruct the blocking or allowing transcription, selecting which genes are turned on or repressed,  programming the right timing of an action, responding correctly to signals, setup specificity, setting up morphogen gradients, which in effect provide a positioning system that tells a cell wherein the body it is, and hence what sort of cell to become, programming the right cell differentiation,  programming the expression of the appropriate cell types at the correct stage of development. are all terms and actions related to intelligence.

Developmental gene regulatory networks (GRNs) have a modular structure, or architecture, consisting of subcircuits— each with a given function that is determined by the set of available transcription factors 13

mutation in gene switches is the key to evolution from one kind to another.

Transcription factors (TF)

View user profile


The make and maintenance of specialized Cell types

The organization of DNA in an intricate, dynamic nucleoprotein assembly termed chromatin is accomplished by a remarkable feat of biological engineering. Although all cells must be able to switch genes on and off in response to changes in their environments, the cells of multicellular organisms have this capacity to an extreme degree. Transcription factors are positioned at multiple sites along long stretches of DNA and that these proteins bring into play coactivators and co-repressors. The Drosophila Even-skipped (Eve) gene expression plays an important part in the development of the Drosophila embryo. If this gene is inactivated by mutation, many parts of the embryo fail to form, and the embryo dies early in development. This cytoplasm contains a mixture of transcription factors that are distributed unevenly along the length of the embryo, thus providing positional information that distinguishes one part of the embryo from another. Although the nuclei are initially identical, they rapidly begin to express different genes because they are exposed to different transcription regulators.

Molecular genetic mechanisms that create and maintain specialized cell types
Although all cells must be able to switch genes on and off in response to changes in their environments, the cells of multicellular organisms have this capacity to an extreme degree. In particular, once a cell in a multicellular organism becomes committed to differentiate into a specific cell type, the cell maintains this choice through many subsequent cell generations, which means that it remembers the changes in gene expression involved in the choice. This phenomenon of cell memory is a prerequisite for the creation of organized tissues and for the maintenance of stably differentiated cell types.

Complex genetic switches that regulate Drosophila development are built up from smaller molecules
Drosophila Even-skipped (Eve) gene expression plays an important part in the development of the Drosophila embryo. If this gene is inactivated by mutation, many parts of the embryo fail to form, and the embryo dies
early in development. At the stage of development when Eve begins to be expressed, the embryo is a single giant cell containing multiple nuclei in a common cytoplasm. This cytoplasm contains a mixture of transcription
factors that are distributed unevenly along the length of the embryo, thus providing positional information that distinguishes one part of the embryo from another.

The nonuniform distribution of transcription regulators in an early Drosophila embryo. 
At this stage, the embryo is a syncytium; that is, multiple nuclei are contained in a common cytoplasm. Although not shown in these drawings, all of these proteins are concentrated in the nuclei.

Although the nuclei are initially identical, they rapidly begin to express different genes because they are exposed to different transcription factors. For example, the nuclei near the anterior end of the developing embryo are exposed to a set of transcription factors that is distinct from the set that influences nuclei at the middle or at the posterior end of the embryo. The regulatory DNA sequences that control the Eve gene “read” the concentrations of transcription factors at each position along the length of the embryo, and they cause the Eve gene to be expressed in seven precisely positioned stripes, each initially five to six nuclei wide. How is this remarkable feat of information processing carried out? Although there is still much to learn, several general principles have emerged from studies of Eve and other genes that are similarly regulated.

The blue stripes (top and bottom panels) are eve expression, the red in the center (bottom panel) is Kruppel expression, and each green dot represents a single nucleus. 1 The seven stripes of the protein encoded by the Evenskipped (Eve) gene in a developing Drosophila embryo. At this stage in development, the egg contains approximately 4000 nuclei. The Eve and Giant proteins are both located in the nuclei, and the Eve stripes are about four nuclei wide.

The regulatory region of the Eve gene is very large (approximately 20,000 nucleotide pairs). It is formed from a series of relatively simple regulatory modules, each of which contains multiple cis-regulatory sequences and is responsible for specifying a particular stripe of Eve expression along the embryo.


View user profile

Sponsored content

Back to top  Message [Page 2 of 2]

Go to page : Previous  1, 2

Permissions in this forum:
You cannot reply to topics in this forum