Intelligent Design, the best explanation of Origins

This is my personal virtual library, where i collect information, which leads in my view to Intelligent Design as the best explanation of the origin of the physical Universe, life, and biodiversity


You are not connected. Please login or register

Intelligent Design, the best explanation of Origins » Molecular biology of the cell » Development biology » Control of Gene Expression, and gene regulatory networks point to intelligent design

Control of Gene Expression, and gene regulatory networks point to intelligent design

Go down  Message [Page 1 of 1]

Admin


Admin
Control of Gene Expression and gene regulatory networks point to intelligent design

http://reasonandscience.catsboard.com/t2194-control-of-gene-expression-and-gene-regulatory-networks-point-to-intelligent-design

Comparisons of the levels of morphological and protein divergence between humans and chimps demonstrated that the level of protein divergence was too small to account for the anatomical differences between these two species. To reconcile the level of divergence between proteins and morphology, it has been proposed that morphological divergence is based mostly on changes in the mechanisms controlling gene expression and not changes in the protein-coding genes themselves. The past decades have seen major advances in developmental genetics that have changed the way we approach the origin of morphological characters. These advances have produced several generalizations about the relationship between genetics and phenotypes. Among the most widely recognized is the concept of toolbox genes, that is that different body plans are realized with a conserved set of developmental genes, namely transcription factors and signalling molecules. 20

Toolbox genes do not change their functions, although their expression patterns can change. 

That raises the question of how toolbox genes emerged in the first place. 

Systems Biology
http://reasonandscience.catsboard.com/t2194-control-of-gene-expression-and-gene-regulatory-networks-point-to-intelligent-design#6322

Internal Signaling and Information
http://reasonandscience.catsboard.com/t2194-control-of-gene-expression-and-gene-regulatory-networks-point-to-intelligent-design#6343

Developmental Gene Regulatory Networks dGRN's
Animal body plan design, along with cell type differentiation, is controlled by the precise regulation of gene expression in time and space, which in turn is driven by developmental gene regulatory networks (dGRNs)  Changes of control genes regulating development are related to morphological change and divergence, suggesting that the changes in morphology are the result of nucleotide substitutions in cis-regulatory elements and amino-acid substitutions in transcription factors affecting the regulation of gene expression. Intergenic genomic regions and differences of protein-coding sequences have an important role in determining differences in gene regulatory patterns, and consequently, in animal body plan diversity. Such networks comprise a constellation of elements including regulatory genes (transcription factors, signalling molecules, noncoding RNAs), regulatory sequences (cis-regulatory modules, enhancers, promoters, insulators), and target genes (differentiation a and structural genes ), which interplay tightly to trigger induction or repression of gene expression. The right execution of this molecular choreography, repeated anew in every generation, is fundamental to the life of every animal on Earth. Recent advances determined the regulatory gene interactions that underpin the dGRNs and how these interactions control the process of regulation of gene expression during animal development. In this sense, many molecular mechanisms that underlie dGRNs and influence cell type and animal body plan development have been characterized in vertebrates. Thus, dGRNs plays a key role to explain animal diversity. 16

Just as development is a system property of the regulatory genome, it must be considered at system level that the development process can change. Modification of the body plan depends on alteration of the structure of developmental gene regulatory networks as a whole.  The hierarchy and multiple additional design features of these networks act to produce Boolean regulatory state specification functions at upstream phases of development of the body plan. These are created by the logic outputs of network subcircuits, and in modern animals these outputs are impervious to continuous adaptive variation. Animal body plans is a system level property of the developmental gene regulatory networks (dGRNs) which control ontogeny of the body plan. It follows that gross morphological novelty requires dramatic alterations in dGRN architecture, always involving multiple regulatory genes, and typically affecting the deployment of whole network subcircuits. Because dGRNs are deeply hierarchical, and it is the upper levels of these GRNs that control major morphological features in development, a question dealt with below in this essay arises: how can we think about selection in respect to dGRN organization? The answers lie in the architecture of dGRNs and the developmental logic they generate at the system level, far from micro-evolutionary mechanism. While adaptive evolutionary variation occurs constantly in modern animals at the periphery of dGRNs, the stability over geological epochs of the developmental properties that define the major attributes of their body plans requires special explanations rooted deep in the structure/function relations of dGRNs. [url= http://sci-hub.tw/https://www.sciencedirect.com/science/article/pii/S0012160611000911#bb0110]19[/url]

Mechanistic developmental biology has shown that its fundamental concepts are largely irrelevant to the process by which the body plan is formed in ontogeny. In addition, it gives rise to lethal errors in respect to evolutionary process. Neo-Darwinian evolution is uniformitarian in that it assumes that all process works the same way, so that evolution of enzymes or flower colors can be used as current proxies for study of evolution of the body plan. It
erroneously assumes that change in protein coding sequence is the basic cause of change in developmental program, and it erroneously assumes that evolutionary change in body plan morphology occurs by a continuous process. All of these assumptions are basically counterfactual. This cannot be surprising since the neo-Darwinian synthesis from which these ideas stem was a pre-molecular biology concoction focused on population genetics and adaptation natural history, neither of which have any direct mechanistic import for the genomic regulatory systems that drive embryonic development of the body plan.

When the properties of the gene regulatory networks that actually generate body plans and body parts are taken into account, it can be seen that many entirely new and different mechanistic factors come into play. The result is that just as the paleontological record of change in animal morphology is the opposite of uniformitarian, so, for very good reasons that are embedded in their structure/function relations, are the mechanisms of dGRN emergence.

No observations on single genes can ever illuminate the overall mechanisms of the development of the body plan or of body parts. 

The architecture of animal body plans as change and conservation of developmental Gene Regulatory Network (dGRN) structure: mechanistic consequences:

Since dGRN structure depends on cis-regulatory linkages at nodes, the change in dGRN structure occurs by redeployment of cis-regulatory modules controlling regulatory gene expression.
Since dGRNs are deeply hierarchical effects of given cis-regulatory change depend specifically on their location in dGRN.
Since dGRNs are deeply hierarchical, subcircuits operating at upper levels (early in developmental process) preclude/prevent/prohibit/hinder/impede certain downstream linkages, and mediate others, i.e., canalize dGRN structure (and developmental process).

http://reasonandscience.catsboard.com/t2318-gene-regulatory-networks-controlling-body-plan-development#4804

dGRN architecture
dGRNs can be represented as complex logic maps that state in detail the interactions between developmental control genes (transcription factors and components of cell signaling pathways) and cis-regulatory modules (promoter, enhancers, and insulators) in order to visualize how differentiation and structural genes (target genes) are turned off or on at a given time and location during development. . In addition, dGRNs have a modular architecture, consisting of multiple sub-circuits—each in charge of individual regulatory tasks defined by a set of specific developmental control genes and their cis-regulatory modules.

Building on this modular architecture, dGRNs are hierarchical as they are divided into different components. For example, the components controlling the initial stages of development are at the top of the hierarchy, while the portions governing intermediate processes, such as spatial subdivision and morphological patterns are in the middle, and the components controlling more specific functions, including cell differentiation and organogenesis/morphogenesis, are at the periphery.

Elucidating the network of interactions between genes that govern cell differentiation through development is one of the core challenges in genome research. These networks are known as developmental gene regulatory networks (dGRNs) and consist largely of the functional linkage between developmental control genes, cis-regulatory modules, and differentiation genes, which generate spatially and temporally refined patterns of gene expression. 16 
The components known as kernels, which consist of conserved interactions among transcription factors, are highly conserved regulatory interactions;  they are responsible for the progenitor states of a developing structure. Other components of the network, known as intermediate and peripheral, have great impacts on the phenotype. Understanding how the components of dGRNs have emerged is a central issue in evo-devo biology. 

Initial strategies for unravelling dGRNs
http://reasonandscience.catsboard.com/t2194-control-of-gene-expression-and-gene-regulatory-networks-point-to-intelligent-design#6323

Regulatory information at different levels of network organization, from single node to subcircuit to large-scale GRNs depends on regulatory design features such as network architecture, hierarchical organization, and cis-regulatory logic which contribute to the developmental function of network circuits. 15

Transcriptional regulatory circuits: A cell senses its environment and calculates the amount of protein it needs for it various functions. This information processing is done by transcription networks. These networks, a major study object of systems biology, often contain recurring network topologies called 'motifs' . Composition and engineering concepts for these circuits have been extensively studied. dGRN's are composed of interesting functions such as scillators, frequency multipliers and frequency band-pass filters. Transcriptional regulatory circuits can be seen as an analog to electronic circuits. Data input, data processing and data output is an abstraction found in both circuit types. Transcriptional circuits have chemicals as an input. Data processing happens as functional clusters of genes impact each other's expression through inducible transcription factors and cis-regulatory elements. The output is e.g. proteins. Promoters control the expression of genes in response to one or more transcription factors. Rules for programming gene expression with combinatorial promoters have been identified. These networks are engineered to perform a wide range of logic functions.

Gene regulatory networks are pre-programmed to instruct the ordered gene expression and as result, the assembly of the basic functional units into structures of higher order complexity, like Cell factories and multicellular organisms, that can be compared to a city of interconnected factories. Simply expecting a change at the DNA level – the production of one or more slightly different molecular machines – is not going to produce a changed body structure or, as a comparison, the architecture and structure of an entire city, which what is claimed and supposed of macroevolution to be able of doing. For that to occur, a change to the assembly process, the dGRN is needed. That would be, as to evolve/mutate/change the City of New York and transform it into the City of Los Angeles. After the dGRN has done its work and the body plan has been produced variety can occur. But that would be microevolution – which isn’t disputed. It is as if a Sky Scraper in Manhattan would be imploded, and a new, completely different building would be constructed. That would however not change the structural or road map of Manhattan as a whole. Fiddling with dGRNs are always catastrophically bad. Far more is needed than new genetic material to create a new kind of organism. New genetic material is required, but its existence alone will not produce a new body plan. To get that you also need a new or altered dGRN.

dGRNs can be represented as complex logic maps that state in detail the interactions between developmental control genes (transcription factors and components of cell signaling pathways) and cis-regulatory modules (promoter, enhancers, and insulators) in order to visualize how differentiation and structural genes (target genes) are turned off or on at a given time and location during development In addition, dGRNs have a modular architecture, consisting of multiple sub-circuits—each in charge of individual regulatory tasks defined by a set of specific developmental control genes and their cis-regulatory modules (promoter, enhancers, and insulators) in order to visualize how differentiation and structural genes (target genes) are turned off or on at a given time and location during development. dGRNs have a modular architecture, consisting of multiple sub-circuits—each in charge of individual regulatory tasks defined by a set of specific developmental control genes and their cis-regulatory modules.  Building on this modular architecture, dGRNs are hierarchical as they are divided into different components. For example, the components controlling the initial stages of development are at the top of the hierarchy, while the portions governing intermediate processes, such as spatial subdivision and morphological patterns are in the middle, and the components controlling more specific functions, including cell differentiation and organogenesis/morphogenesis, are at the periphery. 

Animal morphology results from the functional organization of the gene regulatory networks (GRNs) that control development of the body plan. 13 The body plan is formed by the execution of an inherited genomic regulatory program for embryonic development.  A major mechanism in dGRN's which determine gene expression is the alteration of the structure and architecture of cis-regulatory modules. The basic control task is to determine transcriptional activity throughout embryonic time and space, and here ultimately lies causality in the developmental process. The genomic control apparatus for any given developmental episode consists of the specifically expressed genes that encode the transcription factors required to direct the events of that episode, most importantly including the cis-regulatory control regions of these genes. The cis-regulatory sequences combinatorially determine which regulatory inputs will affect the expression of each gene and what other genes it will affect; that is, they hard-wire the functional linkages among the regulatory genes, forming network subcircuits. The subcircuits perform biologically meaningful jobs, for example, acting as logic gates, interpreting signals, stabilizing given regulatory states, or establishing specific regulatory states in given cell lineages. GRNs are inherently hierarchical: the networks controlling each phase of development are assemblages of subcircuits, the subcircuits are assemblages of specific regulatory linkages among specific genes, and the linkages are individually determined by assemblages of cis-regulatory transcription factor target sites. But at the highest level of its organization, the developmental GRN is hierarchical. Development progresses from phase to phase, and this fundamental phenomenon reflects the underlying sequential hierarchy of the GRN control system. In the earliest embryonic phases, the function of the developmental GRN is establishment of specific regulatory states in the spatial domains of the developing organism. In this way the design of the future body plan is mapped out in regional regulatory landscapes, which differentially endow the potentialities of the future parts. Lower down in the hierarchy, GRN apparatus continues regional regulatory specification on finer scales. Ultimately, precisely confined regulatory states determine how the differentiation and morphogenetic gene batteries at the terminal periphery of the GRN will be deployed.

Understanding the regulation of gene expression is one of the key problems in current biology. 10 Transcriptional regulation is a key mechanism for cells to accomplish changes in gene expression levels. 11 Gene regulation plays a key role in the control of fundamental processes in living organisms, ranging from development to nutrition and metabolic coordination. 12 Genes are regulated at several levels of integration but one key step is the control of gene transcription. Determining the fundamental structure of transcriptional Gene Regulatory Networks (GRNs, considered here as the relationships of transcription factors (TFs) and their targets) is a major challenge of systems biology.

The genes included in dGRNs encode transcription factors, components of signal transduction pathways, and often effector genes as markers of differentiated cell states. 1 dGRNs have the potential of providing a causal understanding of how upstream specification controls downstream events (i.e. differentiation or cell biological functions).

Coded information can always be tracked back to a intelligence, which has to set up the convention of meaning of the code, and the information carrier, that can be a book, the hardware of a computer, or the smoke of a fire of a indian tribe signalling to another. All communication systems have an encoder which produces a message which is processed by a decoder. In the cell there are several code systems. DNA is the most well known, it stores coded information through the four nucleic acid bases. But there are several others, less known. Recently there was some hype about a second DNA code. In fact, it is essential for the expression of genes. The cell uses several formal communication systems according to Shannon’s model because they encode and decode messages using a system of symbols.  As Shannon wrote :

“Information, transcription, translation, code, redundancy, synonymous, messenger, editing, and proofreading are all appropriate terms in biology. They take their meaning from information theory (Shannon, 1948) and are not synonyms, metaphors, or analogies.” (Hubert P. Yockey,  Information Theory, Evolution, and the Origin of Life,  Cambridge University Press, 2005).

An organism’s DNA encodes all of the RNA and protein molecules required to construct its cells. Yet a complete description of the DNA sequence of an organism—be it the few million nucleotides of a bacterium or the few billion nucleotides of a human—no more enables us to reconstruct the organism than a list of English words enables us to reconstruct a play by Shakespeare. In both cases, the problem is to know how the elements in the DNA sequence or the words on the list are used. Under what conditions is each gene product made, and, once made, what does it do? The different cell types in a multicellular organism differ dramatically in both structure and function. If we compare a mammalian neuron with a liver cell, for example, the differences are so extreme that it is difficult to imagine that the two cells contain the same genome. The genome of a organism contains the instructions to make all different cells, and  the expression of either a neuron cell or liver cell can be regulated at many of the steps in the pathway from DNA to RNA to Protein. The most important imho is CONTROL OF TRANSCRIPTION BY SEQUENCESPECIFIC DNA-BINDING PROTEINS, called transcription factors or regulators. These proteins recognize specific sequences of DNA (typically 5–10 nucleotide pairs in length) that are often called cis-regulatory sequences.   Transcription regulators bind to these sequences, which are dispersed throughout genomes, and this binding puts into motion a series of reactions that ultimately specify which genes are to be transcribed and at what rate. Approximately 10% of the protein-coding genes of most organisms are devoted to transcription regulators. Transcription regulators must recognize short, specific cis-regulatory sequences within this structure. The outside of the double helix is studded with DNA sequence information that transcription regulators recognize: the edge of each base pair presents a distinctive pattern of hydrogen-bond donors, hydrogen-bond acceptors, and hydrophobic patches in both the major and minor grooves. The 20 or so contacts that are typically formed at the protein–DNA interface add together to ensure that the interaction is both highly specific and very strong.

These instructions are written in a language that is often called the ‘gene regulatory code’.  The preference for a given nucleotide at a specific position is mainly determined by physical interactions between the aminoacid side chains of the TF ( transcription factor ) and the accessible edges of the base pairs that are contacted.  It is possible that some complex code, comprising rules from each of the different layers, contributes to TF– DNA binding; however, determining the precise rules of TF binding to the genome will require further scientific research. So, Genomes contain both a genetic code specifying amino acids, and this regulatory code specifying transcription factor (TF) recognition sequences. We find that ~15% of human codons are dual-use codons (`duons') that simultaneously specify both amino acids and TF recognition sites. Genomes also contain a parallel regulatory code specifying recognition sequences for transcription factors (TFs) , and the genetic and regulatory codes have been assumed to operate independently of one another, and to be segregated physically into the coding and non-coding genomic compartments. the potential for some coding exons to accommodate transcriptional enhancers or splicing signals has long been recognized

In order for communication to happen, 1. The sequence of DNA bases located in the regulatory region of the gene is required, and 2. transcription factors that read the code. If one of both is missing, communication fails, the gene that has to be expressed, cannot be encountered, and the whole procedure of gene expression fails. This is an irreducible complex system. The gene regulatory code could not arise in a stepwise manner either, since if that were the case, the code has only the right significance if fully developed. That's an example par excellence of intelligent design.. The fact that these transcription factor binding sequences overlap protein-coding sequences, suggest that both sequences were designed together, in order to optimize the efficiency of the DNA code. As we learn more and more about DNA structure and function, it is apparent that the code was not just hobbled together by the trial and error method of natural selection, but that it was specifically designed to provide optimal efficiency and function.

 

The control of gene expression involves complex circuits that exhibit enormous variation in design. 2 For years the most convenient explanation for these variations was historical accident. According to this view, evolution is a haphazard process in which many different designs are generated by chance. The central importance of gene regulation in modern molecular biology provides strong motivation to search for more of these underlying design principles. Gene circuits sense their environmental context and orchestrate the expression of a set of genes to produce appropriate patterns of cellular response. The importance of this role has made the experimental study of gene regulation central to nearly all areas of modern molecular biology. The fruits of several decades of intensive investigation have been the discovery of a plethora of both molecular mechanisms and circuitry by which these are interconnected.  Several elements of design, each exhibiting a variety of realizations, have been identified among elementary gene circuits in prokaryotic organisms. Design principles appear to govern the realization of these elements. Experimental studies of specific gene systems by molecular biologists have revealed an immense variety of molecular mechanisms that are combined into complex gene circuits, and the patterns of gene expression observed in response to environmental and developmental signals are equally diverse. Are these variations in design the result of historical accident or have they been selected for specific functional reasons? 

There are several different levels of hierarchical organization that intervene between the genotype and the phenotype. These levels are linked by gene circuits that can be characterized in terms of the following elements of design: 

- transcription unit, 
- input signalling, 
- mode of control, 
- logic unit, 
- expression cascade, 
- connectivity

The Transcription unit consists of a set of coordinately regulated structural genes that encode proteins, an up-stream promoter site which transcription of the genes is initiated, and a down-stream terminator site at which transcription ceases. Transcription units are the principal feature around which gene circuits are organized.  On the input side, signals in the extracellular or intracellular environment are detected by binding to specific receptor molecules, which propagate the signal to specific regulatory molecules in a process called transduction, although in many cases the regulator molecules are also the receptor molecules. Regulator molecules in turn bind to the modulator sites of transcription units in one of two alternative modes and the signals are combined in a logic unit to determine the rate of transcription. On the output side, transcription initiates an expression cascade that yields one or many mRNA products, one or many protein products, and possibly one or many products of enzymatic activity. Thus, the transcription unit emits a fan-out of signals, which are then connected in a diverse fashion to the receptors of other transcription units to complete the interlocking gene circuitry.

Input signalling The input signals for transcription units can arise either from the external environment or from within the cell. When signals originate in the extracellular environment, they often involve binding of signal molecules to specific receptors in the cellular membrane. In bacteria, alterations in the membrane-bound receptor are communicated directly to regulator proteins via short signal transduction pathways called ‘‘two-component systems.’’  In other cases, signal molecules in the environment are transported across the membrane, and in some cases are subsequently modified metabolically, to become signal molecules that bind directly to regulator proteins . in these cases the receptor and regulator are one and the same molecule. 

Mode of control Regulators exerts their control over gene expression by acting in one of two different modes. In the positive mode, they stimulate expression of an otherwise quiescent gene, and induction of gene expression is achieved by supplying the functional form of the regulator. In the negative mode, regulators block expression of an otherwise active gene, and induction of gene expression is achieved by removing the functional form of the regulator. Each of these two designs positive or negative requires the transcription unit to have the appropriate modulator site initiator type or operator type and promoter function low level or high level. 

Logic unit The control regions associated with transcription units may be considered the logic unit where input signals from various regulators are integrated to govern the rate of transcription initiation. There are two lines of evidence suggesting that most transcription units in bacteria have only a few regulatory inputs.  If the number of inputs was fewer on average, the behavior of the network was too fixed; whereas if the number was greater on average, the behavior was too chaotic. The optimal behaviour associated with a few inputs often is described as ‘‘operating at the edge of chaos.’’  Second, with the arrival of the genomic era and the sequencing of the complete genome for a number of bacteria, there is now experimental evidence regarding the distribution of inputs per transcription unit. The sequence for Escherichia coli has shown that the number of modulator sites located near the promoters of transcription units is on average approximately two to three. The large majority have two and a few have as many as five. 

Expression cascades Expression cascades produce the output signals from transcription units. They typically reflect the flow of information from DNA to RNA to protein to metabolites, which has been called the ‘‘Central Dogma’’ of molecular biology.

Gene regulatory networks (GRNs) provide system level explanations of developmental and physiological functions in the terms of the genomic regulatory code. Depending on their developmental functions, GRNs differ in their degree of hierarchy, and also in the types of modular sub-circuit of which they are composed, although there is a commonly employed sub-circuit repertoire. Mathematical modelling of some types of GRN sub-circuit has deepened biological understanding of the functions they mediate. The structural organization of various kinds of GRN reflects their roles in the life process, and causally illuminates developmental processes. GRNs determine the main events of postembryonic development, including organogenesis and formation of adult parts and cell types. Beyond that, GRNs control a vast array of physiological capabilities and modes of response to environmental fluctuations and challenges. GRNs are composed of multiple sub-circuits, that is, the individual regulatory tasks into which a process can be parsed are each accomplished by a given GRN sub-circuit. Thus the operational significance of a GRN structure will be indicated by the types of sub-circuit it contains. However, GRNs have more global organizational properties as well. GRNs may be deeply layered, generating successive regulatory transactions, or they may be shallow, in the sense that they mandate few transactions between the initial inputs and the terminal activation of effector genes. 

GRN subcircuits
‘The overall control principle is that the embryonic process is finely divided into precise little “jobs” to be done, and each is assigned to a specific subcircuit or wiring feature in the upper-level dGRN. There is always an observable consequence if a dGRN subcircuit is interrupted. Since these consequences are always catastrophically bad, flexibility is minimal, and since the subcircuits are all interconnected, the whole network partakes of the quality that there is only one way for things to work. And indeed the embryos of each species develop in only one way. 7

Thus we can think of a crown group dGRN as a terminal, finely divided, extremely elegant control system that allows continuing alteration and variation only after the body plan per se has formed, i.e., in structural terms, at the dGRN periphery, and in developmental terms, late in the process. It is no surprise, from this point of view, that cell type re-specification by insertion of alternative differentiation drivers is changing only at the dGRN periphery, quite a different matter from altering body plan.  

The significance of GRN subcircuits
Some things never change, and a principle is that developmental jobs are controlled through the logic outputs of genetic subcircuits. How the animal body plan has emerged is a question that in the end can only be addressed in the terms of transcriptional regulatory systems biology.

Five different sub-circuits of the GRN control five distinct cell biological activities. 6 The individual components of a complex developmental process are in general controlled by GRN subcircuits, and it is their architecture that illuminates the basic logic of development. The same topology of each of the subcircuits appears repeatedly in diverse developmental GRNs and executes similar network functions. But in each instance, subcircuits of a given type are composed of entirely different regulatory genes. In fact, in these examples, there is no case where the same regulatory genes are used more than once for a given type of circuit. This demonstrates that subcircuit function depends exclusively on topological design and is not determined by the specific biochemical nature of the constituent regulatory factors, except for their general properties as activating or repressing transcription factors. Perhaps most surprising is the fundamental similarity of inductive signalling logic despite the tremendous varieties of signal transduction pathways responding to diverse ligands.  Genomic regulatory functions in development depend on subcircuit logic functions and not on unique properties of particular individual transcription factors. Feedback subcircuits, the various feedforward subcircuits, double-negative gate subcircuits, Janus signalling subcircuits, etc. are required constituents of GRNs for embryonic development. 

The recurrent use of the same subcircuits implies that there may exist a finite complement of genomically encoded subcircuits which recur regularly in developmental GRNs, and from which these GRNs are constructed. This idea has deep implications for the origins of the morphological programs that control bilaterian development.

The utilization of given subcircuit architectures is clearly not what would be expected on the basis of random occurrence of regulatory interactions. To give one striking example, one might expect that isolated autoregulatory wiring would occur with equal frequency throughout the different regions of a GRN, that is, any gene might have a similar probability of engaging in autoregulation positive or negative. But what is observed is clearly different from this. Here we found among all these examples only three occurrences of isolated positive autoregulations but 10 occurrences of positive autoregulation in the context of positive intergenic feedback subcircuits, even though there are many times more genes in these networks as a whole than are included in intergenic feedback subcircuits. Additional evidence that isolated positive autoregulation has been disfavored is the relatively high frequency of isolated negative autoregulation that we observe, 18 occurrences in the same set of networks, obviously none in the context of positive intergenic feedbacks. On a random basis, the likelihood of a positive and negative autoregulatory site should be quite similar, but in fact we see 3 versus 18 occurrences outside of intergenic feedback subcircuits. 

Subcircuit occurrence depends on requirement in GRN context for given subcircuit functions, which of course is not expected to be random. We can predict that as comparative subcircuit databases expand beyond the initial attempts we have made here, a large body of evidence will accumulate displaying the nonrandom occurrence of all the canonical subcircuit topologies we here consider.

Modular GRN sub-circuits are defined by their topologies, and the topology of a sub-circuit directly indicates its function in life. Sub-circuits perform developmental biology jobs that can be defined uniquely, and not with very common ‘motifs’ such as the coherent feed-forward loop, which although it has specific dynamic properties, appears in so many different contexts that no unique developmental biology function can be associated with it. 

Design principles govern transcriptional regulation networks that control gene expression in cells. 4  ‘Network motifs’ are patterns of interconnections that recur in many different parts of a network at frequencies much higher than those found in randomized networks.  Each network motif has a specific function in determining gene expression, such as generating temporal expression programs and governing the responses to fluctuating external signals. The transcriptional network can be represented as a directed graph, in which each node represents an operon  (an operon is a group of contiguous genes that are transcribed into a single mRNA molecule). and edges represent direct transcriptional interactions. Each edge is directed from an operon that encodes a transcription factor to an operon that is regulated by that transcription factor.

The ‘feedforward loop’ motif is defined by a transcription factor X that regulates a second transcription factor Y, such that both X and Y jointly regulate an operon Z.   We term X the ‘general transcription factor’, Y the ‘specific transcription factor’, and Z the ‘effector operon(s)’.  For example, if X and Y both positively regulate Z, and X positively regulates Y, the feedforward loop is coherent. If, on the other hand, X represses Y, then the motif is incoherent.

An understanding of the design principles of biochemical networks such as gene regulatory, metabolic, or intracellular signalling networks is a central concern of systems biology. In particular, the intricate interplay between network topology and resulting dynamics is crucial to our understanding of such networks, as is their presumed modular structure. 5 A topological feature of central interest is the existence of positive and negative feedback loops. Feedback loops have a decisive effect on dynamics, which has been studied extensively through the analysis of mathematical network models, both continuous and discrete. In biological networks, each variable can only attain two values (0/1 or on/off).

Logic gates evoke images of circuit boards, but cells are arguably equally good in relying on logic computations. A classic example is the Lac operon, which activates itself upon the condition “lactose AND NOT glucose”. In recent years, there have been multiple reports on rationally designed, genetically encoded logic gates and circuits in living cells. Just like the Lac operon, these gates receive two or more molecular signals (inputs) and generate a product (output) whose level is logically linked to the inputs. 18

Network motifs
Stephen C.Meyer, Darwin's doubt, page 228:
Think, again, of arranging Lego blocks. There are many ways of arranging small numbers of Lego blocks. These various arrangements form common structural motifs such as: two blocks stuck together at right angles; several curved blocks forming circular rings, stacked blocks forming hollow squares or walls or cube-like shapes; blocks arranged as prisms or cylinders; flat layers of blocks stacked two bumps thick or three bumps thick or more. Though these structural elements stick together because of interactions between the bumps and indentations on each block, those bumps and indentations themselves do not specify any particular larger structure—a castle or an airplane, for example—because each motif may be combined or recombined with many other structural motifs in numerous different ways. The shape and properties of the modular elements do not dictate the type of larger structure that must be built from them. Instead, to build a particular structure, the modular elements must be arranged in particular ways. And since there are many possible ways to arrange these modular elements, only one or a few of which will result in a desired structure, every Lego set includes a blueprint with step-by-step instructions—in other words, additional information.

Producing a body plan from the different types of cell clusters generated by Newman’s dynamical patterning modules (DPMs), would also require additional information. Newman does not account for this information. He correctly highlights the way certain recurrent motifs for organizing groups of cells seem to form spontaneously as the result of physical interactions between individual cells (his DPMs). He does not, however, establish that these groups of cells must arrange themselves into specific tissues, organs, or body plans in response to any known physical process or law. Instead, it seems entirely possible that these modular elements (cell clusters) have many “degrees of freedom” and can be arranged in innumerable ways. If so, then some additional information—an overall organismal blueprint or set of assembly instructions—would need to direct the arrangement of these modular elements. Newman does not consider this possibility. Nor does he cite any law-like self-organizational process that would eliminate the need for such information to direct animal development.

Transcription regulation networks control the expression of genes. The transcription networks of well-studied microorganisms appear to be made up of a small set of recurring regulation patterns, called network motifs. The same network motifs have recently been found in diverse organisms from bacteria to humans, suggesting that they serve as basic building blocks of transcription networks. Transcription factors respond to biological signals and accordingly change the transcription rate of genes, allowing cells to make the proteins they need at the appropriate times and amounts.  Transcription networks contain a small set of recurring regulation patterns, called network motifs. Network motifs can be thought of as recurring circuits of interactions from which the networks are built. Network motifs were first systematically defined in Escherichia coli, in which they were detected as patterns that occurred in the transcription network much more often than would be expected in random networks. The same motifs have since been found in organisms from bacteria and yeast to plants and animals. 14

There are two types of transcription networks: 
- sensory networks that respond to signals such as stresses and nutrients, and 
- developmental networks that guide differentiation events.

I will first consider sensory networks, the motifs of which are common to both types of network. I will then turn to motifs that are specific to developmental networks.  Network motifs are also found in other biological networks, such as those that involve protein modifications or interactions between neuronal cells. Each network motif can carry out specific information-processing functions. Simple regulation occurs when transcription factor Y regulates gene X with no additional interactions. Figure a, below:



Simple regulation and autoregulation. 
a | In simple regulation, transcription factor Y is activated by a signal Sy . When active, it binds the promoter of gene X to enhance or inhibit its transcription rate. 
b | In negative autoregulation (NAR), X is a transcription factor that represses its own promoter. 
c | In positive autoregulation (PAR), X activates its own promoter. 
d | NAR speeds the response time (the time needed to reach halfway to the steady-state concentration) relative to a simple-regulation system that reaches the same steady-state expression. PAR slows the response time. 
e | An experimental study of NAR, using a synthetic gene circuit in which the repressor TetR fused to GFP represses its own promoter. High-resolution fluorescence measurements in living Escherichia coli cells show that this NAR motif has a response time about fivefold faster than a simple-regulation design. 
f | A schematic cell-cell distribution of protein levels. NAR tends to make this distribution narrower in comparison with simple regulation, whereas PAR tends to make it wider and in extreme cases bimodal with two populations of cells. X/Xst, X concentration relative to steady state Xst

Simple regulation 
Y is usually activated by a signal, Sy . The signal can be an inducer molecule that directly binds Y, or a modification of Y by a signal-transduction cascade, and so on. When transcription begins, the concentration of the gene product X rises and converges to a steady-state level (FIG d). This level is equal to the ratio of the production and degradation rates, where degradation includes both active degradation and the effect of dilution by cell growth. When production stops, the concentration of the gene product decays exponentially. In both cases, the response time, which is defined as the time it takes to reach halfway between the initial and final levels, is equal to the half-life of the gene product a. The faster the degradation rate, the shorter the response time. For proteins that are not actively degraded, as is the case for most proteins in growing bacterial cells, the response time is equal to one cell-generation time. This is a result of the dilution effect from cell growth.

Negative autoregulation
Negative autoregulation (NAR) occurs when a transcription factor represses the transcription of its own gene (FIG.b). This network motif occurs in about half of the repressors in E. coli, and in many eukaryotic repressors. NAR has been shown to display two important functions. First, NAR speeds up the response time of gene circuits. This occurs when NAR uses a strong promoter to obtain a rapid initial rise in the concentration of protein X. When X concentration reaches the repression threshold for its own promoter, the production rate of new X decreases. Thus, the concentration of X locks into a steady-state level that is close to its repression threshold.  By contrast, a simply regulated gene that is designed to reach the same steady-state level must use a weaker promoter. As a result, an NAR system reaches 50% of its steady state faster than a simply regulated gene (FIG. d). The dynamics of NAR show a rapid initial rise followed by a sudden locking into the steady state, possibly accompanied by an overshoot or damped oscillations. Response acceleration (or speed-up) by NAR has been demonstrated experimentally. Speed-up in a natural context was demonstrated in the SOS DNA-repair system of E. coli, in which the master regulator, LexA, represses its own promoter. 

Positive autoregulation 
Positive autoregulation (PAR) occurs when a transcription factor enhances its own rate of production (FIG. c). The effects are opposite to those of NAR: response times are slowed and variation is usually enhanced.  In addition to speeding responses, NAR can reduce cell–cell variation in protein levels. These variations are due to an inherent source of noise: the production rates of proteins fluctuate by tens of percents  (FIG.f). This noise results in cell–cell variation in protein level. NAR can, in many cases, reduce these variations: high concentrations of X reduce its own rate of production, whereas low concentrations cause an increased production rate. The result is a narrower distribution of protein levels than would be expected in simply regulated genes (FIG.f) However, if the NAR feedback contains a long delay, noise can also be amplified. PAR slows the response time because at early stages, when levels of X are low, production is slow. Production picks up only when X concentration approaches the activation threshold for its own promoter. Thus, the desired steady state is reached in an S-shaped curve (FIG.d). The response time is longer than in a corresponding simple-regulation system, as shown theoretically and experimentally. PAR tends to increase cell–cell variability. If PAR is weak (that is, X moderately enhances its own production rate), the cell–cell distribution of X concentration is expected to be broader than in the case of a simply regulated gene (FIG. f). Strong PAR can lead to bimodal distributions, whereby the concentration of X is low in some cells but high in others. In cells in which the concentration is high, X activates its own production and keeps it high indefinitely. Strong PAR can, therefore, lead to a differentiation-like partitioning of cells into two populations (FIG.f). In some cases, PAR can be useful as a memory to maintain gene expression. In other cases, a bimodal distribution is thought to help cell populations to maintain a mixed phenotype so that they can better respond to a stochastic environment.

Feedforward loops
The second family of network motifs is the feedforward loop (FFL). It appears in hundreds of gene systems in E. coli and yeast, as well as in other organisms. This motif consists of three genes: a regulator, X, which regulates Y, and gene Z, which is regulated by both X and Y. Because each of the three regulatory interactions in the FFL can be either activation or repression, there are eight possible structural types of FFL (FIG a).

Named after George Boole, a 19th-century mathematician, Boolean logic is the basis of digital circuit design. For example, the AND function means that both conditions must be on to trigger the circuit. In notational form, where 1 means present and 0 means absent,
1 AND 1 = true or “on”
0 AND 0 = false or “off”
1 AND 0 = false or “off”
In the OR function, either one (or both) of the inputs can be on to trigger the circuit:
1 OR 1 = true or “on”
0 OR 0 = false or “off”
1 OR 0 = true or “on”
Negation of these functions, called NAND and NOR, reverse the outputs. For a logic gate with two inputs and one output, such as a transistor, there are 16 possible operations (AND, NAND, OR, NOR, XOR, XNOR, …). 17


Feedforward loops (FFLs). 
a | The eight types of feedforward loops (FFLs) are shown. In coherent FFLs, the sign of the direct path from transcription factor X to output Z is the same as the overall sign of the indirect path through transcription factor Y. Incoherent FFLs have opposite signs for the two paths. 
b | The coherent type-1 FFL with an AND input function at the Z promoter. 
c | The incoherent type-1 FFL with an AND input function at the Z promoter. SX and SY are input signals for X and Y.

To understand the function of the FFLs, we need to understand how X and Y are integrated to regulate the Z promoter. Two common ‘input functions’ are an ‘AND gate’, in which both X and Y are needed to activate Z, and an ‘OR gate’, in which binding of either regulator is sufficient. Other input functions are possible, such as the additive input function in the flagella system24,31 and the hybrid of AND and OR logic in the lac promoter32. However, much of the essential behaviour of FFLs can be understood by focusing on the stereotypical AND and OR gates. Each of the eight FFL types can thus appear with at least two input functions. In the best studied transcriptional networks (E. coli and yeast), two of the eight FFL types occur much more frequently than the other six types. These common types are the coherent type-1 FFL (C1-FFL) and the incoherent type-1 FFL (I1-FFL). Here I discuss their dynamical functions in detail; the functions of all eight FFL types.

The C1-FFL is a ‘sign-sensitive delay’ element and a persistence detector. In the C1-FFL, both X and Y are transcriptional activators (FIG. b above). I will first consider the behaviour of the FFL when the Z promoter has an AND input function, and then turn to the case of the OR input function. With an AND input function, the C1-FFL shows a delay after stimulation, but no delay when stimulation stops. To see this, let’s follow the behaviour of the FFL. When the signal Sx appears, X becomes active and rapidly binds its downstream promoters. As a result, Y begins to accumulate. However, owing to the AND input function, Z production starts only when Y concentration crosses the activation threshold for the Z promoter. This results in a delay of Z expression following the appearance of Sx
(FIG. below).


The coherent type-1 feedforward loop (C1-FFL) and its dynamics. 
The C1-FFL with an AND input function shows delay after stimulus (SX) addition and no delay after stimulus removal. It thus acts as a sign-sensitive filter, which responds only to persistent stimuli.

In contrast, when the signal Sx is removed, X rapidly becomes inactive. As a result, Z production stops because deactivation of its promoter requires only one arm of the AND gate to be ‘shut off ’. Hence, there is no delay in deactivation of Z after the signal Sx is removed (FIG. above). This dynamic behaviour is called sign-sensitive delay; that is, delay depends on the sign of the Sx step. An ON step (addition of Sx) causes a delay in Z expression, but
an OFF step (removal of Sx) causes no delay. The duration of the delay is determined by the biochemical parameters of the regulator Y; for example, the higher the activation threshold for the Z promoter by Y, the longer the delay. The delay that is generated by the FFL can be useful to filter out brief spurious pulses of signal. A signal that appears only briefly does not allow Y to accumulate and cross its threshold, and thus does not induce a Z response. Only persistent signals lead to Z expres​sion(FIG. above). The sign-sensitive delay function of this motif has been experimentally demonstrated in the arabinose utilization system of E. coli. 


An experimental study of the C1-FFL in the arabinose system of Escherichia coli, using fluorescent-reporter strains and high-resolution measurements in living cells. 
This system (represented by red circles) shows a delay after addition of the input signal (cAMP), and no delay after its removal, relative to a simple-regulation system that responds to the same input signal (the lac system, represented by blue squares).




a Genes that direct differentiation of embryonic stem cells (hESCs) into posterior specialized differentiated progenitor cells

b A structural gene is a gene that codes for any RNA or protein product other than a regulatory factor (i.e. regulatory protein).

1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765714/
2. http://sci-hub.tw/https://www.ncbi.nlm.nih.gov/pubmed/12779449
3. http://www.pnas.org/content/100/21/11980
4. http://sci-hub.tw/https://www.ncbi.nlm.nih.gov/pubmed/11967538/
5. http://sci-hub.tw/https://www.sciencedirect.com/science/article/pii/S0006349508702297
6. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3957374/
7. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3135751/
8. https://en.wikipedia.org/wiki/Systems_biology
9. http://sci-hub.tw/https://www.sciencedirect.com/science/article/pii/S200103701460026X
10. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5898668/
11. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0092709
12. https://www.nature.com/articles/s41540-017-0019-y
13. https://www.cell.com/fulltext/S0092-8674(11)00131-0
14. http://sci-hub.tw/https://www.ncbi.nlm.nih.gov/pubmed/17510665/
15. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5468647/
16. http://sci-hub.tw/https://academic.oup.com/icb/article-abstract/58/4/640/5039865?redirectedFrom=fulltext
17. https://evolutionnews.org/2013/05/bacteria_perfor/
18. http://sci-hub.tw/http://science.sciencemag.org/content/340/6132/554.summary
19. https://www.sciencedirect.com/science/article/pii/S0012160611000911#bb0110
20. http://sci-hub.tw/https://www.ncbi.nlm.nih.gov/pubmed/18501470

https://nptel.ac.in/courses/102106035/Module%204/Lecture%205/Lecture%205.pdf



Last edited by Admin on Sun Dec 02, 2018 10:57 am; edited 81 times in total

View user profile http://elshamah.heavenforum.com

Admin


Admin
A delay occurs after addition of the input signal cAMP, but not after its removal. This delay, of about 20 min, is on the same timescale as spurious pulses of cAMP that occur in the natural environment when E. coli transits between
growth conditions. When the Z promoter has OR logic, the FFL has the opposite effect to the AND case: with an OR input function, the C1-FFL shows no delay after stimulation but does show a delay when stimulation stops. To see this, note that when the signal Sx appears, X alone is sufficient to activate Z because of the OR-gate logic. If the signal suddenly stops after a long period of stimulation, X is no longer active, but the presence of Y is still enough to allow production of Z. Thus, the C1-FFL with OR logic allows continued production in the face of a transient loss of the input signal. This behaviour was experimentally demonstrated in the flagella system of E. coli


The C1-FFL with an OR-like input function in the flagella system of E. coli shows a delay after signal removal but not after the onset of signal (represented by orange circles). Deletion of the ‘Y’ gene (FliA) abolishes this delay (represented by purple squares). Z/Zst, Z concentration relative to the steady state Zst.

The flagella motor genes are regulated in an FFL that has input functions that resemble OR gates (additive functions of the two activators FlhDC and FliA). The flagella FFL was found to prolong flagella gene expression after the input signal (active FlhDC) stopped, but no delay occurred when the input signal appeared. Mutations and conditions that inactivate the FliA gene in this FFL lead to a loss of this delay, resulting in immediate shut-off of the flagella
genes once the input signal stops. The delay in the flagella system, of about 1 hour, is comparable to the time that is needed for the biogenesis of a complete flagella motor.

Structure and function of the feed-forward loop network motifs
Engineered systems are often built of recurring circuit modules that carry out key functions. Transcription networks that regulate the responses of living cells were recently found to obey similar principles: they contain several biochemical wiring patterns, termed network motifs, which recur throughout the network. One of these motifs is the feed-forward loop (FFL). 3 The FFL, a three-gene pattern, is composed of two input transcription factors, one of which regulates the other, both jointly regulating a target gene. The FFL has eight possible structural types, because each of the three interactions in the FFL can be activating or repressing. We find that four of the FFL types, termed incoherent FFLs, act as sign-sensitive accelerators: they speed up the response time of the target gene expression following stimulus steps in one direction (e.g., off to on) but not in the other direction (on to off). The other four types, coherent FFLs, act as sign-sensitive delays.


Feed Forward Loop
The structures for coherent and incoherent FFL

Feedback regulation describes the particular type of gene regulation, where the current status of gene expression will influence the future status of the same set of genes. In a cell, feedback regulation is an essential mechanism to ensure the stability of the cellular system under changing environmental conditions.


In a feedforward circuit, gene product A activates both gene B and gene C, and gene B also activates gene C. Feedforward circuits provide an efficient way to amplify a signal in one direction.

Control of the genes that differentiate the cells of the sea urchin skeleton operates on a feedforward process (Figure above). Here, regulatory gene A produces a transcription factor that is needed for the differentiation of gene C and also activates regulatory gene B, which produces a transcription factor also needed for differentiation of gene C. This feedforward process stabilizes gene expression and makes the resulting cell type irreversible.

Cells contain networks of biochemical transcription interactions. These networks perform information-processing functions. The inputs to the network, such as external nutrients and stresses, affect the activity of transcription factor proteins. The transcription factors bind regulatory regions of specific genes and activate or repress their transcription. As a result, cell processes are modulated to fit the environmental conditions. Transcription networks can be described as directed graphs, in which the nodes are genes. Directed edges represent transcription interactions, where a transcription factor encoded by one gene modulates the transcription rate of the second gene. It was recently found that these networks contain significantly recurring wiring patterns termed “network motifs” Network motifs are patterns that occur in the network far more often than in randomized networks with the same degree sequence. One of the most significant network motifs in both E. coli and yeast is the feed-forward loop (FFL).

The FFL is composed of a transcription factor X, which regulates a second transcription factor Y.


(a) FFL. Transcription factor X regulates transcription factor Y, and both jointly regulate Z. Sx and Sy are the inducers of X and Y, respectively. The action of X and Y is integrated at the Z promoter with a cis-regulatory input function, such as AND or OR logic. 
(b) Simple regulation of Z by X and Y.

The FFL has three transcription interactions. Each of these can be either positive (activation) or negative (repression). There are therefore eight possible structural configurations of activator and repressor interactions

Networks are commonly used to represent the complex interactions between components found in natural and engineered systems. Making sense of these structures has so far relied on the analysis of global topological features, such as degree distributions or clustering coefficients, and the classification of significant localized structures called network motifs. Major progress has been made in the literature toward understanding how some motifs contribute to network structure and function. This has involved proving that motifs exist, that they are not there by accident, and that they make significant functional contributions to networks. Important families of motifs that are shared by diverse networks carrying out similar functions have been discovered, and attempts have been made to relate motif structure with motif function. This earlier work has shown that motifs play an important role in gene regulation, accelerated response times, dynamic stability, and responses to noise.  Even with these detailed studies, the functional importance of motifs is often uncertain and contested. In particular, it is not clear to what extent the functions of the motifs depend on the context in which they are found (that is, their specific dynamical parameters or their position and connections within the network).

Generalizations of motifs often maintain the dynamical function of the template motif, and specific examples of multi-input and multi-output feed-forward loops (FFLs) were shown to be capable of testing for signal persistence and the temporal ordering of events. Structures of Feed Forward Loops (FFLs) are organized in a range of natural and engineered networks.  Random networks have very different distributions compared to natural and engineered networks. Although many types of clustering are possible, often just one or two types dominate, forming more than 80% of the FFL clusters.

Biomolecular computing systems and logic circuits
A cell is a sophisticated device that performs three elaborate functions: sensing inputs, processing the input information for decision-making, and executing the outputs. To this end, cells have built-in sensors that can receive the input signals generated by various environmental factors 6 Specifically, the plasma membrane and its integrated receptors can sense pressure, osmotic stress, intracellular contact, temperature, and chemicals. At the same time reactive oxygen species, pH, nutrients, signaling factors, and other indicators of internal state are registered by internal receptors. Varying degrees of a single environmental input or a combination of many of them is presented to the cell at any given time, giving rise to a large array of input information sets. Cells continuously process this multitude of input signals to make decisions about their appropriate responses that lead to changes in gene expression, enzymatic activity, and rewiring of their signaling networks. This decision-making process manifests itself in the form of migration, growth, or division, as well as programmed cell death as the output information. In a computing device, the input information is mathematically processed into a digital signal. This signal is a code representation of the physical cues and assumes a sequence of discrete values. For instance, in the case of a binary code, the basic unit of information is denoted as a series of “0” and “1” digits. The binary digits indicate the two states of the logic circuit. A threshold is implemented to define the input and output range that can be categorized under each logic set. If the value is either lower or higher than the threshold, the state of the circuit is defined as either “0” or “1”, respectively. Digital circuits make extensive use of logic elements which are interconnected to create logic gates, capable of executing Boolean logic functions including NOT, OR, AND, and all their possible combinations


Traditional symbols and truth table of Boolean logic gates are shown. When the information is sensed or released from the gate, the value is defined as “1”. If not, the value is “0”.

In these gates, the sensors read out inputs and then a computational core assigns them a value of either “0” or “1” depending on the threshold set in place. If the combination of these values meets the system requirements (i.e., in case of an AND gate, if the two different inputs are both “1”), the output is executed. Each gate can be defined by conventional symbols or a truth table (Fig. above). One remarkable property of these modular logic gates is the potential to network them together to make more complex circuits, making it possible to build integrated circuits that can process versatile inputs.

Molecular and biomolecular logic gates and their networks process chemical input signals similarly to computers. The similarities in the processing of information by biological systems and human designed devices are broadly recognized by many researchers.  Biological information can mimic Boolean logic operations using binary definitions (1/0; YES/NO) Computer technology relies on logic functions and arithmetic operations, whereby the insertion of signals into electronic circuits allows the differentiation and processing of the output signals same as in chemical or biochemical computing systems. Different stimuli trigger molecular architectures or biochemical ensembles, and the resulting chemical reactivity provides computing and processing functions. Cell is a complicated device that performs three elaborate functions: sensing inputs, processing the input information for decision-making, and executing the outputs. To this end, cells have built-in sensors that can receive the input signals generated by various external and internal environmental factors. Therefore, biomolecules such as Deoxyribonucleic acid (DNA) or proteins were recommended as active components that function as logic gates. Keeping this in mind, various genetic regulatory network modelling approaches in terms of Boolean logic have been proposed by number of researchers for capturing the holistic behavior of the relevant genes and biomolecule. In the Boolean idealization of the network, consists of a number of "genes" or "biomolecules", which can either be active "ON" i.e. gene/biomolecule is expressed or inactive "OFF" i.e. gene/ biomolecule is not expressed. Because this physiological cellular behaviour is similar to information processing in a computing device, in the field of synthetic biology, engineering principles have been applied to study fundamental biological components.

In any computing device, the input information is mathematically processed into a digital signal. Digital signals are represented in terms of binary code. The binary digits indicate the two states of the logic circuit. Digital circuits make extensive use of logic elements that are interconnected to create logic gates, capable of executing Boolean logic functions including NOT, OR, AND, and all their possible combinations. In these gates, the sensors read out inputs, and then a computational core assigns them a binary value depending on the threshold set in place. If the combination of these values meets the system requirements the output is executed. Each gate can be defined by conventional symbols or a truth table 5 


Operational illustration of some important logic gates.

Many proteins in living cells appear to have as their primary function the transfer and processing of information, rather than the chemical transformation of metabolic intermediates or the building of cellular structures. Such proteins are functionally linked through allosteric or other mechanisms into biochemical 'circuits' that perform a variety of simple computational tasks including amplification, integration and information storage.

The task of information processing, or computation, can be performed by natural and man-made ‘devices’. Man-made computers are made from silicon chips, whereas natural ‘computers’, such as the brain, use cells and molecules. Computation also occurs on a much smaller scale in regulatory and signalling pathways in individual cells and even within single biomolecules. Indeed, much of what we recognize as life results from the remarkable capacity of biological building blocks to compute in highly sophisticated ways. Rational design and engineering of biological computing systems can greatly enhance our ability to study and to control biological systems.

Sensors that collect information (input) from inside and outside the system; computers or processors that interpret this information to determine potential damage and to decide on the response; and actuators (output) that carry the response out. In animals, this description applies equally well to the brain, to the immune and homeostasis systems and to the regulatory pathways in individual cells. 

The gene is the fundamental unit of biological information.  The DNA double helix is beautiful not only because it is an elegant structure but because that structure reveals that DNA can act as a digital information storage device that can be precisely copied. At the structural level of the cell, phenomena such as general cellular homeostasis and the maintenance of cell integrity, the generation of spatial and temporal order, inter- and intracellular signalling, cell ‘memory’ and reproduction are not fully understood. This is also true for the levels of organization seen in tissues, organs and organisms, which feature more complex phenomena such as embryonic development and operation of the immune and nervous systems. Our past successes have led us to underestimate the complexity of living organisms. We need to focus more on how information is managed in living systems and how this brings about higher level biological phenomena.  2 

We need to describe the molecular interactions and biochemical transformations that take place in living organisms, and then translate these descriptions into the logic circuits that reveal how information is managed. This analysis should not be confined to the flow of information from gene to protein, but should also be applied to all functions operating in cells and organisms, including chemical interactions and transformations as well as physical phenomena, such as electrical signalling and mechanical processes.

The logic circuits that operate within cells need to be broken down into the individual segments that carry out specific computational functions. I shall call these segments ‘logic modules’. One example of such a module is the negative feedback loop, which often operates in a homeostatic manner. Another example is the positive feedback loop, which can generate irreversible switch behaviour from one state to another. Combinations of modules will produce more sophisticated outcomes: for example, reversible toggle switches, timers and oscillators.

A useful analogy is an electronic circuit. Representations of such circuits use symbols to define the nature and function of the electronic components used. They also describe the logic relationships between the components, making it clear how information flows through the circuit. A similar conceptualization is required of the logic modules that make up the circuits that manage information in cells.

Knowledge of which modules are operational and how these are linked into circuits will help us to understand the flow of information. We need to know how information is gathered from various sources, from the environment, from other cells and from the short and long-term memories in the cell; how that information is integrated and processed; and how it is then either used, rejected or stored for later use. The aim is to describe how information flows through the modules and brings about higher-level cellular phenomena, investigations that may well require the development of new methods and languages to describe the processes involved. 

Any mapping of ‘all or none’ cues to an ‘all or none’ outcome is a logic function, diverse monitoring and control applications in cells can be enabled by systematic approaches to constructing molecular logic circuits. These circuits, which are also known as digital or Boolean circuits, belong to ‘circuit-based’ models in which data pass through ‘wires’ between small computational units (known as ‘gates’) that perform simple operations


Theoretical proposals for biomolecular logic circuits. 
a | Representation of a ‘circuit machine’, with inputs feeding into interconnected gates, eventually producing outputs. 
b | Schematic representation and truth tables of basic one- and two-input logic gates.

Circuit models resemble coupled chemical reactions, in which the concentrations of individual species are interpreted as their values, and individual interactions are compared to circuit wires.  In cells, regulatory networks encode logic operations that integrate environmental and cellular signals Cellular regulatory networks are robustly designed for their functions. 4


1. http://sci-hub.tw/https://www.ncbi.nlm.nih.gov/pubmed/22688678
2. http://sci-hub.tw/https://www.nature.com/articles/454424a
3. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3904220/
4. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC387325/
5. http://austinpublishinggroup.com/biotechnology-bioengineering/fulltext/ajbtbe-v1-id1016.php
6. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3603578/



Last edited by Admin on Sun Nov 25, 2018 5:04 pm; edited 22 times in total

View user profile http://elshamah.heavenforum.com

Admin


Admin
Gene regulatory network 1
A gene regulatory network or genetic regulatory network (GRN) is a collection of regulators interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins. The regulator can be DNA, RNA, protein and their complex. The interaction can be direct or indirect (through their transcribed RNA or translated protein). In general, each mRNA molecule goes on to make a specific protein (or set of proteins). In some cases this protein will be structural, and will accumulate at the cell membrane or within the cell to give it particular structural properties. In other cases the protein will be an enzyme, i.e., a micro-machine that catalyses a certain reaction, such as the breakdown of a food source or toxin. Some proteins though serve only to activate other genes, and these are the transcription factors that are the main players in regulatory networks or cascades. By binding to the promoter region at the start of other genes they turn them on, initiating the production of another protein, and so on. Some transcription factors are inhibitory.

In multicellular animals the same principle has been put in the service of gene cascades that control body-shape. Each time a cell divides, two cells result which, although they contain the same genome in full, can differ in which genes are turned on and making proteins. Sometimes a 'self-sustaining feedback loop' ensures that a cell maintains its identity and passes it on. Less understood is the mechanism of epigenetics by which chromatin modification may provide cellular memory by blocking or allowing transcription. A major feature of multicellular animals is the use of morphogen gradients, which in effect provide a positioning system that tells a cell where in the body it is, and hence what sort of cell to become. A gene that is turned on in one cell may make a product that leaves the cell and diffuses through adjacent cells, entering them and turning on genes only when it is present above a certain threshold level. These cells are thus induced into a new fate, and may even generate other morphogens that signal back to the original cell. Over longer distances morphogens may use the active process of signal transduction. Such signalling controls embryogenesis, the building of a body plan from scratch through a series of sequential steps. They also control and maintain adult bodies through feedback processes, and the loss of such feedback because of a mutation can be responsible for the cell proliferation that is seen in cancer. In parallel with this process of building structure, the gene cascade turns on genes that make structural proteins that give each cell the physical properties it needs.

Gene regulatory networks for development 
Development of animal body plans proceeds by the progressive installation of transcriptional regulatory states, transiently positioned in embryonic space. The underlying mechanism is the localized expression of genes encoding sequence-specific transcription factors at specific times and places. The units of control are clusters of DNA sequence elements that serve as target sites for transcription factors, which usually correspond to enhancers, although there are other kinds of cis-regulatory DNA modules as well, such as silencers and insulators. We refer to all of these regulatory DNAs as “cis-regulatory modules.” Each module is typically 300 bp or more in length and contains on the order of 10 or more binding sites for at least four transcription factors. In general, a particular cis-regulatory module produces a specific pattern of gene expression in space or time, and multiple modules can produce complex patterns of gene expression. Because each module is regulated by multiple transcription factors and each transcription factor interacts with multiple modules, it is possible to represent developmental patterns of gene expression as an interlocking network .2

Gene regulatory networks (GRNs) are logic maps that state in detail the inputs into each cis-regulatory module, so that one can see how a given gene is fired off at a given time and place. They also provide specifically testable sets of predictions of just what target sites are hardwired into the cis-regulatory DNA sequence. The specific linkages constituting these networks provide a causal structure/function answer to the question of how any given aspect of development is ultimately controlled by heritable genomic sequence information. The architecture reveals features that can never be appreciated at any other level of analysis but that turn out to embody distinguishing and deeply significant properties of each control system. These properties are composed of linkages of multiple genes that together perform specific operations, such as positive feedback loops, which drive stable circuits of cell differentiation

Developmental gene regulation 
Genetic experiments over the last few decades have identified many developmental control genes critical for pattern formation and cell fate specification during the development of multicellular organisms. A large fraction of these genes encode transcription factors and signaling molecules, show highly dynamic expression patterns during development, and are deeply evolutionarily conserved and deregulated in various human diseases such as cancer. Because of their importance in development, evolution, and disease, a fundamental question in biology is how these developmental control genes are regulated in such an extensive and precise fashion. Using genomics methods, it has become clear that developmental control genes are a distinct group of genes with special regulatory characteristics. However, a systematic analysis of these characteristics has not been presented. Here we review how developmental control genes were discovered, evaluate their genome-wide regulation and gene structure, discuss emerging evidence for their mode of regulation, and estimate their overall abundance in the genome. Understanding the global regulation of developmental control genes may provide a new perspective on development in the era genomics.2

Darwins doubt  INTEGRATED CIRCUITRY: DEVELOPMENTAL GENE REGULATORY NETWORKS page 270
Keep in mind, too, that animal forms have more than just genetic information. They also need tightly  integrated networks of genes, proteins, and other molecules to regulate their development—in other words, they require developmental gene regulatory networks, the dGRNs . Developing animals face two main challenges. First, they must produce different types of proteins and cells and, second, they must get those proteins  and cells to the right place at the right time.20 Davidson has shown that embryos accomplish this task by relying on networks of regulatory DNA-binding proteins (called transcription factors) and their physical targets. These physical targets are typically sections of DNA (genes) that produce other  proteins or RNA molecules, which in turn regulate the expression of still other genes.

These interdependent networks of genes and gene products present a striking appearance of design. Davidson's graphical depictions of these dGRNs look for all the world like wiring diagrams in an electrical engineering blueprint or a schematic of an integrated circuit, an uncanny resemblance Davidson himself has often noted. "What emerges, from the analysis of animal dGRNs," he muses, "is almost astounding: a network of logic interactions programmed into the DNA sequence that amounts  essentially to a hardwired biological computational device." These molecules collectively form a tightly integrated network of signaling molecules that function as an integrated circuit. Integrated circuits in electronics are systems of individually functional components such as transistors, resistors, and capacitors that are connected together to perform an overarching function. Likewise, the functional components of dGRNs—the DNA-binding proteins, their DNA target sequences, and the other molecules that the binding proteins and target molecules produce and regulate—also form an integrated circuit, one that contributes to accomplishing the overall function of producing an adult animal form.

Davidson himself has made clear that the tight functional constraints under which these systems of molecules (the dGRNs) operate preclude their gradual alteration by the mutation and selection mechanism. For this reason, neo-Darwinism has failed to explain the origin of these systems of molecules and their functional integration. Like advocates of evolutionary developmental biology, Davidson himself favors a model of evolutionary change that envisions mutations generating large-scale developmental effects, thus perhaps bypassing nonfunctional intermediate circuits or systems. Nevertheless, neither proponents of "evo-devo," nor proponents of other recently proposed materialistic theories of evolution, have identified a mutational mechanism capable of generating a dGRN or anything even remotely resembling a complex integrated circuit. Yet, in our experience, complex integrated circuits—and the functional integration of parts in complex systems generally—are known to be produced by intelligent agents—specifically, by engineers. Moreover, intelligence is the only known cause of such effects. Since developing animals employ a form of integrated circuitry, and certainly one manifesting a tightly and functionally integrated system of parts and subsystems, and since intelligence is the only known cause of these features, the necessary presence of these features in developing Cambrian animals would seem to indicate that intelligent agency played a role in their origin 

Gene Expression and Regulation 
How does a gene, which consists of a string of DNA hidden in a cell's nucleus, know when it should express itself? How does this gene cause the production of a string of amino acids called a protein? How do different types of cells know which types of proteins they must manufacture? The answers to such questions lie in the study of gene expression. Thus, this collection or articles begins by showing how a quiet, well-guarded string of DNA is expressed to make RNA, and how the messenger RNA is translated from nucleic acid coding to protein coding to form a protein. Along the way, the article set also examines the nature of the genetic code, how the elements of code were predicted, and how the actual codons were determined. 4

Next, we turn to the regulation of genes. Genes can't control an organism on their own; rather, they must interact with and respond to the organism's environment. Some genes are constitutive, or always "on," regardless of environmental conditions. Such genes are among the most important elements of a cell's genome, and they control the ability of DNA to replicate, express itself, and repair itself. These genes also control protein synthesis and much of an organism's central metabolism. In contrast, regulated genes are needed only occasionally — but how do these genes get turned "on" and "off"? What specific molecules control when they are expressed?

It turns out that the regulation of such genes differs between prokaryotes and eukaryotes. For prokaryotes, most regulatory proteins are negative and therefore turn genes off. Here, the cells rely on protein–small molecule binding, in which a ligand or small molecule signals the state of the cell and whether gene expression is needed. The repressor or activator protein binds near its regulatory target: the gene. Some regulatory proteins must have a ligand attached to them to be able to bind, whereas others are unable to bind when attached to a ligand. In prokaryotes, most regulatory proteins are specific to one gene, although there are a few proteins that act more widely. For instance, some repressors bind near the start of mRNA production for an entire operon, or cluster of coregulated genes. Furthermore, some repressors have a fine-tuning system known as attenuation, which uses mRNA structure to stop both transcription and translation depending on the concentration of an operon's end-product enzymes. (In eukaryotes, there is no exact equivalent of attenuation, because transcription occurs in the nucleus and translation occurs in the cytoplasm, making this sort of coordinated effect impossible.) Yet another layer of prokaryotic regulation affects the structure of RNA polymerase, which turns on large groups of genes. Here, the sigma factor of RNA polymerase changes several times to produce heat- and desiccation-resistant spores. Here, the articles on prokaryotic regulation delve into each of these topics, leading to primary literature in many cases.

For eukaryotes, cell-cell differences are determined by expression of different sets of genes. For instance, an undifferentiated fertilized egg looks and acts quite different from a skin cell, a neuron, or a muscle cell because of differences in the genes each cell expresses. A cancer cell acts different from a normal cell for the same reason: It expresses different genes. Interestingly, in eukaryotes, the default state of gene expression is "off" rather than "on," as in prokaryotes. Why is this the case? The secret lies in chromatin, or the complex of DNA and histone proteins found within the cellular nucleus. The histones are among the most evolutionarily conserved proteins known; they are vital for the well-being of eukaryotes and brook little change. When a specific gene is tightly bound with histone, that gene is "off." But how, then, do eukaryotic genes manage to escape this silencing? This is where the histone code comes into play. This code includes modifications of the histones' positively charged amino acids to create some domains in which DNA is more open and others in which it is very tightly bound up. DNA methylation is one mechanism that appears to be coordinated with histone modifications, particularly those that lead to silencing of gene expression. Small noncoding RNAs such as RNAi can also be involved in the regulatory processes that form "silent" chromatin. On the other hand, when the tails of histone molecules are acetylated at specific locations, these molecules have less interaction with DNA, thereby leaving it more open. The regulation of the opening of such domains is a hot topic in research. For instance, researchers now know that complexes of proteins called chromatin remodeling complexes use ATP to repackage DNA in more open configurations. Scientists have also determined that it is possible for cells to maintain the same histone code and DNA methylation patterns through many cell divisions. This persistence without reliance on base pairing is called epigenetics, and there is abundant evidence that epigenetic changes cause many human diseases.

For transcription to occur, the area around a prospective transcription zone needs to be unwound. This is a complex process requiring the coordination of histone modifications, transcription factor binding and other chromatin remodeling activities. Once the DNA is open, specific DNA sequences are then accessible for specific proteins to bind. Many of these proteins are activators, while others are repressors; in eukaryotes, all such proteins are often called transcription factors (TFs). Each TF has a specific DNA binding domain that recognizes a 6-10 base-pair motif in the DNA, as well as an effector domain. In the test tube, scientists can find a footprint of a TF if that protein binds to its matching motif in a piece of DNA. They can also see whether TF binding slows the migration of DNA in gel electrophoresis.

For an activating TF, the effector domain recruits RNA polymerase II, the eukaryotic mRNA-producing polymerase, to begin transcription of the corresponding gene. Some activating TFs even turn on multiple genes at once. All TFs bind at the promoters just upstream of eukaryotic genes, similar to bacterial regulatory proteins. However, they also bind at regions called enhancers, which can be oriented forward or backwards and located upstream or downstream or even in the introns of a gene, and still activate gene expression. Because many genes are coregulated, studying gene expression across the whole genome via microarrays or massively parallel sequencing allows investigators to see which groups of genes are coregulated during differentiation, cancer, and other states and processes.

Most eukaryotes also make use of small noncoding RNAs to regulate gene expression. For example, the enzyme Dicer finds double-stranded regions of RNA and cuts out short pieces that can serve in a regulatory role. Argonaute is another enzyme that is important in regulation of small noncoding RNA–dependent systems. Here we offfer an introductory article on these RNAs.

Imprinting is yet another process involved in eukaryotic gene regulation; this process involves the silencing of one of the two alleles of a gene for a cell's entire life span. Imprinting affects a minority of genes, but several important growth regulators are included. For some genes, the maternal copy is always silenced, while for different genes, the paternal copy is always silenced. The epigenetic marks placed on these genes during egg or sperm formation are faithfully copied into each subsequent cell, thereby affecting these genes throughout the life of the organism.

Still another mechanism that causes some genes to be silenced for an organism's entire lifetime is X inactivation. In female mammals, for instance, one of the two copies of the X chromosome is shut off and compacted greatly. This shutoff process requires transcription, the participation of two noncoding RNAs (one of which coats the inactive X chromosome), and the participation of a DNA-binding protein called CTCF. As the possible role of regulatory noncoding RNAs in this process is investigated, more information regarding X inactivation will no doubt be discovered.


Darwins doubt , page 199 :
Another line of research in developmental biology has revealed a related challenge to the creative  power of the neo-Darwinian mechanism. Developmental biologists have discovered that many gene products (proteins and RNAs) needed for the development of specific animal body plans transmit signals that influence the way individual cells develop and differentiate themselves. Additionally, these signals affect how cells are organized and interact with each other during embryological development. These signaling molecules influence each other to form circuits or networks of coordinated interaction, much like integrated circuits on a circuitboard. For example, exactly when a signaling molecule gets transmitted often depends upon when a signal from another molecule is received, which in turn affects the transmission of still others—all of which are coordinated and integrated to perform specific time-critical functions. The coordination and integration of these signaling molecules in cells ensures the proper differentiation and organization of distinct cell types during the development of an animal body plan. Consequently, just as mutating an individual regulatory gene early in the development of an animal will inevitably shut down development, so too will mutations or alterations in the whole network of interacting signaling molecules destroy a developing embryo.

No biologist has explored the regulatory logic of animal development more deeply than Eric Davidson, at the California Institute of Technology. Early in his career, collaborating with molecular biologist Roy Britten, Davidson formulated a theory of "gene regulation for higher cells." By "higher cells" Davidson and Britten meant the differentiated, or specialized, cells found in any animal after the earliest stages of embryological development. Davidson observed that the cells of an  individual animal, no matter how varied in form or function, "generally contain identical genomes." During the life cycle of an organism, the genomes of these specialized cells express only a small fraction of their DNA at any given time and produce different RNAs as a result. These facts strongly suggest that some animal-wide system of genetic control functions to turn specific genes on and off as needed throughout the life of the organism—and that such a system functions during the development of an animal from egg to adult as different cell types are being constructed. When they proposed their theory in 1969, Britten and Davidson acknowledged that "little is known . . . of the molecular mechanisms by which gene expression is controlled in differentiated cells." Nevertheless, they deduced that such a system must be at work. Given:

(1) that tens or hundreds of specialized cell types arise during the development of animals, and
(2) that each cell contains the same genome, they reasoned
(3) that some control system must determine which genes are expressed in different cells at different times to ensure the differentiation of different cell types from each other
—some system-wide regulatory logic must oversee and coordinate the expression of the genome.

Davidson has dedicated his career to discovering and describing the mechanisms by which these systems of gene regulation and control work during embryological development. During the last two decades, research in genomics has revealed that nonprotein-coding region of the genome control and regulate the timing of the expression of the protein-coding regions of the genome. Davidson has shown that the nonprotein-coding regions of DNA that regulate and control gene expression and the protein-coding regions of the genome together function as circuits. These circuits, which Davidson calls "developmental gene regulatory networks" (or dGRNs) control the embryological development of animals.

Overview of the developmental gene regulatory network (dGRN) for the construction of a tissue called the endomesoderm in sea urchins, concentrating on the network after 21 hours. The diagram strongly resembles a complex electrical or computer-logic circuit. (http://sugp.caltech.edu/endomes)

On arriving at Caltech in 1971, Davidson chose the purple sea urchin, Strongylocentrotus purpuratus, as his experimental model system. The biology of S. purpuratus makes it an attractive laboratory subject: the species occurs abundantly along the Pacific coast, produces enormous quantities of easily fertilized eggs in the lab, and lives for many years. Davidson and his coworkers pioneered the technology and experimental protocols required to dissect the sea urchin's genetic regulatory system. The remarkable complexity of what they found needs to be depicted visually ( see picture above ).  This last diagram represents a developmental gene regulatory network (or dGRN), an integrated network of protein and RNA-signaling molecules responsible for the differentiation and arrangement of the specialized cells that establish the rigid skeleton of the sea urchin. Notice that, to express the biomineralization genes that produce structural proteins that make the skeleton, genes far upstream, activated many hours earlier in development, must first play their role.  This process does not happen fortuitously in the sea urchin but via highly regulated and precise control systems, as it does in all animals. Indeed, even one of the simplest animals, the worm C. elegans, possessing just over 1,000 cells as an adult, is constructed during development by dGRNs of remarkable precision and complexity. In all animals, the various dGRNs direct what Davidson describes as the embryo's "progressive increase in complexity"—an increase, he writes, that can be  measured in "informational terms." Davidson notes that, once established, the complexity of the dGRNs as integrated circuits makes them stubbornly resistant to mutational change—a point he has stressed in nearly every publication on the topic over the past fifteen years. "In the sea urchin embryo," he points out, "disarming any one of  these subcircuits produces some abnormality in expression."  Developmental gene regulatory networks resist mutational change because they are organized hierarchically. This means that some developmental gene regulatory networks control other gene regulatory networks, while some influence only the individual genes and proteins under their control. At the center of this regulatory hierarchy are the regulatory networks that specify the axis and global form of the animal body plan during development. These dGRNs cannot vary without causing catastrophic effects to the organism.

Indeed, there are no examples of these deeply entrenched, functionally critical circuits varying at all. At the periphery of the hierarchy are gene regulatory networks that specify the arrangements for smaller-scale features that can sometimes vary. Yet, to produce a new body plan requires altering the axis and global form of the animal. This requires mutating the very circuits that do not vary without catastrophic effects. As Davidson emphasizes, mutations affecting the dGRNs that regulate body-plan  development lead to "catastrophic loss of the body part or loss of viability altogether." He explains  in more detail:

There is always an observable consequence if a dGRN subcircuit is interrupted. Since these  consequences are always catastrophically bad, flexibility is minimal, and since the  subcircuits are all interconnected, the whole network partakes of the quality that there is  only one way for things to work. And indeed the embryos of each species develop in only one  way.

Davidson's findings present a profound challenge to the adequacy of the neo-Darwinian mechanism.  Building a new animal body plan requires not just new genes and proteins, but new dGRNs. But to build a new dGRN from a preexisting dGRN by mutation and selection necessarily requires altering the preexisting developmental gene regulatory network (the very kind of change that  cannot arise without multiple coordinated mutations). In any case, Davidson's work has also shown that such alterations inevitably have catastrophic consequences.  Davidson's work highlights a profound contradiction between the neo-Darwinian account of how new animal body plans are built and one of the most basic principles of engineering—the principle of constraints. Engineers have long understood that the more functionally integrated a system is, the more difficult it is to change any part of it without damaging or destroying the system as a whole. Davidson's work confirms that this principle applies to developing organisms in spades. The system of gene regulation that controls animal-body-plan development is exquisitely integrated, so that significant alterations in these gene regulatory networks inevitably damage or destroy the developing  animal. But given this, how could a new animal body plan, and the new dGRNs necessary to  produce it, ever evolve gradually via mutation and selection from a preexisting body plan and set of  dGRNs?  Davidson makes clear that no one really knows: "contrary to classical evolution theory, the processes that drive the small changes observed as species diverge cannot be taken as models for the  evolution of the body plans of animals." He elaborates:

Neo-Darwinian evolution . . . assumes that all process works the same way, so that evolution of enzymes or flower colors can be used as current proxies for study of evolution of the body plan. It erroneously assumes that change in protein-coding sequence is the basic cause of  change in developmental program; and it erroneously assumes that evolutionary change in  body-plan morphology occurs by a continuous process. All of these assumptions are  basically counterfactual. This cannot be surprising, since the neo-Darwinian synthesis from  which these ideas stem was a premolecular biology concoction focused on population  genetics and . . . natural history, neither of which have any direct mechanistic import for the  genomic regulatory systems that drive embryonic development of the body plan.

Eric Davidson's work, like that of Nüsslein-Volhard and Wieschaus, highlights the difficulty of obvious relevance to the Cambrian explosion. Typically, paleontologists understand the Cambrian explosion as the geologically sudden appearance of new forms of animal life. Building these forms requires new developmental programs—including both new early-acting regulatory genes and new developmental gene regulatory networks. Yet if neither early-acting regulatory genes nor dGRNs can be altered by mutation without destroying existing developmental programs (and thus animal form), then mutating these entities will leave natural selection with nothing favorable to select and the evolution of animal form will, at that point, terminate.  Darwin's doubt about the Cambrian explosion centered on the problem of missing fossil intermediates. Not only have those forms not been found, but the Cambrian explosion itself illustrates  a profound engineering problem that fossil evidence does not address—the problem of building a new form of animal life by gradually transforming one tightly integrated system of genetic components and their products into another.

Exonic transcription factor binding directs codon choice and impacts protein evolution 1
Genomes contain both a genetic code specifying amino acids, and a regulatory code specifying transcription factor (TF) recognition sequences. We find that ~15% of human codons are dual-use codons (`duons') that simultaneously specify both amino acids and TF recognition sites. Genomes also contain a parallel regulatory code specifying recognition sequences for transcription factors (TFs) , and the genetic and regulatory codes have been assumed to operate independently of one another, and to be segregated physically into the coding and non-coding genomic compartments. the potential for some coding exons to accommodate transcriptional enhancers or splicing signals has long been recognized


1) https://en.wikipedia.org/wiki/Gene_regulatory_network
2) http://www.ncbi.nlm.nih.gov/pmc/articles/PMC555974/
3) http://www.cell.com/trends/biochemical-sciences/pdf/S0968-0004(14)00121-2.pdf
4) http://www.nature.com/scitable/topic/gene-expression-and-regulation-15

1) http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3967546/



Last edited by Admin on Thu Nov 29, 2018 4:43 am; edited 5 times in total

View user profile http://elshamah.heavenforum.com

Admin


Admin
Design in DNA: Dual Coding Found in Nearly All Genes 
The information content in human DNA is enormous, but we are just beginning to understand how efficiently the DNA is encoded. Scientists had originally speculated that the human genome contained up to 100,000 genes. However, the human genome project showed that it contained only one quarter that number, mostly because each gene can code for multiple transcripts. Scientists also thought that only the protein coding DNA, comprising only 3% of the DNA, was useful. The other 97% of the DNA was thought to be junk. However, the last few decades of research have shown that the vast majority (>80%) of non-coding DNA is functional. Much of the non-coding DNA is involved in regulation of transcription (the intermediate step in which mRNA is generated, from which the protein is translated). However, scientists have now discovered that some of the protein coding DNA not only codes for the protein sequence, but simultaneously codes for sequences that bind transcription factors (proteins that regulate the transcription and expression of genes). These dual coding sequences have been termed "duons." 1

The scientists who authored the study used a naturally occurring enzyme called DNAse I, which digests DNA. It turns out that the enzyme will only degrade DNA that is not bound to proteins. Since transcription factors are proteins that bind DNA, any transcription factors that are bound to DNA when it is isolated are protected from digestion by DNAse I. Scientists isolated the DNA from 81 different cell types and sequenced the fragments of DNA that were preserved by binding to transcription factors. They had to use different cell types because those different cells differentially express genes and transcription factors on the basis of their own particular function. An example dual coding region is shown in the figure to the right, which shows the gene CELSR2, found on chromosome 1. The gene consists of 34 exons (coding regions), with the ninth exon coding for the transcription factor CTFC, which is known to regulate the transcription of numerous genes. It is interesting to note that this short transcription binding site of the exon contains two arginine residues, which are coded using two very different codons (AGG and CGC) in order to match the sequence to which CTFC binds. Although most genes consist of multiple exons (coding regions), the vast majority of duon sequences occur in the first exon, which is what would be expected if the sequences were involved in the regulation of gene expression.

The scientists had originally expected to find a few genes that simultaneously coded for both proteins and transcription factor binding. However, what they found was that 14% of coding sequence space were duons (which represents over 400 million base pairs). An astounding 86% of all genes expressed at least one duon sequence. Scientist already knew that intronic sequences within the DNA coded for transcription factor binding in order to regulate gene expression. However, since exon coding regions are constrained by their need to code for specific amino acids, it was never imagined that such regions of DNA could simultaneously code for the binding of transcription factors, as well. The finding shows the amazing efficiency of DNA sequences in complex organisms. Although the authors of the study recognized the obvious optimization of the code, they attributed such optimization to natural selection, rather than design:
"Our results indicate that simultaneous encoding of amino acid and regulatory information within exons is a major functional feature of complex genomes. The information architecture of the received genetic code is optimized for superimposition of additional information (34, 35), and this intrinsic flexibility has been extensively exploited by natural selection."
However, they failed to account for how selection could simultaneously select for two diverse functions in the same, overlapping sequence of DNA code.

Scientists have discovered that regulation of gene expression, originally thought to occur only in non-coding DNA sequences, is, in fact, additionally dual coded into the actual sequence of DNA that defines protein composition. Transcription factors, which bind to specific short sequences of DNA, regulate how the genes are expressed. The fact that these transcription factor binding sequences overlap protein coding sequences, suggest that both sequences were designed together, in order to optimize the efficiency of the DNA code. As we learn more and more about DNA structure and function, it is apparent that the code was not just hobbled together by the trial and error method of natural selection, but that it was specifically designed to provide optimal efficiency and function.


1) http://www.godandscience.org/evolution/dual_coding_dna_design.html



Last edited by Admin on Thu Nov 29, 2018 4:47 am; edited 2 times in total

View user profile http://elshamah.heavenforum.com

Admin


Admin
Language of gene switches unchanged across the evolution 1

http://reasonandscience.heavenforum.org/t2194-control-of-gene-expression#4114

Flies look very different from humans, but both are descended from a common ancestor that existed over 600 million years ago. Some differences between animal species are due to them having different genes: stretches of DNA that contain the instructions to make proteins and other molecules. However, often differences are caused by the same or similar genes being switched on and off at different times and in different tissues in each species.

The instructions that control when and where a gene is expressed are written in the sequence of DNA bases located in the regulatory region of the gene. These instructions are written in a language that is often called the ‘gene regulatory code’. This code is read and interpreted by proteins called transcription factors that bind to specific sequences of DNA (or ‘DNA words’) and increase or decrease gene expression. Changes in gene expression between species could therefore be due to changes in the transcription factors and/or changes in the instructions within the regulatory regions of specific genes.


In order for communication to happen, 1. The sequence of DNA bases located in the regulatory region of the gene is required , and 2. transcription factors that read the code. If one of both is missing, communication fails, the gene that has to be expressed, cannot be encountered, and the whole procedure of gene expression fails. This is a irreducible complex system. The gene regulatory code could not arise in a stepwise manner either, since if that were the case, the code has only the right significance if fully developed. Thats a example par excellence of intelligent design.

Gene regulatory regions are not well conserved between species. However, it is unclear if the instructions in these regions are written using the same gene regulatory code, and whether transcription factors found in different species recognize different DNA words.

The findings of Nitta et al. also indicate that transcription factors do not evolve to recognize subtly different DNA motifs, but instead appear constrained to recognize the same motifs. Thus, much like the genetic code that instructs how to build proteins, the gene regulatory code that determines how DNA sequences direct gene expression is also highly conserved in animals. The language used to guide the development of animals has, as such, remained very similar for millions of years. What makes animals different is differences in the content and length of the instructions that are written using this language into the regulatory regions of their genes.

1) http://elifesciences.org/content/4/e04837.full#sthash.mfnYCImJ.dpuf



Last edited by Admin on Mon Oct 12, 2015 5:04 am; edited 2 times in total

View user profile http://elshamah.heavenforum.com

Admin


Admin

Regulatory genes 1


The genome consists not only of structural, or protein-coding genes, but also of regulatory genes (also known as homeotic, homeobox, Hox, or toolbox genes), which control the expression of one or more other genes and the pattern in which different parts of an embryo or larva develop. The study of regulatory genes is part of a growing field called evolutionary developmental biology, or evo-devo for short. Evolutionary developmental biologists argue that mutations affecting regulatory genes can generate large-scale morphological change and even whole new body plans. Homeobox genes are currently the best-known group of regulatory genes; they determine where limbs and other body segments form. In the 1990s researchers were astounded to discover that homeobox genes are almost identical in different multicellular animals; they control the development of analogous sections of the growing embryo of flies, reptiles, mice and humans – a finding entirely unanticipated by neo-Darwinism. The differences between species are said to depend on where and when certain homeobox genes are activated. When particular genes are turned on for certain lengths of time and in certain regions, a worm may emerge. If the same or other genes are expressed for different lengths of time and in different regions, a more complex organism may develop.
Scientists have discovered that the cell’s regulatory system displays ‘mind-boggling complexity’ . Biochemist Michael Behe  writes:

The control systems that affect when, where, and how much of a particular protein is made are becoming so complex, and their distribution in the DNA so widespread, that the very concept of a “gene” as a discrete region of DNA is no longer adequate. ... In animals, a master switch sets in train a whole cascade of lesser switches, where the initial regulatory protein turns on the genes for other regulatory proteins, which turn on other regulatory proteins, and so on.1

There may be more than ten regulatory proteins controlling each protein-encoding gene used in building an animal’s body. Note that control genes, like structural genes, do not embody the instructions to build particular bodily structures – they merely mark certain areas of the body, and signal other genes to turn on or off.

For decades, fruit flies have been deliberately irradiated in the laboratory to induce genetic mutations. The mutations have succeeded mainly in producing monstrosities: the mutant flies may have varied colouring of the eyes, stunted and deformed wings, extra wings, no wings at all, or extra eyes on different parts of their anatomy. The elimination of certain genes may prevent the formation of a given organ but, as Stuart Pivar says, ‘this does not mean that the gene shaped the organ, any more that a lamp switch creates light’.2 Moreover, although many mutants have been produced, they are still fruit flies, and no mutated fruit fly has ever reproduced a fruit fly with the same characteristics.



Fig. 3.4. Above: A normal specimen of the fruit fly Drosophila (top), and a mutant fly in which the third thoracic segment has been transformed so that it duplicates the second thoracic segment. Below: On the left, the head of a normal fruit fly; on the right, the head of a mutant fly in which the antennae are transformed into legs.3

 


In one experiment, researchers mutated the homeotic gene Pax-6, which is related to eye development, causing eyes to grow on the antennae and legs of fruit flies. Pax-6 helps regulate the development of compound eyes (composed of hundreds of separate lenses), as found in fruit flies (arthropods), and camera-type eyes (with a single lens and retinal surface), as found in squid and mice (cephalopods and vertebrates respectively). Darwinists have concluded that Pax-6 is the master control gene for eye morphogenesis. But as Jonathan Wells points out:

If the same gene can ‘determine’ structures as radically different as ... an insect’s eyes and the eyes of humans and squids then that gene is not determining much of anything. ... Except for telling us how an embryo directs its cells into one of several built-in developmental pathways, homeotic genes tell us nothing about how biological structures are formed.4

Moreover, Pax-6 is also involved in the development of other organs, including nose, brain, pituitary gland, gut and pancreas. It is also expressed in nematodes, which are eyeless. In at least one group of animals (flatworms), Pax-6 is involved in eye formation but if it is ‘knocked out’ during the process of regeneration (for which flatworms are famous) eyes still form.5
Here is another example of largely identical genetic sequences regulating the development of very different structures in different organisms:


In fruit flies, the gene distal-less regulates the development of compound limbs with exoskeletons and multiple joints. In sea urchins, however, the homologous gene regulates the development of spines. In vertebrates, by contrast, it regulates the development of another type of limb, with multiple joints but an internal bony skeleton. Except insofar as these structures all exemplify a broad general class, namely, appendages, they have little in common with each other. ... The genedistal-less and its homologues function as switches, but in each case a switch that regulates many different downstream genes, leading to different anatomical features, depending upon the large informational context in which the gene finds itself.6

Many biologists found this surprising because orthodox evolutionary theory had led them to assume that genes control the development of organisms and anatomical structures and that homologous genes should therefore produce homologous structures and organisms.
The evo-devo hope is that cells’ regulatory systems somehow make evolution easier. In reality, the exact opposite is the case. As Behe puts it, ‘the elaborate assembly control instructions for whole animals are a further layer of complexity, beyond the complexity of the animal’s anatomy itself’.7 Supposedly blind, random mutations need to change both structural and regulatory genes in just the right ways to produce proteins in the right places at the right times to build a new organ or body plan. But experiments have shown that mutating the genes that regulate body-plan construction tend to destroy animal forms.8
Moreover neither structural nor regulatory genes actually determine the form of any body structures. What does determine them is essentially unknown. That there are additional, epigenetic (i.e. nongenetic) formative influences at work is shown by experimental tissue-grafting work on frog eggs and developing tadpoles. For instance, if a limb bud is removed and a tail bud grafted in its place, the tail bud is converted into a limb. And if the tissues in a developing frog egg are transposed by cutting and grafting, material that would have become skin is converted into a spinal cord, and vice versa. In another experiment, a portion of a newt embryo was transplanted into another developing newt embryo, which then produced two bodies, each with a head and tail, but joined together at the belly; the anatomy of the embryo was thus dramatically altered even though its DNA remained unchanged.Epigenetic information is thought to be contained in cell structures other than DNA, and to involve, for example, patterns in the cytoskeleton (the cellular scaffolding or skeleton in a cell’s cytoplasm) and in the cell membrane,10 but these patterns, too, are probably effects of more fundamental causes. Darwinism therefore fails to account for the origin of both the genetic and nongenetic information necessary to produce new forms of life, and cannot explain what actually determines an organism's physical shape. As Stuart Pivar says, during embryogenesis ‘cells seem to run about helter-skelter, organizing themselves into organs as though they knew in advance where to go, all to the utter confusion of embryologists. ... It is difficult, if not impossible, to assign epigenetic, mechanically causative effects to the successive steps of observed embryology. Instead, it is as though the cells give the illusion of filling an invisible mold.11

1) http://davidpratt.info/evod1.htm

View user profile http://elshamah.heavenforum.com

Admin


Admin
Genetic Signaling: Transcription Factor Cascades and Segmentation 1

You may take for granted the fact that your body isn't the same from head to toe. Have you stopped and wondered why? Controlling gene expression to turn on and off at specific times is no simple feat.

Many genes encode transcription factors that, in turn, induce the expression of other transcription factors, thus creating cascades of gene expression wherein a multistep signaling pathway results in amplification of the initial signal. This results in a high level of control over expression of the target gene or compound, all from a small initial signal. Within most differentiated cells, several levels of induction produce a multitude of transcription factors. Moreover, during embryonic development, the actual process of differentiation itself requires an integrated system of transcription factors that turn gene expression on and off with strict precision and timing.

Question : how was this strict precision and timing achieved ? trial and error ?

This is particularly true in the case of segmentation.Segmentation is the term applied to the differentiation patterns of vertebrates and arthropods, in which serial repetition of similar anatomical modules creates a body axis. In vertebrates, segmentation results in the development of the spine, head, and limbs; in insects, it results in formation of the abdomen, head, and thorax. Segmentation is believed to be the result of the rhythmic activation and deactivation of certain signals that dictate specific relevant cellular processes. This oscillating process is called the segmentation clock. As scientists learn more about the timing and mechanisms of the segmentation clock, they also continue to uncover information about regulatory interactions and signal cascades in general.

The Role of Cascades in Insect Segmentation

A large portion of the existing research on genetic control of segmentation has been conducted using fruit flies. For instance, over the past 25 years, more than 40 genes have been identified as playing a role in segmentation in Drosophila melanogaster (Peel et al., 2005). Segmentation in fruit flies is controlled by a cascade of transcription factors, some of which have been found in other arthropods. Typically, cascade systems occur within individual cells; however, during segmentation processes, signal cascades are carried across multiple cells within specific regions of a developing embryo.

Pre-Existing Transcripts in Eggs Promote Formation of Transcription Factor Gradients

Expression of messenger RNA (mRNA) transcripts is controlled in time and space so that the protein products of these transcripts appear in gradients along each fertilized egg. Some of these proteins repress translation of other proteins, which results in higher concentrations of different polypeptides in the anterior and posterior parts of the developing embryo. For example, nanos inhibits hunchback translation; thus, while hunchback mRNA is ubiquitous, hunchback protein is more concentrated in the anterior portion of the embryo. Bicoid and hunchback are two important gene products involved in this process. In fact, mutations in the genes that code for these products have been shown to result in abnormal larval development. Larvae that carry these mutations may feature missing heads, thoracic regions, or abdominal regions, depending on the exact mutation.


Early Gradients Activate Gap Genes

The cascade of gene expression within fruit fly embryos begins when signals from the maternal proteins activate a set of genes known as the "gap" genes along the axis between the anterior and posterior halves of the embryo. Each of these gap genes—Krüppel (Kr), tailless (tll), giant(gt), and knirps (kni)—is expressed in a specific domain in the embryo. 






These domains begin to be further refined through the action of another group of genes called the pair-rule genes; these genes recognize the gap genes and generate periodic gene expression events called pair-rule stripes. The pair-rule stripes predict where later boundaries between segments will be established. Thus, this entire cascade of gene expression ultimately results in the differentiation and formation of the segments that make up a fruit fly's body.

Pair-rule gene expression is not stable, however, and many of these genes serve as transcription factors to induce the expression of segment polarity genes, like engrailed (en) and wingless (wg). Boundaries between developing segments are defined by en or wg expression patterns that are established through the aperiodic expression of the pair-rule factors.

Regulation of Transcription and Gene Expression in Eukaryotes 2

If our genes are so similar, what really makes a eukaryote different from a prokaryote, or a human from E. coli? The answer lies in the difference in gene expression and regulation used.

It is estimated that the human genome encodes approximately 25,000 genes, about the same number as that for corn and nearly twice as many as that for the common fruit fly. Even more interesting is the fact that those 25,000 genes are encoded in about 1.5% of the genome. So, what exactly does the other 98.5% of our DNA do? While many mysteries remain about what all of that extra sequence is for, we know that it does contain complex instructions that direct the intricate turning on and off of gene transcription.

Eukaryotes Require Complex Controls Over Gene Expression
While basic similarities in gene transcription exist between prokaryotes and eukaryotes—including the fact that RNA polymerase binds upstream of the gene on its promoter to initiate the process of transcription—multicellular eukaryotes control cell differentiation through more complex and precise temporal and spatial regulation of gene expression.

Multicellular eukaryotes have a much larger genome than prokaryotes, which is organized into multiple chromosomes with greater sequence complexity. Many eukaryotic species carry genes with the same sequences as other plants and animals. In addition, the same DNA sequences (though not the same proteins) are found within all of an organism's diploid, nucleated cells, even though these cells form tissues with drastically different appearances, properties, and functions. Why then, is there such great variation among and within such organisms? Quite simply, the way in which different genes are turned on and off in specific cells generates the variety we observe in nature. In other words, specific functions of different cell types are generated through differential gene regulation.

Of course, higher eukaryotes still respond to environmental signals by regulating their genes. But there is an additional layer of regulation that results from cell-to-cell interactions within the organism that orchestrate development. Specifically, gene expression is controlled on two levels. First, transcription is controlled by limiting the amount of mRNA that is produced from a particular gene. The second level of control is through post-transcriptional events that regulate the translation of mRNA into proteins. Even after a protein is made, post-translational modifications can affect its activity.

1) http://www.nature.com/scitable/topicpage/genetic-signaling-transcription-factor-cascades-and-segmentation-1058
2) http://www.nature.com/scitable/topicpage/regulation-of-transcription-and-gene-expression-in-1086

View user profile http://elshamah.heavenforum.com

Admin


Admin
The princeton guide to evolution page 448

THE EVOLUTION OF NOVEL TRAITS AND THEIR UNDERLYING GENE REGULATORY NETWORKS

Mutations to single developmental genes  often modify the expression of many downstream targets and have a large impact on an organism’s final phenotype. The group of affected genes depends on the topology of the regulatory network, that is, how many targets are downstream of the mutated gene, including both direct and indirect targets. Some gene-regulatory networks are modular in their effects and may be quite important in body plan evolution. For instance, the Distal-less and Pax6 TFs are important early regulators of limb and eye development, respectively, throughout the Metazoa. These genes,when ectopically expressed in several other parts of the body of a fly, are able to promote limb duplications and ectopic eyes; that is, they control the initiation of gene regulatory networks that lead to limb and eye differentiation. These networks have modular qualities in that they can be initiated in a context-independentmanner at multiple locations in the body, somewhat independently of the cocktail of other TFs present at those locations. The deployment and co-option of these modular networks into novel places in the body, and their recruitment to create repeated or serial homologous traits, and potentially also novel traits, is an active area of research in evo-devo. The idea is that the origination of novel traits may proceed by the co-option and the mixing and matching of modular networks, in novel combinations and at novel places in the body, rather than by the elaboration of preexisting networks one gene at a time (see figure 2). Evolution of novel traits would proceed via the genetic tinkering of modules of interacting genes by modification of the cis-regulatory regions of only a small set of individual genes regulating the initiation of each of these modules. The above-mentioned Distal-less (Dll) gene in the context of the evolution of appendages provides a nice example of the way these modular gene networks may originate



The Molecular Basis of Phenotypic Plasticity

While much is beginning to be known about the molecular details of morphological evolution, an area that is still lagging behind concerns investigating the molecular basis of the integration of environmental factors into regulatory gene networks to induce distinct phenotypes. Phenotypic plasticity, or the ability of the same genome to give rise to very different morphological, physiological, or behavioral traits depending on rearing environment, is still poorly understood at the molecular level. A variety of environmental factors such as temperature, light, pressure, food availability, and certain chemicals are known to induce alternative developmental pathways, but the molecular details of the mechanisms by which these factors influence gene regulatory networks are poorly understood. The evolution of adaptive phenotypic plasticity usually involves changes to gene-regulatory networks that better adapt the organism to different and predictable environments. In many cases, hormones appear to play important roles in coordinating plastic development as they circulate among all the tissues in the body, and are thus able to coordinate changes in multiple modular gene regulatory networks underlying the development of various traits. Buthowthese hormonal signaling systems evolve to interact with specific gene networks and how hormonal systems themselves become sensitive to the environment are still areas of active investigation.

View user profile http://elshamah.heavenforum.com

Admin


Admin
Control of Gene Expression

Summary

Transcription regulators recognize short stretches of double-helical DNA of defined sequence called cis-regulatory sequences, and thereby determine which of the thousands of genes in a cell will be transcribed. Approximately 10% of the protein-coding genes in most organisms produce transcription regulators, and they control many features of cells. Although each of these transcription regulators has unique features, most bind to DNA as homodimers or heterodimers and recognize DNA through one of a small number of structural motifs. Transcription regulators typically work in groups and bind DNA cooperatively, a feature that has several underlying
mechanisms, some of which exploit the packaging of DNA in nucleosomes.


An organism’s DNA encodes all of the RNA and protein molecules required to construct its cells. Yet a complete description of the DNA sequence of an organism—be it the few million nucleotides of a bacterium or the few billion nucleotides of a human—no more enables us to reconstruct the organism than a list of English words enables us to reconstruct a play by Shakespeare. In both cases, the problem is to know how the elements in the DNA sequence or the words on the list are used. Under what conditions is each gene product made, and, once made, what does it do? In this chapter, we focus on the first half of this problem—the rules and mechanisms that enable a subset of genes to be selectively expressed in each cell. These mechanisms operate at many levels, and we shall discuss each level in turn. But first we present some of the basic principles involved.

An Overview of Gene Control

The different cell types in a multicellular organism differ dramatically in both structure and function. If we compare a mammalian neuron with a liver cell, for example, the differences are so extreme that it is difficult to imagine that the two cells contain the same genome



For this reason, and because cell differentiation often seemed irreversible, biologists originally suspected that genes might be selectively lost when a cell differentiates. We now know, however, that cell differentiation generally occurs without changes in the nucleotide sequence of a cell’s genome.

The Different Cell Types of a Multicellular Organism Contain the Same DNA

The cell types in a multicellular organism become different from one another because they synthesize and accumulate different sets of RNA and protein molecules. The initial evidence that they do this without altering the sequence of their DNA came from a classic set of experiments in frogs. When the nucleus of a fully differentiated frog cell is injected into a frog egg whose nucleus has been removed, the injected donor nucleus is capable of directing the recipient egg to produce a normal tadpole (Figure A).



Differentiated cells contain all the genetic instructions necessary to direct the formation of a complete organism. (A) The nucleus of a skin cell from an adult frog transplanted into anenucleated egg can give rise to an entire tadpole. The broken arrow indicates that, to give the transplanted genome time to adjust to an embryonic environment, a further transfer step is required in which one of the nuclei is taken from an early embryo that begins to develop and is put back into a second enucleated egg. (B) In many types of plants, differentiated cells retain the ability to “de-differentiate,” so that a single cell can form a clone of progeny cells that later give rise to an entire plant. (C) A nucleus removed from a differentiated cell from an adult cow and introduced into an enucleated egg from a different cow can give rise to a calf. Different calves produced from the same differentiated cell donor are all clones of the donor and are therefore genetically identical.

The tadpole contains a full range of differentiated cells that derived their DNA sequences from the nucleus of the original donor cell. Thus, the differentiated donor cell cannot have lost any important DNA sequences. A similar conclusion came from experiments performed with plants. When differentiated pieces of plant tissue are placed in culture and then dissociated into single cells, often one of these individual cells can regenerate an entire adult plant (Figure B). And the same principle has been more recently demonstrated in mammals that include sheep, cattle, pigs, goats, dogs, and mice (Figure C). Most recently, detailed DNA sequencing has confirmed the conclusion that the changes in gene expression that underlie the development of multicellular organisms do not generally involve changes in the DNA sequence of the genome.

Different Cell Types Synthesize Different Sets of RNAs and Proteins

As a first step in understanding cell differentiation, we would like to know how many differences there are between any one cell type and another. Although we still do not have an exact answer to this fundamental question, we can make several general statements

1. Many processes are common to all cells, and any two cells in a single organism therefore have many gene products in common. These include the structural proteins of chromosomes, RNA and DNA polymerases, DNA
repair enzymes, ribosomal proteins and RNAs, the enzymes that catalyze the central reactions of metabolism, and many of the proteins that form the cytoskeleton such as actin .

2. Some RNAs and proteins are abundant in the specialized cells in which they function and cannot be detected elsewhere, even by sensitive tests. Hemoglobin, for example, is expressed specifically in red blood cells,
where it carries oxygen, and the enzyme tyrosine aminotransferase (which breaks down tyrosine in food) is expressed in liver but not in most other tissues.

3. Studies of the number of different RNAs suggest that, at any one time, a typical human cell expresses 30–60% of its approximately 30,000 genes at some level. There are about 21,000 protein-coding genes and a roughly estimated 9000 noncoding RNA genes in humans. When the patterns of RNA expression in different human cell lines are compared, the level of expression of almost every gene is found to vary from one cell type to another. A
few of these differences are striking, like those of hemoglobin and tyrosine aminotransferase noted above, but most are much more subtle. But even those genes that are expressed in all cell types usually vary in their level of
expression from one cell type to the next.

4. Although there are striking differences in coding RNAs (mRNAs) in specialized cell types, they underestimate the full range of differences in the final pattern of protein production. As we discuss in this chapter, there are many steps after RNA production at which gene expression can be regulated. And, as we saw in Chapter 3, proteins are often covalently modified after they are synthesized. The radical differences in gene expression between
cell types are therefore most fully revealed through methods that directly display the levels of proteins along with their post-translational modifications

Gene Expression Can Be Regulated at Many of the Steps in the Pathway from DNA to RNA to Protein

If differences among the various cell types of an organism depend on the particular genes that the cells express, at what level is the control of gene expression exercised? As we saw in the previous chapter, there are many steps in the pathway leading from DNA to protein. We now know that all of them can in principle be regulated. Thus a cell can control the proteins it makes by

(1) controlling when and how often a given gene is transcribed (transcriptional control),
(2) controlling the splicing and processing of RNA transcripts (RNA processing control),
(3) selecting which completed mRNAs are exported from the nucleus to the cytosol and determining where in the cytosol they are localized (RNA transport and localization control),
(4) selecting which mRNAs in the cytoplasm are translated by ribosomes (translational control),
(5) selectively destabilizing certain mRNA molecules in the cytoplasm (mRNA degradation control), or
(6) selectively activating, inactivating, degrading, or localizing specific protein molecules after they have been made (protein activity control) (Figure below).



For most genes, transcriptional controls are paramount. This makes sense because, of all the possible control points illustrated in Figure 7–5, only transcriptional control ensures that the cell will not synthesize superfluous intermediates. In the following sections, we discuss the DNA and protein components that perform this function by regulating the initiation of gene transcription. We shall then return to the additional ways of regulating gene expression.

CONTROL OF TRANSCRIPTION BY SEQUENCESPECIFIC DNA-BINDING PROTEINS


How does a cell determine which of its thousands of genes to transcribe? Perhaps the most important concept, one that applies to all species on Earth, is based on a group of proteins known as transcription regulators. These proteins recognize specific sequences of DNA (typically 5–10 nucleotide pairs in length) that are often called cis-regulatory sequences, because they must be on the same chromosome (that is, in cis) to the genes they control. Transcription regulators bind to these sequences, which are dispersed throughout genomes, and this binding puts into motion a series of reactions that ultimately specify which genes are to be transcribed and at what rate. Approximately 10% of the protein-coding genes of most organisms are devoted to transcription regulators, making them one of the largest classes of proteins in the cell. In most cases, a given transcription regulator recognizes its own cis-regulatory sequence, which is different from those recognized by all the other regulators in the cell. Transcription of each gene is, in turn, controlled by its own collection of cis-regulatory sequences. These typically lie near the gene, often in the intergenic region directly upstream from the transcription start point of the gene. Although a few genes are controlled by a single cis-regulatory sequence that is recognized by a single transcription regulator, the majority have complex arrangements of cis-regulatory sequences, each of which is recognized by a different transcription regulator. It is therefore the positions, identity, and arrangement of cis-regulatory sequences—which are an important part of the information embedded in the genome—that ultimately determine the time and place that each gene is transcribed. We begin our discussion by describing how transcription regulators recognize cis-regulatory sequences.

The Sequence of Nucleotides in the DNA Double Helix Can Be Read by Proteins

the DNA in a chromosome consists of a very long double helix that has both a major and a minor groove



Transcription regulators must recognize short, specific cis-regulatory sequences within this structure. When first discovered in the 1960s, it was thought that these proteins might require direct access to the interior of the double helix to distinguish between one DNA sequence and another. It is now clear, however, that the outside of the double helix is studded with DNA sequence information that transcription regulators recognize: the edge of each base pair presents a distinctive pattern of hydrogen-bond donors, hydrogen-bond acceptors, and hydrophobic patches in both the major and minor grooves



How the different base pairs in DNA can be recognized from their edges without the need to open the double helix. The four possible configurations of base pairs are shown, with potential hydrogen-bond donors indicated in blue, potential hydrogen-bond acceptors in red, and hydrogen bonds of the base pairs themselves as a series of short, parallel red lines. Methyl groups, which form hydrophobic protuberances, are shown in yellow, and hydrogen atoms that are attached to carbons, and are therefore unavailable for hydrogen-bonding, are white. From the major groove, each of the four base-pair configurations projects a unique pattern of features.

Because the major groove is wider and displays more molecular features than does the minor groove, nearly all transcription regulators make the majority of their contacts with the major groove

Transcription Regulators Contain Structural Motifs That Can Read DNA Sequences

Molecular recognition in biology generally relies on an exact fit between the surfaces of two molecules, and the study of transcription regulators has provided some of the clearest examples of this principle. A transcription regulator recognizes a specific cis-regulatory sequence because the surface of the protein is extensively complementary to the special surface features of the double helix that displays that sequence. Each transcription regulator makes a series of contacts with the DNA, involving hydrogen bonds, ionic bonds, and hydrophobic interactions. Although each individual contact is weak, the 20 or so contacts that are typically formed at the protein–DNA interface add together to ensure that the interaction is both highly specific and very strong

Question: How could that recognition have arised ? Had it not have to be fully developed and functioning right from the beginning, otherwise the cell would be unable to recognize which genes to express ?



The binding of a transcription regulator to a specific DNA sequence. On the left, a single contact is shown between a transcription regulator and DNA; such contacts allow the protein to “read” the DNA sequence. On the right, the complete set of contacts between a transcription regulator (a member of the homeodomain family) and its cis-regulatory sequence is shown. The DNA-binding portion of the protein is 60 amino acids long. Although the interactions in the major groove are the most important, the protein is also seen to contact both the minor groove and phosphates in the sugar–phosphate DNA backbone.

In fact, DNA–protein interactions include some of the tightest and most specific molecular interactions known in biology. Although each example of protein–DNA recognition is unique in detail, x-ray crystallographic and nuclear magnetic resonance (NMR) spectroscopic studies of hundreds of transcription regulators have revealed that many of them contain one or another of a small set of DNA-binding structural motifs





These motifs generally use either α helices or β sheets to bind to the major groove of DNA. The amino acid side chains that extend from these protein motifs make the specific contacts with the DNA. Thus, a given structural motif can be used to recognize many different cis-regulatory sequences depending on the specific side chains present

Dimerization of Transcription Regulators Increases Their Affinity and Specificity for DNA

A monomer of a typical transcription regulator recognizes about 6–8 nucleotide pairs of DNA. However, sequence-specific DNA-binding proteins do not bind tightly to a single DNA sequence and reject all others; rather, they recognize a range of closely related sequences, with the affinity of the protein for the DNA varying according to how closely the DNA matches the optimal sequence. Hence, cis-regulatory sequences are often depicted as “logos” which display the range of sequences recognized by a particular transcription regulator



The DNA sequence recognized by a monomer does not contain sufficient information to be picked out from the background of such sequences that would occur at random all over the genome. For example, an exact six-nucleotide DNA sequence would be expected to occur by chance approximately once every 409^6 nucleotides , and the range of six-nucleotide sequences described by a typical logo would be expected to occur by chance much more often, perhaps every 1000 nucleotides. Clearly, for a bacterial genome of 4.6 × 106 nucleotide pairs, not to mention a mammalian genome of 3 × 109 nucleotide pairs, this is insufficient information to accurately control the transcription of individual genes. Additional contributions to DNA-binding specificity must therefore be present. Many transcription regulators form dimers, with both monomers making nearly identical contacts with DNA (Figure C above). This arrangement doubles the length of the cis-regulatory sequence recognized and greatly increases both the affinity and the specificity of transcription regulator binding. Because the DNA sequence recognized by the protein has increased from approximately 6 nucleotide pairs to 12 nucleotide pairs, there are many fewer random occurrences of matching sequences. Heterodimers are often formed from two different transcription regulators. Transcription regulators may form heterodimers with more than one partner protein; in this way, the same transcription regulator can be “reused” to create several distinct DNA-binding specificities (see Figure C)

Transcription Regulators Bind Cooperatively to DNA

In the simplest case, the collection of noncovalent bonds that holds the above dimers or heterodimers together is so extensive that these structures form obligatorily and never fall apart. In this case, the unit of binding is the dimer or heterodimer, and the binding curve for the transcription regulator (the fraction of DNA-bound as a function of protein concentration) has a standard exponential shape (Figure A).



In many cases, however, the dimers and heterodimers are held together very weakly; they exist predominantly as monomers in solution, and yet dimers are observed on the appropriate DNA sequence. Here, the proteins are said to bind to DNA cooperatively, and the curve describing their binding is sigmoidal in shape (Figure B). Cooperative binding means that, over a range of concentrations of the transcription regulator, binding is more of an all-or-none phenomenon than for noncooperative binding; that is, at most protein concentrations, the cis-regulatory sequence is either nearly empty or nearly fully occupied and rarely is somewhere in between.

Nucleosome Structure Promotes Cooperative Binding of Transcription Regulators
Cooperative binding of transcription regulators to DNA often occurs because the monomers have only a weak affinity for each other. However, there is a second, indirect mechanism for cooperative binding, one that arises
from the nucleosome structure of eukaryotic chromosomes. In general, transcription regulators bind to DNA in nucleosomes with lower affinity than they do to naked DNA. There are two reasons for this difference. First, the surface of the cis-regulatory sequence recognized by the transcription regulator may be facing inward on the nucleosome, toward the histone core, and therefore not be readily available to the regulatory protein. Second, even if the face of the cis-regulatory sequence is exposed on the outside of the nucleosome, many transcription regulators subtly alter the conformation of the DNA when they bind, and these changes are generally opposed by the tight wrapping of the DNA around the histone core. For example, many transcription regulators induce a bend or kink in the DNA when they bind.  Nucleosome remodelling can alter the structure of the nucleosome, allowing transcription regulators access to the DNA. Even without remodelling, however, transcription regulators can still gain limited access to DNA in a nucleosome. The DNA at the end of a nucleosome “breathes,” transiently exposing the DNA and allowing regulators to bind. This breathing happens at a much lower rate in the middle of the nucleosome; therefore, the positions where the DNA exits the nucleosome are much easier to occupy

How nucleosomes effect the binding of transcription regulators.



These properties of the nucleosome promote cooperative DNA binding by transcription regulators. If a regulatory protein enters the DNA of a nucleosome and prevents the DNA from tightly rewrapping around the nucleosome core, it will increase the affinity of a second transcription regulator for a nearby cis-regulatory sequence. If the two transcription regulators also interact with each other (as described above), the cooperative effect is even greater. In some cases, the combined action of the regulatory proteins can eventually displace the histone core of the nucleosome altogether. The cooperation among transcription regulators can become much greater when nucleosome remodeling complexes are involved. If one transcription regulator binds its cis-regulatory sequence and attracts a chromatin remodeling complex, the localized action of the remodeling complex can allow a second transcription regulator to efficiently bind nearby. Moreover,  transcription regulators can work together in pairs; in reality, larger numbers often cooperate by repeated use of the same principles. A highly cooperative binding of transcription regulators to DNA probably explains why many sites in eukaryotic genomes that are bound by transcription regulators are “nucleosome free.”

TRANSCRIPTION REGULATORS SWITCH GENES ON AND OFF

Having seen how transcription regulators bind to cis-regulatory sequences embedded in the genome, we can now discuss how, once bound, these proteins influence the transcription of genes. The situation in bacteria is simpler than in eukaryotes (for one thing, chromatin structure is not an issue), and we therefore discuss it first. Following this, we turn to the more complex situation in eukaryotes.

The Tryptophan Repressor Switches Genes Off

The genome of the bacterium E. coli consists of a single, circular DNA molecule of about 4.6 × 10^6 nucleotide pairs. This DNA encodes approximately 4300 proteins, although only a fraction of these are made at any one time. Bacteria regulate the expression of many of their genes according to the food sources that are available in the environment. For example, in E. coli, five genes code for enzymes that manufacture the amino acid tryptophan. These genes are arranged in a cluster on the chromosome and are transcribed from a single promoter as one long mRNA molecule; such coordinately transcribed clusters are called operons



A cluster of bacterial genes can be transcribed from a single promoter. Each of these five genes encodes a different enzyme, and all of these enzymes are needed to synthesize the amino acid tryptophan from simpler molecules. The genes are transcribed as a single mRNA molecule, a feature that allows their expression to be coordinated. Clusters of genes transcribed as a single mRNA molecule are common in bacteria. Each of these clusters is called an operon because its expression is controlled by a cis-regulatory sequence called the operator (green), situated within the promoter. (In this and subsequent figures, the yellow blocks in the promoter represent DNA sequences that bind RNA polymerase;

Although operons are common in bacteria, they are rare in eukaryotes, where genes are typically transcribed and regulated individually. When tryptophan concentrations are low, the operon is transcribed; the resulting mRNA is translated to produce a full set of biosynthetic enzymes, which work in tandem to synthesize tryptophan from much simpler molecules. When tryptophan is abundant, however—for example, when the bacterium is in the gut of a mammal that has just eaten a protein-rich meal—the amino acid is imported into the cell and shuts down production of the enzymes, which are no longer needed.We now understand exactly how this repression of the tryptophan operon comes about. Within the operon’s promoter is a cis-regulatory sequence that is recognized by a transcription regulator. When this regulator binds to this sequence, it blocks access of RNA polymerase to the promoter, thereby preventing transcription of the operon (and thus production of the tryptophan-producing enzymes). The transcription regulator is known as the tryptophan repressor and its cis-regulatory sequence is called the tryptophan operator. These components are controlled in a simple way: the repressor can bind to DNA only if it has also bound several molecules of tryptophan



Genes can be switched off by repressor proteins. If the concentration of tryptophan inside a bacterium is low (left), RNA polymerase (blue) binds to the promoter and transcribes the five genes of the tryptophan operon. However, if the concentration of tryptophan is high (right), the repressor protein (dark green) becomes active and binds to the operator (light green), where it blocks the binding of RNA polymerase to the promoter. Whenever the concentration of intracellular tryptophan drops, the repressor falls off the DNA, allowing the polymerase to again transcribe the operon. Although not shown in the figure, the repressor is a stable dimer.

The tryptophan repressor is an allosteric protein, and the binding of tryptophan causes a subtle change in its three-dimensional structure so that the protein can bind to the operator sequence. Whenever the concentration of free tryptophan in the bacterium drops, tryptophan dissociates from the repressor, the repressor no longer binds to DNA, and the tryptophan operon is transcribed. The repressor is thus a simple device that switches production of a set of biosynthetic enzymes on and off according to the availability of the end product of the pathway that the enzymes catalyze. The tryptophan repressor protein itself is always present in the cell. The gene that encodes it is continuously transcribed at a low level, so that a small amount of the repressor protein is always being made. Thus the bacterium can respond very rapidly to a rise or fall in tryptophan concentration.

Repressors Turn Genes Off and Activators Turn Them On

The tryptophan repressor, as its name suggests, is a transcriptional repressor protein: in its active form, it switches genes off, or represses them. Some bacterial transcription regulators do the opposite: they switch genes on, or activate them. These transcriptional activator proteins work on promoters that—in contrast to the promoter for the tryptophan operon—are only marginally able to bind and position RNA polymerase on their own. However, these poorly functioning promoters can be made fully functional by activator proteins that bind to nearby cis-regulatory sequences and contact the RNA polymerase to help it initiate transcription



DNA-bound activator proteins can increase the rate of transcription initiation as much as 1000-fold, a value consistent with a relatively weak and nonspecific interaction between the transcription regulator and RNA polymerase. For example, a 1000-fold change in the affinity of RNA polymerase for its promoter corresponds to a change in ΔG of ≈18 kJ/mole, which could be accounted for by just a few weak, noncovalent bonds. Thus, many activator proteins work simply by providing a few favorable interactions that help to attract RNA polymerase to the promoter. To provide this assistance, however, the activator protein must be bound to its cis-regulatory sequence, and this sequence must be positioned, with respect to the promoter, so that the favorable interactions can occur. Like the tryptophan repressor, activator proteins often have to interact with a second molecule to be able to bind DNA. For example, the bacterial activator protein CAP has to bind cyclic AMP (cAMP) before it can bind to DNA. Genes activated by CAP are switched on in response to an increase in intracellular cAMP concentration, which rises when glucose, the bacterium’s preferred carbon source, is no longer available; as a result, CAP drives the production of enzymes that allow the bacterium to digest other sugars.

An Activator and a Repressor Control the Lac Operon

In many instances, the activity of a single promoter is controlled by several different transcription regulators. The Lac operon in E. coli, for example, is controlled by both the Lac repressor and the CAP activator. The Lac operon encodes proteins required to import and digest the disaccharide lactose. In the absence of glucose, the bacterium makes cAMP, which activates CAP to switch on genes that allow the cell to utilize alternative sources of carbon— including lactose. It would be wasteful, however, for CAP to induce expression of the Lac operon if lactose itself were not present. Thus the Lac repressor shuts off the operon in the absence of lactose. This arrangement enables the control region of the Lac operon to integrate two different signals, so that the operon is highly expressed only when two conditions are met: glucose must be absent and lactose must be present



This genetic circuit thus behaves much like a switch that carries out a logic operation in a computer. When lactose is present AND glucose is absent, the cell executes the appropriate program—in this case,
transcription of the genes that permit the uptake and utilization of lactose. All transcription regulators, whether they are repressors or activators, must be bound to DNA to exert their effects. In this way, each regulatory protein acts selectively, controlling only those genes that bear a cis-regulatory sequence recognized by it. The logic of the Lac operon first attracted the attention of biologists more
than 50 years ago. The way it works was uncovered by a combination of genetics and biochemistry, providing some of the first insights into how transcription is controlled in any organism.

Complex Switches Control Gene Transcription in Eukaryotes

When compared to the situation in bacteria, transcription regulation in eukaryotes involves many more proteins and much longer stretches of DNA. It often seems bewilderingly complex. Yet many of the same principles apply. As in bacteria, the time and place that each gene is to be transcribed is specified by its cis-regulatory sequences, which are “read” by the transcription regulators that bind to them. Once bound to DNA, positive transcription regulators (activators) help RNA polymerase begin transcribing genes, and negative regulators (repressors) block this from happening. In bacteria, most of the interactions between DNA-bound transcription regulators and RNA polymerases (whether they activate or repress transcription) are direct. In contrast, these interactions are almost always indirect in eukaryotes: many intermediate proteins, including the histones, act between the DNA-bound transcription regulator and RNA polymerase. Moreover, in multicellular organisms, it is common for dozens of transcription regulators to control a single gene, with cis-regulatory sequences spread over tens of thousands of nucleotide pairs. DNA looping allows the DNA-bound regulatory proteins to interact with each other and ultimately with RNA polymerase at the promoter. Finally, because nearly all of the DNA in eukaryotic organisms is compacted by nucleosomes and higher-order structures, transcription initiation in eukaryotes must overcome this inherent block. In the next sections, we discuss these features of transcription initiation in eukaryotes, emphasizing how they provide extra levels of control not found in bacteria.

A Eukaryotic Gene Control Region Consists of a Promoter Plus Many cis-Regulatory Sequences
In eukaryotes, RNA polymerase II transcribes all the protein-coding genes and many noncoding RNA genes. This polymerase requires five general transcription factors (27 subunits in toto, in contrast to bacterial RNA polymerase, which needs only a single general transcription factor (the σ subunit). The stepwise assembly of the general transcription factors at a eukaryotic promoter provides, in principle, multiple steps at which the cell can speed up or slow down the rate of transcription initiation in response to transcription regulators. Because the many cis-regulatory sequences that control the expression of a typical gene are often spread over long stretches of DNA, we use the term gene control region to describe the whole expanse of DNA involved in regulating and initiating transcription of a eukaryotic gene. This includes the promoter, where the general transcription factors and the polymerase assemble, plus all of the cis-regulatory sequences to which transcription regulators bind to control the rate of the assembly processes at the promoter



The gene control region for a typical eukaryotic gene. 
The promoter is the DNA sequence where the general transcription factors and the polymerase assemble. The cis-regulatory sequences are binding sites for transcription regulators, whose presence on the DNA affects the rate of transcription initiation. These sequences can be located adjacent to the promoter, far upstream of it, or even within introns or entirely downstream of the gene. The broken stretches of DNA signify that the length of DNA between the cis-regulatory sequences and the start of transcription varies, sometimes reaching tens of thousands of nucleotide pairs in length. The TATA box is a DNA recognition sequence for the general transcription factor TFIID. As shown in the lower panel, DNA looping allows transcription regulators bound at any of these positions to interact with the proteins that assemble at the promoter. Many transcription regulators act through Mediator, while some interact with the general transcription factors and RNA polymerase directly. Transcription regulators also act by recruiting proteins that alter the chromatin structure of the promoter. Whereas Mediator and the general transcription factors are the same for all RNA polymerase II-transcribed genes, the transcription regulators and the locations of their binding sites relative to the promoter differ for each gene.

In animals and plants, it is not unusual to find the regulatory sequences of a gene dotted over stretches of DNA as large as 100,000 nucleotide pairs. Some of this DNA is transcribed (but not translated). For now, we can regard much of this DNA as “spacer” sequences that transcription regulators do not directly recognize. It is important to keep in mind that, like other regions of eukaryotic chromosomes, most of the DNA in gene control regions is packaged into nucleosomes and higher-order forms of chromatin, thereby compacting its overall length and altering its properties.  we shall loosely use the term gene to refer to a segment of DNA that is transcribed into a functional RNA molecule, one that either codes for a protein or has a different role in the cell. However, the classical view of a gene includes the gene control region as well, since mutations in it can produce an altered phenotype. Alternative RNA splicing further complicates the definition of a gene—a point we shall return to later. In contrast to the small number of general transcription factors, which are abundant proteins that assemble on the promoters of all genes transcribed by RNA polymerase II, there are thousands of different transcription regulators devoted to turning individual genes on and off. In eukaryotes, operons—sets of genes transcribed as a unit—are rare, and, instead, each gene is regulated individually. Not surprisingly, the regulation of each gene is different in detail from that of every other gene, and it is difficult to formulate simple rules for gene regulation that apply in every case. We can, however, make some generalizations about how transcription regulators, once bound to gene control regions on DNA, set in motion the series of events that lead to gene activation or repression

Eukaryotic Transcription Regulators Work in Groups
In bacteria, we saw that proteins such as the tryptophan repressor, the Lac repressor, and the CAP protein bind to DNA on their own and directly affect RNA polymerase at the promoter. Eukaryotic transcription regulators, in contrast, usually assemble in groups at their cis-regulatory sequences. Often two or more regulators bind cooperatively. In addition, a broad class of multisubunit proteins termed coactivators and co-repressors assemble on DNA with them. Typically, these coactivators and co-repressors do not recognize specific DNA sequences themselves; they are brought to those sequences by the transcription regulators. 



As their names imply, coactivators are typically involved in activating transcription and co-repressors in repressing it.  coactivators and co-repressors can act in a variety of different ways to influence transcription after they have been localized on the genome by transcription regulators. an individual transcription regulator can often participate in more than one type of regulatory complex. A protein might function, for example, in one case as part of a complex that activates transcription and in another case as part of a complex that represses transcription. Thus, individual eukaryotic transcription regulators function as regulatory parts that are used to build complexes whose function depends on the final assembly of all of the individual components. Each eukaryotic gene is therefore regulated by a “committee” of proteins, all of which must be present to express the gene at its proper level.

Activator Proteins Promote the Assembly of RNA Polymerase at the Start Point of Transcription
The cis-regulatory sequences to which eukaryotic transcription activator proteins bind were originally called enhancers because their presence “enhanced” the rate of transcription initiation. It came as a surprise when it was discovered that these sequences could be found tens of thousands of nucleotide pairs away from the promoter;  DNA looping can now explain this initially puzzling observation. Once bound to DNA, how do assemblies of activator proteins increase the rate of transcription initiation? At most genes, mechanisms work in concert. Their function is both to attract and position RNA polymerase II at the promoter and to release it so that transcription can begin. Some activator proteins bind directly to one or more of the general transcription factors, accelerating their assembly on a promoter that has been brought in proximity—through DNA looping—to that activator. Most transcription activators, however, attract coactivators that then perform the biochemical tasks needed to initiate transcription. One of the most prevalent coactivators is the large Mediator protein complex, composed of more than 30 subunits. About the same size as RNA polymerase itself, Mediator serves as a bridge between DNA-bound transcription activators, RNA polymerase, and the general transcription factors, facilitating their assembly at the promoter

Eukaryotic Transcription factors direct the modification of local chromatin structure
The eukaryotic general transcription factors and RNA polymerase are unable, on their own, to assemble on a promoter that is packaged in nucleosomes. Thus, in addition to directing the assembly of the transcription machinery at the promoter, eukaryotic transcription activators promote transcription by triggering changes to the chromatin structure of the promoters, making the underlying DNA more accessible. The most important ways of locally altering chromatin are through covalent histone modifications, nucleosome remodeling, nucleosome removal, and histone replacement. Eukaryotic transcription activators use all four of these mechanisms ( how did they learn that feat ? trial and error ? ) : thus they attract coactivators that include histone modification enzymes, ATP-dependent chromatin remodeling complexes, and histone chaperones, each of which can alter the chromatin structure of promoters



Eukaryotic transcription activator proteins direct local alterations in chromatin structure. Nucleosome remodeling, nucleosome removal, histone replacement, and certain types of histone modifications favor transcription initiation . These alterations increase the accessibility of DNA and facilitate the binding of RNA polymerase and the general transcription factors.

These local alterations in chromatin structure provide greater access to DNA, thereby facilitating the assembly of the general transcription factors at the promoter. In addition, some histone modifications specifically attract these proteins to the promoter. These mechanisms often work together during transcription initiation



Successive histone modifications during transcription initiation. 
In this example, taken from the human interferon gene promoter, a transcription factor binds to DNA packaged into chromatin and attracts a histone acetyl transferase that acetylates lysine 9 of histone H3 and lysine 8 of histone H4. Then a histone kinase, also attracted by the transcription factor, phosphorylates serine 10 of histone H3 but it can only do so after lysine 9 has been acetylated. This serine modification signals the histone acetyl transferase to acetylate position K14 of histone H3. Next, the general transcription factor TFIID and a chromatin remodeling complex bind to the chromatin to promote the subsequent steps of transcription initiation. TFIID and the remodeling complex both recognize acetylated histone tails through a bromodomain, a protein domain specialized to read this particular mark on histones; a bromodomain is carried in a subunit of each protein complex. The histone acetyl transferase, the histone kinase, and the chromatin remodeling complex are all coactivators. The order of events shown applies to a specific promoter; at other genes, the steps may occur in a different order or individual steps may be omitted altogether.

Finally,  the local chromatin changes directed by one transcriptional regulator can allow the binding of additional regulators. By repeated use of this principle, large assemblies of proteins can form on control regions of genes to regulate their transcription. The alterations of chromatin structure that occur during transcription initiation can persist for different lengths of time. In some cases, as soon as the transcription regulator dissociates from DNA, the chromatin modifications are rapidly reversed, restoring the gene to its pre-activated state. This rapid reversal is especially important for genes that the cell must quickly switch on and off in response to external signals. In other cases, the altered chromatin structure persists, even after the transcription factor that directed its establishment has dissociated from DNA. In principle, this memory can extend into the next cell generation because chromatin structure can be self-renewing. The fact that different histone modifications persist for different times provides the cell with a mechanism that makes possible both longer- and shorter-term memory of gene expression patterns. A special type of chromatin modification occurs as RNA polymerase II transcribes through a gene. The histones just ahead of the polymerase can be acetylated by enzymes carried by the polymerase, removed by histone chaperones, and deposited behind the moving polymerase. These histones are then rapidly deacetylated and methylated, also by complexes that are carried by the polymerase, leaving behind nucleosomes that are especially resistant to transcription. This remarkable process seems to prevent spurious transcription reinitiation behind a moving polymerase, which, in essence, must clear a path through chromatin as it transcribes. Later in this chapter, when we discuss RNA interference, the potential dangers to the cell of such inappropriate transcription will become especially obvious. The modification of nucleosomes behind a moving RNA polymerase also plays an important role in RNA splicing

Transcription Activators Can Promote Transcription by Releasing RNA Polymerase from Promoters

In some cases, transcription initiation requires that a DNA-bound transcription activator releases RNA polymerase from the promoter so as to allow it to begin transcribing the gene. In other cases, the RNA polymerase halts after transcribing about 50 nucleotides of RNA, and further elongation requires a transcription activator bound behind it



These paused polymerases are common in humans, where a significant fraction of genes that are not being transcribed have a paused polymerase located just downstream from the promoter. The release of RNA polymerase can occur in several ways. In some cases, the activator brings in a chromatin remodeling complex that removes a nucleosome block to the elongating RNA polymerase. In other cases, the activator communicates with RNA polymerase (typically through a coactivator), signaling it to move ahead. Finally,  RNA polymerase requires elongation factors to effectively transcribe through chromatin. In some cases, the key step in gene activation is the loading of these factors onto RNA polymerase, which can be directed by DNA-bound transcription activators. Once loaded, these factors allow the polymerase to move through blocks imposed by chromatin structure and begin transcribing the gene in earnest. Having RNA polymerase already poised on a promoter in the beginning stages of transcription bypasses the step of assembling many components at the promoter, which is often slow. This mechanism can therefore allow cells to begin transcribing a gene as a rapid response to an extracellular signal.

Transcription Activators Work Synergistically

We have seen that complexes of transcription activators and coactivators assemble cooperatively on DNA. We have also seen that these assemblies can promote different steps in transcription initiation. In general, where several factors work together to enhance a reaction rate, the joint effect is not merely the sum of the enhancements that each factor alone contributes, but the product. If, for example, factor A lowers the free-energy barrier for a reaction by a certain amount and thereby speeds up the reaction 100-fold, and factor B, by acting on another aspect of the reaction, does likewise, then A and B acting in parallel will lower the barrier by a double amount and speed up the reaction 10,000-fold. Even if A and B work simply by attracting the same protein, the affinity of that protein for the reaction site increases multiplicatively. Thus, transcription activators often exhibit
transcriptional synergy, where several DNA-bound activator proteins working together produce a transcription rate that is much higher than the sum of their transcription rates working alone



An important point is that a transcription activator protein must be bound to DNA to influence transcription of its target gene. And the rate of transcription of a gene ultimately depends upon the spectrum of regulatory proteins bound upstream and downstream of its transcription start site, along with the coactivator proteins they bring to DNA.

more readings :

the davidson labs : http://sugp.caltech.edu/endomes/

View user profile http://elshamah.heavenforum.com

10 Systems Biology on Sat Nov 24, 2018 5:45 am

Admin


Admin
Systems Biology
From the year 2000 onwards, a new branch of biology, a concept called systems biology, has begun to be used widely in biology in a variety of contexts. The term systems biology was created by Bertalanffy in 1928.  Systems biology focuses on complex interactions in biological systems by applying a holistic perspective. Altogether, this kind of thinking has led to the identification of ideas behind data processing in machines, such as silicon computers, but also created a bridge and inter-related approaches to the architecture and complex structuration of biological systems in nature. Cells and organisms work based on data flow. Data processing can be found in nature all down to the atomic and molecular level. Examples are DNA information storage and the histone code. Moreover, cells have the potential to compute ( process data), both intracellular (e.g. transcription networks) and during cell to cell communication. Higher order cell systems such as the immune and the endocrine system, the homeostasis system, and the nerve system can be described as computational systems. The most powerful biological computer we know is the human brain. 

The brain is a " Uber-Computer " - far more sophisticated that man-made computers
http://reasonandscience.catsboard.com/t2736-the-brain-is-a-uber-computer-far-more-sophisticated-that-man-made-computers

Plato introduced in his dialogue Philebus a concept called System. A system is according to Plato a model for thinking about how complex structures are developed. Another idealistic philosopher, Kant, introduced, in 1790, in his Critique of Judgment the concept of self-organizing. Idealistic concepts based systemics have become important in contemporary science in order to understand complexity and big data problems. 9 Cybernetics explains complex systems that exist of a large number of interacting and interrelated parts.
 
It is a revolutionary paradigm shift in scientific thinking and has also major implications in regards to historical sciences, and the elucidation of origins of life, and biological diversity. Nature and computers are words that used to mean unrelated things. However, this view has changed with this scientific paradigm shift towards systems biology. One of the aims of systems biology is to model and discover properties of cells, tissues and organisms functioning as a system whose theoretical description is only possible using techniques of systems biology. These typically involve metabolic networks or cell signalling networks. As a field of study, particularly, the study of the interactions between the components of biological systems, and how these interactions give rise to the function and behaviour of that system (for example, the enzymes and metabolites in a metabolic pathway or the heart beats). Much effort has been made to elucidate the function of most of the biomolecular components and many of the interactions but that alone does not offer concepts or methods to understand how biological systems work holistically. The pluralism of causes and effects in biological networks is better addressed by observing, through quantitative measures, multiple components simultaneously and by rigorous data integration with mathematical models. 8 Such inquiry can also give answers to how such systems could have emerged.

DNA is a blueprint to produce, amongst other things, proteins - molecular machines, which are discrete units, components of complex biological systems that are useful only in the completion of organismal subcomponents like organs, or organisms as a functional whole.  The development gene regulatory network (dGRN) is like a central processing unit (CPU) the electronic circuitry within a computer that carries out the instructions of a computer program by performing the basic arithmetic, logic, controlling and input/output (I/O) operations specified by the instructions. It is like the control unit that orchestrates the fetching (from memory) and execution of instructions by directing the coordinated operations of the arithmetic logic unit (ALU). 

In most—if not all—animals, the multicellular state is established in each generation through serial divisions of the zygote, where daughter cells produced by these divisions become an independent and fully specialized cell type. This functional specialization occurs largely during development and involves the tight coordination of cell proliferation, cell differentiation, tissue growth, and developmental genetic programs. Genes encoding transcription factors and signaling molecules are critical controllers of pattern formation and cell fate specification during development.  Notably, most of these genes are highly conserved across animals (i.e., metazoans) and even their closest unicellular relatives. This striking level of conservation suggests that cell types and animal body plans are, at least partially, controlled by the regulatory capacities of these highly conserved genes. Yet, we cannot help but be intrigued by how such a conserved set of genes with few examples of gene expansions and little changes in their functionality can lead to the vast diversity of cell types and body plan forms found in animals. Transcription factors and signalling molecules participate in multiple, independent developmental processes.

View user profile http://elshamah.heavenforum.com

11 Initial strategies for unraveling dGRNs on Sat Nov 24, 2018 6:36 am

Admin


Admin
Initial strategies for unravelling dGRNs
The first theoretical model describing the mechanisms controlling gene regulation in higher eukaryotes was postulated by Eric H. 2 Davidson  and his long-time colleague Roy J. Britten 3 in 1971 . 1  



Eric H. Davidson :
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3135751/

" No observations on single genes can ever illuminate the overall mechanisms of the development of the body plan or of body parts."

Eric Harris Davidson (April 13, 1937 – September 1, 2015) was an American developmental biologist at the California Institute of Technology.

That single sentence has enormous significance for development and evolutionary biology.

Since then, dGRNs have been primarily studied via the careful and minute characterization of regulatory interactions in specific tissues and developmental stages in various organisms. Eric H. Davidson was also the founder and pioneer of studying the molecular interactions happening during development as a whole system. For nearly 30 years, Davidson’s lab was dedicated to untangling all specific interactions occurring during sea urchin (Strongylocentrotus purpuratus) development through targeted experiments. This journey was largely complemented with concurrent advances in DNA technologies. With the development of recombinant DNA in the early 1980s and through a series of experiments, Davidson and his colleagues characterized the genomic cis-regulatory sequence of the Endo16 gene (an endoderm-specific gene that is expressed at the onset of sea urchin gastrulation)  These studies showed that protein-coding genes are controlled by nearby DNA regulatory sequences that serve as binding sites for transcription factors, showing modularity of these binding-sites and demonstrating that different modules governed different temporal aspects of gene expression. After years of these painstaking analyses regarding the function of developmental control genes and their cis-regulatory modules during developmental progression in sea urchin, the description of the dGRN controlling the specification of the endomesoderm was published  4 This study gave support to Britten and Davidson’s theoretical model envisioned in 1969. 

Embryonic development is directed by the dynamic progression of regulatory states, which means a series of sub-circuits, each of which endows particular developmental functions. The last decade has seen a radical change in the way science performs developmental and evolutionary biology experiments. 

GRN reconstruction applied to embryonic development has the challenge which is both experimental and computational, of integrating both spatial and temporal gene expression data as well as gene perturbation data in order to infer the actual regulators and targets in the dGRN. Although we can now generate high-throughput information on molecules composition, abundance, and spatial and temporal organization, we still lack precise ways to detect interactions accurately in a one to one manner. We expect in the near future that information derived from classical model system will be coupled with dGRNs from non-classical model systems to identify key mechanisms that define the fascinating diversity of animal forms seen in nature. 5


1. Britten RJ, Davidson EH. 1971. Repetitive and non-repetitive DNA sequences and a speculation on the origins of evolutionary novelty. Q Rev Biol 46:111–38.
http://sci-hub.tw/https://www.jstor.org/stable/2822073?seq=1#page_scan_tab_contents
2. http://www.nasonline.org/publications/biographical-memoirs/memoir-pdfs/davidson-eric.pdf
3. http://sci-hub.tw/http://science.sciencemag.org/content/335/6073/1183
4. Davidson EH, Rast JP, Oliveri P, Ransick A, Calestani C, Yuh C-H, Minokawa T, Amore G, Hinman V, Arenas MC, et al. 2002. A genomic regulatory network for development. Science 295:1669–78.
http://sci-hub.tw/https://www.ncbi.nlm.nih.gov/pubmed/11872831
5. http://sci-hub.tw/https://academic.oup.com/icb/article-abstract/58/4/640/5039865?redirectedFrom=fulltext



Last edited by Admin on Tue Nov 27, 2018 2:56 pm; edited 1 time in total

View user profile http://elshamah.heavenforum.com

12 Boolean Spatial Output on Sat Nov 24, 2018 8:04 pm

Admin


Admin
Boolean Spatial Output

Animal bodies display a fundamentally Boolean character at every level of organization. The body parts of which they are composed are discrete and their structure and function as well as location within the body are deterministically programmed and spatially bounded. Heads are not partly legs, eyes do not blend into mouths, and the pelvis does not grade into the spine. Similarly, at a microscopic level, differentiated cell types are discrete. Neurons, muscle cells, keratinocytes, and gland cells each express discrete sets of effector genes which determine their respective functions, and their spatial location within each body part is also discretely determined in development. This discrete organization of the body plan is the outcome of a sequential series of regulatory definitions of space in the developing organism. In molecular terms, such definitions are achieved by the spatially discrete expression of sets of regulatory genes. From the point at which the cells of an embryo begin to express their genes, the patterns of expression are discrete and Boolean. That is, cells in every domain of the embryo express some genes that are not expressed in other domains. The expression of genes in some cells but not in others requires that their regulators be present only in some but not in other cells.

View user profile http://elshamah.heavenforum.com

13 Evolution of the dGRN on Mon Nov 26, 2018 5:05 am

Admin


Admin
Origin of the gene regulatory network, through evolution, or design?

Just as development is a system property of the regulatory genome, it must be considered at system level that the development process can change. Modification of the body plan depends on alteration of the structure of developmental gene regulatory networks as a whole.  The hierarchy and multiple additional design features of these networks act to produce Boolean regulatory state specification functions at upstream phases of development of the body plan. These are created by the logic outputs of network subcircuits, and in modern animals these outputs are impervious to continuous adaptive variation. Animal body plans is a system level property of the developmental gene regulatory networks (dGRNs) which control ontogeny of the body plan. It follows that gross morphological novelty requires dramatic alterations in dGRN architecture, always involving multiple regulatory genes, and typically affecting the deployment of whole network subcircuits. Because dGRNs are deeply hierarchical, and it is the upper levels of these GRNs that control major morphological features in development, a question dealt with below in this essay arises: how can we think about selection in respect to dGRN organization? The answers lie in the architecture of dGRNs and the developmental logic they generate at the system level, far from micro-evolutionary mechanism. While adaptive evolutionary variation occurs constantly in modern animals at the periphery of dGRNs, the stability over geological epochs of the developmental properties that define the major attributes of their body plans requires special explanations rooted deep in the structure/function relations of dGRNs.


1. http://sci-hub.tw/https://www.sciencedirect.com/science/article/pii/S0012160611000911#bb0110

View user profile http://elshamah.heavenforum.com

14 Internal Signaling and Information on Sat Dec 01, 2018 2:05 pm

Admin


Admin
Wolpert introduced the conceptual framework of ‘positional information’ in which developmental pattern formation is dependent on cells interpreting positional values that they have acquired from external signals 1 The molecular genetics revolution of the 1980s and 1990s led to the identification of several molecules that behave as graded patterning signals.   Complementary studies have aimed to understand how cells respond to graded signals to control differential gene expression. Finally, a combination of genetics, genomics, misexpression studies, network analysis and mathematical modeling has led to new views of morphogen interpretation.  A set of common design principles underpins the patterning of both tissues. These principles form a basis for a revised theory of morphogen-mediated pattern formation. We argue that this theory is likely to be relevant to many tissues and discuss the rationale that might account for this strategy of tissue patterning.

Anterior-posterior (AP) patterning of the Drosophila blastoderm



AP patterning of the early Drosophila embryo involves maternal gradients of two homeodomain proteins: Bicoid (Bcd) and Caudal (Cad). Bcd protein is translated from a source of mRNA at the anterior pole and diffuses posteriorly through the syncytial blastoderm, forming a long-range Anterior-posterior AP gradient, with highest levels at the anterior end. Complementing the Bcd gradient is an anti-parallel gradient of Cad, which is shaped by Bcd-mediated translational repression. These gradients are initially formed near the cortex of the oocyte, while nuclei divide rapidly in the central region. After ten nuclear division cycles, nuclei migrate to the periphery and import different amounts of Bcd and Cad, depending on their position along the AP axis.

Bcd activates target genes that create boundaries at defined positions along the AP axis, dividing the body plan into regions that will become cephalic (C), thoracic (T1-T3) and abdominal (A1-A8) segments. Bcd target genes include sloppy-paired 1 (slp1), giant (gt) and hunchback (hb), which are activated in overlapping domains in anterior regions. slp1, gt and hb encode repressors, which prevent expression of run, Kr and kni, respectively. Mutual repression between these pairs of repressors refines their patterns, creating sharp gene expression boundaries that foreshadow the organization of the body plan.

The Drosophila blastoderm and the vertebrate neural tube: distinct but alike
In the developing chordate (including vertebrates), the neural tube is the embryonic precursor to the central nervous system, which is made up of the brain and spinal cord. 3
The Drosophila blastoderm and the vertebrate neural tube have distinct development. The signals that act as positional cues in the two tissues are unrelated, the transcription factors (TFs) involved are not orthologous, and the time scales of pattern formation are dissimilar. Establishing the Bcd gradient and the emergence of the gap gene pattern happens within the first ∼2 h of Drosophila development. Indeed, gap gene expression is first detected at nuclear cycle 10 and pattern is generally considered fully manifest during nuclear cycle 14, a period of ∼60 min after these genes are initially expressed. In the neural tube, by contrast, the period of patterning varies between species but takes many hours. For example, in chick and mouse embryos the establishment and elaboration of pattern occur over a period of more than 18 h. This difference in timing might be directly related to the substantial differences in the cell biology of the two tissues. The Drosophila blastoderm is a syncytium with nuclei residing in a shared cytoplasm undergoing synchronized divisions. The absence of cytoplasmic divisions in the blastoderm allows the relatively unfettered movement of TFs between neighboring nuclei, especially during mitosis when the nuclear membranes have broken down. By contrast, the neural tube is a pseudostratified epithelial sheet composed of multiple individual cells proliferating asynchronously. Long-range signaling within the neural tube relies on secreted proteins, which are received by transmembrane receptors and transduced by intracellular signaling pathways.

Generating the Dorsal-Ventral Axis




Internal Signaling and Information
Following will demonstrate the connection of the signalling framework to gene regulation and show how this provides a different way to think about genetic information. Positional information is one example where informational concepts in signalling systems are clearly applied, so the signalling networks and pathways have a significant role in developmental biology. The information is computed from a broad range of incoming cues, and this computation can affect the genetic switches doing the processing. Thus, there is a great flexibility in the relation the information can bear to the environmental context that induces it, and this relation can be fine-tuned to adapt the organism to the environmental conditions. 

Following will provide a useful analogy when looking at more complex signalling in gene regulatory networks. Imagine someone is to maneuver a car from one side of town to the other without crashing. A car contains a driver and a passenger. The driver operates the car normally in most respects but ignores the color of traffic lights. Instead, when the driver comes to a traffic light, he pays attention to the passenger, who has his hand on the driver’s knee. Upon seeing a red or green traffic light, the passenger either squeezes or relaxes his hand. The driver responds to this signal by either accelerating or braking (see fig. 1, top row). The game has all the basic components of a  signalling game. There are two states of the world (red/green), two signals (squeeze/relax), and two actions (accelerate/break). There is a signal sender (the passenger) and a signal receiver (the driver) who share in the success or failure of their interaction.


Two types of internal signaling. Top: Signaling in the game “crosstown traffic.” Bottom: Signaling between genes.

To do well in the game, a driver-passenger team needs to map the states of the world to the right actions: accelerate on green, and brake on red. Two things mediate the mapping between states and actions: how the passenger maps a state to a signal, and how the driver maps a signal to an action. If both driver and passenger always follow the same rules for how they do this mapping, then the best teams will follow one of these two rule sets:

Rule set 1: The passenger squeezes on red and relaxes on green; the driver brakes on squeeze and accelerates on relax. 
Rule set 2: The passenger relaxes on red and squeezes on green; the driver accelerates on squeeze and brakes on relax.

Now imagine a fanciful scenario where selection plays a role in crosstown traffic. Cars with teams begin driving across town, and initially, both driver and passenger follow a random rule set. Cars that crash are selected against. Cars making it to the other side of town inspire new teams to copy their rule set, making the occasional copying errors. The new teams then drive back across town, and the cycle repeats. Given enough teams and enough time, either rule set 1 or rule set 2 would predominate, as teams following any other rule set would have a far higher chance of crashing. In this selective scenario, the car is an individual, reacting to the local state by modifying its behaviour. The driver and passenger are internal parts producing the behaviour and sharing the same evolutionary fate—crashing or surviving together. Selection acts on the behaviour of the system as a whole (the car) and in doing so produces an internal signalling system between individual parts (the driver and passenger). This is why it is called internal signalling. 

Question: Why AT ALL would natural mechanisms begin this complex behaviour of trial and error ? And setting rules ? - which is something that intelligence does?  There is not a goal to do well, or to reach a certain outcome. Why would it try to do the right actions? How could it set a distant goal, if matter has no goal to get from A to B? 

Information is created in the signals during this game. When a successful signalling system (rule set 1 or 2) is established between the internal parts, the signal (squeeze/relax) carries information, in the following way.

Why would nature keep that information for upcoming drives if there are no such goals? 

Imagine being a second passenger sitting in the back seat of a car, whose driver and passenger play using one of the successful rule sets, and suppose you know the exact rule set they are using. Seeing a signal, such as the passenger squeezing the driver’s knee, provides information about two things. First, the signal tells you something about the state of the world (is the light green or red?). Second, the signal tells you something about what the driver will do next (will they accelerate or break?). An evolved signal inside the car thus carries information about both states of the world and about the actions of the driver, and hence about the actions of the car as a whole.

Gene Regulation
All the components of an internal signaling game are present in gene regulatory networks (see figure above, bottom row).  Genes are transcribed into RNA by  RNA polymerase, which walks along the bases of a strand of DNA and produces a corresponding strand of RNA. This RNA is then used to produce a protein. To begin transcription, the RNA polymerase must first bind to the promoter region. The polymerase must stick to the promoter long enough to kickstart transcription, and this crucial moment can be either encouraged or prevented by a protein called a transcription factor. Transcription factors bind to a short portion of the DNA strand called a cis regulatory element. They bind to cis-elements because the shape of the protein fits the shape of the short sequence of bases on the DNA, so specific transcription factors bind only to specific cis-elements. Many transcription factors exist in two different stable shapes, and the presence of a small molecule, such as a hormone, can switch them to their active shape, which is only then capable of binding to the cis-element. So the transcription of a gene can be sensitive to the local conditions in the cellular environment. Once these transcription factors bind to the cis-elements, they do one of two things: they prevent the RNA polymerase from getting to the promoter and thus repress transcription, or they stick to the RNA polymerase and help stabilize it long enough to activate transcription. Several cis-elements may exist for a single gene, and they can act in concert, binding multiple transcription factors to repress or activate a gene. One transcription factor might activate a gene, while another represses activation. The transcription of a gene can thus be a function of the presence of several transcription factors. We have a gene, a promoter, and some cis-elements. Together, I refer to these as a genetic switch.

A genetic switch can act by transcribing a protein that has some effect on fitness in the current environment—producing hemoglobin or muscle fiber, for example. A genetic switch can respond to some external state, as many transcription factors bind to DNA only in the presence of certain molecules from the environment. In addition, our switch can even do some basic information processing, responding to the combined presence and absence of transcription factors. What we do not have, yet, are signals.

Gene Regulation as Signaling 
The genes that produce proteins like hemoglobin or muscle fiber play a direct role in the fitness of the individual. Genes like this are called structural genes. They contrast with regulatory genes, which produce the transcription factors that activate or repress the transcription of further genes. These are what Maynard-Smith had in mind when he talked of genes sending signals to other genes. Like crosstown traffic, the final mapping from state to action in a gene regulatory network is determined by a series of intermediate mappings. Each genetic switch maps incoming information from the environment—or from the output of another genetic switch—by regulating the transcription of a protein. If the gene being transcribed is structural, the protein goes on to perform an action; a genetic switch like this is a receiver. If the genetic switch transcribes a regulatory gene, then the protein can signal further downstream genes; a genetic switch like this is a sender. Some genetic switches may act as both sender and receiver, for gene regulation can consist of chains of gene regulation, where genes regulate genes that regulate still more genes. The complex regulatory connections between genetic switches form a gene regulatory network. Each individual mapping at a genetic switch is—like crosstown traffic— governed by a modifiable rule set. The rules are determined by the sequence of DNA composing the cis-element of the genetic switch and the DNA that is transcribed into a protein. One kind of modification changes the function governing the switch. Different proteins interact in different ways, so mutations affecting the region of DNA containing the cis-elements can affect the function controlling whether a gene is turned on or not. Mutations can thus affect the localized rules governing a single genetic switch. Perhaps more importantly, mutations within the cis-element regions can affect which upstream signals the genetic switch responds to, and mutations within the trancribed DNA can change the shape of transcription factor produced, thus changing the signal being sent. 

Mutations that change the signals sent from and received at each genetic switch rewire the topology of the network, changing its global information processing capacity. A final similarity is worth pointing out. In a signalling game, such as crosstown traffic, a number of different mappings can produce equivalent results (recall that both rule set 1 and rule set 2 did an equivalent job). So the particulars of an intermediary signal do not matter, as long as the state-action mapping produces a successful behaviour. The signals are conventions, as they could have been otherwise. We see this conventional aspect in gene signalling too, for it is possible for different transcription factors to fulfil the same intermediary regulatory task in related species. Transcription factors are signals, and gene regulatory networks make up complex signalling networks. They produce actions in response to local states and use chains of intermediate signals to do so. Gene networks do not just relay information, however; they also process information. We can incorporate this idea into the signalling framework too.

Information Processing and Signaling
A genetic network is considerably more complex than a passenger-driver team, as many genetic switches can be wired together. A single genetic switch can already do some minimal processing, integrating information from several sources using a simple function. Wiring together several genetic switches makes a regulatory network capable of even more complex information processing. Information processing in gene networks takes place at many levels. We find recognizable network motifs—small ensembles of two, three, or four genetic switches wired together—which can perform a recognizable logic function or signal processing of some kind. More complex subnetworks containing many genetic switches perform precise developmental tasks, such as maintaining boundaries between differentiating cells.

 The ensemble of genetic switches in a gene regulatory network is capable of using a broad range of complex inputs to control and organize many possible actions, much like an ensemble of neurons in the brain does. We can incorporate information processing into the signalling framework in the following way. A sender may need to process information because they do not have the relevant state of world handed to them but must calculate it from a variety of cues. A receiver may need to process information because taking the appropriate action may depend on more than one signal. Genetic switches are already capable of approximating some simple logical operations. Genetic switches acting together can integrate information to map world states to signals and signals to actions. A sender in a genetic network might be a gene or, if complex processing is required, a network of genes. A network could integrate upstream information from a series of cues to produce a single signal, or transcription factor. A receiver, equally, may be a gene or network of genes integrating signals to produce some action. An example where signalling and information processing are combined occurs in pattern formation of early multicellular development.

Multicellular Development and Positional Information 
While each cell contains a copy of the same genes, the cells in different parts of our bodies look and act very differently. The make of organisms requires a genetic network capable of a broad range of different actions, to produce a variety of cell types and context-sensitive cellular behaviour. But how and where is each cell programmed which particular action to take? During development, this question is answered, at least in part, by looking to the spatial expression patterns of different transcription factors. Localized regions of cells switch on the same transcription factors, carving out a spatial landscape in a developing embryo that controls the behaviour of cells in the respective regions. In the development of a fly, for example, the location of the limbs is specified by integrating three upstream transcription factors, each of which is expressed at overlapping locations in the embryo. 


Positional information expressed by transcription factors in the fly embryo. 
Three upstream genes produce signals that carry positional information computed from a variety of inputs. These signals are integrated downstream to initiate an action: the production of limbs.

(A) One protein is expressed in cells lying in a set of bands along the body axis, 
(B) another is expressed in cells in the ventral region , 
(C) and another in the posterior. 

These proteins bind to a further gene that integrates the expression of all three with a Boolean function (A and not (B or C) resulting in the expression of a protein in a series of small regions along the body midline that initiates limb development.  The three transcription factors themselves are produced from a variety of complex upstream inputs, including prior cell states and spatially localized concentrations of different substances called morphogens. Transcription factors expressed in such spatial patterns are said to contain positional information. This use of informational language is thoroughly appropriate when viewed using the signalling framework. We can see that each role in a signalling system is filled out, as well as how this licenses talk of information, in the same way that it does in other signalling games:

- Each of the three transcription factors is expressed as the result of mapping a complex set of upstream cues from morphogens and prior cell states into a single transcription factor. 
- These transcription factors induce a downstream genetic switch to act, and this switch maps the three incoming signals to another single protein that initiates limb development. 
- The three transcription factors thus act as intermediaries, or signals, between incoming states and some further downstream actions. 
- Furthermore, each transcription factor expressed in the three different regions delivers a signal containing information. Just as seeing what the passenger does in crosstown traffic tells you about what colour the lights are, seeing what regulatory genes are being expressed tells you about the location of the cell in the three-dimensional form that is developing.

Reusing Existing Information 
Positional information presents a more complex picture of signalling systems than crosstown traffic. In the simple game, the relationship between states, signals, and actions was straightforward: a state mapped to a signal, and a signal mapped to an action. Here, however, signals are sandwiched between both upstream and downstream information processing. Upstream, a number of cues are integrated into a signal that indicates spatial position. Downstream, these signals may coordinate several different actions. The signal itself, though it carries information about spatial location, bears no simple one-to-one relationship to either the input (the incoming cues from morphogens or prior cell states) or the output (the resulting proteins that produce limb development). The idea of processed information carrying internal signals has an important upshot. To evolve signals like this, the right configuration of genes must be selected so that they correctly process the multiple inputs into a single signal, ready to be passed downstream for further processing. The information carried by the signal obviously plays a role in the function it was directly selected for. But cells perform many simultaneous actions, so that same information may well prove useful for other downstream functions dependent on cell location.  Such changes in gene regulation are thought to be a key part of the make of complex animal form. 

Wing Spots for Free
As with the specification of limb development in fruit flies, a number of regulatory genes are expressed in different locations in the wing



Upstream positional information produces a novel wing spot in D. biarmipes.

One transcription factor (A) is expressed along the anterior of the wing, while another (B) is expressed along the tip of the wing. These regulatory elements play a downstream role in the development of the existing architecture of the wing (C). The rapid appearance of wing spots occur in a genetic switch controlling pigment expression. Mutations to the cis-element region enable existing transcription factors to bind upstream of this gene and hence control its transcription (see fig.above, lower right). The incoming signals from the two regions are integrated, and the genetic switch turns on when the Boolean combination of upstream regulators (A and not B) is true. When the gene is turned on, it up-regulates pigmentation in the anterior distal position of the wing (D), producing the distinctive spot . 

For an organism to develop correctly, it is crucial that each cell does the right thing in the right place at the right time. Each cell has the same genetic network, so this network must be able to map complex upstream inputs to a broad range of cellular activities. To accomplish this, the network must integrate information from various sources, and, in the process, it creates internal signals that carry computed information. In some cases, a few mutations allow other genetic switches to make use of the already-existing information in these signals to perform a novel adaptive task. Hence, signalling systems do permit the use of pre-programmed information for adaptation.



1. http://sci-hub.tw/https://www.jstor.org/stable/10.1086/677687?seq=1#page_scan_tab_contents
2. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4712844/
3. https://en.wikipedia.org/wiki/Neural_tube

View user profile http://elshamah.heavenforum.com

Sponsored content


Back to top  Message [Page 1 of 1]

Permissions in this forum:
You cannot reply to topics in this forum