The DNA methylation code and language
Perhaps the most widely investigated mechanism of epigenetic control is DNA methylation, the modification of DNA involving the addition of methyl groups to CpG residues, by DNA methyltransferases. 17 CpG dinucleotides are found throughout the genome in dense clusters, referred to as CpG islands that are associated with a large proportion of gene promoters. CpG methylation is generally, although not always, associated with gene silencing. CpG methylation facilitates gene repression by acting as a recruiting surface for proteins, which possess methyl binding domains and initiate chromatin condensation. Interestingly, DNA methylation alone has been shown to further promote chromatin condensation. 17
DNA methylation, which takes place at cytosine or adenine, but mainly at cytosine when it is followed by a guanine (so-called CpG dinucleotide). The subject is very complex, and I will not go into details. Suffice it to say that methylation at CpGs has usually a repressive effect on DNA ( methylated DNA is not active). The following figure shows how unmethylated CpG islands are usually found at promoters or other active regions, while methylated CpGs correspond to inactive segments, for example, inactivated transposable elements.
Our finding suggests that the Cytosine DNA methylation (CDM) signal comprises a language scheme properly constrained by molecular thermodynamic principles, which is part of an epigenomic communication system that obeys the same thermodynamic rules as do current human communication systems. In a section on the binary language of cytosine DNA methylation, we show that the architecture of small clusters (“words”) of CDM not only are characterized based on maximum entropy and least effort principles but also fits the statistical mechanics given by Weibull distribution on statistical and physical basis. Results obtained may reflect the existence of a methylation language, with ‘words’ depicted in the binary alphabet of methylated (1) and non-methylated (0) bases. 11
DNA methylation, the addition of a methyl- group to one of the bases in the deoxyribonucleic acid chain, does not change the primary DNA sequence and it is therefore considered to be an epigenetic modification. 9
Methylation of cytosine in DNA is linked with gene regulation, and this has profound implications in development, normal biology, and disease conditions in many eukaryotic organisms. A wide range of methods and approaches exist for its identification, quantification, and mapping within the genome. DNA methylation is the most extensively studied epigenetic mechanism, and it plays multiple roles in key cellular processes, including regulation of gene expression, embryonic development, genomic imprinting, and chromosome stability. A plethora of experimental studies demonstrate that deregulation of DNA methylation is intimately linked with many human diseases, most notably cancer. DNA methylation is generally repressive to transcription, therefore constituting an important mechanism for gene silencing in embryonic development. Although cytosine methylation is the most studied modification, adenine has been found to be methylated in prokaryotes and plants. In prokaryotes, DNA methylation is involved in processes such as determination of DNA host specificity, virulence, cell cycle regulation and gene expression.4
Cytosine DNA methylation (CDM) is a stable epigenetic modification to the genome and a widespread regulatory process in living organisms that involves multicomponent molecular machines. Genome-wide cytosine methylation patterning participates in the epigenetic reprogramming of a cell, suggesting that the biological information contained within methylation positions may be amenable to decoding. Adaptation to a new cellular or organismal environment also implies the potential for genome-wide redistribution of CDM changes that will ensure the stability of DNA molecules. In higher eukaryotes, DNA methylation is involved in the regulation of several cellular processes such as chromatin stability, imprinting, X chromosome inactivation, and carcinogenesis. 5
In mammals, DNA methylation occurs mainly on the fifth carbon of the cytosine base, forming what is known as 5-methylcytosine or 5-methylcytidine (5-mC), and it is almost exclusively found at CpG dinucleotides. 5-mC is a potent epigenetic marker and regulator of gene expression. Methylated CpG clusters – named CpG islands – at gene promoters have been associated with gene inactivation. DNA methylation is catalyzed by a family of enzymes called DNA methyltransferase and includes DNMT1, DNMT3a and DNMT3b. DNMT3a and DNMT3b are known as de novo methyltransferases and they are able to methylate previously unmethylated CpG dinucleotides.
DNA methylation can be regarded as a preprogrammed epigenetic mechanism of adaptation, or micro-evolution.
One of the best known epigenetic mechanisms is DNA methylation in which a small molecule (a methyl group) is added to the DNA macromolecule at particular locations. Like a barcode or marker, the methyl group indicates, for instance, which genes in the DNA are to be turned on. This DNA methylation is accomplished via the action of a protein machine that adds the methyl group at precisely the right location in the DNA strand. Methylation of DNA occurs at certain target sites along the DNA sequence where specific short DNA sequences appear. These sequences are found by special proteins as they move along the DNA. The special proteins search for these sequences and add a methyl group to the adenine base that appears within the sequence. The protein binds to the DNA, twists the helix so the adenine base rotates into a precisely shaped pocket in the protein, and the protein then facilitates the transfer of the methyl group from a short donor molecule for example to adenine.7
DNA methylation has several uses in the vertebrate cell. A very important role is to work in conjunction with other gene expression control mechanisms to establish a particularly efficient form of gene repression. This combination of mechanisms ensures that unneeded eukaryotic genes can be repressed to very high degrees. For example, the rate at which a vertebrate gene is transcribed can vary 10^6-fold between one tissue and another. The unexpressed vertebrate genes are much less “leaky” in terms of transcription than bacterial genes, in which the largest known differences in transcription rates between expressed and unexpressed gene states are about 1000-fold. DNA methylation helps to repress transcription in several ways. The methyl groups on methylated cytosines lie in the major groove of DNA and interfere directly with the binding of proteins (transcription regulators as well as the general transcription factors) required for transcription initiation. In addition, the cell contains a repertoire of proteins that bind specifically to methylated DNA.The best characterized of these associate with histone modifying enzymes, leading to a repressive chromatin state where chromatin structure and DNA methylation act synergistically. One reflection of the importance of DNA methylation to humans is the widespread involvement of “incorrect” DNA methylation patterns in cancer progression
DNA methylation inhibits gene transcription
Let’s now turn our attention to a mechanism that usually silences gene expression. DNA structure can be modified by the covalent attachment of methyl groups (—CH3) by an enzyme called DNA methyltransferase. This modification, termed DNA methylation, is common in some eukaryotic species but not all. For example, yeast and Drosophila have little or no detectable methylation of their DNA, whereas DNA methylation in vertebrates and plants is relatively abundant. In mammals, approximately 5% of the DNA is methylated. Eukaryotic DNA methylation occurs on the cytosine base. The sequence that is methylated is shown here:
Perhaps the most widely investigated mechanism of epigenetic control is DNA methylation, the modification of DNA involving the addition of methyl groups to CpG residues, by DNA methyltransferases. 17 CpG dinucleotides are found throughout the genome in dense clusters, referred to as CpG islands that are associated with a large proportion of gene promoters. CpG methylation is generally, although not always, associated with gene silencing. CpG methylation facilitates gene repression by acting as a recruiting surface for proteins, which possess methyl binding domains and initiate chromatin condensation. Interestingly, DNA methylation alone has been shown to further promote chromatin condensation. 17
DNA methylation, which takes place at cytosine or adenine, but mainly at cytosine when it is followed by a guanine (so-called CpG dinucleotide). The subject is very complex, and I will not go into details. Suffice it to say that methylation at CpGs has usually a repressive effect on DNA ( methylated DNA is not active). The following figure shows how unmethylated CpG islands are usually found at promoters or other active regions, while methylated CpGs correspond to inactive segments, for example, inactivated transposable elements.
Our finding suggests that the Cytosine DNA methylation (CDM) signal comprises a language scheme properly constrained by molecular thermodynamic principles, which is part of an epigenomic communication system that obeys the same thermodynamic rules as do current human communication systems. In a section on the binary language of cytosine DNA methylation, we show that the architecture of small clusters (“words”) of CDM not only are characterized based on maximum entropy and least effort principles but also fits the statistical mechanics given by Weibull distribution on statistical and physical basis. Results obtained may reflect the existence of a methylation language, with ‘words’ depicted in the binary alphabet of methylated (1) and non-methylated (0) bases. 11
DNA methylation, the addition of a methyl- group to one of the bases in the deoxyribonucleic acid chain, does not change the primary DNA sequence and it is therefore considered to be an epigenetic modification. 9
Methylation of cytosine in DNA is linked with gene regulation, and this has profound implications in development, normal biology, and disease conditions in many eukaryotic organisms. A wide range of methods and approaches exist for its identification, quantification, and mapping within the genome. DNA methylation is the most extensively studied epigenetic mechanism, and it plays multiple roles in key cellular processes, including regulation of gene expression, embryonic development, genomic imprinting, and chromosome stability. A plethora of experimental studies demonstrate that deregulation of DNA methylation is intimately linked with many human diseases, most notably cancer. DNA methylation is generally repressive to transcription, therefore constituting an important mechanism for gene silencing in embryonic development. Although cytosine methylation is the most studied modification, adenine has been found to be methylated in prokaryotes and plants. In prokaryotes, DNA methylation is involved in processes such as determination of DNA host specificity, virulence, cell cycle regulation and gene expression.4
Cytosine DNA methylation (CDM) is a stable epigenetic modification to the genome and a widespread regulatory process in living organisms that involves multicomponent molecular machines. Genome-wide cytosine methylation patterning participates in the epigenetic reprogramming of a cell, suggesting that the biological information contained within methylation positions may be amenable to decoding. Adaptation to a new cellular or organismal environment also implies the potential for genome-wide redistribution of CDM changes that will ensure the stability of DNA molecules. In higher eukaryotes, DNA methylation is involved in the regulation of several cellular processes such as chromatin stability, imprinting, X chromosome inactivation, and carcinogenesis. 5
In mammals, DNA methylation occurs mainly on the fifth carbon of the cytosine base, forming what is known as 5-methylcytosine or 5-methylcytidine (5-mC), and it is almost exclusively found at CpG dinucleotides. 5-mC is a potent epigenetic marker and regulator of gene expression. Methylated CpG clusters – named CpG islands – at gene promoters have been associated with gene inactivation. DNA methylation is catalyzed by a family of enzymes called DNA methyltransferase and includes DNMT1, DNMT3a and DNMT3b. DNMT3a and DNMT3b are known as de novo methyltransferases and they are able to methylate previously unmethylated CpG dinucleotides.
DNA methylation can be regarded as a preprogrammed epigenetic mechanism of adaptation, or micro-evolution.
One of the best known epigenetic mechanisms is DNA methylation in which a small molecule (a methyl group) is added to the DNA macromolecule at particular locations. Like a barcode or marker, the methyl group indicates, for instance, which genes in the DNA are to be turned on. This DNA methylation is accomplished via the action of a protein machine that adds the methyl group at precisely the right location in the DNA strand. Methylation of DNA occurs at certain target sites along the DNA sequence where specific short DNA sequences appear. These sequences are found by special proteins as they move along the DNA. The special proteins search for these sequences and add a methyl group to the adenine base that appears within the sequence. The protein binds to the DNA, twists the helix so the adenine base rotates into a precisely shaped pocket in the protein, and the protein then facilitates the transfer of the methyl group from a short donor molecule for example to adenine.7
DNA methylation has several uses in the vertebrate cell. A very important role is to work in conjunction with other gene expression control mechanisms to establish a particularly efficient form of gene repression. This combination of mechanisms ensures that unneeded eukaryotic genes can be repressed to very high degrees. For example, the rate at which a vertebrate gene is transcribed can vary 10^6-fold between one tissue and another. The unexpressed vertebrate genes are much less “leaky” in terms of transcription than bacterial genes, in which the largest known differences in transcription rates between expressed and unexpressed gene states are about 1000-fold. DNA methylation helps to repress transcription in several ways. The methyl groups on methylated cytosines lie in the major groove of DNA and interfere directly with the binding of proteins (transcription regulators as well as the general transcription factors) required for transcription initiation. In addition, the cell contains a repertoire of proteins that bind specifically to methylated DNA.The best characterized of these associate with histone modifying enzymes, leading to a repressive chromatin state where chromatin structure and DNA methylation act synergistically. One reflection of the importance of DNA methylation to humans is the widespread involvement of “incorrect” DNA methylation patterns in cancer progression
DNA methylation inhibits gene transcription
Let’s now turn our attention to a mechanism that usually silences gene expression. DNA structure can be modified by the covalent attachment of methyl groups (—CH3) by an enzyme called DNA methyltransferase. This modification, termed DNA methylation, is common in some eukaryotic species but not all. For example, yeast and Drosophila have little or no detectable methylation of their DNA, whereas DNA methylation in vertebrates and plants is relatively abundant. In mammals, approximately 5% of the DNA is methylated. Eukaryotic DNA methylation occurs on the cytosine base. The sequence that is methylated is shown here:
DNA methylation usually inhibits the transcription of eukaryotic genes, particularly when it occurs in the vicinity of the promoter. In vertebrates and flowering plants, many genes contain sequences called CpG islands near their promoters. CpG refers to the nucleotides of C and G in DNA that are connected by a phosphodiester linkage. A CpG island is a cluster of CpG sites. Unmethylated CpG islands are usually correlated with active genes, whereas repressed genes contain methylated CpG islands. In this way, DNA methylation may play an important role in the silencing of particular genes. How does DNA methylation inhibit transcription? This can occur in two general ways. First, methylation of CpG islands may prevent an activator from binding to an enhancer element, thus inhibiting the initiation of transcription. A second way that methylation inhibits transcription is by altering chromatin structure. Proteins known as methyl-CpG-binding proteins bind methylated sequences. Once bound to the DNA, the methyl-CpG-binding protein recruits other proteins to the region that inhibit transcription.