DNA is an information storage device like an HD and contains a genetic code ( formed by three DNA nucleotides, so-called triplets, or codons). Three-letter combinations from the four-letter alphabet of DNA bases ( Adenine, Thymine, Guanine and Cytosine ) form the genetic code. Each triplet codon can be compared to the letter of the alphabet, like the letter A. It is a combinatorial scheme in which 4 nucleotides arranged 3-at-a-time specify 20 different amino acids. It is used to form a biological instruction manual ( blueprint ) for the construction of proteins, cells, organisms, and instructing organisms how to grow, develop, survive and reproduce. Every triplet of nucleotides in a nucleic acid sequence specifies a single amino acid of a polypeptide chain of proteins. In other words, it instructs which of the twenty amino acids used in proteins is used in each position of the protein polypeptide chain. This specification, from triplet codon to amino acid, is called a cypher. It is like a translation from one language to another. These base triplets specify one of the letters of the 20 letter alphabet of amino acids. With three letters out of four possible, there are 4^3 = 64 possibilities. So the code is redundant but in a purposeful way. The redundancy takes care of the majority of single-base errors in coding. That means, more than one triplet can code for a particular amino acid (a possibility inherent in the fact that there are 64 possible triplets out of 4 base pairs and only 20 amino acids to be coded for)
There is a lot to be explained. That is:
- the origin of the hardware:
the gene regulatory network, DNA, mRNA, amino acids, the machinery of transcription and translation, tRNA, tRNA Synthetase, in more advanced organisms, spliceosomes, and error check and repair mechanisms along the way. How did the transition from random occurrence of the building blocks, RNA, DNA, amino acids, lipids, on early earth occur, to the hypercomplex synthesis in cells, and precise arrangement of information flow ?
- the four-fold problem of the software ( information, alphabet, codes, and translation system ):
- the origin and selection of the alphabet, the four nucleobases, and not more or less
- the origin of the genetic code, the ‘dictionary’, the collection of rules, based on 64 codon triplets
- the origin of the instructional blueprint which directs the synthesis of proteins, and and instructing organisms how to grow, develop, survive and reproduce.
- the origin of the cypher, or the coding assignments—i.e., which triplets code for which amino acids, from 64 codons to 20 amino acids, and its redundancy, and robustness and error-proneness.
The DNA molecule is the hardware ( information storage device )
The genetic code ( software ) is composed of three-letter combinations from the four-letter alphabet of DNA bases ( adenine, thymine, guanine, cytosine ). Each triplet is called codon. It is the triplet recipe of these bases that make up the ‘dictionary’. It is a collection of rules. For example, the base pairs GGG (Guanine-Guanine-Guanine) are instructions to make the amino acid Glycine which is then assembled into proteins by the ribosomes.
The biological instruction manual, blueprint ( information ), is composed of the genetic code ( software ) . The sequences of triplet codons are also called a code, which is confusing since the specific instructional codon sequence composed of three DNA bases is which instruct the ribosome about how to compose the sequence of amino acids of proteins is not the same as the dictionary which forms the collection of rules, which is also called genetic code. They are different things. The instructional blueprint or specific codon sequence contains the information necessary to build a protein, is first transcribed to messenger RNA, and afterwards translated to the twenty-amino-acid alphabet necessary to build the protein.
The confusion comes from the ambiguity in using the term “genetic code”. Here is a quote from Francis Crick, who seems to be the one who coined this term:
Unfortunately the phrase “genetic code” is now used in two quite distinct ways. Laymen often use it to mean the entire genetic message in an organism. Molecular biologists usually mean the little dictionary that shows how to relate the four-letter language of the nucleic acids to the twenty-letter language of the proteins, just as the Morse code relates the language of dots and dashes to the twenty-six letters of the alphabet… The proper technical term for such a translation is, strictly speaking, not a code but a cypher. In the same way, the Morse code should really be called the Morse cypher. I did not know this at the time, which was fortunate because “genetic code” sounds a lot more intriguing than “genetic cypher” (from “What Mad Pursuit”, 1988)
The specification, from triplet codon to amino acid, is called a cypher. It is like a translation from one language to another. We can use for example the google translate program. We write the English word language, and the program translates it and we can get the word "Sprache", in German, which is equivalent to the word "language" in English. As in all translations, there must be someone or something, that is bilingual, in this case, to turn the coded instructions written in nucleic acid language into a result written in the amino-acid language.
In Cells the adaptor molecule, tRNA, performs this task. One end of the tRNA mirrors the code on the codons on the messenger RNA and the other end is attached to the amino acid that is coded for. the correct amino acid is attached to the correct tRNA by an enzyme called amino acid tRNA Syntethase..
This raises a huge - an even tougher problem concerning the coding assignments—i.e., which triplets code for which amino acids. How did these designations come about? Because nucleic-acid bases and amino acids don’t recognize each other directly but have to deal via the tRNA chemical intermediary, there is no obvious reason why particular triplets should go with particular amino acids. Other translations are conceivable. Coded instructions are a good idea, but the actual code seems to be pretty arbitrary. Perhaps it is simply a frozen accident, a random choice that just locked itself in, with no deeper significance? That is what Crick proposed. How could that not be called just an "ad-hoc" assertion, face no other reasonable or likely explanation? - unless, of course, we permit the divine into the picture.
The origin of the hardware:
The RNA world, and the origins of life
The DNA double helix, evidence of design
Origin of the canonical twenty amino acids required for life
Proteins: how they provide striking evidence of design
The interdependent and irreducible structures required to make proteins
Control of Gene Expression and gene regulatory networks point to intelligent design
The complexity of transcription through RNA polymerase enzymes and general transcription factors in eukaryotes
Ribosomes amazing nanomachines
Transfer RNA, and its biogenesis
DNA and RNA error checking and repair, amazing evidence of design
Origin of translation of the 4 nucleic acid bases and the 20 amino acids, and the universal assignment of codons to amino acids
Origin of the software:
The origin of the genetic cypher, the most perplexing problem in biology
Main topics on complex, specified/instructional coded information in biochemical systems and life