Comparative unknowns Nirenberg and Matthaei solved the puzzle that eluded the greats.
British biologist Francis Crick and his American colleague James Watson laid the groundwork for modern molecular genetics when they determined the structure of DNA in 1953. While demonstrating how the strands of the double helix were put together, Watson and Crick also sought to learn how genetic information was coded into the DNA. Crick’s “central dogma” of molecular biology—“DNA makes RNA makes protein”—signified the importance of the processes of transcription (the creation of messenger RNA) and translation (the production of proteins). Once this became the canonical basis of the genetic transfer of information, the next item on the scientific agenda was the genetic code itself—the instructions that regulated the dogma’s implementation, the true “secret of life”.
The first scientist after Watson and Crick to find a measure of success with the coding problem was Russian émigré physicist George Gamow. He envisioned the relationship between DNA structure and protein synthesis as a numerical cryptanalytic problem. Gamow surmised that the goal for scientists was to learn how a long sequence of 4 nucleotides determines the assignment of long protein sequences composed of 20 amino acids. Gamow published a short piece in the October 1953 issue of Nature that proposed a solution called the “diamond code”, an overlapping triplet code based on a combinatorial scheme in which 4 nucleotides arranged 3-at-a-time would specify 20 amino acids. Somewhat like a language, this highly restrictive code was primarily hypothetical, based on then-current knowledge of the behavior of nucleic acids and proteins.
Gamow’s coding scheme generated a great deal of enthusiasm among other scientists. To foster communication and camaraderie, Gamow founded the RNA Tie Club, a group of 20 hand-picked scientists—corresponding to the 20 amino acids—who would circulate notes and manuscripts on the coding problem and (not inconsequentially) consume wine, beer, and whiskey at periodic meetings. Each member of the club was given the moniker of an amino acid, and all were presented with a diagrammed tie and tiepin made to Gamow’s specification. Although geographically dispersed, the Tie Club brought physical scientists and biologists together to work on one of the most challenging and important problems in modern science.
By mid-1954, Gamow had accepted that his diamond code was not accurate, yet he and others continued to deliberate over the various codes presented by disparate researchers. In truth, the notion of a “code” as the key to information transfer was not articulated publicly until late 1954, when Gamow, Martynas Ycas, and Alexander Rich published an article that defined the code idiom for the first time since Watson and Crick casually mentioned it in a 1953 article. Yet the concept of coding applied to genetic specificity was somewhat misleading, as translation between the 4 nucleic acid bases and the 20 amino acids would obey the rules of a cipher instead of a code. As Crick acknowledged years later, in linguistic analysis, ciphers generally operate on units of regular length (as in the triplet DNA scheme), whereas codes operate on units of variable length (e.g., words, phrases). But the code metaphor worked well, even though it was literally inaccurate, and in Crick’s words, “‘Genetic code’ sounds a lot more intriguing than ‘genetic cipher’.” Codes and the information transfer metaphor were extraordinarily powerful, and heredity was often described as a biological form of electronic communication.
By 1955, research suggested that a nonoverlapping code was more plausible than Gamow’s original notion of overlapping triplets. By 1961, Crick and his colleagues (including Sydney Brenner) concluded that the nucleotides of each triplet did not belong to any other triplet. They also postulated that sets of triplets are arranged in continuous linear sequence starting at a fixed point in a polynucleotide chain without breaks, an aspect of the code that was termed “commaless”. The notion of a “degenerate code” was also introduced, which meant that more than one triplet can code for a particular amino acid (a possibility inherent in the fact that there were 64 possible triplets out of 4 base pairs and only 20 amino acids to be coded for). These discoveries in the decade before 1961 brought scientists closer to a clear vision of what the genetic code might look like, but it was experimental biochemical investigation in the 1960s that finally led biologists to the solution of the code.
In 1957, a 30-year-old biochemist named Marshall Nirenberg began work at the National Institutes of Health (NIH) in Bethesda, MD. Nirenberg was one among a veritable litter of young talent at the NIH, where biochemistry occupied a privileged place. The young scientist was fascinated with the role of genetic control and notions of “information flow” in biochemical reactions and cellular functions. Yet in a scientific culture that encouraged team science, Nirenberg tended to work alone. Years later, in a New York Times profile, he was described as a “genius because he does one thing superlatively well, but he has trouble driving cars, and he has been known to trip over his feet . . . works 12 hours a day 7 days a week and has no outside hobbies.” Although Nirenberg contested this description, he did recognize that working alone stymied his attempts to engage the major problems of contemporary biology. Still, although he was aware of the competition among leading laboratories to solve problems like protein synthesis, he worked in relative isolation for a few years, studying the cell-free synthesis of the enzyme penicillinase in the Bacillus cereus bacteria.
In 1960, Gordon Tomkins offered Nirenberg a position as a research biochemist in NIH’s Section of Metabolic Enzymes. There, he ceased study of the B. cereus system and focused his research on a cell-free E. coli system. By the summer of 1960, Nirenberg concluded that cell-free protein synthesis was dependent on the DNA template that specified the RNA messenger. Other researchers, including Nobel laureate Severo Ochoa, had arrived at similar conclusions and were determined to solve protein synthesis and crack the genetic code, but Nirenberg had never been as focused as he was at that moment. Auspiciously, he was joined at NIH in the autumn of that year by German plant physiologist Heinrich Matthaei, a postdoctoral fellow who planned to work with Nirenberg on protein synthesizing systems. Matthaei’s arrival signaled, in Nirenberg’s words, a “new phase” in the research.
By December 1960, Nirenberg and Matthaei worked intently on the E. coli system, attempting to incorporate amino acids in the DNAase-supplemented system and to establish definitively that the system was dependent on an RNA template. The two scientists demonstrated that endogenous messenger RNA did stimulate protein synthesis, but they needed a better, less-contaminated RNA preparation. They ultimately decided that the RNA of the tobacco mosaic virus (TMV) would be a viable template. This conclusion converged with the efforts of Heinz Frankel-Conrat at the Berkeley Virus Laboratory, which was leading the race to decipher the code. In the spring of 1961, Nirenberg traveled to Berkeley, visiting Frankel-Conrat’s laboratory to gain facility with TMV. Meanwhile, Matthaei began the experiments on May 15, testing poly-A (polyadenylic acid), poly-U (poly-uridylic acid), poly-(2A)U, and poly-(4A)U for amino acid incorporation. Each synthetic polynucleotide was tested in the presence of 19 unlabeled (cold) and 1 labeled (hot) amino acids. On the morning of May 27, the results of experiment 27Q indicated that poly-U specified the assembly of hot polyphenyalanine. It was the first break in the genetic code.
Matthaei phoned the poly-U results to Nirenberg at Berkeley, but Nirenberg did not release the results publicly. Neither man was a member of biochemistry’s “inner circle”, and thus, Nirenberg wanted to continue experimenting before announcing the momentous discovery. Returning to NIH, he and Matthaei continued their polynucleotide work. It was not until October that they released the information in two papers, although Nirenberg presented his findings at the International Congress of Biochemistry in August. The announcement was a major defeat for Crick and other major players in the field, who had been trying to break the code for years. Some begrudged the obscure young scientists their success, and others claimed that it was a lucky strike. Gamow, in particular, resented not being cited for his work in their 1961 paper. (Nirenberg claimed to have never heard of it.) Nevertheless, Nirenberg and Matthaei accomplished one of the premier feats in the history of science, doing what Nobel laureates and other leading figures had been unable to do.
Biochemical research after this grand achievement was directed primarily toward the completion of the code and took approximately six years. In 1964, Nirenberg and Philip Leder discovered a technique for making coding assignments using trinucleotides, and in the following year, Har Gobind Khorana perfected a precise technique for synthesizing long RNA chains of completely defined sequences. After these achievements, the entire code was elucidated within a year. All 20 amino acids were accounted for, and Crick devised a standard form in which to present the genetic code, a table that has the same importance for biology that the periodic table has for chemistry.
Watson and Crick had won their Nobel Prize in 1962 for discovering the structure of DNA. Nirenberg (along with Gobind Khorana and Robert Holley) matched them in 1968, when he was also awarded the Nobel Prize in Physiology or Medicine. Thus, an incredible achievement for an “outsider” was acknowledged at the end of two decades of amazing research in the nascent field of molecular biology. The discovery of the genetic code, or more accurately the “genetic cipher”, would inaugurate an era of astonishing genetic research that continues today.