REPRWTED FROM wMPosWhf ON INFORMATIONAL ~~~~~~~~~~~~ 01963 "CADEm PM.%, INC., NEW yoRK gy I . The Current Status of the RNA Code' MARSHALL W. NIRENBERG AND OLIVER W. JONES, JR. National Heart Institute, Nationd Institutes of Health, Bethesda, Ma yland Rather than review all of our work concerning the genetic coding problem, only one aspect which we have been investigating (up to September, 1962) will be presented; that is, the extent of degeneracy and its relationship to the general nature of the code. A degenerate genetic code was suggested a number of years ago by Gamow ( 10) and by Crick (4). I n such a code, an amino acid may be directed into protein by two or more codewords. Previous work demonstrated that C14-amino acids were directed into protein by synthetic polynucleotides in cell- free Escherichia coli extracts (19) and that leucine incorporation was stimulated by either poly UG,2 UC, or UA ( 16, 17, 29). Thus the code was shown to be degenerate with respect to leucine ( 16, 17, 29). Initially, all of the codewords found contained U. However, assuming a triplet code, the proportion of U compared with other nucleotides in codewords seemed unusually high, for natural template RNA, such as viral RNA, did not contain such a preponderance of U. To resolve this paradox, a more degenerate code was proposed with both non-U and U containing codewords (17). An alternative hypothesis was advanced by Roberts, who suggested a doublet code; for in such a code the pro- 1 This report is limited to the data which were presented at the Symposium on Informational Macromolecules in September, 1962. Data obtained after this date are not included. 2 The following abbreviations are used: poly U, polyuridylic acid; poly A, poly- adenylic acid; poly C, polycytidylic acid; poly G, polyguanylic acid; poly UGAC, polyuridylic-guanylic-adenylic-cytidylic acid; poly ACG, polyadenylic-cytidylic- guanylic acid; poly AC, polyadenylic-cytidylic acid; poly CG, polycytidylic-guanylic acid; poly UG, polyuridylic-guanylic acid; poly UC, polyuridylic-cytidylic acid; poly UA, polyuridylic-adenylic acid; poly UCG, polyuridylic-cytidylic-guanylic acid; poly UAG, polyuridylic-adenylic-guanylic acid; G-G, guanylic-guanylic; A, adenylic acid; G, guanylic acid; C, cytidylic acid; U, uridylic acid. 451 452 MARSHALL W. NIRENBERC AND OLIVER W. JONES, JR. TABLE I BASE RATIOS l Moles Per Cent) Input ratio Base ratio Designation Polymer of nucleotides of nucleotides UGAC UGAC Ap231 UGAC 40:20:20:20 55:32: 5: 8 Ap232 UGAC 58:14:14:14 56:25: 5:13 Ap233 UGAC 20:20:20:40 3:45: 9:43 A~234 UGAC 12:12:12:64 23:21: 4:52 Ju 258 UGAC 29:13:29:29 27:22:22:29 Ju 2510 UGAC 29:29:29:13 27:43:21: 9 AC G AC G J 251 ACG 60:20:20 46:32:22 M 76 ACG 7:86: 7 2:89: 9 M 75 ACG 1O:SO: 10 4:77:19 M 74 ACG 3o:eo: 10 18:72:12 AC AC J 104 AC 9:91 3:97 J 103 AC 12:88 6:94 J 102 AC 20:80 12:88 J 101 AC 33:67 30:70 J 1oc-1 AC 75:25 07:33 J 108 AC 83: 17 SO:20 CG CG M 141 CG 88:12 90: 10 M 71 CG 92: 8 87: 13 F 120 CC 88: 12 82:18 F 135 CC 50:50 9:9I AG AC J 106 AG SO:20 73:27 J 1~7 AG 66:33 48~52 0 Polyribonucleotides were synthesized, as described previously, with the aid of polynucleotide phosphorylase partially pursed from Micrococcus lysodeikticus according to the method of Singer and Cuss (27). The base-ratio of each polynucleotide preparation was determined by analysis. Polynucleotides were hydrolyzed by incubation in 0.4 N KOH at 25" for 18 hours. Under these conditions, little deamination occurred.3 Such mild conditions were not sufficient to hydrolyze certain polymers; however, in such cases, incubation in 0.3 N KOH at 37" for 18 hours resulted in complete hydrolysis (5). Mono- nucleotide products were separated either by paper electrophoresis (Whatman No. 3 MM paper, 0.05 M ammonium formate, pH 3.7) or by descending paper chroma- tography (Whatman No. 3 MM paper and a solvent system containing 0.1 M sodium phosphate, pH 7.0 and 3 M ammonium sulfate). Two % or greater contamination of polynucleotides by U would have been detected. No contamination by U was found. Mononucleotides and appropriate blanks were eluted by shaking small paper 3 We thank Dr. M. Grunberg-Manago for this protocol. THE CURRENT STATUS OF THE RNA CODE 453 portions of nucleotides would be within the range found in viral RNA (25, 26). The existence of non-U codewords was suggested when poly AC was found to direct small amounts of proline and threonine into protein (13, 21). Recently, in a careful study, Bretscher and Grunberg-Manago clearly demonstrated coding by non-U words (2). Several poly AC preparations were reported to code well for proline, threonine, histidine and, to a lesser extent, for glutamine. This work indicated that other non-U polynucleotides might have template activities. In this communication, further qualitative analysis of coding by such poly- nucleotides will be reported. Base-Ratio Analysis RESULTS The synthetic polynucleotides used in this study are listed in Table I. The base-ratio analysis of each polymer is compared with the ratio of nucleoside diphosphates present during the synthesis of each poly- nucleotide. In many cases, the base-ratio of the polymer product dif- fered slightly from the input ratio of the substrates. In polymers con- taining two or three different nucleotides, preferential incorporation into polynucleotide of either G or C relative to A was observed. Bret- scher and Grunberg-Manago have reported that Azotobacter polynu- cleotide phosphorylase also catalyzes a preferential incorporation of C and G into poly UC and UG (2). Stimulation of Amino Acid Incorporation by Polynucleotides Containing Four Bases The data of Table II dqmonstrate that synthetic polynucleotides containing four bases stimulate the incorporation of a large number of amino acids into protein. In the last column is given the basal level of C14-amino acid incorporation obtained in the absence of polynucleotide; other figures refer to the net increase above basal incorporation due to addition of polynucleotide. The base-ratios of the polynucleotides vary widely. The fifth polynucleotide (Ju-258) contains approximately equal proportions of U, G, A, and C, whereas the other polynucleotides con- tain predominant amounts of two or three nucleotides. All of the poly- nucleotides were active in directing amino acid incorporation, except polynucleotide Ju-2510. Although 10 pg of polynucleotide were added to each reaction mixture, the total amount of C14-amino acid directed into protein by each polynucleotide varied more than 50-fold. As we have shown previously, the template activity of polynucleotides is de- pendent upon factors other than nucleotide sequence. For example, strips immersed in 0.1 N or 0.01 N HCl and determining UV absorption at appro- priate wavelengths in a Beckman DU spectrophotometer. Polynucleotide: TABLE II STIMULATION OF AMINO Acm INCORPORATION BY POLY UGAC UGAC UGAC IJGAC UGAC UGAC UGAC u 55 u 56 ii 3 u 2.3 U 27 U 27 Base ratio G 32 G 25 G 45 G 21 G 22 G 43 Minus (moles per cent) A 5 A 5 A 9 A 4 A 22 A 21 C 8 polynucleotide c 13 c 43 c 52 c 29 c 9 control Designation: Ap231 Ap232 Ap233 Ap234 J&58 Ju2510 Cl-l-Amino acid Incorporation above control A ppMolesa Alanine 110 127 62 152 31 5 10 Arginine 69 270 68 212 99 57 11 Aspartic acid (-NH,?) 10 40 9 10 25 12 12 Glutamic acid (-NH,?) 16 52 12 9 14 - 23 Glycine 62 663 25 40 12 6 13 Histidine 20 8 11 24 13 0 4 Isoleucine 68 301 60 90 0 0 22 Leucine 168 1,243 125 418 12 0 41 Lysine 10 21 0 3 25 - 4 Methionine 9 64 0 9 15 4 12 Phenylalanine 152 606 86 64 14 0 10 Proline 50 140 125 1,007 121 10 7 Serine 179 807 181 445 37 0 47 Threonine 15 54 19 78 44 0 7 Tryptophan 23 8 16 8 1 - 45 Tyrosine 14 80 11 12 6 0 17 Valine 100 602 57 70 43 13 7 Total 1,075 5,086 867 2,651 512 107 292 TABLE II (Continued) a PpMoles represents the difference between C la-amino acid incorporation into protein in the presence and absence of poly- E ides. Basal incorporations obtained when polynucleotides were omitted are presented in the last coIum (minus poly- $ ide). ction mixtures used to determine Cl4-L-amino acid incorporation into protein contained the following components: 9 Tris (hydroxymethylaminoethane) pH 7.8; 0.01 M magnesium acetate; 0.05 M KCI; 6 x 10-a M mercaptoethanol; 1 x 5 4 ATP; 5 x 10-3 M potassium phosphoenolpyruvate; 5 pg of crystalline phosphoenolpyruvate kinase (California Bio- v, al Corporation); 0.8 x 10-4 M U4-amino acid (approximately 30,000-150,000 counts/minute/reaction mixture); 3.2 X % 4 each of 19 Cl%-amino acids minus the Cl*-amino acid; 10 pg of polynucleotide/reaction mixture, when specified; and 2 : preincubated S-30 extracts ( 1-2 mg protein/reaction mixture). Total volume of each reaction mixture was 0.3 ml. Re- M mixtures were incubated at 37" for 30 minutes; thus, total amino acid incorporation rather than rate of incorporation was g ed. A Nuclear-Chicago thin-window, gas flow counter was used. s n 456 MARSHALL W. NIRENBERG AND OLIVER W. JONES, JR. large polymers of chain length greater than 100 units are considerably more active than shorter ones (17). Single-stranded polynucleotides are active, whereas double- or triple-stranded polymers are not (19). In addition, randomly-mixed copolymers which have a high degree of sec- ondary structure are inactive in coding (28). In particular, polymers containing much G have little activity, possibly because of G-G inter- actions. Thus the relative inactivity of the last poly UGAC preparation (Ju-2510) should not be ascribed necessarily to the presence of a high proportion of nonsense nucleotide sequences. Such considerations make it difficult to compare with validity the relative abilities of different polynucleotides to code for the same amino acid; thus, such comparisons should be made with caution. The fact that polynucleotides containing four bases coded so well for so many amino acids strongly suggested that most nucleotide sequences could be read. In addition, a high pro- portion of U clearly was not required for messenger RNA activity. Stimulation of Amino Acid Incorporation by Poly ACG The coding activities of polymers which did not contain U are given in Table III. Base-ratio analyses of each poly ACG preparation failed to detect contamination by U. Poly ACG preparations stimulated the incorporation of many amino acids tested, including alanine, arginine, glutamic acid, lysine, proline, and threonine. Such high incorporations of glutamic acid, lysine, and threonine were not observed previously. A number of amino acids did not appear to be coded by any ACG preparations, which suggested that U may be an absolute requirement in coding for some amino acids. Since the template activities of some poly ACG preparations equaled those of our best synthetic template RNA preparations, U clearly was not required for coding other amino acids. Stimulation of Amino Acid Incorporation by Polynucleotides Containing Two Bases The data of Table IV demonstrate stimulation of amino acid incor- poration by poly AC preparations. The polynucleotides are listed in order of decreasing C content. In accord with the findings of Bretscher and Grunberg-Manago (2)) poly AC stimulated incorporation of proline, threonine, and histidine. In addition, poly AC was found to direct aspartic acid, glutamic acid, and lysine into protein. Bretscher and Grunberg-Manago (2) report that glutamine is coded by such polymers. We have not been able to obtain C14-asparagine or C14-glutamine and, thus, have not been able to study this point4 Although the addition of 4 Recently, we have confirmed the finding of Bretscher and Grunberg-Manago (2) that glutamine rather than glutamic acid is directed into protein by poly CA. In addition, we find that poly CA codes for asparagine rather than aspartic acid. THE CURRENT STATUS OF THE RNA CODE 457 C12-aspartic acid and Cl"-glutamic acid to reaction mixtures completely diluted the incorporation of C14-aspartic and C4-glutamic acids, re- spectively, the possibility of conversion of the free acid to the amide during incubation of reaction mixtures does not allow us to distinguish between the acid and amide forms. Many of the polynucleotides were found to have template activities equal to the most active poly U prep- TABLE III STIMULATION OF AMINO ACID INCORPORATION BY POLY ACG Polynucleotide: ACG ACG ACG ACG Minus Base ratio I A 46 A 2 A 4 A 16 PolY- (moles per cent) C 32 C 89 c 77 C 72 nucleotide G22 G 9 G 19 G 12 control Designation: J251 M 76 M 75 M 74 Cr4-Amino acid Incorporation above control A PwMoles" Alanine 123 45 56 85 8 Arginine 128 30 40 74 9 Aspartic acid (-NH,?) 167 0 0 24 13 Glutamic acid (-NH,?) 326 0 0 33 21 Glycine 5 0 0 0 13 Histidine 71 6 9 95 5 Isoleucine 0 0 0 0 20 Leucine 0 10 0 0 40 Lysine 820 5 0 23 6 Methionine 1 4 0 0 10 Phenylalanine 0 0 1 6 9 Proline 147 320 185 41 8 Serine 182 24 30 55 45 Threonine 250 11 13 11 8 Tryptophan 1 0 0 0 43 Tyrosine 0 4 0 0 18 Valine 1 5 5 7 6 Total 2,222 464 339 454 282 a A FkMoles represents the difference between C14-amino acid incorporation into protein in the presence and absence of polynucleotides. Assay procedures are described in the footnote of Table II. arations tested. Poly AC (J-104) contained 97% C, yet actively directed proline into protein. Thus, it appears probable that one codeword for proline may contain only C. Relatively large amounts of lysine were directed into protein by AC (J-109) and (J-108), which contained 67 and 80% A, respectively. These data suggest that a codeword for lysine may contain only A. The data of Table V demonstrate the effects of poly CG and AC preparations in directing amino acids into protein. The first three CG 458 MARSHALL W. NIRENBERG AND OLIVER W. JONES, JR. polymers contain high proportions of C and directed alanine, arginine, and proline into protein. The last poly CG preparation (F-135) con- tains 91% G and was inactive as template RNA. Poly AG directed in- corporation of glutamic acid and lysine into protein. TABLE IV STIMULATION OF AMINO ACID INCORPORATION BY POLY AC Minus Polynucleotide: AC AC AC AC AC AC PolY- (moles per cent) ( A 3 A 6 A 12 A 30 A 67 A 80 nucleotide Base ratio 1 C 97 C 94 C 88 C 70 C 33 C 20 control Designation: J104 J103 JlOZ JlOl JlOQ JlO8 C*4-Amino acid Alanine Arginine Aspartic acid (-NH,?) Glutamic acid (-NH,?) Glycine Histidine Isoleucine Leucine Lysine Methionine Phenylalanine Proline Serine Threonine Tryptophan Tyrosine Valine Incorporation above control A FwMolesa 0 1 0 0 0 0 11 0 4 1 1 0 0 12 0 0 9 51 157 53 24 4 19 24 53 135 53 15 5 16 0 2 0 0 6 0 0 5 198 85 17 23 6 0 0 0 0 0 42 0 0 - - - 1 3 5 10 14 47 909 441 5 0 10 0 0 0 0 10 0 0 1 0 4 0 11 625 1,132 643 1,102 140 20 9 11 19 18 16 9 8 46 30 65 75 170 176 105 9 23 0 1 1 1 9 44 14 2 2 0 0 0 19 0 0 0 0 0 0 5 ------- Total 723 1,278 793 1,641 1,616 707 294 a A nnhloles represents the difference between C 14-amino acid incorporation into protein in the presence and absence of polynucleotides. Assay procedures are described in the footnote of Table II. Quantitative Aspects of Data A comparative study of polynucleotides of varying base-ratios is helpful in evaluating amino acid incorporation data, for relative amino acid incorporations easily can be correlated with changes in base-ratio. Occasional inconsistencies and the significance of minor incorporations become apparent. Isotope dilution experiments were performed routinely to detect the possible presence of radioactive impurities in Ci4-amino acids. The THE CURREKT STATUS OF THE RNA CODE 459 presence of Cl"-impurities seemed unlikely, for incorporation of a C14- amino acid was lowered sharply if the reaction mixture contained both a C14-amino acid (0.05 umoles) and the same C12-amino acid (1.0 pmole). The purity of each Cr4-amino acid also was determined by TABLE V STIMULATION OF AMINO ACID INCORPORATION BY POLY CC AND AG Polynucleotide: Base ratio (moles per cent) Designation: Minus CG CG CG CC AC AG r PO'Y- C 90 C 87 C 82 C 9 A 73 A 48 nucleotide G 10 G 13 G 18 G 91 G 27 G 52 control Ml41 M 71 F120 F135 JlO6 J107 Cr4-Amino acid Alanine 30 Arginine 39 Aspartic acid (-NH,?) 0 Glutamic acid (-NH,?) 0 Glycine 5 Histidine 0 Isoleucine 0 Leucine 0 Lysine 2 Methionine 0 Phenylalanine 5 Proline 144 Serine 18 Threonine 0 Tryptophan 0 Tyrosine 2 Valine 1 Total 246 Incorporation above control A puMole.sa 20 63 0 0 0 14 16 86 1 10 8 13 0 6 3 12 10 26 0 0 0 44 5 11 0 8 0 2 0 4 0 0 0 0 0 26 0 0 0 1 0 39 0 0 5 0 11 7 0 0 0 110 8 3 0 0 0 0 0 12 5 4 8 0 0 14 202 356 2 1 1 8 0 6 0 0 0 42 0 1 0 1 0 5 1 17 1 0 0 40 6 2 0 0 0 14 1 0 0 0 0 4 -- - - - - 251 549 20 181 43 282 a A FBMoles represents the difference between C Id-amino acid incorporation into protein in the presence and absence of polynucleotides. Details of the assay procedures are described in the footnote of Table Il. paper electrophoresis followed by radioautography as described pre- viously ( 17). Limiting amounts of polynucleotides were added to reaction mix- tures and total amino acid incorporations were measured rather than rates of amino acid incorporations, E. coli extracts contain nucleases which rapidly degrade synthetic polynucleotides and the nuclease con- tent may vary from one preincubated S-30 preparation to another. Since many different enzyme extracts were used in this study, the data are 460 MARSHALL W. NIBENBERG AND OLIVER W. JONES, JR not useful for quantitative analyses. Comparisons between theoretical frequencies of triplets, etc. in polynucleotides and relative amino acid incorporations have not been presented because the data do not permit such calculations to be made with accuracy. The data demonstrate only qualitative aspects of the code; that is, nucleotide compositions of code- words and the degree of code degeneracy. SUMMARY OF INCORPORATION DATA Table VI summarizes all of the coding data previously published (19, 16, 17, 29, 15, 14, 30) and obtained in this study. Only polynu- TABLE VI SUMMARY OF CODING DATA~ Ci4-Amino acid Phenylalanine Proline Lysine Threonine Se&e Valine Leucine Glycine Cysteine Glutamic acid (-NH,?) Isoleucine Tryptophan Tyrosine Arginine Methionine Histidine Alanine Aspartic acid (-NH,?) U(98) C(T) A(?) AC( 15) UC(23) UG( 15) UG( 14) UG(5) UG( 8-15) AC(7) U`4(8) UG(6) UN9) CG( 15) UAG( 1) AC( 10 CG( 11) AC(8) Stimulated by poly- CA( 87) CU(W) AC( 53) AG( 60) UGG( 23)? UC( 13) UA( ?) AG( 20) CG( 80) AU(?) (1 Polymers used for these calculations represent optimal base-ratio directing Ci4-amino acids into protein. Numbers in parentheses refer to: Amino acid incorporated x 100 Sum of incorporation of I7 amino acids cleotides containing the minimum number of bases capable of stimu- lating an amino acid into protein are given in Table VI. The coding of proline by poly C and lysine by poly A was suggested by the poly AC experiments presented in Table III. The fact that poly C and poly A code so weakly may be due either to inhibitory effects of secondary structure or to difficulty in precipitating peptides. At acid pH, poly A in solution is double-stranded (9, 24), and poly C also may have or- dered structure (8). THE CURRENT STATUS OF THE RNA CODE 461 A surprising conclusion revealed by this summary is that almost every amino acid tested could be coded by a polymer containing only two bases. Methionine could be coded only by poly UGA as reported previously (17, 30), but the amount of methionine directed into protein was small; thus this codeword remains questionable. Assuming a triplet code, a summary of codewords estimated thus far is presented in Table VII. Previously, poly UCG was found to direct alanine and arginine into protein, and codewords containing U, C, and G were proposed for these amino acids (16, 17, 14, 30). The observed frequencies of incorporations (17) suggest coding of alanine and ar- TABLE VII TENTATIVE SUMMARY OF CODEWORDS C"-Amino acid Alanine Arginine Aspartic acid (-NH,?) Asparagine Cysteine Glutamic acid (-NH,?) Glycine Histidine Isoleucine Leucine Lysine Methionine Phenylalanine Proline Serine Threonine Tryptophan Tyrosine Valine CCG CGC ACA UAC or UUG or ACA UGG ACC UUA GUU AAA UGAd uuu ccc UCG CAC UGG UAU UGU Codewordsa UAAb UGGo AGA AGUd cuu AUUb AAC AAU ecu CCA ucu CAA (UUU) CCG a Nucleotide sequence in codewords is arbitrary. b Proposed by Speyer et al. (30). c We cannot differentiate between these possibilities at present. d It is not entirely clear whether these codewords require U. ginine by either UCG or CCG, but not by both codewords. In addition, the data of Table V show that poly CG codes for alanine and arginine; thus, codewords corresponding to these amino acids do not appear to contain U. Since it is not possible at this time to distinguish between triplet and double codes, etc., the assignments in Table VII represent current approximations of codewords. It seems probable that additional code- words will be found. 462 MARSHALL W. NIRENBERG AND OLIVER W. JONES, JR. DISCUSSION Codeword Specificity in Protein Synthesis The term degeneracy refers to the phenomenon whereby one amino acid is coded by two or more codewords. This term is inadequate when applied to the mechanism of coding, for it does not indicate codeword specificity. A degenerate code may have high or low specificity depend- ing upon the fidelity of protein synthesis. In most cases the fidelity of protein synthesis in viva appears to be high, and amino acid replace- ments other than those due to mutation have not been found. However, although the amino acid sequence analyses would reveal mistakes at one site occurring with a frequency higher than 1 or Z%, they would not reveal occasional mistakes occurring at different sites. Thus, occasional coding errors of 1 or 2%, distributed at random over entire protein molecules, might not be detected. In the in vitro system, codewords direct amino acids into protein with very striking specificity (21). In Table IV for example, poly AC prepara- tions do not direct the incorporation into protein of alanine, arginine, glycine, isoleucine, leucine, methionine, phenylalanine, tryptophan, tyro- sine, or valine. The specificity of coding by poly CG and AG preparations in Table V is equally apparent. Such negative data clearly demonstrate the very high fidelity of codeword recognition during protein synthesis in this cell-free system. The codewords corresponding to both leucine and valine contain U and G (16, 17, 29). Although the nucleotide content of these code- words are identical, each word was shown to code only for the appro- priate amino acid ( 21) , Th us, nucleotide sequence as well as chemical structure confers specificity upon codewords. However, one example of ambiguity has been found, but this occurs to a large extent in our experiments only under unusual conditions. Poly U directs about 3-570 as much leucine into protein as phenylalanine (17). Bretscher and Grunberg-Manago also have reported this phe- nomenon (2). In the absence of phenylalanine, using well-dialyzed E. coli extracts, poly U coded for leucine about 50% as well as it would code for phenylalanine (20). The molecular basis of this ambiguity is unknown, In the absence of phenylalanine, it is possible that leucine is attached to phenylalanine transfer RNA and then is coded like phenyl- alanine. On the other hand, the ambiguity may occur at the level of the coding units. It is important to note that phenomena of this type also may occur in vivo (3). THE CURRENT STATUS OF THE RSA CODE 463 Eficiency of Synthetic RNA in Coding In spite of the previously mentioned difficulties in comparing tem- plate activities of RNA preparations with different chain lengths and degrees of secondary structure, it seems clear that synthetic poly- nucleotides containing 4, 3, or 2 bases code as well in this system as natural template RNA obtained from viruses (19, 22, 32, 18). The efficiency in coding displayed by synthetic polynucleotides suggests that most nucleotide sequences direct amino acids into protein and that relatively few nonsense nucleotide sequences are present. Although alternative explanations of coding efficiency, such as non- random polynucleotides or nonsequential reading of template RNA, may be considered, such efficiency cannot be ascribed simply to random error in directing amino acids into protein, for amino acids are coded with marked specificity. Considerations such as these may be used to approximate the coding ratio. In a doublet code, only 16 base permutations are possible; thus, the information content would be insufficient to code specifically for all amino acids. Triplet and quadruplet codes would contain 64 and 256 codewords, respectively. Since almost every amino acid tested was found to be coded by polynucleotides containing only two bases, specific and efficient coding by quadruplet words would not seem likely. The data suggest either coding of all amino acids by triplet words, or coding of some by triplets and others by doublets (mixed doublet-triplet code). Recently, Weisblum, Benzer, and Holley (33) have established a molecular basis of degeneracy by demonstrating that multiple species of transfer RNA recognize different codewords with specificity. Multiple peaks of transfer RNA corresponding to at least four amino acids have been found independently by Holley et al. ( ll), Sueoka et al. (31), and Doctor ti al. (6). If a triplet code is assumed, each cell would require almost 64 transfer RNA species. Alternatives which do not require so many transfer RNA species deserve consideration. For example, Donohue and others have described many models other than Watson-Crick pairing (7). The demonstrated interaction between poly A and poly I (23), and the type of base-pairing suggested by Hoogsteen (12) also might be cited. Theories which require recognition of either the 2- or 6-substituents of bases (34) are not supported by the demonstration that hypoxanthine functions in codewords like G (28, 1) . The Z-amino group of G does not appear to be required for coding. A triplet code may be constructed wherein correct hydrogen bonding between two out of three nucleotide pairs may, in some cases, suffice for 464 MARSHALL W. NIRENBERG AND OLIVER W. JONES, JR. coding. Correct pairing of a base at one position in the triplet sometimes may be optional. It should be noted that a triplet code of this type in some respects would bear a superficial resemblance to a doublet code and would be in accord with all of the data available. Any theory concerning the physical basis of the code must attempt to explain the following experimentally obtained data: (a) High coding efficiency by synthetic polynucleotides. (b) Marked codeword specificity. (c) Degenerate codewords. (d) The Z-amino group of G is not essential for proper coding. (e) RNA with a high degree of secondary structure has little ability to code. ( f) Almost all amino acids tested can be coded by polynucleotides containing only two bases. SUMMARY Synthetic polynucleotides containing 4, 3, or 2 bases have been found to direct amino acids into protein with high efficiency and specificity. Many additional RNA codewords which do not contain uridylic acid have been determined. Almost all amino acids could be coded by polynucleotides containing only 2 bases. These results have been discussed in terms of the general nature of the code. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. REFERENCES BASILIO, C., WAHBA, A, J,, LENGYEL, P., SPEYER, J. F., AND OCHOA, S., Proc. Natl. Aced. Sci. U.S., 48, 613 ( 1962). BRETSCHER, M. S., AND GRUNBERG-MANAGO, M., Nature, 195, 283 (1962). COHEN, G. N., Ann. Inst. Pasteur, 94, 15 (1958). CRICK, F. H. C., in "Structure and Function of Genetic Elements, Brookhaven Symposia in Biology, No. 12," 1959, p. 35. DAVIDSON, J. N., AND Shmxq R. M. S., Biochem. J., 52, 594 ( 1952). DOCTOR, B. P., APGAR, J., AND HOLLEY, R. W., J. Biol. Chem., 236, 1117 (1962). DONOHUE, J., Proc. Natl. Acad. Sci. U.S., 42, 60 (1956). FRESCO, J. R., Trans. N.Y. Acad. Sci., Series II, 21, 653 (1959). FRESCO, J. R., AND DOTY, P., J. Am. Chem. Sot., 79, 3928 (1957). GAMOW, G., Nature, 1'73, 318 ( 1954). HOLLEY, R. W., DOCTOR, B. P., MERRILL, S. H., AND SAAD, F. M., Biochim. et Biophys. Acta, 35, 272 (1959). HOOGSTEEN, K., Acta Cryst., 12, 822 (1959). JONES, 0. W., AND MARTIN, R. G., Federation Proc., 21, 414 (1962). LENGYEL, P., SPEYER, J. F., BASILIO, C., AND OCHOA, S., Proc. N&Z. Acad. Sci. U.S., 48, 282 (1962). THE CURREST STATUS OF THE RNA CODE 465 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. LENGYEL, P., SPEYER, J. F., AND OCHOA, S., Proc. Natl. Acad. Sci. U.S., 47, 1936 (1961). MARTIN, R. G., MA~HAEI, J. H., JONES, 0. W., AND NIRENBERG, M. W., Biochem. Biophys. Research Cornmum., 6, 410 (1962). MATTHAEI, J. H., JONES, 0. W., MARTIN, R. G., AND NIBENBERG, M. W., Proc. Natl. Acad. Sci. U.S., 48, 666 ( 1962). NATHANS, D., NOTANI, G., SCHWARTZ, J. H., AND ZINDER, N. D., Proc. Natl. Acad. Sci. U.S., 48, 1424 ( 1962). NIRENBERG, M. W., AND MATTHAEI, J. H., Proc. Natl. Acad. Sci. U.S., 47, 1588 (1961). NIRENBERG, M. W., MATTHAEI, J. H., AND JONES, 0. W., unpublished observations. NIRENBERG, M. W., MATTHAEI, J. H., JONES, 0. W., MARTIN, R. G., AND BARONDES, S. H., Federation Proc., 22, 55 ( 1963). OFENGAND, J., AND HASELKORN, R., Biochem. Biophys. Research Commzt~~.~., 6, 469 ( 1962). RICH, A., Nature, 181, 521 (1958). RICH, A., DAVIES, D. R., CRICK, F. H. C., AND WATSON, J. D., J. Mol. Biol., 3, 71 (1961). ROBERTS, R. B., Proc. Natl. Acad. Sci. U.S., 48, 897 (1962). ROBERTS, R. B., Proc. Natl. Acad. Sci. VS., 48, 1245 (1962). SINGER, M. F., AND Cuss, J. K., J. Biol. Chem., 237, 182 (1962). SINGER, M. F., JONES, 0. W., MATTHAEI, J. H., AND NIRENBERG, M. W., unpublished observations. SPEYER, J. F., LENGYEL, P., BASILIO, C., AND OCHOA, S., PTOC. Natl. Acad. Sci. U.S., 48, 63 ( 1962). SPEYER, J. F., LENGYEL, P., BASILIO, C., AND OCHOA, S., Proc. Natl. Acad. Sci. U.S., 48, 441 (1962). SUEOIL+, N., AND YAMANE, T., Proc. Natl. Acad. Sci. VS., 48, 1454 (1962). TSUGITA, A., FRAENKEL-CONRAT, H., NIRENBERG, M. W., AND MATTHAEI, J. H., Proc. Natl. Acad. Sci. VS., 48, 846 (1962). WEISBLUM, B., BENZER, S., AND HOLLEY, R. W., Proc. Natl. Acad. Sci. U.S., 48, 1449 ( 1962). WOESE, C. R., Nature, 194, 1114 (1962).