Genetic code: properties and functions. Code in the code: the second genetic code revealed Genetic code its discovery of properties

After the discovery of the rules of the genetic code, according to which hereditary information is rewritten from the language of nucleotides to the language of amino acids, they were considered universal. There are at least 30 known cases where the genetic code is used in a slightly modified form. Changes can be very diverse: the meaning of a codon will change, a stop codon will begin to encode some amino acid, a regular codon will begin to act as a start codon. We offer you ten cases of the most interesting deviations from the standard genetic code.

Despite the generally accepted “standard” nature of the genetic code, there are several dozen examples of living organisms using a slightly modified version of it. Some changes are common to entire taxa, while others are found in just a few species. There are known cases when part of the mRNA of a certain gene is translated according to standard rules, and the other - according to modified ones. For example, when translating human malate dehydrogenase mRNA, which is encoded in the nucleus, in 4% of cases the standard stop codon encodes tryptophan and arginine. Very often, deviations from the standard genetic code are observed only in some organelles. Thus, for the first time the existence of such deviations was confirmed back in 1979, showing that the genetic code of human mitochondria differs from the nuclear code. Our article is devoted to the most surprising cases of deviation of the genetic code from the standard.

“Biomolecule” has written more than once about the genetic code. Article " Such different synonyms"is devoted to the phenomenon of codon preference. In the articles " " And " Evolution of the genetic code" talks about the evolution of the genetic code, and the publications " Expanded genome" And " Four letter word"You can read about the prospects for its artificial expansion.

Blastocrithidia

In protozoa of the genus Blastocrithidia, related to trypanosomes (Fig. 1), the genetic code used in the translation of nuclear genes is literally “without brakes”: all three stop codons encode amino acids. The UGA codon codes for tryptophan, while UAG and UAA code for glutamate. However, UAA and, less commonly, UAG can still act as stop codons. It turned out that one of the proteins necessary for the release of the ribosome from mRNA after translation eRF1, the extremely important serine residue is replaced by another amino acid, which reduces its affinity for UGA, allowing this stop codon to function as a sense codon. However, it is not completely known why UAG and UAA can act as both sense and terminator codons.

Condylostoma magnum

In ciliates Condylostoma magnum each of the standard stop codons is capable of acting as a sense codon: UAA and UAG can encode glutamine, and UGA can encode tryptophan. However, the dual coding mechanism in this organism is completely different from Blastocrithidia: The meaning of each of the standard stop codons depends on their position in the mRNA. Stop codons, located in the middle part of the transcript, encode amino acids, and stop codons, located near the 3′ end of the mRNA, work “in their specialty” and act as terminators. Probably 3′ untranslated gene regions Condylostoma magnum very short and conserved and play a role in stop codon recognition.

Acetohalobium arabaticum

Rhabdopleura compacta

Scenedesmus obliquus

Genetic code of green algae mitochondria Scenedesmus obliquus(Figure 3) is unusual in that the UCA codon, which normally codes for leucine, functions as a stop codon. The mitochondrial genome of this alga lacks a gene encoding tRNA corresponding to the UCA codon. Instead in mitochondria Scenedesmus obliquus leucine encodes the standard UAG stop codon.

Flatworms class Rhabditophora

Radopholus similis

Ciliates-slippers

Mitochondrial genetic code of slipper ciliates (genus Paramecium) differs from the standard one primarily in the number of start codons. As many as five or six can act as start codons: AUG, AUA, AUU, AUC, GUG, and possibly GUA. Because the mitochondrial genome of these organisms contains genes for only three tRNAs, most of the tRNA comes from the cytoplasm. In this regard, in the mitochondria of slipper ciliates, as in the nucleus of many ciliates, stop codons UAG and UAA encode glutamine.

Ashbya gossypii

In yeast Ashbya gossypii in mitochondria, the codon CUU, which normally codes for leucine, codes for alanine. Surprisingly, the other two leucine codons, CUC and CUG, are completely absent from the mitochondrial genome, so in these organisms leucine is encoded by only two codons - UUG and UUA - instead of the standard five.

Mycobacterium smegmatis

In a bacterium Mycobacterium smegmatis aspartic codons take on additional significance in the stationary phase of growth, as well as under low pH conditions. Even more curiously, due to the ambiguity of aspartic codons, substitutions occur in the β-subunit of RNA polymerase that preserve its functionality, but make the enzyme resistant to the antibiotic rifampicin, which normally blocks its work.

Of course, variations in the standard genetic code are not limited to these examples. However, exceptions only prove the rule, and this is also true for the genetic code. Despite the enormous diversity of living organisms, exceptions to the genetic code are so rare that they seem to be nothing more than curious curiosities. However, these exceptions provide valuable material for reconstructing the evolution of the genetic code and help to better understand its fundamental properties.

Literature

  1. Julia Hofhuis, Fabian Schueren, Christopher Nötzel, Thomas Lingner, Jutta Gärtner, et. al.. (2016). The functional readthrough extension of malate dehydrogenase reveals a modification of the genetic code. Open Biol.. 6 , 160246;
  2. B. G. Barrell, A. T. Bankier, J. Drouin. (1979). A different genetic code in human mitochondria. Nature. 282 , 189-194;
  3. Such different synonyms;
  4. At the Origins of the Genetic Code: Soul Mates;
  5. Evolution of the genetic code;
  6. Expanded genome;
  7. Four letter word;
  8. Kristina Záhonová, Alexei Y. Kostygov, Tereza Ševčíková, Vyacheslav Yurchenko, Marek Eliáš. (2016). An Unprecedented Non-canonical Nuclear Genetic Code with All Three Termination Codons Reassigned as Sense Codons. Current Biology. 26 , 2364-2369;
  9. Stephen M. Heaphy, Marco Mariotti, Vadim N. Gladyshev, John F. Atkins, Pavel V. Baranov. (2016). Novel Ciliate Genetic Code Variants Including the Reassignment of All Three Stop Codons to Sense Codons in Condylostoma magnum. Mol Biol Evol. 33 , 2885-2889;
  10. L. Prat, I. U. Heinemann, H. R. Aerni, J. Rinehart, P. O" Donoghue, D. Soll. (2012). . 109 , 21070-21075;
  11. Marleen Perseke, Joerg Hetmank, Matthias Bernt, Peter F Stadler, Martin Schlegel, Detlef Bernhard. (2011). The enigmatic mitochondrial genome of Rhabdopleura compacta(Pterobranchia) reveals insights into selection of an efficient tRNA system and supports monophyly of Ambulacraria. BMC Evol Biol. 11 ;
  12. A. M. Nedelcu. (2000). The Complete Mitochondrial DNA Sequence of Scenedesmus obliquus Reflects an Intermediate Stage in the Evolution of the Green Algal Mitochondrial Genome. Genome Research. 10 , 819-831;
  13. M. J. Telford, E. A. Herniou, R. B. Russell, D. T. J. Littlewood. (2000). Changes in mitochondrial genetic codes as phylogenetic characters: Two examples from the flatworms. Proceedings of the National Academy of Sciences. 97 , 11359-11364;
  14. Joachim E.M. Jacob, Bartel Vanholme, Thomas Van Leeuwen, Godelieve Gheysen. (2009). A unique genetic code change in the mitochondrial genome of the parasitic nematode Radopholus similis. BMC Research Notes. 2 , 192;
  15. Pritchard A.E., Seilhamer J.J., Mahalingam R., Sable C.L., Venuti S.E., Cummings D.J. (1990). Nucleotide sequence of the mitochondrial genome of Paramecium . Nucleic Acids Res. 18 , 173–180;
  16. Jiqiang Ling, Rachid Daoud, Marc J. Lajoie, George M. Church, Dieter Söll, B. Franz Lang. (2014). Natural reassignment of CUU and CUA sense codons to alanine in Ashbya mitochondria. Nucleic Acids Research. 42 , 499-508;
  17. Jiqiang Ling, Patrick O"Donoghue, Dieter Söll. (2015). Genetic code flexibility in microorganisms: novel mechanisms and impact on physiology. Nat Rev Micro. 13 , 707-721;
  18. B. Javid, F. Sorrentino, M. Toosky, W. Zheng, J. T. Pinkham, et. al.. (2014). Mycobacterial mistranslation is necessary and sufficient for rifampicin phenotypic resistance. Proceedings of the National Academy of Sciences. 111 , 1132-1137;
  19. Alexander O. Frolov, Marina N. Malysheva, Anna I. Ganyukova, Vyacheslav Yurchenko, Alexei Y. Kostygov. (2017). Life cycle of Blastocrithidia papi sp. n. (Kinetoplastea, Trypanosomatidae) in Pyrrhocoris apterus (Hemiptera, Pyrrhocoridae) . European Journal of Protistology. 57 , 85-98;
  20. Johannes Sikorski, Alla Lapidus, Olga Chertkov, Susan Lucas, Alex Copeland, et. al.. (2010). Complete genome sequence of Acetohalobium arabaticum type strain (Z-7288T) . Stand. Genomic Sci.. 3 , 57-65.

Today it is no secret to anyone that the life program of all living organisms is written on a DNA molecule. The easiest way to imagine a DNA molecule is as a long ladder. The vertical posts of this staircase are made up of molecules of sugar, oxygen and phosphorus. All the important operating information in the molecule is written on the rungs of the ladder - they consist of two molecules, each of which is attached to one of the vertical posts. These molecules—the nitrogenous bases—are called adenine, guanine, thymine, and cytosine, but they are usually simply designated by the letters A, G, T, and C. The shape of these molecules allows them to form bonds—complete ladders—only of a certain type. These are connections between the bases A and T and between the bases G and C (the pair thus formed is called "base pair"). There cannot be any other types of connections in a DNA molecule.

By going down the steps along one strand of a DNA molecule, you get a sequence of bases. It is this message in the form of a sequence of bases that determines the flow of chemical reactions in the cell and, consequently, the characteristics of the organism possessing this DNA. According to the central dogma of molecular biology, the DNA molecule encodes information about proteins, which, in turn, act as enzymes ( cm. Catalysts and enzymes) regulate everything chemical reactions in living organisms.

The strict correspondence between the sequence of base pairs in a DNA molecule and the sequence of amino acids that make up protein enzymes is called the genetic code. The genetic code was deciphered soon after the discovery of the double-stranded structure of DNA. It was known that the newly discovered molecule informational, or matrix RNA (mRNA, or mRNA) carries information written on DNA. Biochemists Marshall W. Nirenberg and J. Heinrich Matthaei from National Institute Health in the town of Bethesda near Washington, DC, carried out the first experiments that led to the solution to the genetic code.

They began by synthesizing artificial mRNA molecules consisting only of the repeating nitrogenous base uracil (which is an analogue of thymine, "T", and forms bonds only with adenine, "A", from the DNA molecule). They added these mRNAs to test tubes with a mixture of amino acids, and in each tube only one of the amino acids was labeled with a radioactive label. The researchers discovered that the mRNA they artificially synthesized initiated protein formation in only one test tube, which contained the labeled amino acid phenylalanine. So they established that the sequence “—U—U—U—” on the mRNA molecule (and, therefore, the equivalent sequence “—A—A—A—” on the DNA molecule) encodes a protein consisting only of the amino acid phenylalanine. This was the first step towards deciphering the genetic code.

Today it is known that three base pairs of a DNA molecule (this triplet is called codon) code for one amino acid in a protein. By performing experiments similar to those described above, geneticists eventually deciphered the entire genetic code, in which each of the 64 possible codons corresponds to a specific amino acid.

The genetic code is a recording system hereditary information in molecules nucleic acids, based on a certain alternation of nucleotide sequences in DNA or RNA, forming codons corresponding to amino acids in the protein.

Properties of the genetic code.

The genetic code has several properties.

    Tripletity.

    Degeneracy or redundancy.

    Unambiguity.

    Polarity.

    Non-overlapping.

    Compactness.

    Versatility.

It should be noted that some authors also propose other properties of the code related to the chemical characteristics of the nucleotides included in the code or the frequency of occurrence of individual amino acids in the body’s proteins, etc. However, these properties follow from those listed above, so we will consider them there.

A. Tripletity. The genetic code, like many complexly organized systems, has the smallest structural and smallest functional unit. A triplet is the smallest structural unit of the genetic code. It consists of three nucleotides. A codon is the smallest functional unit of the genetic code. Typically, triplets of mRNA are called codons. In the genetic code, a codon performs several functions. Firstly, its main function is that it encodes a single amino acid. Secondly, the codon may not code for an amino acid, but, in this case, it performs another function (see below). As can be seen from the definition, a triplet is a concept that characterizes elementary structural unit genetic code (three nucleotides). Codon – characterizes elementary semantic unit genome - three nucleotides determine the attachment of one amino acid to the polypeptide chain.

The elementary structural unit was first deciphered theoretically, and then its existence was confirmed experimentally. Indeed, 20 amino acids cannot be encoded with one or two nucleotides because there are only 4 of the latter. Three out of four nucleotides give 4 3 = 64 variants, which more than covers the number of amino acids available in living organisms (see Table 1).

The 64 nucleotide combinations presented in table have two features. Firstly, of the 64 triplet variants, only 61 are codons and encode any amino acid, they are called sense codons. Three triplets do not encode

amino acids a are stop signals indicating the end of translation. There are three such triplets - UAA, UAG, UGA, they are also called “meaningless” (nonsense codons). As a result of a mutation, which is associated with the replacement of one nucleotide in a triplet with another, a nonsense codon can arise from a sense codon. This type of mutation is called nonsense mutation. If such a stop signal is formed inside the gene (in its information part), then during protein synthesis in this place the process will be constantly interrupted - only the first (before the stop signal) part of the protein will be synthesized. A person with this pathology will experience a lack of protein and will experience symptoms associated with this deficiency. For example, this kind of mutation was identified in the gene encoding the hemoglobin beta chain. A shortened inactive hemoglobin chain is synthesized, which is quickly destroyed. As a result, a hemoglobin molecule devoid of a beta chain is formed. It is clear that such a molecule is unlikely to fully fulfill its duties. A serious disease occurs that develops as hemolytic anemia (beta-zero thalassemia, from the Greek word “Thalas” - Mediterranean Sea, where this disease was first discovered).

The mechanism of action of stop codons differs from the mechanism of action of sense codons. This follows from the fact that for all codons encoding amino acids, corresponding tRNAs have been found. No tRNAs were found for nonsense codons. Consequently, tRNA does not take part in the process of stopping protein synthesis.

CodonAUG (sometimes GUG in bacteria) not only encode the amino acids methionine and valine, but are alsobroadcast initiator .

b. Degeneracy or redundancy.

61 of the 64 triplets encode 20 amino acids. This three-fold excess of the number of triplets over the number of amino acids suggests that two coding options can be used in the transfer of information. Firstly, not all 64 codons can be involved in encoding 20 amino acids, but only 20 and, secondly, amino acids can be encoded by several codons. Research has shown that nature used the latter option.

His preference is obvious. If out of 64 variant triplets only 20 were involved in encoding amino acids, then 44 triplets (out of 64) would remain non-coding, i.e. meaningless (nonsense codons). Previously, we pointed out how dangerous it is for the life of a cell to transform a coding triplet as a result of mutation into a nonsense codon - this significantly disrupts the normal functioning of RNA polymerase, ultimately leading to the development of diseases. Currently, three codons in our genome are nonsense, but now imagine what would happen if the number of nonsense codons increased by about 15 times. It is clear that in such a situation the transition of normal codons to nonsense codons will be immeasurably higher.

A code in which one amino acid is encoded by several triplets is called degenerate or redundant. Almost every amino acid has several codons. Thus, the amino acid leucine can be encoded by six triplets - UUA, UUG, TSUU, TsUC, TsUA, TsUG. Valine is encoded by four triplets, phenylalanine by two and only tryptophan and methionine encoded by one codon. The property that is associated with recording the same information with different symbols is called degeneracy.

The number of codons designated for one amino acid correlates well with the frequency of occurrence of the amino acid in proteins.

And this is most likely not accidental. The higher the frequency of occurrence of an amino acid in a protein, the more often the codon of this amino acid is represented in the genome, the higher the likelihood of its damage by mutagenic factors. Therefore, it is clear that a mutated codon has a greater chance of encoding the same amino acid if it is highly degenerate. From this perspective, the degeneracy of the genetic code is a mechanism that protects the human genome from damage.

It should be noted that the term degeneracy is used in molecular genetics in another sense. Thus, the bulk of the information in a codon is contained in the first two nucleotides; the base in the third position of the codon turns out to be of little importance. This phenomenon is called “degeneracy of the third base.” The latter feature minimizes the effect of mutations. For example, it is known that the main function of red blood cells is to carry oxygen from the lungs to the tissues and carbon dioxide from tissues to lungs. This function is performed by the respiratory pigment - hemoglobin, which fills the entire cytoplasm of the erythrocyte. It consists of a protein part - globin, which is encoded by the corresponding gene. In addition to protein, the hemoglobin molecule contains heme, which contains iron. Mutations in globin genes lead to the appearance of different variants of hemoglobins. Most often, mutations are associated with replacing one nucleotide with another and the appearance of a new codon in the gene, which may encode a new amino acid in the hemoglobin polypeptide chain. In a triplet, as a result of mutation, any nucleotide can be replaced - the first, second or third. Several hundred mutations are known that affect the integrity of the globin genes. Near 400 of which are associated with the replacement of single nucleotides in a gene and the corresponding amino acid replacement in a polypeptide. Of these only 100 replacements lead to instability of hemoglobin and various kinds of diseases from mild to very severe. 300 (approximately 64%) substitution mutations do not affect hemoglobin function and do not lead to pathology. One of the reasons for this is the above-mentioned “degeneracy of the third base,” when a replacement of the third nucleotide in a triplet encoding serine, leucine, proline, arginine and some other amino acids leads to the appearance of a synonymous codon encoding the same amino acid. Such a mutation will not manifest itself phenotypically. In contrast, any replacement of the first or second nucleotide in a triplet in 100% of cases leads to the appearance of a new hemoglobin variant. But even in this case, there may not be severe phenotypic disorders. The reason for this is the replacement of an amino acid in hemoglobin with another one similar to the first one. physical and chemical properties. For example, if an amino acid with hydrophilic properties is replaced by another amino acid, but with the same properties.

Hemoglobin consists of the iron porphyrin group of heme (oxygen and carbon dioxide molecules are attached to it) and protein - globin. Adult hemoglobin (HbA) contains two identical-chains and two-chains. Molecule-chain contains 141 amino acid residues,-chain - 146,- And-chains differ in many amino acid residues. The amino acid sequence of each globin chain is encoded by its own gene. Gene encoding-the chain is located in the short arm of chromosome 16,-gene - in the short arm of chromosome 11. Substitution in the gene encoding-the hemoglobin chain of the first or second nucleotide almost always leads to the appearance of new amino acids in the protein, disruption of hemoglobin functions and serious consequences for the patient. For example, replacing “C” in one of the triplets CAU (histidine) with “Y” will lead to the appearance of a new triplet UAU, encoding another amino acid - tyrosine. Phenotypically this will manifest itself in a severe disease.. A similar substitution in position 63-chain of histidine polypeptide to tyrosine will lead to destabilization of hemoglobin. The disease methemoglobinemia develops. Replacement, as a result of mutation, of glutamic acid with valine in the 6th position-chain is the cause of the most severe disease - sickle cell anemia. Let's not continue the sad list. Let us only note that when replacing the first two nucleotides, an amino acid with physicochemical properties similar to the previous one may appear. Thus, replacement of the 2nd nucleotide in one of the triplets encoding glutamic acid (GAA) in-chain with “U” leads to the appearance of a new triplet (GUA), encoding valine, and replacing the first nucleotide with “A” forms the triplet AAA, encoding the amino acid lysine. Glutamic acid and lysine are similar in physicochemical properties - they are both hydrophilic. Valine is a hydrophobic amino acid. Therefore, replacing hydrophilic glutamic acid with hydrophobic valine significantly changes the properties of hemoglobin, which ultimately leads to the development of sickle cell anemia, while replacing hydrophilic glutamic acid with hydrophilic lysine changes the function of hemoglobin to a lesser extent - patients develop a mild form of anemia. As a result of the replacement of the third base, the new triplet can encode the same amino acids as the previous one. For example, if in the CAC triplet uracil was replaced by cytosine and a CAC triplet appeared, then virtually no phenotypic changes in humans will be detected. This is understandable, because both triplets code for the same amino acid – histidine.

In conclusion, it is appropriate to emphasize that the degeneracy of the genetic code and the degeneracy of the third base from a general biological point of view are protective mechanisms that are inherent in evolution in the unique structure of DNA and RNA.

V. Unambiguity.

Each triplet (except nonsense) encodes only one amino acid. Thus, in the direction codon - amino acid the genetic code is unambiguous, in the direction amino acid - codon it is ambiguous (degenerate).

Unambiguous

Amino acid codon

Degenerate

And in this case, the need for unambiguity in the genetic code is obvious. In another option, when translating the same codon, different amino acids would be inserted into the protein chain and, as a result, proteins with different primary structures and different functions would be formed. Cell metabolism would switch to the “one gene – several polypeptides” mode of operation. It is clear that in such a situation the regulatory function of genes would be completely lost.

g. Polarity

Reading information from DNA and mRNA occurs only in one direction. Polarity is important for defining higher order structures (secondary, tertiary, etc.). Earlier we talked about how lower-order structures determine higher-order structures. Tertiary structure and structures more high order in proteins, they are formed immediately as soon as the synthesized RNA chain leaves the DNA molecule or the polypeptide chain leaves the ribosome. While the free end of an RNA or polypeptide acquires a tertiary structure, the other end of the chain continues to be synthesized on DNA (if RNA is transcribed) or a ribosome (if a polypeptide is transcribed).

Therefore, the unidirectional process of reading information (during the synthesis of RNA and protein) is essential not only for determining the sequence of nucleotides or amino acids in the synthesized substance, but for the strict determination of secondary, tertiary, etc. structures.

d. Non-overlapping.

The code may be overlapping or non-overlapping. In most organisms the code does not overlap. Overlapping code is found in some phages.

The essence of a non-overlapping code is that a nucleotide of one codon cannot simultaneously be a nucleotide of another codon. If the code were overlapping, then the sequence of seven nucleotides (GCUGCUG) could encode not two amino acids (alanine-alanine) (Fig. 33, A) as in the case of a non-overlapping code, but three (if there is one nucleotide in common) (Fig. . 33, B) or five (if two nucleotides are common) (see Fig. 33, C). In the last two cases, a mutation of any nucleotide would lead to a violation in the sequence of two, three, etc. amino acids.

However, it has been established that a mutation of one nucleotide always disrupts the inclusion of one amino acid in a polypeptide. This is a significant argument that the code is non-overlapping.

Let us explain this in Figure 34. Bold lines show triplets encoding amino acids in the case of non-overlapping and overlapping code. Experiments have clearly shown that the genetic code is non-overlapping. Without going into details of the experiment, we note that if you replace the third nucleotide in the sequence of nucleotides (see Fig. 34)U (marked with an asterisk) to some other thing:

1. With a non-overlapping code, the protein controlled by this sequence would have a substitution of one (first) amino acid (marked with asterisks).

2. With an overlapping code in option A, a substitution would occur in two (first and second) amino acids (marked with asterisks). Under option B, the replacement would affect three amino acids (marked with asterisks).

However, numerous experiments have shown that when one nucleotide in DNA is disrupted, the disruption in the protein always affects only one amino acid, which is typical for a non-overlapping code.

GZUGZUG GZUGZUG GZUGZUG

GCU GCU GCU UGC GCU GCU GCU UGC GCU GCU GCU

*** *** *** *** *** ***

Alanin - Alanin Ala - Cis - Ley Ala - Ley - Ley - Ala - Ley

A B C

Non-overlapping code Overlapping code

Rice. 34. A diagram explaining the presence of a non-overlapping code in the genome (explanation in the text).

The non-overlap of the genetic code is associated with another property - the reading of information begins from a certain point - the initiation signal. Such an initiation signal in mRNA is the codon encoding methionine AUG.

It should be noted that humans still have a small number of genes that deviate from general rule and overlap.

e. Compactness.

There is no punctuation between codons. In other words, triplets are not separated from each other, for example, by one meaningless nucleotide. The absence of “punctuation marks” in the genetic code has been proven in experiments.

and. Versatility.

The code is the same for all organisms living on Earth. Direct proof The universality of the genetic code was obtained by comparing DNA sequences with corresponding protein sequences. It turned out that all bacterial and eukaryotic genomes use the same sets of code values. There are exceptions, but not many.

The first exceptions to the universality of the genetic code were found in the mitochondria of some animal species. This concerned the terminator codon UGA, which reads the same as the codon UGG, encoding the amino acid tryptophan. Other rarer deviations from universality were also found.

DNA code system.

The genetic code of DNA consists of 64 triplets of nucleotides. These triplets are called codons. Each codon codes for one of the 20 amino acids used in protein synthesis. This gives some redundancy in the code: most amino acids are coded for by more than one codon.
One codon performs two interrelated functions: it signals the beginning of translation and encodes the inclusion of the amino acid methionine (Met) in the growing polypeptide chain. The DNA coding system is designed so that the genetic code can be expressed either as RNA codons or DNA codons. RNA codons are found in RNA (mRNA) and these codons are able to read information during the synthesis of polypeptides (a process called translation). But each mRNA molecule acquires a nucleotide sequence in transcription from the corresponding gene.

All but two amino acids (Met and Trp) can be encoded by 2 to 6 different codons. However, the genome of most organisms shows that certain codons are favored over others. In humans, for example, alanine is encoded by GCC four times more often than by GCG. This probably indicates greater translation efficiency of the translation apparatus (for example, the ribosome) for some codons.

The genetic code is almost universal. The same codons are assigned to the same section of amino acids and the same start and stop signals are overwhelmingly the same in animals, plants and microorganisms. However, some exceptions have been found. Most involve assigning one or two of the three stop codons to an amino acid.

In any cell and organism, all anatomical, morphological and functional features are determined by the structure of the proteins that comprise them. The hereditary property of the body is the ability to synthesize certain proteins. Amino acids are located in a polypeptide chain, on which biological characteristics depend.
Each cell has its own sequence of nucleotides in the polynucleotide chain of DNA. This is the genetic code of DNA. Through it, information about the synthesis of certain proteins is recorded. This article describes what the genetic code is, its properties and genetic information.

A little history

The idea that there might be a genetic code was formulated by J. Gamow and A. Down in the mid-twentieth century. They described that the nucleotide sequence responsible for the synthesis of a particular amino acid contains at least three units. Later they proved the exact number of three nucleotides (this is a unit of genetic code), which was called a triplet or codon. There are sixty-four nucleotides in total, because the acid molecule where RNA occurs is made up of four different nucleotide residues.

What is genetic code

The method of encoding the sequence of amino acid proteins due to the sequence of nucleotides is characteristic of all living cells and organisms. This is what the genetic code is.
There are four nucleotides in DNA:

  • adenine - A;
  • guanine - G;
  • cytosine - C;
  • thymine - T.

They are denoted by capital Latin or (in Russian-language literature) Russian letters.
RNA also contains four nucleotides, but one of them is different from DNA:

  • adenine - A;
  • guanine - G;
  • cytosine - C;
  • uracil - U.

All nucleotides are arranged in chains, with DNA having a double helix and RNA having a single helix.
Proteins are built on where they, located in a certain sequence, determine its biological properties.

Properties of the genetic code

Tripletity. A unit of genetic code consists of three letters, it is triplet. This means that the twenty amino acids that exist are encoded by three specific nucleotides called codons or trilpets. There are sixty-four combinations that can be created from four nucleotides. This amount is more than enough to encode twenty amino acids.
Degeneracy. Each amino acid corresponds to more than one codon, with the exception of methionine and tryptophan.
Unambiguity. One codon codes for one amino acid. For example, in a healthy person's gene with information about the beta target of hemoglobin, a triplet of GAG and GAA encodes A in everyone who has sickle cell disease, one nucleotide is changed.
Collinearity. The sequence of amino acids always corresponds to the sequence of nucleotides that the gene contains.
The genetic code is continuous and compact, which means that it has no punctuation marks. That is, starting at a certain codon, continuous reading occurs. For example, AUGGGUGTSUUAAUGUG will be read as: AUG, GUG, TSUU, AAU, GUG. But not AUG, UGG and so on or anything else.
Versatility. It is the same for absolutely all terrestrial organisms, from humans to fish, fungi and bacteria.

Table

Not all available amino acids are included in the table presented. Hydroxyproline, hydroxylysine, phosphoserine, iodine derivatives of tyrosine, cystine and some others are absent, since they are derivatives of other amino acids encoded by m-RNA and formed after modification of proteins as a result of translation.
From the properties of the genetic code it is known that one codon is capable of encoding one amino acid. The exception is the genetic code that performs additional functions and encodes valine and methionine. The mRNA, being at the beginning of the codon, attaches t-RNA, which carries formylmethione. Upon completion of the synthesis, it is cleaved off and takes the formyl residue with it, transforming into a methionine residue. Thus, the above codons are the initiators of the synthesis of the polypeptide chain. If they are not at the beginning, then they are no different from the others.

Genetic information

This concept means a program of properties that is passed down from ancestors. It is embedded in heredity as a genetic code.
The genetic code is realized during protein synthesis:

  • messenger RNA;
  • ribosomal rRNA.

Information is transmitted through direct communication (DNA-RNA-protein) and reverse communication (medium-protein-DNA).
Organisms can receive, store, transmit it and use it most effectively.
Passed on by inheritance, information determines the development of a particular organism. But due to interaction with the environment, the reaction of the latter is distorted, due to which evolution and development occur. In this way, new information is introduced into the body.


Computing patterns molecular biology and the discovery of the genetic code illustrated the need to combine genetics with Darwin's theory, on the basis of which a synthetic theory of evolution emerged - non-classical biology.
Heredity, variability and natural selection Darwin's ideas are supplemented by genetically determined selection. Evolution is realized at the genetic level through random mutations and the inheritance of the most valuable traits that are most adapted to the environment.

Decoding the human code

In the nineties, the Human Genome Project was launched, as a result of which genome fragments containing 99.99% of human genes were discovered in the two thousandths. Fragments that are not involved in protein synthesis and are not encoded remain unknown. Their role remains unknown for now.

Last discovered in 2006, chromosome 1 is the longest in the genome. More than three hundred and fifty diseases, including cancer, appear as a result of disorders and mutations in it.

The role of such studies can hardly be overestimated. When they discovered what the genetic code is, it became known according to what patterns development occurs, how the morphological structure, psyche, predisposition to certain diseases, metabolism and defects of individuals are formed.

They line up in chains and thus produce sequences of genetic letters.

Genetic code

The proteins of almost all living organisms are built from only 20 types of amino acids. These amino acids are called canonical. Each protein is a chain or several chains of amino acids connected in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties.

C

CUU (Leu/L)Leucine
CUC (Leu/L)Leucine
CUA (Leu/L)Leucine
CUG (Leu/L)Leucine

In some proteins, nonstandard amino acids, such as selenocysteine ​​and pyrrolysine, are inserted by a ribosome reading the stop codon, depending on the sequences in the mRNA. Selenocysteine ​​is now considered to be the 21st, and pyrrolysine the 22nd, amino acids that make up proteins.

Despite these exceptions, all living organisms have a genetic code common features: a codon consists of three nucleotides, where the first two are decisive; codons are translated by tRNA and ribosomes into an amino acid sequence.

Deviations from the standard genetic code.
Example Codon Normal meaning Reads like:
Some types of yeast Candida C.U.G. Leucine Serin
Mitochondria, in particular in Saccharomyces cerevisiae CU(U, C, A, G) Leucine Serin
Mitochondria of higher plants CGG Arginine Tryptophan
Mitochondria (in all studied organisms without exception) U.G.A. Stop Tryptophan
Mitochondria in mammals, Drosophila, S. cerevisiae and many protozoa AUA Isoleucine Methionine = Start
Prokaryotes G.U.G. Valin Start
Eukaryotes (rare) C.U.G. Leucine Start
Eukaryotes (rare) G.U.G. Valin Start
Prokaryotes (rare) UUG Leucine Start
Eukaryotes (rare) A.C.G. Threonine Start
Mammalian mitochondria AGC, AGU Serin Stop
Drosophila mitochondria A.G.A. Arginine Stop
Mammalian mitochondria AG(A, G) Arginine Stop

History of ideas about the genetic code

However, in the early 60s of the 20th century, new data revealed the inconsistency of the “code without commas” hypothesis. Then experiments showed that codons, considered meaningless by Crick, could provoke protein synthesis in vitro, and by 1965 the meaning of all 64 triplets was established. It turned out that some codons are simply redundant, that is, a whole series of amino acids are encoded by two, four or even six triplets.

see also

Notes

  1. Genetic code supports targeted insertion of two amino acids by one codon. Turanov AA, Lobanov AV, Fomenko DE, Morrison HG, Sogin ML, Klobutcher LA, Hatfield DL, Gladyshev VN. Science. 2009 Jan 9;323(5911):259-61.
  2. The AUG codon encodes methionine, but at the same time serves as a start codon - translation usually begins with the first AUG codon of mRNA.
  3. NCBI: "The Genetic Codes", Compiled by Andrzej (Anjay) Elzanowski and Jim Ostell
  4. Jukes TH, Osawa S, The genetic code in mitochondria and chloroplasts., Experience. 1990 Dec 1;46(11-12):1117-26.
  5. Osawa S, Jukes TH, Watanabe K, Muto A (March 1992). "Recent evidence for evolution of the genetic code." Microbiol. Rev. 56 (1): 229–64. PMID 1579111.
  6. SANGER F. (1952). "The arrangement of amino acids in proteins." Adv Protein Chem. 7 : 1-67. PMID 14933251.
  7. M. Ichas Biological code. - World, 1971.
  8. WATSON JD, CRICK FH. (April 1953). “Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid." Nature 171 : 737-738. PMID 13054692.
  9. WATSON JD, CRICK FH. (May 1953). "Genetic implications of the structure of deoxyribonucleic acid." Nature 171 : 964-967. PMID 13063483.
  10. Crick FH. (April 1966). “The genetic code - yesterday, today, and tomorrow.” Cold Spring Harb Symp Quant Biol.: 1-9. PMID 5237190.
  11. G. GAMOW (February 1954). "Possible Relation between Deoxyribonucleic Acid and Protein Structures." Nature 173 : 318. DOI:10.1038/173318a0. PMID 13882203.
  12. GAMOW G, RICH A, YCAS M. (1956). "The problem of information transfer from the nucleic acids to proteins." Adv Biol Med Phys. 4 : 23-68. PMID 13354508.
  13. Gamow G, Ycas M. (1955). “STATISTICAL CORRELATION OF PROTEIN AND RIBONUCLEIC ACID COMPOSITION. " Proc Natl Acad Sci U S A. 41 : 1011-1019. PMID 16589789.
  14. Crick FH, Griffith JS, Orgel LE. (1957). “CODES WITHOUT COMMAS. " Proc Natl Acad Sci U S A. 43 : 416-421. PMID 16590032.
  15. Hayes B. (1998). "The Invention of the Genetic Code." (PDF reprint). American Scientist 86 : 8-14.

Literature

  • Azimov A. Genetic code. From the theory of evolution to deciphering DNA. - M.: Tsentrpoligraf, 2006. - 208 pp. - ISBN 5-9524-2230-6.
  • Ratner V. A. Genetic code as a system - Soros educational journal, 2000, 6, No. 3, pp. 17-22.
  • Crick FH, Barnett L, Brenner S, Watts-Tobin RJ. General nature of the genetic code for proteins - Nature, 1961 (192), pp. 1227-32

Links

  • Genetic code- article from the Great Soviet Encyclopedia

Wikimedia Foundation.