Since ancient times, thinkers have been fascinated by the observation that a plant develops from a seed or a bird from a fertilized egg. The seed and the egg are therefore already given what is to develop from them. How can this phenomenon be explained?
Even if we have no empirical knowledge of the internal structure of a living organism, we can make an educated guess: The seed and the egg must have a store of information on which the building instructions are stored as to how a plant or a bird are constructed and how the developmental path from seed or egg to adult is to be followed. Thanks to the findings of molecular biology, we now know that this information memory consists of single-stranded RNA or double-stranded DNA molecules. In multicellular organisms such as plants or birds, genetic information is stored in very long DNA molecules in the cell nucleus. The blueprints for the proteins that a cell needs are encoded in the DNA. Since each cell of a multicellular organism contains largely identical copies of the DNA, but not every protein is needed by the cell at all times, a cell must also know when which section of the DNA should be read. How this situation-dependent program control is implemented is still only partially known. This is the subject of a current branch of research - epigenetics.
The translation mechanism from RNA and DNA to proteins is already well researched and understood. RNA and DNA consist of a sequence of four basic molecules - the nucleic bases adenine, cytosine, guanine and thymine. The blueprint for each protein corresponds to the sequence of these nucleic bases in a specific section of the genome. Such a protein-coding section is provided with a start and a stop mark and is called a gene. Each three consecutive nucleic bases of an RNA or DNA molecule code for a specific amino acid. DNA consists of two mirror-image RNA strands. If the two strands are separated, the DNA can be read and translated into a single-stranded RNA. When a special protein, the ribosome, then moves along such an RNA molecule, it assembles the corresponding amino acids in parallel to form a protein.
The genetic code enables every living organism to produce similar proteins over and over again, in large numbers and with precision. Proteins are working machines that perform a variety of functions in every cell. For example, certain proteins are able to build other essential components of living organisms such as fats, sugars, amino acids, nucleic bases, RNA and DNA from other molecules.
The emergence of the genetic code is one of the most important and fascinating leaps in the evolution of life. With the emergence of the genetic code, the early forms of life became independent of the more or less random provision of organic molecules by their environment, since a cell was now able to produce its own components.
But how could the blind forces of chance produce a program code that, with minor variations, is used by all life forms on Earth today?
In order to arrive at a hypothesis as to how genetic code arose, the model of emergence and submergence can help us.
We already know one branch: every cell contains ribosomes, a type of molecular machine that can translate a sequence of nucleic bases into the amino acid sequence of a protein. To achieve a self-stabilizing feedback loop, however, a second branch is needed - namely, a second molecular machine that can translate the amino acid sequence of a protein into a sequence of nucleic bases. When both machines are in close proximity (e.g., in a cell or precellular compartment), the RNA strands and corresponding proteins continue to accumulate in an avalanche-like process, as each newly produced protein is translated back into RNA, which in turn serves as a blueprint for new proteins. Of course, this self-reinforcing effect only occurs if both machines use the same code - otherwise protein A would become protein B after translation into RNA and retranslation.
So, if, by chance, parallel translation machines for protein => RNA and for RNA => protein had arisen in the course of evolutionary history, these machines would have encoded themselves in large numbers in RNA molecules and then produced themselves again from these RNA molecules. Due to this avalanche-like effect, the randomly created genetic code prevailed and is used by all living creatures on Earth today.
How likely is such a scenario?
Molecular readers that translate DNA or RNA into proteins are called ribosomes. There are many of them in every cell. However, we do not know of any reverse reading machines that translate the amino acid sequence of proteins back into RNA in modern living organisms. This is not surprising. If these back-translation machines existed in cells, the cell's protein mixture would be constantly replicated and could not change depending on environmental conditions or the cell's developmental cycle. This would be completely dysfunctional and unphysiological. Therefore, it is almost inevitable that these retranslation machines have disappeared in the course of evolutionary history. The fact that the protein => RNA translation machines disappeared and not the RNA => protein translation machines is due to the fact that an RNA template can be used several times to translate into a protein, while the reverse process destroys the read protein. RNA and the double-stranded DNA formed from it are better suited as information storage, while proteins can be much more versatile working machines.
Although there are no retranslation machines in today's cells, there are a large number of enzymatically active proteins that act as proteases to break down other proteins into individual amino acids. As a result, the proteins in a cell have only a few hours or days to survive before they are broken down again. In a cell, only those proteins are present and active whose building instructions have just been read from the genetic information and implemented by the ribosomes.
Even if the hypothetical back-translation machines do not exist in today's cells, it is conceivable in principle to equip proteases with an additional function that allows them to convert the amino acid sequence of a degraded protein into an RNA sequence. Proteases “nibble” proteins from one end and cleave off one amino acid at a time. In total, living organisms use around 20 different amino acids to build proteins. Which amino acid is currently being cleaved by a protease could therefore be “encoded” in a corresponding sequence of RNA nucleobases.
Presumably, during the transition from prebiotic to biotic evolution, there were many different molecular machines of this kind, using different codes for translating RNA into proteins and vice versa. The code that prevailed was the one for which simultaneous back-and-forth molecular reading machines evolved first. Thus, there is a universal genetic code that is used by all living organisms today.
If you look at the code table to see which triplets of nucleic bases code for which amino acid, you are more likely to get the impression that the code was randomly thrown together than to see confirmation of the view that a divine architect systematically planned the code. An engineer would probably have designed a more systematically structured code on the drawing board. At first glance, therefore, it seems that any other code could have prevailed in evolutionary history, in which the triplets code for other amino acids. However, statistical analyses comparing different hypothetical codes have shown that the different codes have different tolerances to typical sources of error, such as mutations and reading errors. The genetic code as we find it today in living organisms on Earth is characterized by an above-average error tolerance compared to other hypothetical variants. On closer inspection, this finding is not surprising, since particularly error-tolerant coding variants had a higher chance of first realizing error-free back-and-forth translation.
The genetic code is thus apparently an optimized product of prebiotic evolution - which in turn supports our hypothesis that there were once many different coding and decoding machines that differed in the codes they used, among which the pair that first had a uniform translation prevailed.
The random simultaneous creation of two opposing machines that use the same code for amino acids and nucleic bases is physico-chemically possible, but so improbable that one could be forgiven for thinking that a miracle has occurred. If one does not want to believe that a divine architect has intervened here, the only possibility is that a long time and a great many experiments were available to help evolution find this solution. In fact, evolution on Earth had perhaps three billion years to do so. There can only have been a large number of experiments during this long period if the required materials - organic molecules such as amino acids, ribonucleic acids and sugars - were available locally in large numbers and high concentrations. Thus, there must have been abundant and reliable sources of these molecules on the early Earth for a long time.
The article “The origin of life” deals with where these sources were probably located.
Comments powered by CComment