Lesson 4 (Genetic Code and Central Dogma)

This website will shut down soon. We Recommend Completing The Lessons On Our New Website CodeSciLab.Org

The Genetic Code

So we've talked about how our genes contribute to our phenotype. But what exactly are genes? Genes, as we've mentioned, are made up of DNA and provide the information for our cells to carry out important biological processes that ultimately lead to phenotype. Those processes depend on molecules called proteins. The DNA provides instructions for building proteins, which are made up of essential building blocks called amino acids. Almost all proteins are made up of combinations and permutations of the same 20 amino acids. This is where the code of DNA comes in. DNA is composed of 4 different molecules called nucleotides or bases: adenosine, thymine, cytosine, and guanine. We abbreviate these as A, T, C, and G, respectively. An important question is immediately apparent: how can 4 bases can code for 20 unique amino acids? The answer is in combinations. Let's do some quick math to figure out how many letters in a row you need to code for 20 amino acids. Note that in Python, we use "**" as an exponent symbol (i.e. to calculate 3 to the 8th power [3^8], we would type 3**8).

As you can see, two letters only allows for 16 unique combinations, which is insufficient for 20 different amino acids. However, with 3 letters, we can create 64 unique combinations, well beyond what is necessary for 20 different amino acids. In reality, this is how nature actually works: sequences of three letters (which we call triplets) correspond to various amino acids. We call this the genetic code. Because triplets of nucleotides (e.g. ATG or GGC) could encode more than our 20 unique amino acids, there is some redundancy (or degeneracy) to this genetic code. That is to say, both CAC and CAT both encode the same amino acid, histidine. The discovery of the genetic code was an enormous breakthough in biology and is summarized in this famous diagram:

TheGeneticCode.jpg

The Central Dogma

This triplet genetic code is actually conferred in the biology of using DNA to make proteins through an intermediate, RNA; RNA, or ribonucleic acid, is composed of three of the same nucleotides as DNA: adenosine, cytosine, and guanine, and with uracil replacing thymine. DNA is copied into RNA in a process called transcription. This is easy to remember, since DNA and RNA are "written" in the same "language" of nucleotides. RNA is then used as a template to make a protein in a process called translation. This makes sense, as RNA and proteins are "written" in different "languages" (nucleotides and amino acids, respectively). The transcription of DNA into RNA, which can then be translated into proteins is called The Central Dogma of biology.

TheCentralDogma.jpg