Lesson 4 (Build Your Own Protein)
(Finally) Build your own protein!
For the last example, we started with a DNA sequence already grouped into codon triplets. This is pretty unrealistic, though. Generally, a DNA sequence that you would work with when coding is just a string of As, Cs, Ts, and Gs. The good news is that we can easily convert a string of DNA nucleotides into a list of codons. To do this, we need one more tool in our toolbox: modulo.
Modulo, although it sounds complicated, is just another word for the remainder left after division. If two numbers divide into one another with no fraction remaining, then those numbers are said to have "modulo 0." For example, 4 and 2 have modulo 0 because 4 is divisible by 2 with no remainder. 5 and 2, however, have modulo 1, because 5 divided by 2 is 2 with a remainder of 1.
We can calculate modulo in Python with the % symbol:
So why are we on this modulo tangent? Because we want to divide a string of DNA nucleotides into a list of triplets! So we will use modulo 3 to check if we've reached the beginning of a new codon when reading through the DNA string. Remember that a string can be treated just like a list of characters; we can index into the string and divide up the string. So, we can convert a string of nucleotides into a list of codons:
What would happen if we didn't use the modulo function? We would triple-count all of the nucleotides and end up with the wrong protein!
Clearly, this isn't what we were trying to do - there are obviously too many codons in our solution to have come from the original DNA sequence. Thank goodness for the modulo function!
Now it's your turn: Challenge question!
Combine our previous two calculations (creating a protein from a list of codons and creating a list of codons from a DNA nucleotide string) to generate a protein from a DNA nucleotide string. If you really want to challenge yourself, expand our genetic code dictionary to encompass more of the genetic code - and then build entirely your own protein! (Hint: If you're having trouble, add print statements to see what values variables are taking on!)