🎉 Exercism Research is now launched. Help Exercism, help science and have some fun at research.exercism.io 🎉
Avatar of n0mn0m

n0mn0m's solution

to Protein Translation in the Python Track

Published at Jan 28 2021 · 0 comments
Instructions
Test suite
Solution

Translate RNA sequences into proteins.

RNA can be broken into three nucleotide sequences called codons, and then translated to a polypeptide like so:

RNA: "AUGUUUUCU" => translates to

Codons: "AUG", "UUU", "UCU" => which become a polypeptide with the following sequence =>

Protein: "Methionine", "Phenylalanine", "Serine"

There are 64 codons which in turn correspond to 20 amino acids; however, all of the codon sequences and resulting amino acids are not important in this exercise. If it works for one codon, the program should work for all of them. However, feel free to expand the list in the test suite to include them all.

There are also three terminating codons (also known as 'STOP' codons); if any of these codons are encountered (by the ribosome), all translation ends and the protein is terminated.

All subsequent codons after are ignored, like this:

RNA: "AUGUUUUCUUAAAUG" =>

Codons: "AUG", "UUU", "UCU", "UAA", "AUG" =>

Protein: "Methionine", "Phenylalanine", "Serine"

Note the stop codon "UAA" terminates the translation and the final methionine is not translated into the protein sequence.

Below are the codons and resulting Amino Acids needed for the exercise.

Codon Protein
AUG Methionine
UUU, UUC Phenylalanine
UUA, UUG Leucine
UCU, UCC, UCA, UCG Serine
UAU, UAC Tyrosine
UGU, UGC Cysteine
UGG Tryptophan
UAA, UAG, UGA STOP

Learn more about protein translation on Wikipedia

Exception messages

Sometimes it is necessary to raise an exception. When you do this, you should include a meaningful error message to indicate what the source of the error is. This makes your code more readable and helps significantly with debugging. Not every exercise will require you to raise an exception, but for those that do, the tests will only pass if you include a message.

To raise a message with an exception, just write it as an argument to the exception type. For example, instead of raise Exception, you should write:

raise Exception("Meaningful message indicating the source of the error")

Running the tests

To run the tests, run pytest protein_translation_test.py

Alternatively, you can tell Python to run the pytest module: python -m pytest protein_translation_test.py

Common pytest options

  • -v : enable verbose output
  • -x : stop running tests on first failure
  • --ff : run failures from previous test before running other test cases

For other options, see python -m pytest -h

Submitting Exercises

Note that, when trying to submit an exercise, make sure the solution is in the $EXERCISM_WORKSPACE/python/protein-translation directory.

You can find your Exercism workspace by running exercism debug and looking for the line that starts with Workspace.

For more detailed information about running tests, code style and linting, please see Running the Tests.

Source

Tyler Long

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

protein_translation_test.py

import unittest

from protein_translation import proteins

# Tests adapted from `problem-specifications//canonical-data.json`


class ProteinTranslationTest(unittest.TestCase):
    def test_methionine_rna_sequence(self):
        value = "AUG"
        expected = ["Methionine"]
        self.assertEqual(proteins(value), expected)

    def test_phenylalanine_rna_sequence_1(self):
        value = "UUU"
        expected = ["Phenylalanine"]
        self.assertEqual(proteins(value), expected)

    def test_phenylalanine_rna_sequence_2(self):
        value = "UUC"
        expected = ["Phenylalanine"]
        self.assertEqual(proteins(value), expected)

    def test_leucine_rna_sequence_1(self):
        value = "UUA"
        expected = ["Leucine"]
        self.assertEqual(proteins(value), expected)

    def test_leucine_rna_sequence_2(self):
        value = "UUG"
        expected = ["Leucine"]
        self.assertEqual(proteins(value), expected)

    def test_serine_rna_sequence_1(self):
        value = "UCU"
        expected = ["Serine"]
        self.assertEqual(proteins(value), expected)

    def test_serine_rna_sequence_2(self):
        value = "UCC"
        expected = ["Serine"]
        self.assertEqual(proteins(value), expected)

    def test_serine_rna_sequence_3(self):
        value = "UCA"
        expected = ["Serine"]
        self.assertEqual(proteins(value), expected)

    def test_serine_rna_sequence_4(self):
        value = "UCG"
        expected = ["Serine"]
        self.assertEqual(proteins(value), expected)

    def test_tyrosine_rna_sequence_1(self):
        value = "UAU"
        expected = ["Tyrosine"]
        self.assertEqual(proteins(value), expected)

    def test_tyrosine_rna_sequence_2(self):
        value = "UAC"
        expected = ["Tyrosine"]
        self.assertEqual(proteins(value), expected)

    def test_cysteine_rna_sequence_1(self):
        value = "UGU"
        expected = ["Cysteine"]
        self.assertEqual(proteins(value), expected)

    def test_cysteine_rna_sequence_2(self):
        value = "UGC"
        expected = ["Cysteine"]
        self.assertEqual(proteins(value), expected)

    def test_tryptophan_rna_sequence(self):
        value = "UGG"
        expected = ["Tryptophan"]
        self.assertEqual(proteins(value), expected)

    def test_stop_codon_rna_sequence_1(self):
        value = "UAA"
        expected = []
        self.assertEqual(proteins(value), expected)

    def test_stop_codon_rna_sequence_2(self):
        value = "UAG"
        expected = []
        self.assertEqual(proteins(value), expected)

    def test_stop_codon_rna_sequence_3(self):
        value = "UGA"
        expected = []
        self.assertEqual(proteins(value), expected)

    def test_translate_rna_strand_into_correct_protein_list(self):
        value = "AUGUUUUGG"
        expected = ["Methionine", "Phenylalanine", "Tryptophan"]
        self.assertEqual(proteins(value), expected)

    def test_translation_stops_if_stop_codon_at_beginning_of_sequence(self):
        value = "UAGUGG"
        expected = []
        self.assertEqual(proteins(value), expected)

    def test_translation_stops_if_stop_codon_at_end_of_two_codon_sequence(self):
        value = "UGGUAG"
        expected = ["Tryptophan"]
        self.assertEqual(proteins(value), expected)

    def test_translation_stops_if_stop_codon_at_end_of_three_codon_sequence(self):
        value = "AUGUUUUAA"
        expected = ["Methionine", "Phenylalanine"]
        self.assertEqual(proteins(value), expected)

    def test_translation_stops_if_stop_codon_in_middle_of_three_codon_sequence(self):
        value = "UGGUAGUGG"
        expected = ["Tryptophan"]
        self.assertEqual(proteins(value), expected)

    def test_translation_stops_if_stop_codon_in_middle_of_six_codon_sequence(self):
        value = "UGGUGUUAUUAAUGGUUU"
        expected = ["Tryptophan", "Cysteine", "Tyrosine"]
        self.assertEqual(proteins(value), expected)


if __name__ == "__main__":
    unittest.main()
from itertools import takewhile
from textwrap import wrap


codon_protein_map = {
    "AUG": "Methionine",
    "UUC": "Phenylalanine",
    "UUU": "Phenylalanine",
    "UUA": "Leucine",
    "UUG": "Leucine",
    "UCU": "Serine",
    "UCC": "Serine",
    "UCA": "Serine",
    "UCG": "Serine",
    "UAC": "Tyrosine",
    "UAU": "Tyrosine",
    "UGC": "Cysteine",
    "UGU": "Cysteine",
    "UGG": "Tryptophan",
}

terminating_codons = {"UAG", "UAA", "UGA"}


def is_not_terminator(pattern):
    return pattern not in terminating_codons


def proteins(strand):
    return [
        codon_protein_map.get(pattern, None)
        for pattern in takewhile(is_not_terminator, wrap(strand, 3))
    ]

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?