🎉 Exercism Research is now launched. Help Exercism, help science and have some fun at research.exercism.io 🎉
Avatar of SergiiVlasiuk

SergiiVlasiuk's solution

to Protein Translation in the Scala Track

Published at Aug 23 2019 · 0 comments
Instructions
Test suite
Solution

Translate RNA sequences into proteins.

RNA can be broken into three nucleotide sequences called codons, and then translated to a polypeptide like so:

RNA: "AUGUUUUCU" => translates to

Codons: "AUG", "UUU", "UCU" => which become a polypeptide with the following sequence =>

Protein: "Methionine", "Phenylalanine", "Serine"

There are 64 codons which in turn correspond to 20 amino acids; however, all of the codon sequences and resulting amino acids are not important in this exercise. If it works for one codon, the program should work for all of them. However, feel free to expand the list in the test suite to include them all.

There are also three terminating codons (also known as 'STOP' codons); if any of these codons are encountered (by the ribosome), all translation ends and the protein is terminated.

All subsequent codons after are ignored, like this:

RNA: "AUGUUUUCUUAAAUG" =>

Codons: "AUG", "UUU", "UCU", "UAA", "AUG" =>

Protein: "Methionine", "Phenylalanine", "Serine"

Note the stop codon "UAA" terminates the translation and the final methionine is not translated into the protein sequence.

Below are the codons and resulting Amino Acids needed for the exercise.

Codon Protein
AUG Methionine
UUU, UUC Phenylalanine
UUA, UUG Leucine
UCU, UCC, UCA, UCG Serine
UAU, UAC Tyrosine
UGU, UGC Cysteine
UGG Tryptophan
UAA, UAG, UGA STOP

Learn more about protein translation on Wikipedia

The Scala exercises assume an SBT project scheme. The exercise solution source should be placed within the exercise directory/src/main/scala. The exercise unit tests can be found within the exercise directory/src/test/scala.

To run the tests simply run the command sbt test in the exercise directory.

For more detailed info about the Scala track see the help page.

Source

Tyler Long

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

ProteinTranslationTest.scala

import org.scalatest.{Matchers, FunSuite}

/** @version 1.1.0 */
class ProteinTranslationTest extends FunSuite with Matchers {

  test("Methionine RNA sequence") {
    ProteinTranslation.proteins("AUG") should be(Seq("Methionine"))
  }

  test("Phenylalanine RNA sequence 1") {
    pending
    ProteinTranslation.proteins("UUU") should be(Seq("Phenylalanine"))
  }

  test("Phenylalanine RNA sequence 2") {
    pending
    ProteinTranslation.proteins("UUC") should be(Seq("Phenylalanine"))
  }

  test("Leucine RNA sequence 1") {
    pending
    ProteinTranslation.proteins("UUA") should be(Seq("Leucine"))
  }

  test("Leucine RNA sequence 2") {
    pending
    ProteinTranslation.proteins("UUG") should be(Seq("Leucine"))
  }

  test("Serine RNA sequence 1") {
    pending
    ProteinTranslation.proteins("UCU") should be(Seq("Serine"))
  }

  test("Serine RNA sequence 2") {
    pending
    ProteinTranslation.proteins("UCC") should be(Seq("Serine"))
  }

  test("Serine RNA sequence 3") {
    pending
    ProteinTranslation.proteins("UCA") should be(Seq("Serine"))
  }

  test("Serine RNA sequence 4") {
    pending
    ProteinTranslation.proteins("UCG") should be(Seq("Serine"))
  }

  test("Tyrosine RNA sequence 1") {
    pending
    ProteinTranslation.proteins("UAU") should be(Seq("Tyrosine"))
  }

  test("Tyrosine RNA sequence 2") {
    pending
    ProteinTranslation.proteins("UAC") should be(Seq("Tyrosine"))
  }

  test("Cysteine RNA sequence 1") {
    pending
    ProteinTranslation.proteins("UGU") should be(Seq("Cysteine"))
  }

  test("Cysteine RNA sequence 2") {
    pending
    ProteinTranslation.proteins("UGC") should be(Seq("Cysteine"))
  }

  test("Tryptophan RNA sequence") {
    pending
    ProteinTranslation.proteins("UGG") should be(Seq("Tryptophan"))
  }

  test("STOP codon RNA sequence 1") {
    pending
    ProteinTranslation.proteins("UAA") should be(Seq())
  }

  test("STOP codon RNA sequence 2") {
    pending
    ProteinTranslation.proteins("UAG") should be(Seq())
  }

  test("STOP codon RNA sequence 3") {
    pending
    ProteinTranslation.proteins("UGA") should be(Seq())
  }

  test("Translate RNA strand into correct protein list") {
    pending
    ProteinTranslation.proteins("AUGUUUUGG") should be(
      Seq("Methionine", "Phenylalanine", "Tryptophan"))
  }

  test("Translation stops if STOP codon at beginning of sequence") {
    pending
    ProteinTranslation.proteins("UAGUGG") should be(Seq())
  }

  test("Translation stops if STOP codon at end of two-codon sequence") {
    pending
    ProteinTranslation.proteins("UGGUAG") should be(Seq("Tryptophan"))
  }

  test("Translation stops if STOP codon at end of three-codon sequence") {
    pending
    ProteinTranslation.proteins("AUGUUUUAA") should be(
      Seq("Methionine", "Phenylalanine"))
  }

  test("Translation stops if STOP codon in middle of three-codon sequence") {
    pending
    ProteinTranslation.proteins("UGGUAGUGG") should be(Seq("Tryptophan"))
  }

  test("Translation stops if STOP codon in middle of six-codon sequence") {
    pending
    ProteinTranslation.proteins("UGGUGUUAUUAAUGGUUU") should be(
      Seq("Tryptophan", "Cysteine", "Tyrosine"))
  }
}
object ProteinTranslation {
  def proteins(input: String): Seq[String] = {
    def decryptCodon(codon: String): String = {
      codon match {
        case "AUG" => "Methionine"
        case "UUU" | "UUC" => "Phenylalanine"
        case "UUA" | "UUG" => "Leucine"
        case "UCU" | "UCC" | "UCA" | "UCG" => "Serine"
        case "UAU" | "UAC" => "Tyrosine"
        case "UGU" | "UGC" => "Cysteine"
        case "UGG" => "Tryptophan"
        case "UAA" | "UAG" | "UGA" => "STOP"
      }
    }

    input.grouped(3).map(decryptCodon).takeWhile(_ != "STOP").toSeq
  }

}

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?