🎉 Exercism Research is now launched. Help Exercism, help science and have some fun at research.exercism.io 🎉 # SergiiVlasiuk's solution

## to Nucleotide Count in the Scala Track

Published at Aug 24 2019 · 0 comments
Instructions
Test suite
Solution

Given a single stranded DNA string, compute how many times each nucleotide occurs in the string.

The genetic language of every living thing on the planet is DNA. DNA is a large molecule that is built from an extremely long sequence of individual elements called nucleotides. 4 types exist in DNA and these differ only slightly and can be represented as the following symbols: 'A' for adenine, 'C' for cytosine, 'G' for guanine, and 'T' thymine.

Here is an analogy:

• twigs are to birds nests as
• nucleotides are to DNA as
• legos are to lego houses as
• words are to sentences as...

## Hints

A common use of `Either` is to indicate a computation that may possibly result in an error (if the actual error is of no interest then the simpler `Option` type might be a better choice). In the absence of an error the result is usually a `Right` (mnemonic: the "right" value) whereas an error is a `Left`, for example a `Left[String]` containing an error message. Note that in Scala 2.12 `Either` is right-biased by default, so it works as expected for operations like `filter`, `map`, `flatMap` or in a for-comprehension. If you are unfamiliar with `Either` you may read this tutorial. But be aware that this tutorial is about Scala versions prior to 2.12. For Scala 2.12 you can safely ignore `RightProjection` and omit `.right`. `Either` is a so-called Monad which covers a "computational aspect", in this case error handling. Proper use of Monads can result in very concise yet elegant and readable code. Improper use can easily result in the contrary. Watch this video to learn more.

#### Common pitfalls that you should avoid

There are a few rules of thumbs for `Either`:

1. If you don't need it don't use it. Instead of
``````def add1(x: Int): Either[String, Int] = Right(x + 1)
``````

better have

``````def add1(x: Int): Int = x + 1
``````

(there is `Either.map` to apply such simple functions, so you don't have to clutter them with `Either`). 2. Don't "unwrap" if you don't really need to. Often there are built-in functions for your purpose. Indicators of premature unwrapping are `isLeft/isRight` or pattern matching. For example, instead of

``````val x: Either[String, Int] = ...

if (x.isRight) x.right.get + 1 else x.left.get
// or
x match {
case Right(n) => n + 1
case Left(s) => s
}
``````

better have

``````x fold (identity, _ + 1)
``````
1. Monads can be used inside a for-comprehension FTW. This is advisable when you want to "compose" several `Either` instances. Instead of
``````val xo: Either[String, Int] = ...
val yo: Either[String, Int] = ...
val zo: Either[String, Int] = ...

xo.flatMap(x =>
yo.flatMap(y =>
zo.map(z =>
x + y + z)))
``````

better have

``````for {
x <- xo
y <- yo
z <- zo
} yield x + y + z
``````

The Scala exercises assume an SBT project scheme. The exercise solution source should be placed within the exercise directory/src/main/scala. The exercise unit tests can be found within the exercise directory/src/test/scala.

To run the tests simply run the command `sbt test` in the exercise directory.

For more detailed info about the Scala track see the help page.

## Source

The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/

## Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

### NucleotideCountTest.scala

``````import org.scalatest.{Matchers, FunSuite}

/** @version 1.3.0 */
class NucleotideCountTest extends FunSuite with Matchers {

test("empty strand") {
new DNA("").nucleotideCounts should be(
Right(Map('A' -> 0, 'C' -> 0, 'G' -> 0, 'T' -> 0)))
}

test("can count one nucleotide in single-character input") {
pending
new DNA("G").nucleotideCounts should be(
Right(Map('A' -> 0, 'C' -> 0, 'G' -> 1, 'T' -> 0)))
}

test("strand with repeated nucleotide") {
pending
new DNA("GGGGGGG").nucleotideCounts should be(
Right(Map('A' -> 0, 'C' -> 0, 'G' -> 7, 'T' -> 0)))
}

test("strand with multiple nucleotides") {
pending
new DNA(
"AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC").nucleotideCounts should be(
Right(Map('A' -> 20, 'C' -> 12, 'G' -> 17, 'T' -> 21)))
}

test("strand with invalid nucleotides") {
pending
new DNA("AGXXACT").nucleotideCounts.isLeft should be(true)
}
}``````
``````case class DNA(dna: String) {
val dnaTypes: String = "ACGT"
val dnaRegex = s"[^\$dnaTypes]".r

def nucleotideCounts: Either[String, Map[Char, Int]] =
if (dnaRegex.findAllMatchIn(dna).hasNext) Left("Unknown nucleotide!")
else Right(Map(dnaTypes.toCharArray.map(x => (x, dna.count(_ == x))).toSeq: _*))
}``````