Avatar of javierdallamore

javierdallamore's solution

to Nucleotide Count in the Elixir Track

Published at Apr 07 2020 · 0 comments
Test suite

Given a single stranded DNA string, compute how many times each nucleotide occurs in the string.

The genetic language of every living thing on the planet is DNA. DNA is a large molecule that is built from an extremely long sequence of individual elements called nucleotides. 4 types exist in DNA and these differ only slightly and can be represented as the following symbols: 'A' for adenine, 'C' for cytosine, 'G' for guanine, and 'T' thymine.

Here is an analogy:

  • twigs are to birds nests as
  • nucleotides are to DNA as
  • legos are to lego houses as
  • words are to sentences as...

Running tests

Execute the tests with:

$ mix test

Pending tests

In the test suites, all but the first test have been skipped.

Once you get a test passing, you can unskip the next one by commenting out the relevant @tag :pending with a # symbol.

For example:

# @tag :pending
test "shouting" do
  assert Bob.hey("WATCH OUT!") == "Whoa, chill out!"

Or, you can enable all the tests by commenting out the ExUnit.configure line in the test suite.

# ExUnit.configure exclude: :pending, trace: true

If you're stuck on something, it may help to look at some of the available resources out there where answers might be found.


The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.


defmodule NucleotideCountTest do
  use ExUnit.Case

  # @tag :pending
  test "empty dna string has no adenine" do
    assert NucleotideCount.count('', ?A) == 0

  @tag :pending
  test "repetitive cytosine gets counted" do
    assert NucleotideCount.count('CCCCC', ?C) == 5

  @tag :pending
  test "counts only thymine" do
    assert NucleotideCount.count('GGGGGTAACCCGG', ?T) == 1

  @tag :pending
  test "empty dna string has no nucleotides" do
    expected = %{?A => 0, ?T => 0, ?C => 0, ?G => 0}
    assert NucleotideCount.histogram('') == expected

  @tag :pending
  test "repetitive sequence has only guanine" do
    expected = %{?A => 0, ?T => 0, ?C => 0, ?G => 8}
    assert NucleotideCount.histogram('GGGGGGGG') == expected

  @tag :pending
  test "counts all nucleotides" do
    expected = %{?A => 20, ?T => 21, ?C => 12, ?G => 17}
    assert NucleotideCount.histogram(s) == expected


ExUnit.configure(exclude: :pending, trace: true)
defmodule NucleotideCount do
  @nucleotides [?A, ?C, ?G, ?T]

  @doc """
  Counts individual nucleotides in a DNA strand.

  ## Examples

  iex> NucleotideCount.count('AATAA', ?A)

  iex> NucleotideCount.count('AATAA', ?T)
  @spec count(charlist(), char()) :: non_neg_integer()
  def count(strand, nucleotide) do
    Enum.filter(strand, &(&1 === nucleotide))
      |> length

  @doc """
  Returns a summary of counts by nucleotide.

  ## Examples

  iex> NucleotideCount.histogram('AATAA')
  %{?A => 4, ?T => 1, ?C => 0, ?G => 0}
  @spec histogram(charlist()) :: map()
  def histogram(strand) do
    initial_status = Map.new(for v <- @nucleotides, do: {v, 0})

    increment_key = fn(x, acc) -> Map.update!(acc, x, &(&1 + 1)) end
    Enum.reduce(strand, initial_status, increment_key)

Community comments

Find this solution interesting? Ask the author a question to learn more.

javierdallamore's Reflection

I thought on two solutions to initialize the histogram

initial_status = Map.new(for v