Given a single stranded DNA string, compute how many times each nucleotide occurs in the string.
The genetic language of every living thing on the planet is DNA. DNA is a large molecule that is built from an extremely long sequence of individual elements called nucleotides. 4 types exist in DNA and these differ only slightly and can be represented as the following symbols: 'A' for adenine, 'C' for cytosine, 'G' for guanine, and 'T' thymine.
Here is an analogy:
Execute the tests with:
$ elixir nucleotide_count_test.exs
In the test suites, all but the first test have been skipped.
Once you get a test passing, you can unskip the next one by
commenting out the relevant @tag :pending
with a #
symbol.
For example:
# @tag :pending
test "shouting" do
assert Bob.hey("WATCH OUT!") == "Whoa, chill out!"
end
Or, you can enable all the tests by commenting out the
ExUnit.configure
line in the test suite.
# ExUnit.configure exclude: :pending, trace: true
For more detailed information about the Elixir track, please see the help page.
The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/
It's possible to submit an incomplete solution so you can see how others have completed the exercise.
if !System.get_env("EXERCISM_TEST_EXAMPLES") do
Code.load_file("nucleotide_count.exs", __DIR__)
end
ExUnit.start()
ExUnit.configure(exclude: :pending, trace: true)
defmodule NucleotideCountTest do
use ExUnit.Case
# @tag :pending
test "empty dna string has no adenine" do
assert NucleotideCount.count('', ?A) == 0
end
@tag :pending
test "repetitive cytosine gets counted" do
assert NucleotideCount.count('CCCCC', ?C) == 5
end
@tag :pending
test "counts only thymine" do
assert NucleotideCount.count('GGGGGTAACCCGG', ?T) == 1
end
@tag :pending
test "empty dna string has no nucleotides" do
expected = %{?A => 0, ?T => 0, ?C => 0, ?G => 0}
assert NucleotideCount.histogram('') == expected
end
@tag :pending
test "repetitive sequence has only guanine" do
expected = %{?A => 0, ?T => 0, ?C => 0, ?G => 8}
assert NucleotideCount.histogram('GGGGGGGG') == expected
end
@tag :pending
test "counts all nucleotides" do
s = 'AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC'
expected = %{?A => 20, ?T => 21, ?C => 12, ?G => 17}
assert NucleotideCount.histogram(s) == expected
end
end
defmodule DNA do
@nucleotides [?A, ?C, ?G, ?T]
@doc """
Counts individual nucleotides in a DNA strand.
## Examples
iex> DNA.count('AATAA', ?A)
4
iex> DNA.count('AATAA', ?T)
1
"""
@spec count([char], char) :: non_neg_integer
def count(strand, nucleotide) do
cond do
Enum.member?(@nucleotides, nucleotide) == false ->
raise ArgumentError, message: "invalid nucleotide"
true ->
Enum.count strand, &(
cond do
Enum.member?(@nucleotides, &1) == false ->
raise ArgumentError, message: "invalid strand"
true ->
&1 == nucleotide
end
)
end
end
@doc """
Returns a summary of counts by nucleotide.
## Examples
iex> DNA.histogram('AATAA')
%{?A => 4, ?T => 1, ?C => 0, ?G => 0}
"""
@spec histogram([char]) :: map
def histogram(strand) do
@nucleotides
|> Enum.map(&( {&1, DNA.count(strand, &1)} ))
|> Map.new()
end
end
A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.
Here are some questions to help you reflect on this solution and learn the most from it.
Community comments