ðŸŽ‰ Exercism Research is now launched. Help Exercism, help science and have some fun at research.exercism.io ðŸŽ‰

# angelikatyborska's solution

## to Nucleotide Count in the Ruby Track

Published at Jul 13 2018 · 1 comment
Instructions
Test suite
Solution

Given a single stranded DNA string, compute how many times each nucleotide occurs in the string.

The genetic language of every living thing on the planet is DNA. DNA is a large molecule that is built from an extremely long sequence of individual elements called nucleotides. 4 types exist in DNA and these differ only slightly and can be represented as the following symbols: 'A' for adenine, 'C' for cytosine, 'G' for guanine, and 'T' thymine.

Here is an analogy:

• twigs are to birds nests as
• nucleotides are to DNA as
• legos are to lego houses as
• words are to sentences as...

For installation and learning resources, refer to the exercism help page.

For running the tests provided, you will need the Minitest gem. Open a terminal window and run the following command to install minitest:

``````gem install minitest
``````

If you would like color output, you can `require 'minitest/pride'` in the test file, or note the alternative instruction, below, for running the test file.

Run the tests from the exercise directory using the following command:

``````ruby nucleotide_count_test.rb
``````

To include color from the command line:

``````ruby -r minitest/pride nucleotide_count_test.rb
``````

## Source

The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/

## Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

### nucleotide_count_test.rb

``````require 'minitest/autorun'
require_relative 'nucleotide_count'

class NucleotideTest < Minitest::Test
assert_equal 0, Nucleotide.from_dna('').count('A')
end

def test_repetitive_cytidine_gets_counted
skip
assert_equal 5, Nucleotide.from_dna('CCCCC').count('C')
end

def test_counts_only_thymidine
skip
assert_equal 1, Nucleotide.from_dna('GGGGGTAACCCGG').count('T')
end

def test_counts_a_nucleotide_only_once
skip
dna = Nucleotide.from_dna('CGATTGGG')
dna.count('T')
dna.count('T')
assert_equal 2, dna.count('T')
end

def test_empty_dna_strand_has_no_nucleotides
skip
expected = { 'A' => 0, 'T' => 0, 'C' => 0, 'G' => 0 }
assert_equal expected, Nucleotide.from_dna('').histogram
end

def test_repetitive_sequence_has_only_guanosine
skip
expected = { 'A' => 0, 'T' => 0, 'C' => 0, 'G' => 8 }
assert_equal expected, Nucleotide.from_dna('GGGGGGGG').histogram
end

def test_counts_all_nucleotides
skip
s = 'AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC'
dna = Nucleotide.from_dna(s)
expected = { 'A' => 20, 'T' => 21, 'G' => 17, 'C' => 12 }
assert_equal expected, dna.histogram
end

def test_validates_dna
skip
assert_raises ArgumentError do
Nucleotide.from_dna('JOHNNYAPPLESEED')
end
end
end``````
``````class Nucleotide
DNA_NUCLEOTIDES = %w(A C G T)
REGEXP = Regexp.new('\A(' + DNA_NUCLEOTIDES.join('|') + ')*\z')

def initialize(dna)
if valid_dna?(dna)
@dna = dna
else
raise ArgumentError, "#{ dna } is not a valid DNA string"
end
end

def self.from_dna(dna)
new(dna)
end

def histogram
@histogram ||= @dna.chars.each_with_object(empty_histogram) do |nucleotide, histogram|
histogram[nucleotide] += 1
end
end

def count(nucleotide)
histogram[nucleotide]
end

private

def valid_dna?(string)
string =~ REGEXP
end

def empty_histogram
Hash[DNA_NUCLEOTIDES.collect { |nucleotide| [nucleotide, 0] }]
end
end``````