ðŸŽ‰ Exercism Research is now launched. Help Exercism, help science and have some fun at research.exercism.io ðŸŽ‰

# remcopeereboom's solution

## to Nucleotide Count in the Ruby Track

Published at Jul 13 2018 · 3 comments
Instructions
Test suite
Solution

Given a single stranded DNA string, compute how many times each nucleotide occurs in the string.

The genetic language of every living thing on the planet is DNA. DNA is a large molecule that is built from an extremely long sequence of individual elements called nucleotides. 4 types exist in DNA and these differ only slightly and can be represented as the following symbols: 'A' for adenine, 'C' for cytosine, 'G' for guanine, and 'T' thymine.

Here is an analogy:

• twigs are to birds nests as
• nucleotides are to DNA as
• legos are to lego houses as
• words are to sentences as...

For installation and learning resources, refer to the exercism help page.

For running the tests provided, you will need the Minitest gem. Open a terminal window and run the following command to install minitest:

``````gem install minitest
``````

If you would like color output, you can `require 'minitest/pride'` in the test file, or note the alternative instruction, below, for running the test file.

Run the tests from the exercise directory using the following command:

``````ruby nucleotide_count_test.rb
``````

To include color from the command line:

``````ruby -r minitest/pride nucleotide_count_test.rb
``````

## Source

The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/

## Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

### nucleotide_count_test.rb

``````require 'minitest/autorun'
require_relative 'nucleotide_count'

class NucleotideTest < Minitest::Test
assert_equal 0, Nucleotide.from_dna('').count('A')
end

def test_repetitive_cytidine_gets_counted
skip
assert_equal 5, Nucleotide.from_dna('CCCCC').count('C')
end

def test_counts_only_thymidine
skip
assert_equal 1, Nucleotide.from_dna('GGGGGTAACCCGG').count('T')
end

def test_counts_a_nucleotide_only_once
skip
dna = Nucleotide.from_dna('CGATTGGG')
dna.count('T')
dna.count('T')
assert_equal 2, dna.count('T')
end

def test_empty_dna_strand_has_no_nucleotides
skip
expected = { 'A' => 0, 'T' => 0, 'C' => 0, 'G' => 0 }
assert_equal expected, Nucleotide.from_dna('').histogram
end

def test_repetitive_sequence_has_only_guanosine
skip
expected = { 'A' => 0, 'T' => 0, 'C' => 0, 'G' => 8 }
assert_equal expected, Nucleotide.from_dna('GGGGGGGG').histogram
end

def test_counts_all_nucleotides
skip
s = 'AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC'
dna = Nucleotide.from_dna(s)
expected = { 'A' => 20, 'T' => 21, 'G' => 17, 'C' => 12 }
assert_equal expected, dna.histogram
end

def test_validates_dna
skip
assert_raises ArgumentError do
Nucleotide.from_dna('JOHNNYAPPLESEED')
end
end
end``````
``````class Nucleotide

def initialize(string)
fail ArgumentError if string =~ /[^ACTG]/

@string = string
@counts = Hash.new { |h, key| h[key] = string.count(key) }
end

def count(nucleotide)
@counts[nucleotide]
end

def self.from_dna(string)
Nucleotide.new(string)
end

def histogram
count 'A'
count 'T'
count 'C'
count 'G'

@counts
end
end``````

Any way to avoid the duplication of the list of valid nucleotides (ACTG)?

Solution Author
commented over 5 years ago

@monkbroc Yeah, I should really put it in a data-structure of some sort. I could then also use it in the regexp. Thanks for the reminder. I'll try to remember to update it.

Solution Author
commented over 5 years ago

I also should be using new instead of Nucleotide.new - give derived classes a chance to do their magic.

### What can you learn from this solution?

A huge amount can be learned from reading other peopleâ€™s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

• What compromises have been made?