Avatar of zioufang

zioufang's solution

to Nucleotide Count in the Go Track

Published at Jul 10 2020 · 0 comments
Instructions
Test suite
Solution

Given a single stranded DNA string, compute how many times each nucleotide occurs in the string.

The genetic language of every living thing on the planet is DNA. DNA is a large molecule that is built from an extremely long sequence of individual elements called nucleotides. 4 types exist in DNA and these differ only slightly and can be represented as the following symbols: 'A' for adenine, 'C' for cytosine, 'G' for guanine, and 'T' thymine.

Here is an analogy:

  • twigs are to birds nests as
  • nucleotides are to DNA as
  • legos are to lego houses as
  • words are to sentences as...

Implementation

You should define a custom type 'DNA' with a function 'Counts' that outputs two values:

  • a frequency count for the given DNA strand
  • an error (if there are invalid nucleotides)

Which is a good type for a DNA strand ?

Which is the best Go types to represent the output values ?

Take a look at the test cases to get a hint about what could be the possible inputs.

note about the tests

You may be wondering about the cases_test.go file. We explain it in the leap exercise.

Coding the solution

Look for a stub file having the name nucleotide_count.go and place your solution code in that file.

Running the tests

To run the tests run the command go test from within the exercise directory.

If the test suite contains benchmarks, you can run these with the --bench and --benchmem flags:

go test -v --bench . --benchmem

Keep in mind that each reviewer will run benchmarks on a different machine, with different specs, so the results from these benchmark tests may vary.

Further information

For more detailed information about the Go track, including how to get help if you're having trouble, please visit the exercism.io Go language page.

Source

The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

cases_test.go

package dna

// Source: exercism/problem-specifications
// Commit: 879a096 nucleotide-count: Apply new "input" policy
// Problem Specifications Version: 1.3.0

// count all nucleotides in a strand
var testCases = []struct {
	description   string
	strand        string
	expected      Histogram
	errorExpected bool
}{
	{
		description: "empty strand",
		strand:      "",
		expected:    Histogram{'A': 0, 'C': 0, 'G': 0, 'T': 0},
	},
	{
		description: "can count one nucleotide in single-character input",
		strand:      "G",
		expected:    Histogram{'A': 0, 'C': 0, 'G': 1, 'T': 0},
	},
	{
		description: "strand with repeated nucleotide",
		strand:      "GGGGGGG",
		expected:    Histogram{'A': 0, 'C': 0, 'G': 7, 'T': 0},
	},
	{
		description: "strand with multiple nucleotides",
		strand:      "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC",
		expected:    Histogram{'A': 20, 'C': 12, 'G': 17, 'T': 21},
	},
	{
		description:   "strand with invalid nucleotides",
		strand:        "AGXXACT",
		errorExpected: true,
	},
}

nucleotide_count_test.go

package dna

import (
	"reflect"
	"testing"
)

func TestCounts(t *testing.T) {
	for _, tc := range testCases {
		dna := DNA(tc.strand)
		s, err := dna.Counts()
		switch {
		case tc.errorExpected:
			if err == nil {
				t.Fatalf("FAIL: %s\nCounts(%q)\nExpected error\nActual: %#v",
					tc.description, tc.strand, s)
			}
		case err != nil:
			t.Fatalf("FAIL: %s\nCounts(%q)\nExpected: %#v\nGot error: %q",
				tc.description, tc.strand, tc.expected, err)
		case !reflect.DeepEqual(s, tc.expected):
			t.Fatalf("FAIL: %s\nCounts(%q)\nExpected: %#v\nActual: %#v",
				tc.description, tc.strand, tc.expected, s)
		}
		t.Logf("PASS: %s", tc.description)
	}
}
package dna

import "fmt"

type Histogram map[byte]int

type DNA string

func (d DNA) Counts() (Histogram, error) {
	nuclCnt := Histogram{'A': 0, 'C': 0, 'G': 0, 'T': 0}
	dbyte := []byte(d)
	for _, b := range dbyte {
		if _, ok := nuclCnt[b]; !ok {
			return nuclCnt, fmt.Errorf("%b is not a valid nucleotide", b)
		}
		nuclCnt[b]++
	}
	return nuclCnt, nil
}

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?