Avatar of vlzware

vlzware's solution

to Nucleotide Count in the C Track

Published at Jul 13 2018 · 0 comments
Instructions
Test suite
Solution

Note:

This solution was written on an old version of Exercism. The tests below might not correspond to the solution code, and the exercise may have changed since this code was written.

Given a single stranded DNA string, compute how many times each nucleotide occurs in the string.

The genetic language of every living thing on the planet is DNA. DNA is a large molecule that is built from an extremely long sequence of individual elements called nucleotides. 4 types exist in DNA and these differ only slightly and can be represented as the following symbols: 'A' for adenine, 'C' for cytosine, 'G' for guanine, and 'T' thymine.

Here is an analogy:

  • twigs are to birds nests as
  • nucleotides are to DNA as
  • legos are to lego houses as
  • words are to sentences as...

Getting Started

Make sure you have read the C page on the Exercism site. This covers the basic information on setting up the development environment expected by the exercises.

Passing the Tests

Get the first test compiling, linking and passing by following the three rules of test-driven development.

The included makefile can be used to create and run the tests using the test task.

make test

Create just the functions you need to satisfy any compiler errors and get the test to fail. Then write just enough code to get the test to pass. Once you've done that, move onto the next test.

As you progress through the tests, take the time to refactor your implementation for readability and expressiveness and then go on to the next test.

Try to use standard C99 facilities in preference to writing your own low-level algorithms or facilities by hand.

Source

The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

test_nucleotide_count.c

#include "vendor/unity.h"
#include "../src/nucleotide_count.h"
#include <stdlib.h>
#include <string.h>

void setUp(void)
{
}

void tearDown(void)
{
}

void test_strand_count(const char *dna_strand, const char *expected)
{
   char *actual_count = count(dna_strand);

   TEST_ASSERT_TRUE(strcmp(actual_count, expected) == 0);
   free(actual_count);
}

void test_empty_strand(void)
{
   TEST_IGNORE();               // delete this line to run test
   const char *dna_strand = "";
   const char *expected = "A:0 C:0 G:0 T:0";

   test_strand_count(dna_strand, expected);
}

void test_repeated_nucleotide(void)
{
   TEST_IGNORE();
   const char *dna_strand = "GGGGGGG";
   const char *expected = "A:0 C:0 G:7 T:0";

   test_strand_count(dna_strand, expected);
}

void test_multiple_nucleotides(void)
{
   TEST_IGNORE();
   const char *dna_strand =
       "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC";
   const char *expected = "A:20 C:12 G:17 T:21";

   test_strand_count(dna_strand, expected);
}

void test_invalid_nucleotide(void)
{
   TEST_IGNORE();
   const char *dna_strand = "AGXXACT";
   const char *expected = "";

   test_strand_count(dna_strand, expected);
}

int main(void)
{
   UnityBegin("test/test_rna_transcription.c");

   RUN_TEST(test_empty_strand);
   RUN_TEST(test_repeated_nucleotide);
   RUN_TEST(test_multiple_nucleotides);
   RUN_TEST(test_invalid_nucleotide);

   UnityEnd();
   return 0;
}

src/nucleotide_count.c

#include "nucleotide_count.h"
#include <stdlib.h>
#include <stdio.h>

char *count(const char *dna_strand)
{
	char *error = (char*) malloc(1);
	*error = '\0';

	if (dna_strand == NULL)
		return error;

	int a, c, g, t;
	a = c = g = t = 0;
	char *tmp = (char*) dna_strand;
	while(*tmp)
		switch(*tmp++) {
		case 'A':
			a++;
			break;
		case 'C':
			c++;
			break;
		case 'G':
			g++;
			break;
		case 'T':
			t++;
			break;
		default:
			return error;
		}

	/* https://stackoverflow.com/a/3923207/6049386 */
	int size = snprintf(NULL, 0, "A:%i C:%i G:%i T:%i", a, c, g, t);

	char *res = (char*) malloc(size + 1);
	if (res == NULL)
		return error;

	sprintf(res, "A:%i C:%i G:%i T:%i", a, c, g, t);
	res[size] = '\0';

	free(error);
	return res;
}

src/nucleotide_count.h

#ifndef NUCLEOTIDE_COUNT
#define NUCLEOTIDE_COUNT

char *count(const char *dna_strand);

#endif

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?