Avatar of joshuavvv

joshuavvv's solution

to Nucleotide Count in the Prolog Track

Published at Jan 05 2020 · 0 comments
Instructions
Test suite
Solution

Given a DNA string, compute how many times each nucleotide occurs in the string.

DNA is represented by an alphabet of the following symbols: 'A', 'C', 'G', and 'T'.

Each symbol represents a nucleotide, which is a fancy name for the particular molecules that happen to make up a large part of DNA.

Shortest intro to biochemistry EVAR:

  • twigs are to birds nests as
  • nucleotides are to DNA and RNA as
  • amino acids are to proteins as
  • sugar is to starch as
  • oh crap lipids

I'm not going to talk about lipids because they're crazy complex.

So back to nucleotides.

DNA contains four types of them: adenine (A), cytosine (C), guanine (G), and thymine (T).

RNA contains a slightly different set of nucleotides, but we don't care about that for now.

Source

The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

nucleotide_count_tests.plt

pending :-
    current_prolog_flag(argv, ['--all'|_]).
pending :-
    write('\nA TEST IS PENDING!\n'),
    fail.

:- begin_tests(nucleotide_counting).

    test(empty_dna_strand_has_no_adenosine, condition(true)) :-
        nucleotide_count('', [('A', 0) | _ ]), !.

    test(repetitive_cytidine_gets_counted, condition(pending)) :-
        nucleotide_count('CCCCC', Counts),
        member(('C', 5), Counts), !.

    test(counts_only_thymidine, condition(pending)) :-
        nucleotide_count('GGGGGTAACCCGG', Counts),
        member(('T', 1), Counts), !.

    test(counts_only_thymidine, condition(pending)) :-
        nucleotide_count(
            'AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC',
            [ ('A' ,20), ('C' , 12), ('G' , 17), ('T', 21) ]), !.

    test(fails_when_not_dns, [fail, condition(pending)]) :-
        nucleotide_count('JOHNNYAPPLESEED', _), !.

:- end_tests(nucleotide_counting).
nucleotide_count(Str, Counts) :- 
    string_chars(Str, Chars),
    nucleotide_count(Chars, [('A', 0), ('C', 0), ('G', 0), ('T', 0)], Counts).

count('A', [('A', N) | Rest], [('A', N1) | Rest]) :-
    N1 is N + 1.
count('C', [As, ('C', N) | Rest], [As, ('C', N1) | Rest]) :-
    N1 is N + 1.
count('G', [As, Cs, ('G', N), Ts], [As, Cs, ('G', N1), Ts]) :-
    N1 is N + 1.
count('T', [As, Cs, Gs, ('T', N)], [As, Cs, Gs, ('T', N1)]) :-
    N1 is N + 1.

nucleotide_count([], Counts, Counts).
nucleotide_count([X|Xs], Acc, Counts) :-
    count(X, Acc, Acc1),
    nucleotide_count(Xs, Acc1, Counts).

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?