JDygert's solution

to Nucleotide Count in the Prolog Track

Published at May 16 2020 · 0 comments
Instructions
Test suite
Solution

Given a DNA string, compute how many times each nucleotide occurs in the string.

DNA is represented by an alphabet of the following symbols: 'A', 'C', 'G', and 'T'.

Each symbol represents a nucleotide, which is a fancy name for the particular molecules that happen to make up a large part of DNA.

Shortest intro to biochemistry EVAR:

• twigs are to birds nests as
• nucleotides are to DNA and RNA as
• amino acids are to proteins as
• sugar is to starch as
• oh crap lipids

I'm not going to talk about lipids because they're crazy complex.

So back to nucleotides.

DNA contains four types of them: adenine (`A`), cytosine (`C`), guanine (`G`), and thymine (`T`).

RNA contains a slightly different set of nucleotides, but we don't care about that for now.

Source

The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

nucleotide_count_tests.plt

``````pending :-
current_prolog_flag(argv, ['--all'|_]).
pending :-
write('\nA TEST IS PENDING!\n'),
fail.

:- begin_tests(nucleotide_counting).

nucleotide_count('', [('A', 0) | _ ]), !.

test(repetitive_cytidine_gets_counted, condition(pending)) :-
nucleotide_count('CCCCC', Counts),
member(('C', 5), Counts), !.

test(counts_only_thymidine, condition(pending)) :-
nucleotide_count('GGGGGTAACCCGG', Counts),
member(('T', 1), Counts), !.

test(counts_only_thymidine, condition(pending)) :-
nucleotide_count(
'AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC',
[ ('A' ,20), ('C' , 12), ('G' , 17), ('T', 21) ]), !.

test(fails_when_not_dns, [fail, condition(pending)]) :-
nucleotide_count('JOHNNYAPPLESEED', _), !.

:- end_tests(nucleotide_counting).``````
``````nucleotide_count(Strand, Counts) :-
atom_chars(Strand, Chars),
list_to_assoc(['A'-0, 'C'-0, 'G'-0, 'T'-0], Empty),
foldl(increment, Chars, Empty, CountsAssoc),
assoc_to_list(CountsAssoc, CountsPairs),
maplist(pair_to_compound, CountsPairs, Counts).

increment(X, In, Out) :-
get_assoc(X, In, N, Out, M),
succ(N, M).

pair_to_compound(K-V, (K, V)).``````