Given a single stranded DNA string, compute how many times each nucleotide occurs in the string.
The genetic language of every living thing on the planet is DNA. DNA is a large molecule that is built from an extremely long sequence of individual elements called nucleotides. 4 types exist in DNA and these differ only slightly and can be represented as the following symbols: 'A' for adenine, 'C' for cytosine, 'G' for guanine, and 'T' thymine.
Here is an analogy:
While Common Lisp doesn't care about indentation and layout of code, nor whether you use spaces or tabs, this is an important consideration for submissions to exercism.io. Excercism.io's code widget cannot handle mixing of tab and space characters well so using only spaces is recommended to make the code more readable to the human reviewers. Please review your editors settings on how to accomplish this. Below are instructions for popular editors for Common Lisp.
Use the following commands to ensure VIM uses only spaces for indentation:
:set tabstop=2 :set shiftwidth=2 :set expandtab
(or as a oneliner
:set tabstop=2 shiftwidth=2 expandtab). This can
be added to your
~/.vimrc file to use it all the time.
Emacs is very well suited for editing Common Lisp and has many powerful add-on packages available. The only thing that one needs to do with a stock emacs to make it work well with exercism.io is to evaluate the following code:
(setq-default indent-tabs-mode nil)
This can be placed in your
order to have it set whenever Emacs is launched.
One suggested add-on for Emacs and Common Lisp is SLIME which offers tight integration with the REPL; making iterative coding and testing very easy.
The Calculating DNA Nucleotides_problem at Rosalind http://rosalind.info/problems/dna/
It's possible to submit an incomplete solution so you can see how others have completed the exercise.
(ql:quickload "lisp-unit") #-xlisp-test (load "nucleotide-count") (defpackage #:nucleotide-count-test (:use #:common-lisp #:lisp-unit)) (in-package #:nucleotide-count-test) (defun make-hash (kvs) (reduce #'(lambda (h kv) (setf (gethash (first kv) h) (second kv)) h) kvs :initial-value (make-hash-table))) (define-test empty-dna-strand-has-no-adenine (assert-equal 0 (dna:dna-count #\A ""))) (define-test empty-dna-strand-has-no-nucleotides (assert-equalp (make-hash '((#\A 0) (#\T 0) (#\C 0) (#\G 0))) (dna:nucleotide-counts ""))) (define-test repetitive-cytosine-gets-counted (assert-equal 5 (dna:dna-count #\C "CCCCC"))) (define-test repetitive-sequence-has-only-guanine (assert-equalp (make-hash '((#\A 0) (#\T 0) (#\C 0) (#\G 8))) (dna:nucleotide-counts "GGGGGGGG"))) (define-test counts-only-thymine (assert-equal 1 (dna:dna-count #\T "GGGGGTAACCCGG"))) (define-test validates-nucleotides (assert-error 'dna:invalid-nucleotide (dna:dna-count #\X "GACT"))) (define-test counts-all-nucleotides (assert-equalp (make-hash '((#\A 20) (#\T 21) (#\G 17) (#\C 12))) (dna:nucleotide-counts "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC"))) #-xlisp-test (let ((*print-errors* t) (*print-failures* t)) (run-tests :all :nucleotide-count-test))
(in-package #:cl-user) (defpackage #:dna (:use #:common-lisp) (:export #:dna-count #:nucleotide-counts #:invalid-nucleotide)) (in-package #:dna) (define-condition invalid-nucleotide (error) ()) (defun dna-count (nucleotide sequence) (unless (member nucleotide '(#\A #\C #\G #\T)) (error 'invalid-nucleotide)) (count nucleotide sequence)) (defun nucleotide-counts (sequence) (let ((counts (make-hash-table))) (loop for c across "ACGT" do (setf (gethash c counts) (dna-count c sequence))) counts))
A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.
Here are some questions to help you reflect on this solution and learn the most from it.