🎉 Exercism Research is now launched. Help Exercism, help science and have some fun at research.exercism.io 🎉
Avatar of hyphenrf

hyphenrf's solution

to RNA Transcription in the MIPS Assembly Track

Published at Jul 30 2020 · 0 comments
Instructions
Test suite
Solution

Rna Transcription

Given a DNA strand, return its RNA complement (per RNA transcription).

Both DNA and RNA strands are a sequence of nucleotides.

The four nucleotides found in DNA are adenine (A), cytosine (C), guanine (G) and thymine (T).

The four nucleotides found in RNA are adenine (A), cytosine (C), guanine (G) and uracil (U).

Given a DNA strand, its transcribed RNA strand is formed by replacing each nucleotide with its complement:

  • G -> C
  • C -> G
  • T -> A
  • A -> U

Source

Rosalind http://rosalind.info/problems/rna

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

runner.mips

#
# Test transcribe_rna with some examples
#
# a0 - pointer to input string, for callee
# a1 - pointer to output string, for callee
# s0 - num of tests left to run
# s1 - address of input string
# s2 - address of expected output string
# s3 - char byte of input
# s4 - char byte of output
# s5 - counter for clearing output
#
# transcribe_rna must:
# - be named transcribe_rna and declared as global
# - read input address of string from a0
# - follow the convention of using the t0-9 registers for temporary storage
# - (if it uses s0-7 then it is responsible for pushing existing values to the stack then popping them back off before returning)
# - write a zero-terminated string representing the return value to address given in a1

.data

# number of test cases
n: .word 5
# input values and expected output values (all null terminated)
ins:  .asciiz "C", "G", "T", "A", "ACGTGGTCTTAA"
outs: .asciiz "G", "C", "A", "U", "UGCACCAGAAUU"

failmsg: .asciiz "failed for test input: "
expectedmsg: .asciiz ". expected "
tobemsg: .asciiz " to be "
okmsg: .asciiz "all tests passed"


.text

runner:
        lw      $s0, n
        la      $s1, ins
        la      $s2, outs

        li      $v0, 9                  # code for allocating heap memory
        li      $a0, 16                 # specify 16 bytes - length of longest expected output
        syscall
        move    $a1, $v0                # location of allocated memory is where callee writes result

run_test:
        jal     clear_output            # zero out output location
        move    $a0, $s1                # move address of input str to a0
        jal     transcribe_rna          # call subroutine under test
        move    $v1, $a1                # retain a copy of response from callee

scan:
        lb      $s3, 0($s2)             # load one byte of the expectation
        lb      $s4, 0($v1)             # load one byte of the actual
        bne     $s3, $s4, exit_fail     # if the two differ, the test has failed
        addi    $s2, $s2, 1             # point to next expectation byte
        addi    $v1, $v1, 1             # point to next actual byte
        addi    $s1, $s1, 1             # point to next input byte
        bne     $s3, $zero, scan        # if one char (and therefore the other) was not null, loop

done_scan:
        sub     $s0, $s0, 1             # decrement num of tests left to run
        bgt     $s0, $zero, run_test    # if more than zero tests to run, jump to run_test

exit_ok:
        la      $a0, okmsg              # put address of okmsg into a0
        li      $v0, 4                  # 4 is print string
        syscall

        li      $v0, 10                 # 10 is exit with zero status (clean exit)
        syscall

exit_fail:
        la      $a0, failmsg            # put address of failmsg into a0
        li      $v0, 4                  # 4 is print string
        syscall

        move    $a0, $s1                # print input that failed on
        li      $v0, 4
        syscall

        la      $a0, expectedmsg
        li      $v0, 4
        syscall

        move    $a0, $v1                # print actual that failed on
        li      $v0, 4
        syscall

        la      $a0, tobemsg
        li      $v0, 4
        syscall

        move    $a0, $s2                # print expected value that failed on
        li      $v0, 4
        syscall

        li      $a0, 1                  # set error code to 1
        li      $v0, 17                 # 17 is exit with error
        syscall

clear_output:
        sw      $zero, 0($a1)           # zero out output by storing 4 words (16 bytes) of zeros
        sw      $zero, 4($a1)
        sw      $zero, 8($a1)
        sw      $zero, 12($a1)
        jr      $ra

# # Include your implementation here if you wish to run this from the MARS GUI.
# .include "impl.mips"
.globl transcribe_rna

# I noticed that:
# - A,G,C,U are all odd numbers
# - to transcribe, all you need to do is toggle the third bit (0x04)
# - and IF it's U, we have to set the 5th bit (0x45 -> 0x55)

# To make good use of these observations, we have to normalise AGCT to be all in
# the 0x4[odd] form. This allows the loop to branch only once instead of 3 times

.text
transcribe_rna:
	move  $a2, $a1 #incrementable ptr for a1
LOOP:
	lb    $t0, ($a0)
	beqz  $t0, STOP

	or    $t0, $t0, 0x01 # if 0x54 make it 0x55
	and   $t0, $t0, 0xef # if 0x5* make it 0x4*
	xor   $t0, $t0, 0x04 # do the transformation

	beq   $t0, 0x45, fix_u # A -> U(0x55) instead of E(0x45)
	j     store_t0
fix_u:
	or    $t0, $t0, 0x10
store_t0:
	sb    $t0, ($a2)
	addi  $a0, $a0, 1
	addi  $a2, $a2, 1
	j     LOOP
STOP:
	jr    $ra

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?