🎉 Exercism Research is now launched. Help Exercism, help science and have some fun at research.exercism.io 🎉
Avatar of SergiiVlasiuk

SergiiVlasiuk's solution

to Run Length Encoding in the Scala Track

Published at Aug 25 2019 · 0 comments
Instructions
Test suite
Solution

Implement run-length encoding and decoding.

Run-length encoding (RLE) is a simple form of data compression, where runs (consecutive data elements) are replaced by just one data value and count.

For example we can represent the original 53 characters with only 13.

"WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"  ->  "12WB12W3B24WB"

RLE allows the original data to be perfectly reconstructed from the compressed data, which makes it a lossless data compression.

"AABCCCDEEEE"  ->  "2AB3CD4E"  ->  "AABCCCDEEEE"

For simplicity, you can assume that the unencoded string will only contain the letters A through Z (either lower or upper case) and whitespace. This way data to be encoded will never contain any numbers and numbers inside data to be decoded always represent the count for the following character.

The Scala exercises assume an SBT project scheme. The exercise solution source should be placed within the exercise directory/src/main/scala. The exercise unit tests can be found within the exercise directory/src/test/scala.

To run the tests simply run the command sbt test in the exercise directory.

For more detailed info about the Scala track see the help page.

Source

Wikipedia https://en.wikipedia.org/wiki/Run-length_encoding

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

RunLengthEncodingTests.scala

import org.scalatest.{Matchers, FunSuite}

/** @version 1.0.0 */
class RunLengthEncodingTest extends FunSuite with Matchers {

  test("encode - empty string") {
    RunLengthEncoding.encode("") should be ("")
  }

  test("encode - single characters only are encoded without count") {
    pending
    RunLengthEncoding.encode("XYZ") should be ("XYZ")
  }

  test("encode - string with no single characters") {
    pending
    RunLengthEncoding.encode("AABBBCCCC") should be ("2A3B4C")
  }

  test("encode - single characters mixed with repeated characters") {
    pending
    RunLengthEncoding.encode("WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB") should be ("12WB12W3B24WB")
  }

  test("encode - multiple whitespace mixed in string") {
    pending
    RunLengthEncoding.encode("  hsqq qww  ") should be ("2 hs2q q2w2 ")
  }

  test("encode - lowercase characters") {
    pending
    RunLengthEncoding.encode("aabbbcccc") should be ("2a3b4c")
  }

  test("decode - empty string") {
    pending
    RunLengthEncoding.decode("") should be ("")
  }

  test("decode - single characters only") {
    pending
    RunLengthEncoding.decode("XYZ") should be ("XYZ")
  }

  test("decode - string with no single characters") {
    pending
    RunLengthEncoding.decode("2A3B4C") should be ("AABBBCCCC")
  }

  test("decode - single characters with repeated characters") {
    pending
    RunLengthEncoding.decode("12WB12W3B24WB") should be ("WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB")
  }

  test("decode - multiple whitespace mixed in string") {
    pending
    RunLengthEncoding.decode("2 hs2q q2w2 ") should be ("  hsqq qww  ")
  }

  test("decode - lower case string") {
    pending
    RunLengthEncoding.decode("2a3b4c") should be ("aabbbcccc")
  }

  test("consistency - encode followed by decode gives original string") {
    pending
    RunLengthEncoding.decode(RunLengthEncoding.encode("zzz ZZ  zZ")) should be ("zzz ZZ  zZ")
  }
}
object RunLengthEncoding {
  def decode(str: String): String = str match {
    case "" => ""
    case s => {
      val (h, t) = s.span(_.isDigit)
      if (h.length > 0) t.head.toString * h.toInt + decode(t.tail)
      else t.head + decode(t.tail)
    }
  }

  def encode(str: String): String = {
    def enc_(list: List[(Char, Int)]): List[(Char, Int)] = list match {
      case h1 :: h2 :: tail if h1._1 == h2._1 => enc_((h1._1, h1._2 + 1) :: tail)
      case h1 :: h2 :: tail => h1 :: enc_((h2._1, 1) :: tail)
      case head :: Nil => list
      case Nil => List.empty
    }

    enc_(str.toList.zip(Stream from 1)).map {
      case (ch, 1) => ch.toString
      case (ch, dig) => dig.toString + ch
    }.mkString
  }
}

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?