Avatar of shmibs

shmibs's solution

to Run Length Encoding in the Elixir Track

Published at Jul 13 2018 · 0 comments
Instructions
Test suite
Solution

Implement run-length encoding and decoding.

Run-length encoding (RLE) is a simple form of data compression, where runs (consecutive data elements) are replaced by just one data value and count.

For example we can represent the original 53 characters with only 13.

"WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"  ->  "12WB12W3B24WB"

RLE allows the original data to be perfectly reconstructed from the compressed data, which makes it a lossless data compression.

"AABCCCDEEEE"  ->  "2AB3CD4E"  ->  "AABCCCDEEEE"

For simplicity, you can assume that the unencoded string will only contain the letters A through Z (either lower or upper case) and whitespace. This way data to be encoded will never contain any numbers and numbers inside data to be decoded always represent the count for the following character.

Running tests

Execute the tests with:

$ elixir run_length_encoding_test.exs

Pending tests

In the test suites, all but the first test have been skipped.

Once you get a test passing, you can unskip the next one by commenting out the relevant @tag :pending with a # symbol.

For example:

# @tag :pending
test "shouting" do
  assert Bob.hey("WATCH OUT!") == "Whoa, chill out!"
end

Or, you can enable all the tests by commenting out the ExUnit.configure line in the test suite.

# ExUnit.configure exclude: :pending, trace: true

For more detailed information about the Elixir track, please see the help page.

Source

Wikipedia https://en.wikipedia.org/wiki/Run-length_encoding

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

rle_test.exs

if !System.get_env("EXERCISM_TEST_EXAMPLES") do
  Code.load_file("rle.exs", __DIR__)
end

ExUnit.start()
ExUnit.configure(exclude: :pending, trace: true)

defmodule RunLengthEncoderTest do
  use ExUnit.Case

  test "encode empty string" do
    assert RunLengthEncoder.encode("") === ""
  end

  @tag :pending
  test "encode single characters only are encoded without count" do
    assert RunLengthEncoder.encode("XYZ") === "XYZ"
  end

  @tag :pending
  test "encode string with no single characters" do
    assert RunLengthEncoder.encode("AABBBCCCC") == "2A3B4C"
  end

  @tag :pending
  test "encode single characters mixed with repeated characters" do
    assert RunLengthEncoder.encode("WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB") ===
             "12WB12W3B24WB"
  end

  @tag :pending
  test "encode multiple whitespace mixed in string" do
    assert RunLengthEncoder.encode("  hsqq qww  ") === "2 hs2q q2w2 "
  end

  @tag :pending
  test "encode lowercase characters" do
    assert RunLengthEncoder.encode("aabbbcccc") === "2a3b4c"
  end

  @tag :pending
  test "decode empty string" do
    assert RunLengthEncoder.decode("") === ""
  end

  @tag :pending
  test "decode single characters only" do
    assert RunLengthEncoder.decode("XYZ") === "XYZ"
  end

  @tag :pending
  test "decode string with no single characters" do
    assert RunLengthEncoder.decode("2A3B4C") == "AABBBCCCC"
  end

  @tag :pending
  test "decode single characters with repeated characters" do
    assert RunLengthEncoder.decode("12WB12W3B24WB") ===
             "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"
  end

  @tag :pending
  test "decode multiple whitespace mixed in string" do
    assert RunLengthEncoder.decode("2 hs2q q2w2 ") === "  hsqq qww  "
  end

  @tag :pending
  test "decode lower case string" do
    assert RunLengthEncoder.decode("2a3b4c") === "aabbbcccc"
  end

  @tag :pending
  test "encode followed by decode gives original string" do
    original = "zzz ZZ  zZ"
    encoded = RunLengthEncoder.encode(original)
    assert RunLengthEncoder.decode(encoded) === original
  end
end
defmodule RunLengthEncoder do
  @doc """
  Generates a string where consecutive elements are represented as a data value and count.
  "HORSE" => "1H1O1R1S1E"
  For this example, assume all input are strings, that are all uppercase letters.
  It should also be able to reconstruct the data into its original form.
  "1H1O1R1S1E" => "HORSE"
  """
  @spec encode(String.t) :: String.t
  def encode(string) do
    f = &(&2 <> Integer.to_string(String.length &1) <> String.first &1) 
    Regex.split(~r/(.)\1*/, string, trim: true, include_captures: true)
    |> Enum.reduce("", f)
  end

  @spec decode(String.t) :: String.t
  def decode(string) do
    f = fn([n, c]) -> Stream.repeatedly(fn -> c end) |> Stream.take(String.to_integer n) end
    Regex.split(~r/([[:digit:]]+)/, string, trim: true, include_captures: true)
    |> Stream.chunk(2)
    |> Enum.map_join( &(Enum.join(f.(&1))) )
  end
end

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?