Avatar of paulfioravanti

paulfioravanti's solution

to Run Length Encoding in the Ruby Track

Published at Jul 13 2018 · 0 comments
Instructions
Test suite
Solution

Implement run-length encoding and decoding.

Run-length encoding (RLE) is a simple form of data compression, where runs (consecutive data elements) are replaced by just one data value and count.

For example we can represent the original 53 characters with only 13.

"WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"  ->  "12WB12W3B24WB"

RLE allows the original data to be perfectly reconstructed from the compressed data, which makes it a lossless data compression.

"AABCCCDEEEE"  ->  "2AB3CD4E"  ->  "AABCCCDEEEE"

For simplicity, you can assume that the unencoded string will only contain the letters A through Z (either lower or upper case) and whitespace. This way data to be encoded will never contain any numbers and numbers inside data to be decoded always represent the count for the following character.


For installation and learning resources, refer to the Ruby resources page.

For running the tests provided, you will need the Minitest gem. Open a terminal window and run the following command to install minitest:

gem install minitest

If you would like color output, you can require 'minitest/pride' in the test file, or note the alternative instruction, below, for running the test file.

Run the tests from the exercise directory using the following command:

ruby run_length_encoding_test.rb

To include color from the command line:

ruby -r minitest/pride run_length_encoding_test.rb

Source

Wikipedia https://en.wikipedia.org/wiki/Run-length_encoding

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

run_length_encoding_test.rb

require 'minitest/autorun'
require_relative 'run_length_encoding'

# Common test data version: 1.1.0 1b7900e
class RunLengthEncodingTest < Minitest::Test
  def test_encode_empty_string
    # skip
    input = ''
    output = ''
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_single_characters_only_are_encoded_without_count
    skip
    input = 'XYZ'
    output = 'XYZ'
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_string_with_no_single_characters
    skip
    input = 'AABBBCCCC'
    output = '2A3B4C'
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_single_characters_mixed_with_repeated_characters
    skip
    input = 'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB'
    output = '12WB12W3B24WB'
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_multiple_whitespace_mixed_in_string
    skip
    input = '  hsqq qww  '
    output = '2 hs2q q2w2 '
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_lowercase_characters
    skip
    input = 'aabbbcccc'
    output = '2a3b4c'
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_decode_empty_string
    skip
    input = ''
    output = ''
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_single_characters_only
    skip
    input = 'XYZ'
    output = 'XYZ'
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_string_with_no_single_characters
    skip
    input = '2A3B4C'
    output = 'AABBBCCCC'
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_single_characters_with_repeated_characters
    skip
    input = '12WB12W3B24WB'
    output = 'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB'
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_multiple_whitespace_mixed_in_string
    skip
    input = '2 hs2q q2w2 '
    output = '  hsqq qww  '
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_lower_case_string
    skip
    input = '2a3b4c'
    output = 'aabbbcccc'
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_consistency_encode_followed_by_decode_gives_original_string
    skip
    input = 'zzz ZZ  zZ'
    encoded = RunLengthEncoding.encode(input)
    assert_equal input, RunLengthEncoding.decode(encoded)
  end
end
# frozen_string_literal: true

require "english"

module RunLengthEncoding
  CONSECUTIVE_DATA_ELEMENTS = /([A-Za-z\s])\1+/.freeze
  private_constant :CONSECUTIVE_DATA_ELEMENTS
  RUN_LENGTH_ENCODING = /(?<count>\d+)(?<character>\D)/.freeze
  private_constant :RUN_LENGTH_ENCODING

  module_function

  def encode(input)
    return "" if input.empty?

    input.gsub(CONSECUTIVE_DATA_ELEMENTS) do
      "#{$MATCH.length}#{$LAST_PAREN_MATCH}"
    end
  end

  def decode(input)
    return "" if input.empty?

    input.gsub(RUN_LENGTH_ENCODING) do
      $LAST_MATCH_INFO[:character] * $LAST_MATCH_INFO[:count].to_i
    end
  end
end

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?