🎉 Exercism Research is now launched. Help Exercism, help science and have some fun at research.exercism.io 🎉
Avatar of katrinleinweber

katrinleinweber's solution

to Run Length Encoding in the Ruby Track

Published at Jan 23 2021 · 1 comment
Instructions
Test suite
Solution

Implement run-length encoding and decoding.

Run-length encoding (RLE) is a simple form of data compression, where runs (consecutive data elements) are replaced by just one data value and count.

For example we can represent the original 53 characters with only 13.

"WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"  ->  "12WB12W3B24WB"

RLE allows the original data to be perfectly reconstructed from the compressed data, which makes it a lossless data compression.

"AABCCCDEEEE"  ->  "2AB3CD4E"  ->  "AABCCCDEEEE"

For simplicity, you can assume that the unencoded string will only contain the letters A through Z (either lower or upper case) and whitespace. This way data to be encoded will never contain any numbers and numbers inside data to be decoded always represent the count for the following character.


For installation and learning resources, refer to the Ruby resources page.

For running the tests provided, you will need the Minitest gem. Open a terminal window and run the following command to install minitest:

gem install minitest

If you would like color output, you can require 'minitest/pride' in the test file, or note the alternative instruction, below, for running the test file.

Run the tests from the exercise directory using the following command:

ruby run_length_encoding_test.rb

To include color from the command line:

ruby -r minitest/pride run_length_encoding_test.rb

Source

Wikipedia https://en.wikipedia.org/wiki/Run-length_encoding

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

run_length_encoding_test.rb

require 'minitest/autorun'
require_relative 'run_length_encoding'

# Common test data version: 1.1.0 1b7900e
class RunLengthEncodingTest < Minitest::Test
  def test_encode_empty_string
    # skip
    input = ''
    output = ''
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_single_characters_only_are_encoded_without_count
    skip
    input = 'XYZ'
    output = 'XYZ'
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_string_with_no_single_characters
    skip
    input = 'AABBBCCCC'
    output = '2A3B4C'
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_single_characters_mixed_with_repeated_characters
    skip
    input = 'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB'
    output = '12WB12W3B24WB'
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_multiple_whitespace_mixed_in_string
    skip
    input = '  hsqq qww  '
    output = '2 hs2q q2w2 '
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_encode_lowercase_characters
    skip
    input = 'aabbbcccc'
    output = '2a3b4c'
    assert_equal output, RunLengthEncoding.encode(input)
  end

  def test_decode_empty_string
    skip
    input = ''
    output = ''
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_single_characters_only
    skip
    input = 'XYZ'
    output = 'XYZ'
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_string_with_no_single_characters
    skip
    input = '2A3B4C'
    output = 'AABBBCCCC'
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_single_characters_with_repeated_characters
    skip
    input = '12WB12W3B24WB'
    output = 'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB'
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_multiple_whitespace_mixed_in_string
    skip
    input = '2 hs2q q2w2 '
    output = '  hsqq qww  '
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_decode_lower_case_string
    skip
    input = '2a3b4c'
    output = 'aabbbcccc'
    assert_equal output, RunLengthEncoding.decode(input)
  end

  def test_consistency_encode_followed_by_decode_gives_original_string
    skip
    input = 'zzz ZZ  zZ'
    encoded = RunLengthEncoding.encode(input)
    assert_equal input, RunLengthEncoding.decode(encoded)
  end
end
class RunLengthEncoding
  attr_reader :string

  SEGMENT = /(.)\1+/
  TUPLE   = /\d+\D/

  def initialize(string)
    @string    = string
  end

  def self.encode(string) = new(string).encode
  def self.decode(string) = new(string).decode
  #   self.new… & RunLengthEncoding.new… would also work

  def encode = string.convert_each(SEGMENT, &method(:compress))
  def decode = string.convert_each(TUPLE, &method(:decompress))

  def compress(segment) = "#{segment.length}#{segment.squeeze}"
  def decompress(tuple) = tuple.letter * tuple.number
end

class String
  def convert_each(regex, &block) = gsub(regex, &block)
  def letter = self[-1]
  def number = self.to_i
end

Community comments

Find this solution interesting? Ask the author a question to learn more.
Avatar of katrinleinweber

This is rather procedural. Would not recommend.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?