Avatar of coltontcrowe

coltontcrowe's solution

to Run Length Encoding in the Rust Track

Published at Oct 08 2019 · 0 comments
Instructions
Test suite
Solution

Implement run-length encoding and decoding.

Run-length encoding (RLE) is a simple form of data compression, where runs (consecutive data elements) are replaced by just one data value and count.

For example we can represent the original 53 characters with only 13.

"WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"  ->  "12WB12W3B24WB"

RLE allows the original data to be perfectly reconstructed from the compressed data, which makes it a lossless data compression.

"AABCCCDEEEE"  ->  "2AB3CD4E"  ->  "AABCCCDEEEE"

For simplicity, you can assume that the unencoded string will only contain the letters A through Z (either lower or upper case) and whitespace. This way data to be encoded will never contain any numbers and numbers inside data to be decoded always represent the count for the following character.

Rust Installation

Refer to the exercism help page for Rust installation and learning resources.

Writing the Code

Execute the tests with:

$ cargo test

All but the first test have been ignored. After you get the first test to pass, open the tests source file which is located in the tests directory and remove the #[ignore] flag from the next test and get the tests to pass again. Each separate test is a function with #[test] flag above it. Continue, until you pass every test.

If you wish to run all ignored tests without editing the tests source file, use:

$ cargo test -- --ignored

To run a specific test, for example some_test, you can use:

$ cargo test some_test

If the specific test is ignored use:

$ cargo test some_test -- --ignored

To learn more about Rust tests refer to the online test documentation

Make sure to read the Modules chapter if you haven't already, it will help you with organizing your files.

Further improvements

After you have solved the exercise, please consider using the additional utilities, described in the installation guide, to further refine your final solution.

To format your solution, inside the solution directory use

cargo fmt

To see, if your solution contains some common ineffective use cases, inside the solution directory use

cargo clippy --all-targets

Submitting the solution

Generally you should submit all files in which you implemented your solution (src/lib.rs in most cases). If you are using any external crates, please consider submitting the Cargo.toml file. This will make the review process faster and clearer.

Feedback, Issues, Pull Requests

The exercism/rust repository on GitHub is the home for all of the Rust exercises. If you have feedback about an exercise, or want to help implement new exercises, head over there and create an issue. Members of the rust track team are happy to help!

If you want to know more about Exercism, take a look at the contribution guide.

Source

Wikipedia https://en.wikipedia.org/wiki/Run-length_encoding

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

run-length-encoding.rs

use run_length_encoding as rle;

// encoding tests

#[test]
fn test_encode_empty_string() {
    assert_eq!("", rle::encode(""));
}

#[test]
#[ignore]
fn test_encode_single_characters() {
    assert_eq!("XYZ", rle::encode("XYZ"));
}

#[test]
#[ignore]
fn test_encode_string_with_no_single_characters() {
    assert_eq!("2A3B4C", rle::encode("AABBBCCCC"));
}

#[test]
#[ignore]
fn test_encode_single_characters_mixed_with_repeated_characters() {
    assert_eq!(
        "12WB12W3B24WB",
        rle::encode("WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB")
    );
}

#[test]
#[ignore]
fn test_encode_multiple_whitespace_mixed_in_string() {
    assert_eq!("2 hs2q q2w2 ", rle::encode("  hsqq qww  "));
}

#[test]
#[ignore]
fn test_encode_lowercase_characters() {
    assert_eq!("2a3b4c", rle::encode("aabbbcccc"));
}

// decoding tests

#[test]
#[ignore]
fn test_decode_empty_string() {
    assert_eq!("", rle::decode(""));
}

#[test]
#[ignore]
fn test_decode_single_characters_only() {
    assert_eq!("XYZ", rle::decode("XYZ"));
}

#[test]
#[ignore]
fn test_decode_string_with_no_single_characters() {
    assert_eq!("AABBBCCCC", rle::decode("2A3B4C"));
}

#[test]
#[ignore]
fn test_decode_single_characters_with_repeated_characters() {
    assert_eq!(
        "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB",
        rle::decode("12WB12W3B24WB")
    );
}

#[test]
#[ignore]
fn test_decode_multiple_whitespace_mixed_in_string() {
    assert_eq!("  hsqq qww  ", rle::decode("2 hs2q q2w2 "));
}

#[test]
#[ignore]
fn test_decode_lower_case_string() {
    assert_eq!("aabbbcccc", rle::decode("2a3b4c"));
}

// consistency test

#[test]
#[ignore]
fn test_consistency() {
    assert_eq!(
        "zzz ZZ  zZ",
        rle::decode(rle::encode("zzz ZZ  zZ").as_str())
    );
}

src/lib.rs

use itertools::Itertools;
use regex::Regex;

/// Encodes a given string slice into run length encoding
///
/// # Arguments
///
/// * `source` - The string slice to be encoded
///
/// # Examples
///
/// ```
/// use run_length_encoding::encode;
/// assert_eq!(encode("AABCCCDEEEE"), "2AB3CD4E".to_string())
/// ```
pub fn encode(source: &str) -> String {
    // group like characters together, count, and make the 1s invisible
    source
        .chars()
        .group_by(|&ch| ch)
        .into_iter()
        .map(|(ch, group)| {
            let len = group.count();
            let num = match len {
                1 => "".to_string(),
                _ => len.to_string(),
            };
            format!("{}{}", num, ch)
        })
        .collect()
}

/// Decodes a run length encoding into the original text
///
/// # Arguments
///
/// * source - string slice containing the encoded message
///
/// # Examples
///
/// ```
/// use run_length_encoding::decode;
/// assert_eq!(decode("2AB3CD4E"), "AABCCCDEEEE".to_string())
/// ```
pub fn decode(source: &str) -> String {
    let re = Regex::new(r"\d*\D").unwrap();
    // convert each regex group to string, then separate into numbers and letters
    // then repeat the letter the appropriate number of times
    re.captures_iter(source)
        .map(|x| x.get(0).map_or("", |y| y.as_str()))
        .map(|x| {
            let (num, ch): (String, String) = x.chars().partition(|y| y.is_digit(10));
            match num.as_str() {
                "" => ch,
                _ => ch.repeat(num.parse().unwrap()),
            }
        })
        .collect()
}

Cargo.toml

[package]
edition = "2018"
name = "run-length-encoding"
version = "1.1.0"

[dependencies]
itertools = "0.8.0"
regex = "1"

Community comments

Find this solution interesting? Ask the author a question to learn more.

coltontcrowe's Reflection

Admittedly not my best solution--it really needs to properly deal with error handling. In any case, this gave me exposure to itertools' group_by method and some of the regex methods as well.