Implement run-length encoding and decoding.
Run-length encoding (RLE) is a simple form of data compression, where runs (consecutive data elements) are replaced by just one data value and count.
For example we can represent the original 53 characters with only 13.
"WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB" -> "12WB12W3B24WB"
RLE allows the original data to be perfectly reconstructed from the compressed data, which makes it a lossless data compression.
"AABCCCDEEEE" -> "2AB3CD4E" -> "AABCCCDEEEE"
For simplicity, you can assume that the unencoded string will only contain the letters A through Z (either lower or upper case) and whitespace. This way data to be encoded will never contain any numbers and numbers inside data to be decoded always represent the count for the following character.
For installation and learning resources, refer to the exercism help page.
To work on the exercises, you will need
Base. Consult opam website for instructions on how to install
opam for your OS. Once
opam is installed open a terminal window and run the following command to install base:
opam install base
To run the tests you will need
OUnit. Install it using
opam install ounit
A Makefile is provided with a default target to compile your solution and run the tests. At the command line, type:
utop is a command line program which allows you to run Ocaml code interactively. The easiest way to install it is via opam:
opam install utop
Consult utop for more detail.
The exercism/ocaml repository on GitHub is the home for all of the Ocaml exercises.
If you have feedback about an exercise, or want to help implementing a new one, head over there and create an issue. We'll do our best to help you!
It's possible to submit an incomplete solution so you can see how others have completed the exercise.
open Base open OUnit2 open Run_length_encoding let ae exp got _test_ctxt = assert_equal exp got ~printer:Fn.id let encode_tests = [ "empty string" >:: ae "" (encode ""); "single characters only are encoded without count" >:: ae "XYZ" (encode "XYZ"); "string with no single characters" >:: ae "2A3B4C" (encode "AABBBCCCC"); "single characters mixed with repeated characters" >:: ae "12WB12W3B24WB" (encode "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"); "multiple whitespace mixed in string" >:: ae "2 hs2q q2w2 " (encode " hsqq qww "); "lowercase characters" >:: ae "2a3b4c" (encode "aabbbcccc"); ] let decode_tests = [ "empty string" >:: ae "" (decode ""); "single characters only" >:: ae "XYZ" (decode "XYZ"); "string with no single characters" >:: ae "AABBBCCCC" (decode "2A3B4C"); "single characters with repeated characters" >:: ae "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB" (decode "12WB12W3B24WB"); "multiple whitespace mixed in string" >:: ae " hsqq qww " (decode "2 hs2q q2w2 "); "lower case string" >:: ae "aabbbcccc" (decode "2a3b4c"); ] let encode_and_then_decode_tests = [ "encode followed by decode gives original string" >:: ae "zzz ZZ zZ" (encode "zzz ZZ zZ" |> decode); ] let () = run_test_tt_main ( "run length encoding tests" >::: List.concat [encode_tests; decode_tests; encode_and_then_decode_tests] )
open Core let encode (s : string) = let argerr = Invalid_argument "plaintext cannot contain digit characters" in let dcheck c = if Char.is_digit c then raise argerr else c in String.to_list s (* chunk by val *) |> List.group ~break:(<>) (* convert to integer-prefixed *) |> List.map ~f:(fun l -> match l with | [c] -> dcheck c |> Char.to_string | h :: _ -> dcheck h |> Printf.sprintf "%d%c" (List.length l) |  -> raise (Failure "your standard library is broken :c") ) |> String.concat let decode (s : string) = let l2str l = let buf = List.length l |> Buffer.create in List.iter l ~f:(Buffer.add_char buf); Buffer.contents buf in String.to_list s (* group digits *) |> List.group ~break:(fun l r -> not ((Char.is_digit l) && (Char.is_digit r)) ) (* build string parts *) |> List.fold_map ~init:"" ~f:(fun acc l -> let h = List.hd_exn l in if Char.is_digit h then (l2str l, "") else if acc = "" then ("", Char.to_string h) else ("", Bytes.make (Int.of_string acc) h |> Bytes.to_string) ) |> fun (_, l) -> l |> String.concat
A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.
Here are some questions to help you reflect on this solution and learn the most from it.