run_length.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# -*- coding: utf-8 -*-
'''
    run_length.py
    -------------

    Run Length Encoding allows the original data to be perfectly reconstructed
    from the compressed data, which makes it a lossless data compression.
'''


from itertools import groupby
import re


def encode(phrase):
    '''
    Given a string, encodes the string into a compressed, Run Length
    Encoded (RLE) format, and returns this encoded string.
    '''

    # Stole this generator function idea from @hop
    def rle():
        for key, group in groupby(phrase):
            count = len(list(group))
            yield key if count == 1 else str(count) + key

    return ''.join(rle())


def decode(encoded):
    '''
    Given an Run Length Encoded (RLE) string, decodes it
    into its original format, and returns this decoded string.
    '''

    def rle():
        for n, char in re.findall(r'(\d*)(\D)', encoded):
            count = 1 if n == '' else int(n)
            yield char * count

    return ''.join(rle())

@Leockard thinks this looks great

Comments

Thinking out loud incoming.

rle could be written as a one-liner as such:

1
2
(key if len(list(group)) == 1 else str(len(list(group))) + key
 for key, group in groupby(phrase))

Admittedly, this looks hideous and is not very pythonic.

A slightly clearer way to write it would be

1
2
3
(key if len(count) == 1 else str(count) + key
 for key, group in groupby(phrase)
 for count in [list(group)])

Or similar (haven't tested it). This still looks kind of weird, and you'd be iterating over a one-element list every time just for the sake of not defining a whole different function.

This makes me thin about languages that have ways to define lexical scoping anywhere, something like let in lisp or With in Mathematica. It would be cool to write something like

1
2
3
(let([("count", list(group))],
    key if len(count) == 1 else str(count) + key)
    for key, group in groupby(phrase))

@reeddunkle, maybe we should implement something similar at some point, just because.

Leockard commented 20 July 2016 at 19:39 UTC

Yeah I like where you're going with this. Admittedly, every example I come up with sacrifices a lot of readability, but I think you're right: we start with doing it just because, and then than refactor it to try to make it pretty.

reeddunkle commented 20 July 2016 at 20:27 UTC

This works:

1
2
3
4
5
def encode2(phrase):
    return ''.join(((lambda count:
                     key if count == 1 else str(count) + key)
                    (len(list(group)))
                   for key, group in groupby(phrase)))

I like it because it's halfway between a let expression and writing a whole separate function. But I hate it because it looks ugly. And because I'm creating the same lambda with every element in the generator. Yikes.

Leockard commented 20 July 2016 at 22:13 UTC

You're not logged in right now. Please login via GitHub to comment