Avatar of mike-cramblett

mike-cramblett's solution

to Grep in the Python Track

Published at Jan 22 2019 · 0 comments
Instructions
Test suite
Solution

Note:

This exercise has changed since this solution was written.

Search a file for lines matching a regular expression pattern. Return the line number and contents of each matching line.

The Unix grep command can be used to search for lines in one or more files that match a user-provided search query (known as the pattern).

The grep command takes three arguments:

  1. The pattern used to match lines in a file.
  2. Zero or more flags to customize the matching behavior.
  3. One or more files in which to search for matching lines.

Your task is to implement the grep function, which should read the contents of the specified files, find the lines that match the specified pattern and then output those lines as a single string. Note that the lines should be output in the order in which they were found, with the first matching line in the first file being output first.

As an example, suppose there is a file named "input.txt" with the following contents:

hello
world
hello again

If we were to call grep "hello" input.txt, the returned string should be:

hello
hello again

Flags

As said earlier, the grep command should also support the following flags:

  • -n Print the line numbers of each matching line.
  • -l Print only the names of files that contain at least one matching line.
  • -i Match line using a case-insensitive comparison.
  • -v Invert the program -- collect all lines that fail to match the pattern.
  • -x Only match entire lines, instead of lines that contain a match.

If we run grep -n "hello" input.txt, the -n flag will require the matching lines to be prefixed with its line number:

1:hello
3:hello again

And if we run grep -i "HELLO" input.txt, we'll do a case-insensitive match, and the output will be:

hello
hello again

The grep command should support multiple flags at once.

For example, running grep -l -v "hello" file1.txt file2.txt should print the names of files that do not contain the string "hello".

Exception messages

Sometimes it is necessary to raise an exception. When you do this, you should include a meaningful error message to indicate what the source of the error is. This makes your code more readable and helps significantly with debugging. Not every exercise will require you to raise an exception, but for those that do, the tests will only pass if you include a message.

To raise a message with an exception, just write it as an argument to the exception type. For example, instead of raise Exception, you should write:

raise Exception("Meaningful message indicating the source of the error")

Running the tests

To run the tests, run the appropriate command below (why they are different):

  • Python 2.7: py.test grep_test.py
  • Python 3.4+: pytest grep_test.py

Alternatively, you can tell Python to run the pytest module (allowing the same command to be used regardless of Python version): python -m pytest grep_test.py

Common pytest options

  • -v : enable verbose output
  • -x : stop running tests on first failure
  • --ff : run failures from previous test before running other test cases

For other options, see python -m pytest -h

Submitting Exercises

Note that, when trying to submit an exercise, make sure the solution is in the $EXERCISM_WORKSPACE/python/grep directory.

You can find your Exercism workspace by running exercism debug and looking for the line that starts with Workspace.

For more detailed information about running tests, code style and linting, please see Running the Tests.

Source

Conversation with Nate Foster. http://www.cs.cornell.edu/Courses/cs3110/2014sp/hw/0/ps0.pdf

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

grep_test.py

import unittest
try:
    import builtins
except ImportError:
    import __builtin__ as builtins

from grep import grep


# Tests adapted from `problem-specifications//canonical-data.json` @ v1.1.0

ILIADFILENAME = 'iliad.txt'
ILIADCONTENTS = '''Achilles sing, O Goddess! Peleus' son;
His wrath pernicious, who ten thousand woes
Caused to Achaia's host, sent many a soul
Illustrious into Ades premature,
And Heroes gave (so stood the will of Jove)
To dogs and to all ravening fowls a prey,
When fierce dispute had separated once
The noble Chief Achilles from the son
Of Atreus, Agamemnon, King of men.'''

MIDSUMMERNIGHTFILENAME = 'midsummer-night.txt'
MIDSUMMERNIGHTCONTENTS = '''I do entreat your grace to pardon me.
I know not by what power I am made bold,
Nor how it may concern my modesty,
In such a presence here to plead my thoughts;
But I beseech your grace that I may know
The worst that may befall me in this case,
If I refuse to wed Demetrius.'''

PARADISELOSTFILENAME = 'paradise-lost.txt'
PARADISELOSTCONTENTS = '''Of Mans First Disobedience, and the Fruit
Of that Forbidden Tree, whose mortal tast
Brought Death into the World, and all our woe,
With loss of Eden, till one greater Man
Restore us, and regain the blissful Seat,
Sing Heav'nly Muse, that on the secret top
Of Oreb, or of Sinai, didst inspire
That Shepherd, who first taught the chosen Seed'''
FILENAMES = [
    ILIADFILENAME,
    MIDSUMMERNIGHTFILENAME,
    PARADISELOSTFILENAME,
]
FILES = {}


class File(object):
    def __init__(self, name=''):
        self.name = name
        self.contents = ''

    def read(self):
        return self.contents

    def readlines(self):
        return [line + '\n' for line in self.read().split('\n') if line]

    def write(self, data):
        self.contents += data

    def __enter__(self):
        return self

    def __exit__(self, *args):
        pass


# Store builtin definition of open()
builtins.oldopen = builtins.open


def open(name, mode='r', *args, **kwargs):
    # if name is a mocked file name, lookup corresponding mocked file
    if name in FILENAMES:
        if mode == 'w' or name not in FILES:
            FILES[name] = File(name)
        return FILES[name]
    # else call builtin open()
    else:
        return builtins.oldopen(name, mode, *args, **kwargs)


builtins.open = open


# remove mocked file contents
def remove_file(file_name):
    del FILES[file_name]


def create_file(name, contents):
    with open(name, 'w') as f:
        f.write(contents)


class GrepTest(unittest.TestCase):
    @classmethod
    def setUpClass(self):
        # Override builtin open() with mock-file-enabled one
        builtins.open = open
        create_file(ILIADFILENAME, ILIADCONTENTS)
        create_file(MIDSUMMERNIGHTFILENAME, MIDSUMMERNIGHTCONTENTS)
        create_file(PARADISELOSTFILENAME, PARADISELOSTCONTENTS)

    @classmethod
    def tearDownClass(self):
        remove_file(ILIADFILENAME)
        remove_file(MIDSUMMERNIGHTFILENAME)
        remove_file(PARADISELOSTFILENAME)
        # Restore builtin open()
        builtins.open = builtins.oldopen

    def test_one_file_one_match_no_flags(self):
        self.assertMultiLineEqual(
            grep("Agamemnon", [ILIADFILENAME]),
            "Of Atreus, Agamemnon, King of men.\n"
        )

    def test_one_file_one_match_print_line_numbers_flag(self):
        self.assertMultiLineEqual(
            grep("Forbidden", [PARADISELOSTFILENAME], "-n"),
            "2:Of that Forbidden Tree, whose mortal tast\n"
        )

    def test_one_file_one_match_case_insensitive_flag(self):
        self.assertMultiLineEqual(
            grep("FORBIDDEN", [PARADISELOSTFILENAME], "-i"),
            "Of that Forbidden Tree, whose mortal tast\n"
        )

    def test_one_file_one_match_print_file_names_flag(self):
        self.assertMultiLineEqual(
            grep("Forbidden", [PARADISELOSTFILENAME], "-l"),
            PARADISELOSTFILENAME + '\n'
        )

    def test_one_file_one_match_match_entire_lines_flag(self):
        self.assertMultiLineEqual(
            grep("With loss of Eden, till one greater Man",
                 [PARADISELOSTFILENAME], "-x"),
            "With loss of Eden, till one greater Man\n"
        )

    def test_one_file_one_match_multiple_flags(self):
        self.assertMultiLineEqual(
            grep(
                "OF ATREUS, Agamemnon, KIng of MEN.",
                [ILIADFILENAME],
                "-n -i -x"
            ),
            "9:Of Atreus, Agamemnon, King of men.\n"
        )

    def test_one_file_several_matches_no_flags(self):
        self.assertMultiLineEqual(
            grep("may", [MIDSUMMERNIGHTFILENAME]),
            "Nor how it may concern my modesty,\n"
            "But I beseech your grace that I may know\n"
            "The worst that may befall me in this case,\n"
        )

    def test_one_file_several_matches_print_line_numbers_flag(self):
        self.assertMultiLineEqual(
            grep("may", [MIDSUMMERNIGHTFILENAME], "-n"),
            "3:Nor how it may concern my modesty,\n"
            "5:But I beseech your grace that I may know\n"
            "6:The worst that may befall me in this case,\n"
        )

    def test_one_file_several_matches_match_entire_lines_flag(self):
        self.assertMultiLineEqual(
            grep("may", [MIDSUMMERNIGHTFILENAME], "-x"),
            ""
        )

    def test_one_file_several_matches_case_insensitive_flag(self):
        self.assertMultiLineEqual(
            grep("ACHILLES", [ILIADFILENAME], "-i"),
            "Achilles sing, O Goddess! Peleus' son;\n"
            "The noble Chief Achilles from the son\n"
        )

    def test_one_file_several_matches_inverted_flag(self):
        self.assertMultiLineEqual(
            grep("Of", [PARADISELOSTFILENAME], "-v"),
            "Brought Death into the World, and all our woe,\n"
            "With loss of Eden, till one greater Man\n"
            "Restore us, and regain the blissful Seat,\n"
            "Sing Heav'nly Muse, that on the secret top\n"
            "That Shepherd, who first taught the chosen Seed\n"
        )

    def test_one_file_no_matches_various_flags(self):
        self.assertMultiLineEqual(
            grep("Gandalf", [ILIADFILENAME], "-n -l -x -i"),
            ""
        )

    def test_multiple_files_one_match_no_flags(self):
        self.assertMultiLineEqual(
            grep("Agamemnon", FILENAMES),
            "iliad.txt:Of Atreus, Agamemnon, King of men.\n"
        )

    def test_multiple_files_several_matches_no_flags(self):
        self.assertMultiLineEqual(
            grep("may", FILENAMES),
            "midsummer-night.txt:Nor how it may concern my modesty,\n"
            "midsummer-night.txt:But I beseech your grace that I may know\n"
            "midsummer-night.txt:The worst that may befall me in this case,\n"
        )

    def test_multiple_files_several_matches_print_line_numbers_flag(self):
        expected = (
            "midsummer-night.txt:5:But I beseech your grace that I may know\n"
            "midsummer-night.txt:6:The worst that may befall me in this case,"
            "\nparadise-lost.txt:2:Of that Forbidden Tree, whose mortal tast\n"
            "paradise-lost.txt:6:Sing Heav'nly Muse, that on the secret top\n"
        )
        self.assertMultiLineEqual(
            grep("that", FILENAMES, "-n"),
            expected
        )

    def test_multiple_files_one_match_print_file_names_flag(self):
        self.assertMultiLineEqual(
            grep("who", FILENAMES, "-l"),
            ILIADFILENAME + '\n' + PARADISELOSTFILENAME + '\n')

    def test_multiple_files_several_matches_case_insensitive_flag(self):
        expected = (
            "iliad.txt:Caused to Achaia's host, sent many a soul\n"
            "iliad.txt:Illustrious into Ades premature,\n"
            "iliad.txt:And Heroes gave (so stood the will of Jove)\n"
            "iliad.txt:To dogs and to all ravening fowls a prey,\n"
            "midsummer-night.txt:I do entreat your grace to pardon me.\n"
            "midsummer-night.txt:In such a presence here to plead my thoughts;"
            "\nmidsummer-night.txt:If I refuse to wed Demetrius.\n"
            "paradise-lost.txt:Brought Death into the World, and all our woe,"
            "\nparadise-lost.txt:Restore us, and regain the blissful Seat,\n"
            "paradise-lost.txt:Sing Heav'nly Muse, that on the secret top\n"
        )
        self.assertMultiLineEqual(
            grep("TO", FILENAMES, "-i"),
            expected
        )

    def test_multiple_files_several_matches_inverted_flag(self):
        self.assertMultiLineEqual(
            grep("a", FILENAMES, "-v"),
            "iliad.txt:Achilles sing, O Goddess! Peleus' son;\n"
            "iliad.txt:The noble Chief Achilles from the son\n"
            "midsummer-night.txt:If I refuse to wed Demetrius.\n"
        )

    def test_multiple_files_one_match_match_entire_lines_flag(self):
        self.assertMultiLineEqual(
            grep("But I beseech your grace that I may know",
                 FILENAMES, "-x"),
            "midsummer-night.txt:But I beseech your grace that I may know\n")

    def test_multiple_files_one_match_multiple_flags(self):
        self.assertMultiLineEqual(
            grep("WITH LOSS OF EDEN, TILL ONE GREATER MAN",
                 FILENAMES, "-n -i -x"),
            "paradise-lost.txt:4:With loss of Eden, till one greater Man\n")

    def test_multiple_files_no_matches_various_flags(self):
        self.assertMultiLineEqual(
            grep("Frodo", FILENAMES, "-n -l -x -i"),
            ""
        )


if __name__ == '__main__':
    unittest.main()
def grep(pattern, files, flags=''):
    found = []
    
    if len(files) > 1:
        print_file_names = True
    else:
        print_file_names = False
        
    print_numbers = False
    if '-n' in flags:
        print_numbers = True

    only_names = False
    if '-l' in flags:
        only_names = True

    case_insensitive = False
    if '-i' in flags:
        case_insensitive = True

    inverted = False
    if '-v' in flags:
        inverted = True

    entire_line = False
    if '-x' in flags:
        entire_line = True
    
    for file in files:
        for n,line in enumerate(open(file).readlines(),1):
            original_line = line
            if case_insensitive:
                pattern = pattern.lower()
                line = line.lower()
            
            if entire_line and (pattern+'\n' == line) and (not inverted):
                found.append((file,n,original_line))
                
            elif (not entire_line) and pattern in line and (not inverted):
                found.append((file,n,original_line))
                
            elif (((not entire_line) and (pattern not in line)) or (entire_line and (pattern+'\n' != line))) and inverted:
                found.append((file,n,original_line))
                
    output_string = ''
    all_files = []
    for (file,n,line) in found:
        output_line = []
        if print_file_names:
            output_line.append(file)
        if print_numbers:
            output_line.append(str(n))

        if only_names:
            if file not in all_files:
                all_files.append(file)
        else:
            output_line.append(line)

        output_string += ':'.join(output_line)

    if only_names:
        if len(all_files) > 0:
            output_string = '\n'.join(all_files)+'\n'

    return output_string

Community comments

Find this solution interesting? Ask the author a question to learn more.

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?