Avatar of nicolemon

nicolemon's solution

to Markdown in the Python Track

Published at Jul 26 2018 · 0 comments
Instructions
Test suite
Solution

Note:

This exercise has changed since this solution was written.

Refactor a Markdown parser.

The markdown exercise is a refactoring exercise. There is code that parses a given string with Markdown syntax and returns the associated HTML for that string. Even though this code is confusingly written and hard to follow, somehow it works and all the tests are passing! Your challenge is to re-write this code to make it easier to read and maintain while still making sure that all the tests keep passing.

It would be helpful if you made notes of what you did in your refactoring in comments so reviewers can see that, but it isn't strictly necessary. The most important thing is to make the code better!

Exception messages

Sometimes it is necessary to raise an exception. When you do this, you should include a meaningful error message to indicate what the source of the error is. This makes your code more readable and helps significantly with debugging. Not every exercise will require you to raise an exception, but for those that do, the tests will only pass if you include a message.

To raise a message with an exception, just write it as an argument to the exception type. For example, instead of raise Exception, you should write:

raise Exception("Meaningful message indicating the source of the error")

Running the tests

To run the tests, run the appropriate command below (why they are different):

  • Python 2.7: py.test markdown_test.py
  • Python 3.4+: pytest markdown_test.py

Alternatively, you can tell Python to run the pytest module (allowing the same command to be used regardless of Python version): python -m pytest markdown_test.py

Common pytest options

  • -v : enable verbose output
  • -x : stop running tests on first failure
  • --ff : run failures from previous test before running other test cases

For other options, see python -m pytest -h

Submitting Exercises

Note that, when trying to submit an exercise, make sure the solution is in the $EXERCISM_WORKSPACE/python/markdown directory.

You can find your Exercism workspace by running exercism debug and looking for the line that starts with Workspace.

For more detailed information about running tests, code style and linting, please see Running the Tests.

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

markdown_test.py

import unittest
from markdown import parse_markdown


# Tests adapted from `problem-specifications//canonical-data.json` @ v1.2.0

class MarkdownTest(unittest.TestCase):

    def test_paragraph(self):
        self.assertEqual(parse_markdown('This will be a paragraph'),
                         '<p>This will be a paragraph</p>')

    def test_italics(self):
        self.assertEqual(parse_markdown('_This will be italic_'),
                         '<p><em>This will be italic</em></p>')

    def test_bold(self):
        self.assertEqual(parse_markdown('__This will be bold__'),
                         '<p><strong>This will be bold</strong></p>')

    def test_mixed_normal_italics_and_bold(self):
        self.assertEqual(parse_markdown('This will _be_ __mixed__'),
                         '<p>This will <em>be</em> <strong>mixed</strong></p>')

    def test_h1(self):
        self.assertEqual(parse_markdown('# This will be an h1'),
                         '<h1>This will be an h1</h1>')

    def test_h2(self):
        self.assertEqual(parse_markdown('## This will be an h2'),
                         '<h2>This will be an h2</h2>')

    def test_h6(self):
        self.assertEqual(parse_markdown(
            '###### This will be an h6'), '<h6>This will be an h6</h6>')

    def test_unordered_lists(self):
        self.assertEqual(parse_markdown('* Item 1\n* Item 2'),
                         '<ul><li>Item 1</li>'
                         '<li>Item 2</li></ul>')

    def test_little_bit_of_everything(self):
        self.assertEqual(parse_markdown(
            '# Header!\n* __Bold Item__\n* _Italic Item_'),
            '<h1>Header!</h1><ul><li><strong>Bold Item</strong></li>'
            '<li><em>Italic Item</em></li></ul>')


if __name__ == '__main__':
    unittest.main()
import re

# regex voodoo
HEADER = re.compile(r'(?P<hchar>^#+) (?P<line_content>.*)')
EM = re.compile(r'(?P<before>.*)_(?P<line_content>.*)_(?P<after>.*)')
STRONG = re.compile(r'(?P<before>.*)__(?P<line_content>.*)__(?P<after>.*)')
LI = re.compile(r'\* (?P<list_item>.*)$')


def parse_header(line):
    match = HEADER.match(line)
    return '<h{size}>{line}</h{size}>'.format(
        size=len(match.group('hchar')),
        line=match.group('line_content')
    )

def parse_p(line):
    return '<p>{line}</p>'.format(line=line)

def parse_em(line):
    match = EM.match(line)
    return '{before}<em>{line}</em>{after}'.format(
        before=match.group('before'),
        line=match.group('line_content'),
        after=match.group('after')
    )


def parse_strong(line):
    match = STRONG.match(line)
    return '{before}<strong>{line}</strong>{after}'.format(
        before=match.group('before'),
        line=match.group('line_content'),
        after=match.group('after')
    )


def start_list(line):
    return '<ul>{line}'.format(line=line)


def end_list(line):
    return '</ul>{line}'.format(line=line)


def parse_li(line):
    match = LI.match(line)
    return '<li>{line}</li>'.format(
        line=match.group('list_item')
    )

def parse_markdown(markdown):
    lines = markdown.split('\n')
    res = ''
    in_list = False
    for i in lines:
        # first inner formatting
        if STRONG.match(i):
            i = parse_strong(i)
        if EM.match(i):
            i = parse_em(i)

        # now headers, list items, paragraphs
        if HEADER.match(i):
            i = parse_header(i)
        elif LI.match(i):
            i = parse_li(i)
            if not in_list:
                in_list = True
                i = start_list(i)
        else:
            i = parse_p(i)
            if in_list:
                in_list = False
                i = end_list(i)

        res += i

    if in_list:
        res += '</ul>'

    return res

Community comments

Find this solution interesting? Ask the author a question to learn more.

nicolemon's Reflection

REEEEEEEEEEEE