Avatar of 4d47

4d47's solution

to Markdown in the PHP Track

Published at Jul 13 2018 · 1 comment
Instructions
Test suite
Solution

Refactor a Markdown parser.

The markdown exercise is a refactoring exercise. There is code that parses a given string with Markdown syntax and returns the associated HTML for that string. Even though this code is confusingly written and hard to follow, somehow it works and all the tests are passing! Your challenge is to re-write this code to make it easier to read and maintain while still making sure that all the tests keep passing.

It would be helpful if you made notes of what you did in your refactoring in comments so reviewers can see that, but it isn't strictly necessary. The most important thing is to make the code better!

Running the tests

  1. Go to the root of your PHP exercise directory, which is <EXERCISM_WORKSPACE>/php. To find the Exercism workspace run

     % exercism debug | grep Workspace
    
  2. Get PHPUnit if you don't have it already.

     % wget --no-check-certificate https://phar.phpunit.de/phpunit.phar
     % chmod +x phpunit.phar
    
  3. Execute the tests:

     % ./phpunit.phar markdown/markdown_test.php
    

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

markdown_test.php

<?php

require 'markdown.php';

class MarkdownTest extends PHPUnit\Framework\TestCase
{
    public function testParsingParagraph()
    {
        $this->assertEquals('<p>This will be a paragraph</p>', parseMarkdown('This will be a paragraph'));
    }

    public function testParsingItalics()
    {
        $this->assertEquals('<p><i>This will be italic</i></p>', parseMarkdown('_This will be italic_'));
    }

    public function testParsingBoldText()
    {
        $this->assertEquals('<p><em>This will be bold</em></p>', parseMarkdown('__This will be bold__'));
    }

    public function testMixedNormalItalicsAndBoldText()
    {
        $this->assertEquals('<p>This will <i>be</i> <em>mixed</em></p>', parseMarkdown('This will _be_ __mixed__'));
    }

    public function testWithH1Headerlevel()
    {
        $this->assertEquals('<h1>This will be an h1</h1>', parseMarkdown('# This will be an h1'));
    }

    public function testWithH2Headerlevel()
    {
        $this->assertEquals('<h2>This will be an h2</h2>', parseMarkdown('## This will be an h2'));
    }

    public function testWithH6Headerlevel()
    {
        $this->assertEquals('<h6>This will be an h6</h6>', parseMarkdown('###### This will be an h6'));
    }

    public function testUnorderedLists()
    {
        $this->assertEquals(
            '<ul><li><p>Item 1</p></li><li><p>Item 2</p></li></ul>',
            parseMarkdown("* Item 1\n* Item 2")
        );
    }

    public function testWithALittleBitOfEverything()
    {
        $this->assertEquals(
            '<h1>Header!</h1><ul><li><em>Bold Item</em></li><li><i>Italic Item</i></li></ul>',
            parseMarkdown("# Header!\n* __Bold Item__\n* _Italic Item_")
        );
    }
}
<?php

function parseMarkdown(string $markdown): string
{
    $listStarted = false;
    $lines = explode("\n", $markdown);

    foreach ($lines as &$line) {
        if (preg_match('/^(#{1,6})(.*)/', $line, $matches)) {
            $tag = 'h' . strlen($matches[1]);
            $line = "<$tag>" . trim($matches[2]) . "</$tag>";
        }

        if (preg_match('/\*(.*)/', $line, $matches)) {
            $startTag = '<li>';
            $endTag = '</li>';

            if (!$listStarted) {
                $listStarted = true;
                $startTag = '<ul><li>';
            }

            if (!parseMarkdownInline($matches[1])) {
                $startTag = "$startTag<p>";
                $endTag = "</p>$endTag";
            }

            $line = "$startTag" . trim($matches[1]) . $endTag;

        } elseif ($listStarted) {
            $line = "</ul>$line";
            $listStarted = false;
        }

        if (!preg_match('/<h|<ul|<p|<li/', $line)) {
            $line = "<p>$line</p>";
        }

        parseMarkdownInline($line);
    }
    $html = join($lines);
    if ($listStarted) {
        $html .= '</ul>';
    }
    return $html;
}

function parseMarkdownInline(string &$line): string
{
    static $patterns = [
        'em' => '/(.*)__(.*)__(.*)/',
        'i' => '/(.*)_(.*)_(.*)/',
    ];
    $result = false;
    foreach ($patterns as $tag => $pattern) {
        if (preg_match($pattern, $line, $matches)) {
            $line = $matches[1] . "<$tag>" . $matches[2] . "</$tag>" . $matches[3];
            $result = true;
        }
    }
    return $result;
}

Community comments

Find this solution interesting? Ask the author a question to learn more.
Avatar of 4d47

Did not went insane and changed everything but instead did few small refactorings.

Generalize heading branch Remove duplicate lines with new var $startTag Extract parseMarkdownInline and remove $isBold/$isItalic

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?