Avatar of 4d47
0
0
Genius
0
1

4d47's solution

to Markdown in the PHP Track

Instructions
Test suite
Solution

Refactor a Markdown parser.

The markdown exercise is a refactoring exercise. There is code that parses a given string with Markdown syntax and returns the associated HTML for that string. Even though this code is confusingly written and hard to follow, somehow it works and all the tests are passing! Your challenge is to re-write this code to make it easier to read and maintain while still making sure that all the tests keep passing.

It would be helpful if you made notes of what you did in your refactoring in comments so reviewers can see that, but it isn't strictly necessary. The most important thing is to make the code better!

Running the tests

  1. Go to the root of your PHP exercise directory, which is <EXERCISM_WORKSPACE>/php. To find the Exercism workspace run

     % exercism debug | grep Workspace
    
  2. Get PHPUnit if you don't have it already.

     % wget --no-check-certificate https://phar.phpunit.de/phpunit.phar
     % chmod +x phpunit.phar
    
  3. Execute the tests:

     % ./phpunit.phar markdown/markdown_test.php
    

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

markdown_test.php

<?php

require 'markdown.php';

class MarkdownTest extends PHPUnit\Framework\TestCase
{
    public function testParsingParagraph()
    {
        $this->assertEquals('<p>This will be a paragraph</p>', parseMarkdown('This will be a paragraph'));
    }

    public function testParsingItalics()
    {
        $this->assertEquals('<p><i>This will be italic</i></p>', parseMarkdown('_This will be italic_'));
    }

    public function testParsingBoldText()
    {
        $this->assertEquals('<p><em>This will be bold</em></p>', parseMarkdown('__This will be bold__'));
    }

    public function testMixedNormalItalicsAndBoldText()
    {
        $this->assertEquals('<p>This will <i>be</i> <em>mixed</em></p>', parseMarkdown('This will _be_ __mixed__'));
    }

    public function testWithH1Headerlevel()
    {
        $this->assertEquals('<h1>This will be an h1</h1>', parseMarkdown('# This will be an h1'));
    }

    public function testWithH2Headerlevel()
    {
        $this->assertEquals('<h2>This will be an h2</h2>', parseMarkdown('## This will be an h2'));
    }

    public function testWithH6Headerlevel()
    {
        $this->assertEquals('<h6>This will be an h6</h6>', parseMarkdown('###### This will be an h6'));
    }

    public function testUnorderedLists()
    {
        $this->assertEquals(
            '<ul><li><p>Item 1</p></li><li><p>Item 2</p></li></ul>',
            parseMarkdown("* Item 1\n* Item 2")
        );
    }

    public function testWithALittleBitOfEverything()
    {
        $this->assertEquals(
            '<h1>Header!</h1><ul><li><em>Bold Item</em></li><li><i>Italic Item</i></li></ul>',
            parseMarkdown("# Header!\n* __Bold Item__\n* _Italic Item_")
        );
    }
}
<?php

function parseMarkdown(string $markdown): string
{
    $listStarted = false;
    $lines = explode("\n", $markdown);

    foreach ($lines as &$line) {
        if (preg_match('/^(#{1,6})(.*)/', $line, $matches)) {
            $tag = 'h' . strlen($matches[1]);
            $line = "<$tag>" . trim($matches[2]) . "</$tag>";
        }

        if (preg_match('/\*(.*)/', $line, $matches)) {
            $startTag = '<li>';
            $endTag = '</li>';

            if (!$listStarted) {
                $listStarted = true;
                $startTag = '<ul><li>';
            }

            if (!parseMarkdownInline($matches[1])) {
                $startTag = "$startTag<p>";
                $endTag = "</p>$endTag";
            }

            $line = "$startTag" . trim($matches[1]) . $endTag;

        } elseif ($listStarted) {
            $line = "</ul>$line";
            $listStarted = false;
        }

        if (!preg_match('/<h|<ul|<p|<li/', $line)) {
            $line = "<p>$line</p>";
        }

        parseMarkdownInline($line);
    }
    $html = join($lines);
    if ($listStarted) {
        $html .= '</ul>';
    }
    return $html;
}

function parseMarkdownInline(string &$line): string
{
    static $patterns = [
        'em' => '/(.*)__(.*)__(.*)/',
        'i' => '/(.*)_(.*)_(.*)/',
    ];
    $result = false;
    foreach ($patterns as $tag => $pattern) {
        if (preg_match($pattern, $line, $matches)) {
            $line = $matches[1] . "<$tag>" . $matches[2] . "</$tag>" . $matches[3];
            $result = true;
        }
    }
    return $result;
}

What can you learn from this solution?

A huge amount can be learnt from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that I could read more about to develop my understanding?

Community comments

See what others have said about this solution
about 1 year ago
4d47 says

Did not went insane and changed everything but instead did few small refactorings.

Generalize heading branch
Remove duplicate lines with new var $startTag
Extract parseMarkdownInline and remove $isBold/$isItalic