Avatar of wahyu1971

wahyu1971's solution

to ETL in the D Track

Published at Jul 16 2018 · 0 comments
Instructions
Test suite
Solution

We are going to do the Transform step of an Extract-Transform-Load.

ETL

Extract-Transform-Load (ETL) is a fancy way of saying, "We have some crufty, legacy data over in this system, and now we need it in this shiny new system over here, so we're going to migrate this."

(Typically, this is followed by, "We're only going to need to run this once." That's then typically followed by much forehead slapping and moaning about how stupid we could possibly be.)

The goal

We're going to extract some scrabble scores from a legacy system.

The old system stored a list of letters per score:

  • 1 point: "A", "E", "I", "O", "U", "L", "N", "R", "S", "T",
  • 2 points: "D", "G",
  • 3 points: "B", "C", "M", "P",
  • 4 points: "F", "H", "V", "W", "Y",
  • 5 points: "K",
  • 8 points: "J", "X",
  • 10 points: "Q", "Z",

The shiny new scrabble system instead stores the score per letter, which makes it much faster and easier to calculate the score for a word. It also stores the letters in lower-case regardless of the case of the input letters:

  • "a" is worth 1 point.
  • "b" is worth 3 points.
  • "c" is worth 3 points.
  • "d" is worth 2 points.
  • Etc.

Your mission, should you choose to accept it, is to transform the legacy data format to the shiny new format.

Notes

A final note about scoring, Scrabble is played around the world in a variety of languages, each with its own unique scoring table. For example, an "E" is scored at 2 in the Māori-language version of the game while being scored at 4 in the Hawaiian-language version.

Getting Started

Make sure you have read D page on exercism.io. This covers the basic information on setting up the development environment expected by the exercises.

Passing the Tests

Get the first test compiling, linking and passing by following the three rules of test-driven development. Create just enough structure by declaring namespaces, functions, classes, etc., to satisfy any compiler errors and get the test to fail. Then write just enough code to get the test to pass. Once you've done that, uncomment the next test by moving the following line past the next test.

static if (all_tests_enabled)

This may result in compile errors as new constructs may be invoked that you haven't yet declared or defined. Again, fix the compile errors minimally to get a failing test, then change the code minimally to pass the test, refactor your implementation for readability and expressiveness and then go on to the next test.

Try to use standard D facilities in preference to writing your own low-level algorithms or facilities by hand. DRefLanguage and DReference are references to the D language and D standard library.

Source

The Jumpstart Lab team http://jumpstartlab.com

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

etl.d

module etl;

import std.array : array;
import std.algorithm.sorting : sort;
import std.algorithm.comparison : equal;

unittest
{

// test associative array equality
bool aaEqual (const int[dchar] lhs, const int[dchar] rhs)
{
	auto lhs_pairs = lhs.byKeyValue.array;
	auto rhs_pairs = rhs.byKeyValue.array;
	lhs_pairs.sort!(q{a.key < b.key});
	rhs_pairs.sort!(q{a.key < b.key});

	return equal!("a.key == b.key && a.value == b.value")(lhs_pairs, rhs_pairs);
}

immutable int allTestsEnabled = 0;

// transform one value
{
	immutable string[int] old = [1: "A"];

	const auto actual = transform(old);
	const int[dchar] expected = ['a': 1];

	assert(aaEqual(expected, actual));
}

static if (allTestsEnabled)
{
// transform more values
{
	immutable string[int] old = [1: "AEIOU"];

	const auto actual = transform(old);
	const int[dchar] expected = ['a': 1, 'e': 1, 'i': 1, 'o': 1, 'u': 1];

	assert(aaEqual(expected, actual));
}

// transforms more keys
{
	immutable string[int] old = [1: "AE", 2: "DG"];

	const auto actual = transform(old);
	const int[dchar] expected = ['a': 1, 'e': 1, 'd': 2, 'g': 2];

	assert(aaEqual(expected, actual));
}

// transforms a full dataset
{
	immutable string[int] old = [1: "AEIOULNRST",
								2: "DG",
								3: "BCMP",
								4: "FHVWY",
								5: "K",
								8: "JX",
								10: "QZ"];

	const auto actual = transform(old);

	const int[dchar] expected = ['a': 1, 'b': 3,  'c': 3, 'd': 2, 'e': 1,
								'f': 4, 'g': 2,  'h': 4, 'i': 1, 'j': 8,
								'k': 5, 'l': 1,  'm': 3, 'n': 1, 'o': 1,
								'p': 3, 'q': 10, 'r': 1, 's': 1, 't': 1,
								'u': 1, 'v': 4,  'w': 4, 'x': 8, 'y': 4,
								'z': 10];

	assert(aaEqual(expected, actual));
}

}

}
module etl;

import std.array : array;
import std.algorithm.sorting : sort;
import std.algorithm.comparison : equal;
import std.ascii : toLower;

int[dchar] transform(const string[int] ii)
{
    int[dchar] res;
    foreach (key, value; ii)
    {
        foreach (ch; value)
        {
            res[toLower(ch)] = key;
        }
    }
    return res;
}

unittest
{

    // test associative array equality
    bool aaEqual(const int[dchar] lhs, const int[dchar] rhs)
    {
        auto lhs_pairs = lhs.byKeyValue.array;
        auto rhs_pairs = rhs.byKeyValue.array;
        lhs_pairs.sort!(q{a.key < b.key});
        rhs_pairs.sort!(q{a.key < b.key});

        return equal!("a.key == b.key && a.value == b.value")(lhs_pairs, rhs_pairs);
    }

    immutable int allTestsEnabled = 0;

    // transform one value
    {
        immutable string[int] old = [1 : "A"];

        const auto actual = transform(old);
        const int[dchar] expected = ['a' : 1];

        assert(aaEqual(expected, actual));
    }

    // transform more values
    {
        immutable string[int] old = [1 : "AEIOU"];

        const auto actual = transform(old);
        const int[dchar] expected = ['a' : 1, 'e' : 1, 'i' : 1, 'o' : 1, 'u' : 1];

        assert(aaEqual(expected, actual));
    }

    // transforms more keys
    {
        immutable string[int] old = [1 : "AE", 2 : "DG"];

        const auto actual = transform(old);
        const int[dchar] expected = ['a' : 1, 'e' : 1, 'd' : 2, 'g' : 2];

        assert(aaEqual(expected, actual));
    }

    // transforms a full dataset
    {
        immutable string[int] old = [
            1 : "AEIOULNRST", 2 : "DG", 3 : "BCMP", 4 : "FHVWY", 5 : "K", 8 : "JX", 10 : "QZ"
        ];

        const auto actual = transform(old);

        const int[dchar] expected = ['a' : 1, 'b' : 3, 'c' : 3, 'd' : 2, 'e' : 1,
            'f' : 4, 'g' : 2, 'h' : 4, 'i' : 1, 'j' : 8, 'k' : 5, 'l' : 1, 'm' : 3,
            'n' : 1, 'o' : 1, 'p' : 3, 'q' : 10, 'r' : 1, 's' : 1, 't' : 1, 'u' : 1,
            'v' : 4, 'w' : 4, 'x' : 8, 'y' : 4, 'z' : 10];

        assert(aaEqual(expected, actual));
    }

    static if (allTestsEnabled)
    {
    }

}

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?