🎉 Exercism Research is now launched. Help Exercism, help science and have some fun at research.exercism.io 🎉
Avatar of remcopeereboom

remcopeereboom's solution

to Difference Of Squares in the Ruby Track

Published at Jul 13 2018 · 10 comments
Instructions
Test suite
Solution

Note:

This solution was written on an old version of Exercism. The tests below might not correspond to the solution code, and the exercise may have changed since this code was written.

Find the difference between the square of the sum and the sum of the squares of the first N natural numbers.

The square of the sum of the first ten natural numbers is (1 + 2 + ... + 10)² = 55² = 3025.

The sum of the squares of the first ten natural numbers is 1² + 2² + ... + 10² = 385.

Hence the difference between the square of the sum of the first ten natural numbers and the sum of the squares of the first ten natural numbers is 3025 - 385 = 2640.


For installation and learning resources, refer to the exercism help page.

For running the tests provided, you will need the Minitest gem. Open a terminal window and run the following command to install minitest:

gem install minitest

If you would like color output, you can require 'minitest/pride' in the test file, or note the alternative instruction, below, for running the test file.

Run the tests from the exercise directory using the following command:

ruby difference_of_squares_test.rb

To include color from the command line:

ruby -r minitest/pride difference_of_squares_test.rb

Source

Problem 6 at Project Euler http://projecteuler.net/problem=6

Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.

difference_of_squares_test.rb

require 'minitest/autorun'
require_relative 'difference_of_squares'

# Common test data version: 1.1.0 7a1108b
class DifferenceOfSquaresTest < Minitest::Test
  def test_square_of_sum_1
    # skip
    assert_equal 1, Squares.new(1).square_of_sum
  end

  def test_square_of_sum_5
    skip
    assert_equal 225, Squares.new(5).square_of_sum
  end

  def test_square_of_sum_100
    skip
    assert_equal 25_502_500, Squares.new(100).square_of_sum
  end

  def test_sum_of_squares_1
    skip
    assert_equal 1, Squares.new(1).sum_of_squares
  end

  def test_sum_of_squares_5
    skip
    assert_equal 55, Squares.new(5).sum_of_squares
  end

  def test_sum_of_squares_100
    skip
    assert_equal 338_350, Squares.new(100).sum_of_squares
  end

  def test_difference_of_squares_1
    skip
    assert_equal 0, Squares.new(1).difference
  end

  def test_difference_of_squares_5
    skip
    assert_equal 170, Squares.new(5).difference
  end

  def test_difference_of_squares_100
    skip
    assert_equal 25_164_150, Squares.new(100).difference
  end

  # Problems in exercism evolve over time, as we find better ways to ask
  # questions.
  # The version number refers to the version of the problem you solved,
  # not your solution.
  #
  # Define a constant named VERSION inside of the top level BookKeeping
  # module, which may be placed near the end of your file.
  #
  # In your file, it will look like this:
  #
  # module BookKeeping
  #   VERSION = 1 # Where the version number matches the one in the test.
  # end
  #
  # If you are curious, read more about constants on RubyDoc:
  # http://ruby-doc.org/docs/ruby-doc-bundle/UsersGuide/rg/constants.html

  def test_bookkeeping
    skip
    assert_equal 4, BookKeeping::VERSION
  end
end
class Squares
  attr_reader :n

  def initialize(n)
    @n = n
  end

  def square_of_sums
    ((n + 1) * n / 2)**2
  end

  def sum_of_squares
    n * (n + 1) * (2*n + 1) / 6
  end

  def difference
    # Fun:
    #   If you like you can factor out a term of sum to make this marginally
    #   faster to compute:
    #   n(n + 1) / 2 <- sum
    #   n(n + 1)(2n + 1) / 6 <- sum_of_squares 
    #   n(n + 1)/2 * ((2n+1)/3)
    #   sum * (2n + 1)/3
    #   ------------------------------------------------
    #   sum * sum - sum * (2n + 1)/3 = sum - (2n + 1)/3 <- difference
    square_of_sums - sum_of_squares
  end
end

Community comments

Find this solution interesting? Ask the author a question to learn more.
Avatar of remcopeereboom

For some reason I wasn't using the attr_reader.

Avatar of ericroberts

I always go back and forth between defining initialize and attr_reader and just using < Struct.new(:n). Do you have a reason for doing it this way? I would be interested to know.

Avatar of remcopeereboom

@ericroberts Yeah, there can be seemingly weird problems with < Struct.new if you include the file multiple times (the new struct is different from the old one, which messes with classes being constants). Also deriving from Struct.new introduces an extra class level which you may or may not like. Using Struct.new is fine, but if you do, extend the resulting value, rather than derive from it.

But as you've been doing it for a while, you can already see the issues don't come up often...

Avatar of kosh-jelly

Definite kudos for doing this is an atypical way. I imagine this would compute significantly faster compared to the iteration method.

But it has a premature optimization code smell.

You're making the code significantly less readable just to save a few cpu cycles. The minute it will take for you to remember what's happening here in 2 weeks isn't worth the micro seconds saved.

"the most important function of computer code is to communicate the programmer's intent to a human reader. Any coding practice that makes your code harder to understand in the name of performance is a premature optimization."

More here: http://stackoverflow.com/questions/385506/when-is-optimisation-premature

Avatar of remcopeereboom

@kosh-jelly: Everyone should immediately recognize the formulae for the sum of the first n integers. It is important in soooo many fields, so I don't think that is any premature optimization. I am an engineer so I may be biased, but I expect all my peers will much prefer this over the loop (not so sure about the second one).

Moreover, the methods are both one short line and the names of the methods make it perfectly clear what the code is computing. In general I think you are allowed to have somewhat complicated implementation, as long as your public interface is well named. Focus on names!

As an aside, I think there is a huge difference between choosing a faster algorithm and the optimization techniques the people in your thread are referring to. Certainly Knuth would not call faster algorithms "optimization" (although I agree that the difference between O(1) and O(n) is not as critical as say O(n) and O(n2) )

Avatar of kosh-jelly

Thanks for the response. This seemed like a case of cleverness over clarity to me, but you make valid points that these formulas, or at least the first one, are immediately recognizable to some and that the the public interface is well named for those who aren't sure what's going on here.

Avatar of marienfressinaud
marienfressinaud
commented almost 5 years ago

If you really care about performances you should consider computing square of sums and sum of squares during initialization so you don't calculate them each time you call methods :). (I know the solution was posted 6 months ago, but it could be useful for someone, one day, who knows?)

Avatar of remcopeereboom

@marienfressinaud

If you really care about performances you should consider computing square of sums and sum of squares during initialization so you don't calculate them each time you call methods :).

Maybe, probably, but not necessarily.

If you pre-compute something, than there is some memory overhead. There is also the possibility that you will never call one of the methods or even that you will never call either of the methods. That means all computation is wasted. It all depends on how the code will be used. I'd argue that moving from an iterative solution to a direct solution is a free optimization (the lack of extra objects and method calls make it faster on my machine for all n), whereas pre-computing is only faster in some cases and slower in others.

Avatar of sumproxy

Yours is the first solution I found that had the same idea behind it as mine had. I would just say that performance-wise you don't lose much to computing the common factor of square_of_sums and sum_of_squares in the initialize section. Then you can reuse the result in both functions and add caching and then... But you can just keep it as it is :)

Avatar of esquinas

I tried to implement the "fun" part. As it is, it fails. Then I tried to work out the simplification myself and I managed to get it to pass all test except for the difference of 5: Expected: 170 Actual: 180

What can you learn from this solution?

A huge amount can be learned from reading other people’s code. This is why we wanted to give exercism users the option of making their solutions public.

Here are some questions to help you reflect on this solution and learn the most from it.

  • What compromises have been made?
  • Are there new concepts here that you could read more about to improve your understanding?