Notes on Haskell

I have a fairly deep background in pure math, and Haskell appears to be the purest and mathiest of the many functional languages that are getting more and more fashionable of late. So while one of my close co-workers recently took a course in Scala and my boss has dipped his toes in Erlang, I’ve chosen Haskell for my introduction to the functional language craze. Now we just need two of my other fellow software engineers to delve into OCaml and F# and we’ll have a royal flush.

I’m still a rank amateur at Haskell and am going about it in a completely unsystematic way, but nevertheless I am making progress and wanted to document a couple recent breakthroughs of sorts that I’ve had with it. The first came so easily and elegantly that my eyes turned into hearts and I almost titled this blog post “Haskell is Cheating.” But then the second was a series of battles with foggy comprehension and unfamiliar paradigms that quickly eroded my optimism and then continued until I was nearly sure Haskell and I didn’t have a future.

Eventually I worked it out, though, so now that I’ve bounced back a little I think I should document all this. The two breakthroughs came on Project Euler problems 2 and 3, which is a little embarrassing but I did say I was a rank amateur. Just a little less of one today than I was yesterday.

Problem 2

Problem 2 was a lark. The statement of the problem is:

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:
1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …
By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

Generating Fibonacci numbers comes probably just after Hello World and finding factorials in the Beginner Programming Problems Hall of Fame, and a straightforward recursive solution is easy in Haskell:

fibo 0 = 0
fibo 1 = 1
fibo n = fibo (n - 1) + fibo (n - 2)

But that solution won’t memoize: if you ask it for fibo 156 it calculates fibo 0 through fibo 155, and then if you ask for fibo 157 it calculates them all again on its way.

I came across another solution that took me a little time to understand:

fibs ∷ [Integer]
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

fibo ∷ Int → Integer
fibo n = fibs !! n

The first and retrospectively stupidest question I had was how this memoizes any better than the naive solution above, but that is obviously because fibs is an infinite list. Thanks to Haskell’s lazy evaluation, it fills up only as far as we ask it to (using fibo) and stores everything preceding that. Okay.

The definition of fibs puzzled me but not for too long: zipwith combines corresponding elements of two lists using the predicate that it takes as its first argument. So fibs seeds the list of Fibonacci numbers with the first two, then generates subsequent ones by summing corresponding elements of that list and itself sans its first element (tail fibs). To put it another way, you get subsequent entries in fibs by summing what you’ve got so far with the same shifted up by one. Sort of like this:

fibs      0 1 1 2 3 5  8 
tail fibs 1 1 2 3 5 8  13  etc...
sum       1 2 3 5 8 13 21

Anyway, that was fine. What was beautiful, once I had that, was the actual solution to the problem. It looked like this:

sum $ filter even (takeWhile (< 4000000) fibs)

which is about as close as you can get to just restating the problem in code. Read function applications from right to left: take Fibonacci numbers while they’re less than 40 million, filter out the even ones, and sum the result. It’s beautiful. That’s when my eyes turned into hearts.

Problem 3

Unfortunately, then there was Problem 3:

The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143?

I’ve done a whole bunch of Euler Project problems in Python, and I found it very helpful to build up a library of useful functions along the way. So I didn’t just want to solve this problem; I wanted to use it as an opportunity to add a factors function, which would take an integer n and return a list of n’s prime factors, to my repertoire. After I did so, the solution to Problem 3 was a joke:

last $ Euler.factors 600851475143

Coming up with factors was the part that gave me problems. Here’s the basic approach that I used and to which my eventual success hewed pretty closely:

Write a divisible function to tell me if one number divides another. (This is just syntactic sugar for n mod m = 0.)
Use divisible to find the first divisor d of n by checking everything between 1 and √n.
Recursively call factors on n / d.

I’m sure this is far from optimal but I believe it’s the approach I used in Python, it worked pretty well there, and it seemed to me that it would work even better in Haskell. Well…

The most difficult thing about this problem actually turned out to be the √n part. Haskell has a built-in square root function for floating point numbers, of course, but unfortunately because of precision issues, just using it and then taking the floor of the result can lead to wrong answers and overflow errors for large enough numbers. I’m making a repository of useful stuff, I want it to be better than that. So I went and swiped some code based on Newton’s Method:

squareRoot ∷ Int → Int
squareRoot n 
    | n ==  1              = 1 
    | otherwise            = div (k + ( div n k)) 2
where k = squareRoot(n-1)

I implemented everything else and tried it. It worked okay, but when I ran factors on a six-digit number, it took probably ten to fifteen seconds. A back of the envelope calculation then yields about 200 days for the actual problem, assuming the runtime increased linearly, which it almost certainly would not have. Of course that didn’t matter, because when I compiled and ran the same test instead of using GHCi, I got a stack overflow error.

I figured the problem was in my factors definition. Some googling for the error yielded some interesting and, to me, unintuitive results about tail recursion. Sure enough, factors was not tail call recursive; my recursive call did a division and a list append after returning. I floundered about with some attempts to force strict evaluation but they were more or less at random and so of course yielded nothing. Finally I added an accumulating parameter to factors to make it properly tail recursive. The stack overflow persisted.

OK. I decided to try some bigger guns and dip a toe into Haskell profiling. I realize now that that page says heap profiling, whereas I had a stack overflow, which may be why the option that seemed to be necessary for me wasn’t on it. In any case, I did some more random flailing about with command line options that produced useless files before compiling using ghc -prof -auto-all and running with ./problem3 +RTS -p -hy to get something useful: a file named problem3.prof that told me most of my time was actually being spent in squareRoot!

Treachery! That innocuous-looking code I’d pilfered from the web was terrible! And now that I looked at it again, why the hell was it recursing on n - 1? That would seem to mean that calculating the square root of 261,315 would require doing the same for 261,314 other numbers as well!

I scrapped that code and wrote a stupid binary search my damn self, using not one but two accumulating parameters to make sure it was tail-recursive. Here’s the result of my sweat and tears (no blood, I mean Jesus it’s just computer programming):

squareRoot' ∷ Integer → Integer → (Integer, Integer) → Integer
squareRoot' n g (min, max)
    | g^2 < n   = if (g + 1)^2 > n
                      then g
                      else squareRoot' n ((max + g) `div` 2) (g, max)
    | g^2 > n   = squareRoot' n ((min + g) `div` 2) (min, g)
    | otherwise = g

squareRoot n = squareRoot' n (n `div` 2) (0, n)

Needless to say, I’m not nearly as euphoric about this as I was about Problem 2. In fact I kind of hate this code. Accumulating parameters are gross and ugly and they make this function take way too many parameters.

On the plus side, according to the same profiling I’d done before, it appears that my code ran in about 2 μs, which I must admit is a significant improvement on 200 days (if the old slow code was linear) or the lifespan of the universe (if it was exponential). At some point I may go back and see which of my optimizations were necessary, but I think I’m done touching this code for now. Fighting with it was demoralizing at first, but I think it was necessary for me to get a more realistic idea of how Haskell operates.