Project Euler Problem 14

This problem was rather fun to solve. The Collatz sequence is a rather simple pattern that is as-yet mathematically unproven to converge, at least according to wikipedia.

Because writing a solution to the problem is remarkably simple, I chose to focus on brainstorming a variety of lazy solutions to the problem to see what I could learn about clojure’s performance with lazy sequences. Readers who finish this post may also see how to easily create stack and heap overflows…

But before we dive into descriptions about Collatz sequences, let’s define a function to find the position of the maximum element in the sequence. I had two initial ideas on how to do this:

;; The easy way
(use 'clojure.contrib.seq-utils)
(defn max-at [s]
  (reduce #(if (> (second %1) (second %2)) %1 %2) (indexed s)))

;; Doesn't create ephemeral garbage, so is faster
(defn max-at [s]
  "Returns [xi x], where xi is the index number of the max element,
  (indexed from 1) and x is the max of s."
  (letfn [(find-max [s n x xi]
             (let [f (first s)
                   r (rest s)]
               (if (nil? f)
                 [xi x]
                 (if (> f x)
                   (recur r (inc n) f n)
                   (recur r (inc n) x xi)))))]
    (find-max s 1 0 0)))

The first definition is succinct but unfortunately generates lots of ephemeral garbage because it needs to create so many small vectors. Because I anticipate that this type of function will be used a lot in future problems, I took the time to write the second definition. Because it uses recur, the second definition traverses the list without needing much memory and is so is faster.

Next: Collatz Sequences! I started with the typical brute force, low-memory-use approach, avoiding both recursion and the necessity of even recording the particular numbers in the chain. This type of solution uses a minimum amount of memory and is a fairly tight loop. Remember that reduceing lazy sequences means only two elements need to be produced and held in memory at one time.

(defn collatz-seq-length [num]
  (loop [len 1
         n  num]
    (cond
     (= n 1) len 
     (odd? n) (recur (inc len) (+ 1 (* 3 n)))
     true     (recur (inc len) (/ n 2)))))

(defn euler-14-loop [num]
  (max-at (map collatz-seq-length (range 1 num))))

(dotimes [_ 5]
  (time (euler-14-loop 1000000)))

"Elapsed time: 10459.657878 msecs"
"Elapsed time: 10401.909871 msecs"
"Elapsed time: 10394.646465 msecs"
"Elapsed time: 10399.737577 msecs"
"Elapsed time: 10394.974482 msecs"

Performance wasn’t bad, but we could easily improve upon it with a little memoization, right?

(def memoized-collatz-seq-length
     (memoize
      (fn [n]
        (cond
         (= n 1) 1 
         (odd? n) (+ 1 (memoized-collatz-seq-length (+ 1 (* 3 n))))
         true     (+ 1 (memoized-collatz-seq-length (/ n 2)))))))

(defn euler-14-memoized [num]
  (max-at (map memoized-collatz-seq-length (range 1 num))))

(dotimes [_ 5]
  (time (euler-14-memoized 1000000)))

"Elapsed time: 7803.967211 msecs"
"Elapsed time: 1877.55123 msecs"
"Elapsed time: 774.759701 msecs"
"Elapsed time: 743.353267 msecs"
"Elapsed time: 739.709821 msecs"

As expected, memoization cached the collatz length values for us after they were created, but it didn’t help us much to create the values in the first place so the first run. Maybe we can improve on that?

One obvious thing about collatz sequences is that if they all return to 1 in the end, then if we compute many collatz sequences we are going to traverse the same pattern of numbers over and over again.

By caching the sequences of numbers that we have traversed before, we should be able to trade some of our surplus memory for a mild increase in speed. The larger the collatz numbers we try to compute, the more this type of tradeoff will become computationally advantageous.

A russian engineering professor of mine once said that we should always start by imagining the ideal final result of our solution. So, I wrote down this (broken) piece of code:

;; Doesn't work

(def lazy-chain-length
     (letfn [(chain [n]
               (lazy-seq
                (cons (if (odd? n)
                        (+ 1 (nth lazy-chain-length (dec (+ 1 (* 3 n)))))
                        (+ 1 (nth lazy-chain-length (dec (/ n 2)))))
                      (chain (+ n 1)))))]
       (chain 1)))

(max-at (take 10000000 lazy-chain-length))

Why doesn’t this work? The basic problem is that this type of recursive definition, however beautiful, requires that you somehow use lazy values that will be found ‘in the future’ that haven’t been computed yet. Although it is fine for lazy-chain-length to define itself in terms of past values (/ n 2), it has no way of knowing the values of ‘future’ collatz values (+ 1 (* 3 n)), so trying to define this causes Clojure to freak out when it sees this definition. (Special thanks to the fine folk at Stack Overflow for helping me understand this better)

Instead of using the lazy seq itself to store collatz sequence lengths, whate if we pushed the values into a hashmap? Hash maps in clojure are pretty fast, after all.

(defn collatz-hashmap [n cache]
  (if-let [v (cache n)]
    [v cache]
    (let [[p cache] (if (odd? n)
                      (collatz-hashmap (+ 1 (* 3 n)) cache)
                      (collatz-hashmap (/ n 2) cache))]
      [(inc p) (assoc cache n (inc p))])))

(def lazy-collatz-hashmap
     (map first (iterate (fn [[v cache n]]
                           (concat (collatz-hashmap n cache) [(inc n)]))
                         [1 {1 1} 2])))

(defn euler-14-hashmap [num]
  (max-at (take num lazy-collatz-hashmap))) 

(dotimes [_ 5]
  (time (euler-14-hashmap 1000000)))

No message.
  [Thrown class java.lang.StackOverflowError]

Oops! This recursive definition is recursing too deep for the small default stack size of the JVM on which Clojure is running. But before we rewrite that, what was the performance like?

(dotimes [_ 5]
  (time (euler-14-hashmap 100000)))

"Elapsed time: 499.264841 msecs"
"Elapsed time: 24.243182 msecs"
"Elapsed time: 24.289533 msecs"
"Elapsed time: 24.200116 msecs"
"Elapsed time: 24.282115 msecs"


(dotimes [_ 5]
  (time (euler-14-memoized 100000)))

"Elapsed time: 323.063032 msecs"
"Elapsed time: 62.479948 msecs"
"Elapsed time: 111.146193 msecs"
"Elapsed time: 55.228571 msecs"
"Elapsed time: 55.107326 msecs"

So hashmaps don’t seem to beat memoized functions, even the first time. But still, what if modify the code to keep from blowing the stack? Might it not become more advantageous for larger numbers of collatz sequence lengths?

Instead of using recursion at the functional level, let’s just build our own ‘stack’ using a vector to store unknown values, and use recur to avoid using the JVM stack. Once a known/cached Collatz sequence length has been reached, we can then populate the map upwards and add a bunch of numbers all at once.

(defn collatz-next [n]
  (cond
    (= n 1) 1
    (odd? n) (+ 1 (* 3 n))
    true (/ n 2)))

(defn collatz-nonrecursing [[_ cache xs]]
  (if-let [v (cache (last xs))]
    [(+ v -1 (count xs))
     (reduce #(let [[m p] %2] (assoc %1 m p))
             cache
             (map vector (reverse xs) (range v (+ v (count xs)))))
     [(inc (first xs))]]
    (recur [0 cache (conj xs (collatz-next (last xs)))])))

(def lazy-collatz-nonrecursing
     (map first (iterate collatz-nonrecursing [1 {1 1} [2]])))

(defn euler-14-nonrecursing [num]
  (max-at (take num lazy-collatz-nonrecursing)))

(dotimes [_ 5]
  (time (euler-14-nonrecursing 1000000)))

"Elapsed time: 21587.162081 msecs"
"Elapsed time: 268.052347 msecs"
"Elapsed time: 239.52824 msecs"
"Elapsed time: 240.095869 msecs"
"Elapsed time: 236.408247 msecs"

Good grief! That was a dismal failure! In fact, the above code causes a heap overflow on my other, less-powerful computer if I use an inefficient max-at function.

My guess is that however beautiful list comprehensions are for letting one bundle together multiple return values and then later parse them out again, the memory penalty incurred can be prohibitive, as it was in this case.

Giving up on the hashmap approach, for my final attempt, I decided that instead of trying to cache each and every part of the Collatz sequence tree structure, I would only cache the values below my current place of interest (as the memoization system does automatically). Because lookup time becomes an issue for lazy seqs (which are essentially just iterators with nth performance of O(n), not constant time), I needed to figure out a way to make lookups to earlier parts of the lazy seq be in constant time.

Once again, Stack Overflow came to the rescue and reminded me how I might create a lazy sequence that is also a vector, and so has constant time access with nth. Since is good practice to write smaller, easily-testable functions, I also partitioned my code a little more elegantly and added documentation strings. Here is the complete solution:

(defn collatz-next [n]
  "Returns the next number in a collatz sequence."
  (cond
    (= n 1) 1
    (odd? n) (+ 1 (* 3 n))
    true (/ n 2)))

(defn collatz-seq [n]
  "Returns a lazy list of numbers in a collatz sequence."
  (iterate collatz-next n))

(defn collatz-stack [vc n]
  "Returns a lazy list of numbers in a collatz sequence that aren't in vector vc."
  (let [stack (take-while #(not (and (> n %) (vc %))) (collatz-seq n))]
    stack))

(defn collatz-value [vc]
  "'Appends' the length of the collatz sequence for the last value.
  Zero indexed... Ex: Given [0 1 2], returns 8."
  (let [n (count vc)
        s (collatz-stack vc n)]
    (conj vc (+ (count s) (vc (collatz-next (last s)))))))

(def lazy-collatz-vec (iterate collatz-value [0 1 2]))

(defn euler-14-lazy-vec [n]
  (max-at (nth lazy-collatz-vec n)))

(dotimes [_ 5]
  (time (euler-14-lazy-vec 1000000)))

"Elapsed time: 5556.394195 msecs"
"Elapsed time: 117.392197 msecs"
"Elapsed time: 163.767481 msecs"
"Elapsed time: 107.376708 msecs"
"Elapsed time: 178.192841 msecs"

Success! Performance is mildly better than it was for the tight loop solution (euler-14-loop) and even the memoized solution if we take the first 1,000,000 entries. But what if we try for 10,000,000 entries? Shouldn’t the algorithm (and also memoization system) get better and better for larger numbers because the lengths of already-computed paths back to 1 will be relatively larger?

(dotimes [_ 5]
  (time (euler-14-loop 10000000)))

"Elapsed time: 102569.50535 msecs"
"Elapsed time: 91814.976501 msecs"
"Elapsed time: 91668.209228 msecs"
"Elapsed time: 91615.01015 msecs"
"Elapsed time: 91621.117394 msecs"

(dotimes [_ 5]
  (time (euler-14-memoized 10000000)))

GC overhead limit exceeded
 [Thrown class java.lang.OutOfMemoryError]

(dotimes [_ 5]
  (time (euler-14-lazy-vec 10000000)))

GC overhead limit exceeded
  [Thrown class java.lang.OutOfMemoryError]

In the immortal words of Homer Simpson, “D’oh!”

I guess the implementation overhead is just getting too big… Anybody see a way around this?

Finally, what if we pulled out the Java hatchet and began hacking away using (int-array), manually managing the memory and preallocating it, and ditching the ‘lazy’ method? Well, it felt really wrong to write imperative code in Clojure, but this worked:

(defn collatz-stack-array [ary n max]
  "Returns a lazy list of numbers in a collatz sequence that aren't in array ary."
  (reverse (take-while #(not (and (> max %)
                           (not (== 0 (aget ary (int %))))))
                (collatz-seq n))))

(defn stack-into-array [ary max stack]
  "PERFORMS SIDE EFFECT ONLY!
  Given an array and a list of elements not in the array, pushes their
  values inside, if possible."
  (let [known (int (collatz-next (first stack)))]
    (aset-int ary 0 (aget ary known)) ;; Use zero index as temp variable
    (dorun (map #(do
                   (aset-int ary 0 (inc (aget ary 0)))
                   (when (> max %)
                     (aset-int ary % (aget ary 0))))
                stack))))

(defn collatz-array [num]
  "Returns a java int array of the lengths of collatz sequences."
  (let [ary (int-array num 0)]
    (aset-int ary 1 1)
    (loop [i 2]
      (when (zero? (aget ary i))
        (stack-into-array ary num (collatz-stack-array ary i num)))
      (when (< (inc i) num)
        (recur (inc i))))
    ary))

(defn euler-14-intarray [num]
  (let [ary (collatz-array num)]
    (areduce ary i ret [0 0]
             (if (> (second ret) (aget ary i))
               ret
               [i (aget ary i)]))))

(dotimes [_ 5]
  (time (euler-14-intarray 1000000)))
"Elapsed time: 2654.909547 msecs"
"Elapsed time: 2663.26126 msecs"
"Elapsed time: 2733.119015 msecs"
"Elapsed time: 2688.467888 msecs"
"Elapsed time: 2738.851569 msecs"

(time (euler-14-intarray 10000000))
"Elapsed time: 28301.846968 msecs"

Note that the performance is now about 3-4x better than the original loop algorithm, and that it seems to scale to even larger problems (10,000,000 entries). I think there is still some java reflection occurring here that more type hinting could remove, so it seems plausible this could go much faster.

Since I basically beat this problem to death with repetition, I didn’t find any insights from looking at other people’s code, so it’s time to summarize what we learned:

Trading memory for computation time doesn’t always work, especially using a technique like laziness with significant implementation overhead
Lazy vector creation is still an interesting technique for vectors smaller than a couple million elements.
Using mutable java arrays and side effects feels terribly dirty in Clojure, but is at least possible.
Automatic memoizations are pretty damn good, and trying to beat them without good reason is probably a premature optimization.

Art is Never Finished

...Only Abandoned. – Leonardo da Vinci

Project Euler Problem 14