Project Euler Problem 52

Ivar Thorson bio photo By Ivar Thorson

Problem 52 was too easy, so I took the time to use it’s simplicity and study how lazy sequences are chunked, and how they are grouped when apply‘d.

Because this is such a simple problem, I’ll show you the step-by-step refactoring process as I took as I tried to improve the performance and simplicity of the code.

Our goal in this problem was to find a number that when multiplied by 2,3,4,5, and 6 has the exact same digits, but in a different order.

First, let’s write something that solves the problem.

(defn digs [n] (sort (str n)))

(defn euler-52 []
  (first
   (for [x (iterate inc 1)
         :when (every? #(= (digs x) %)
                       (map #(digs (* x %)) [2 3 4 5 6]))]
     x)))

(time (euler-52)) ;; "Elapsed time: 563.332529 msecs"

Since that was trivial to write, let’s see if we can’t make it more elegant and faster. First, let’s make it faster by using the short-circuiting behavior of and:

;; short circuits using and
(defn euler-52-b []
  (loop [n 1
         d (digs n)]
    (if (and (= d (digs (* 2 n)))
             (= d (digs (* 3 n)))
             (= d (digs (* 4 n)))
             (= d (digs (* 5 n)))
             (= d (digs (* 6 n))))
      n
      (recur (inc n) (digs (inc n))))))

(time (euler-52-b)) ;; "Elapsed time: 159.1435 msecs"

What about if we wanted to improve the brevity? Maybe we can use apply and = to avoid treating the n case different from the 2n, 3n, …6n, cases.

;; Short circuiting broken because of chunking
(defn euler-52-c []
  (first
   (filter
    (fn [n] (apply = (map #(digs (* n %)) [1 2 3 4 5 6])))
    (iterate inc 1))))

(time (euler-52-c)) ;; "Elapsed time: 603.714745 msecs"

You may be wondering, “Why did the performance get so bad? Shouldn’t the performance be nearly as good as euler-52-b?”

The answer to this question is that the chunking behavior of sequences (which takes things in groups of 32) meant that the entire array [1 2 3 4 5 6] was being evaluated and passed through the map. Also, the behavior of apply when given a long list is that it will take its arguments in groups of four, which is more than is needed.

If you want to see this chunking behavior visually, try the following two lines of code in your REPL:

(apply = (map #(do (print %) %) (iterate inc 1))) ;; Groups by 4 during apply
(apply = (map #(do (print %) %) (range)))  ;; Seqs chunk by 32's

Thinking of workarounds for the natural behavior of apply (it may be instructive to look in clojure.core’s source to see how apply is implemented if you are curious why it acts this way), I tried to use take-while and count that all the elements were consumed, but the result is a little cumbersome.

;; Short circuits because of take-while
(defn euler-52-d []
  (first
   (filter
    (fn [n]
      (let [d (digs n)]
        (= 5 (count (take-while #(= d (digs (* n %))) [2 3 4 5 6])))))
    (iterate inc 1))))

(time (euler-52-d)) ;; "Elapsed time: 236.771 msecs"

Changing hats once again and thinking once again about brevity, let’s use recur to shorten our definition…

;; Nice and short
(defn euler-52-f [n]
  (if (apply = (map #(sort (str (* n %))) [1 2 3 4 5 6]))
    n
    (recur (inc n))))

(time (euler-52-f 1)) ;; "Elapsed time: 552.148607 msecs"

That worked well and is certainly clearer. Now let’s combine it with our earlier take-while trick and see if we can improve performance again:

(defn euler-52-g [n]
  (let [d (sort (str n))]
    (if (= 5 (count (take-while #(= d (sort (str (* n %)))) [2 3 4 5 6])))
      n
      (recur (inc n)))))

(time (euler-52-g 1)) ;; "Elapsed time: 187.475716 msecs"

Great! I think most people would just stop at euler-52-g in the refactoring process, but let’s do one more just to flex our computer scientist muscles, and then we’ll call it quits for today.

Theoretically, short-circuiting behavior on our sequence processing could improve performance by removing the unnecessary computation invoked by chunking and the take-by-four behavior of apply. If we had a version of apply that short-circuited, we could write a function like euler-52-f that would be as performant as anything else.

Of course, we don’t want to actually rewrite apply, and nothing in the standard sequence library quite fits. So, let’s take the time to write two functions that may be useful in the future:

(defn reducep
  "Like reduce, but for use with a predicate. Short-circuits on first false."
  ([p coll]
     (if-let [s (seq coll)]
       (reducep p (first s) (next s))
       (p)))
  ([p val coll]
     (if-let [s (seq coll)]
       (if-let [v (p val (first s))]
         (recur p (first s) (next s))
         false)
       true)))
    
(defn unchunk
  "Forces a sequence to be evaluated element by element.
  Modified from code by Stuart Sierra"
  [s]
  (when-first [x s]
    (lazy-seq (cons x (unchunk (next s))))))

With those two functions in hand, we can have our cake and eat it too: brevity and reasonable performance.

Using just unchunk, we can write:

(defn euler-52-h [n]
  (let [d (sort (str n))]
    (if (every? #(= d %1) (map #(sort (str (* n %)))
                               (unchunk [2 3 4 5 6])))
      n
      (recur (inc n)))))

Or better yet,

(defn euler-52-z [n]
  (if (reducep = (map #(sort (str (* n %))) (unchunk [1 2 3 4 5 6])))
        n
        (recur (inc n))))

(time (euler-52-z 1)) ;; "Elapsed time: 336.205473 msecs"

Which version did you like the best? The easy-to-understand euler-52-b? The brief definition in euler-52-f or it’s more performant sister euler-52-g? Or the reducep abstraction approach of euler-52-z?

Ciao!