Project Euler Problem 22

Ivar Thorson bio photo By Ivar Thorson

Project Euler Problem 22 gives us our first opportunity to use regular expressions. It’s amazing how a single re-seq basically takes care of extracting data from the file format.

This code also features a neat way that I discovered to initialize a hashmap as a lookup table.

(def *letters* "ABCDEFGHIJKLMNOPQRSTUVWXYZ")

(def char2num (apply hash-map (interleave *letters* (iterate inc 1))))

(defn letterscore [s]
  "Returns the score of the letters of string s."
  (reduce + (map #(char2num %) s)))

(defn get-names [namestr]
  (re-seq #"\w+" namestr))

(defn euler-22 [namefile]
  (let [names (sort (get-names (rp namefile)))]
    (reduce + (map * (map letterscore names) (iterate inc 1)))))

(euler-22 "/home/ivar/names.txt")

My understanding of Clojure must be improving, because this felt particularly smooth to write and elegant to look at afterward. The only question remains is what small changes might we make to this code to make it more extensible in the future? What should happen if there is an error? With if we want to use this code with other alphabets?

In the latter case, we just need to change the regular expression:

;; Need to redefine char2num hash to use letterscore with different alphabets
(def *letters*
     "あいうえおかきくけこさしすせそたちつてとなにぬねのはひふへほまみむめもやゆよらりるれろわをん")

(def char2num
     (apply hash-map (interleave *letters* (iterate inc 1))))

(defn letterscore [s]
  (reduce + (map #(char2num %) s)))

(defn get-names [namestr]
  (re-seq #"\p{L}+" namestr))

(def japanese-namescores
     (let [names (get-names "\"ともこ\",\"みえこ\",\"ひろき\",\"まさし\"")]
       (apply hash-map (interleave (map letterscore names) names))))

(println japanese-namescores)
{65 ともこ, 77 ひろき, 46 みえこ, 54 まさし}

Kind of neat, huh?

user> (re-seq #"\p{L}+" "日本語 の 文章 は スペース の 必要 が ない って、 本当?")
("日本語" "の" "文章" "は" "スペース" "の" "必要" "が" "ない" "って" "本当")
user> (re-seq #"\p{L}+" "It's really true that Japanese sentences don't need spaces?")
("It" "s" "really" "true" "that" "Japanese" "sentences" "don" "t" "need" "spaces")

Clearly there is a lot here to explore.