Earlier, I wrote how the straight translation to CL had sped things up:
sbcl: table.lisp
real 0m2.868s user 0m2.259s sys 0m0.001s
That was pretty good I think, but could do better. Also, there is room to demonstrate why language features can play a role in other benchmarks such as programmer productivity and program flexibility.
I did some more tinkering, and added the following, which comes (I think) from one of Peter Norvig's texts:
(defun memoize-symbol (funsym) "Memoizes the function named by the given symbol." (let ((unmemoized (symbol-function funsym)) (table (make-hash-table :test #'equalp))) (setf (symbol-function funsym) #'(lambda (&rest args) (multiple-value-bind (value found) (gethash args table) (if found value (setf (gethash args table) (apply unmemoized args))))))))
What this does bears some thinking about: take an arbitrary function, and replace it with a version which caches previous results. Twelve lines, including a doc-string, to fundamentally change how an existing function works, without altering how you call it. *WHOA*
Now, I did make a couple small changes, because my first version of hstry returns multiple values, but the simple cache here only grabs one. Also, it didn't have a way of making use of sub-results, so there would only be a benefit if the function was called more than once.
In essence, the following code makes hstry recursive, and hsdisp is tweaked a little to take a single list instead of multiple values.
(defun hstry2r (hs) (let ((length 0) (height hs) (result nil))
(when (> hs 1) (if (= (mod hs 2) 0) (setf hs (/ hs 2)) (progn (setf hs (+ (* 3 hs) 1)) (if (> hs height) (setf height hs)))) (setf result (hstry2r hs)) (setf length (1+ (car result))) (when (> (cadr result) height) (setf height (cadr result)))) (list length height)))
(defun hsdisp2 (maxTry) (let ((hs 0) (hsMaxLength 0) (hsMaxHeight 0) (hsML '()) (hsMH '()) (lh '()))
(loop while (< hs maxtry) do (setf hs (1+ hs)) (setf lh (hstry2r hs)) (if (> (car lh) hsMaxLength) (progn (setf hsMaxLength (car lh)) (push (list hs hsMaxLength) hsML))) (if (> (cadr lh) hsMaxHeight) (progn (setf hsMaxHeight (cadr lh)) (push (list hs hsMaxHeight) hsMH))))
(format t "~10,A ~10,A~%" "hs" "length maxima") (dolist (x (reverse hsML)) (format t "~{~10,d ~}~%" x)) (format t "~%~%~10,A ~10,A~%" "hs" "height maxima") (dolist (x (reverse hsMH)) (format t "~{~10,d ~}~%" x)) ))
So now, what's the difference? Well, as-is, it is slower, mainly because it allocates more RAM for temporary use.
4.93 seconds of real time 4.158 seconds of user run time 0.383 seconds of system run time [Run times include 1.143 seconds GC run time.]
OK, fine, not so impressive, that's about twice as slow? Then I execute:
(memoize-symbol 'hstry2r)
This just turned my recursive version of hstry into the caching, recursive version. Because it breaks the process into calls of itself with other arguments, there are often duplicates which we can cache.
1.593 seconds of real time 1.252 seconds of user run time 0.06 seconds of system run time [Run times include 0.316 seconds GC run time.]
Much better, the benchmark now runs in about half the time my original needed, with little modification to the code. Now though, it gets better. If this result was something we'd want again, we get this:
0.158 seconds of real time 0.144 seconds of user run time 0.005 seconds of system run time
Almost identical to the C version! hsdisp2 still calls hstry2r 100000 times, but those results are all cached, leaving us with function call and hash lookup as the overhead.
There are a few points:
- An interactive development system lets you try things out before you have even saved a file, then splice it into code once it works. In some systems, you can even do that without shutting down the program.
- Language flexibility lets you easily make powerful changes that would cause oodles of grief to do in a lower-level language.
- The time saved lets you spend more time on the hot spots if you need to. If you don't need to, you are just done earlier, so go outside and enjoy the sun. Writing an entire program in a low-level language is a waste of time if you don't have to do that. Like Ihor's other example though with Python, you could do a piece in C to make it faster, and still have the rest of the program done the easy way.
Cheers, Tim