During last night's meeting a question was raised about benchmarks for the python language, or how fast it is relative to other computer languages. Since I am a computer language buff, I will attempt to give a more informative answer to this question, (to augment and correct my utterance of last evening).
Basically a benchmark is an implementation of a timing procedure, that determines how fast a specified program runs in a given computer language, or a software manufacturer's implementation of a given language. Various accounts have been written as to what should be the components of the ideal benchmark, (using the sieve of Eratosthenes, etc.). Of course, these, can give an idea as to how fast various languages perform. However, a benchmark should be a reflection of the type of programs, you are most interested in using.
If your applications will mostly be using disk file i/o, or large memory transfers, then a number crunching algorithm, as a benchmark, may not give you a relevant answer. Even, if you are focused on number crunching, the numerical algorithms used in calculating say, perspective views in a graphics program may be quite different from a program performing finite element analysis, or number theory conjectures.
(Urban Legend Gossip Note: a software manufacturer may fine tune or optimize their product to run more quickly with the well known benchmark algorithms.)
The above then becomes an argument for designing your own benchmark. By carefully noting what you want to perform with your programs, you can determine what kind of routines will be used. These are the routines you should use in your benchmark.
Having said the above, I will give a benchmark with results, as an example --knowing full well that that I can not please everyone-- the results will only be an indication of run times that may or may not be relevant to your application needs.
In my benchmark, I decided to test run times for: function (or subroutine) calls, and basic arithmetic. I decided in using a not too generally well known algorithm, but one having documented results.
The benchmark I chose, consists of an algorithm that calculates the length maxima, and the maxima of height maxima, of consecutive hailstone numbers from 1 to 100 000. For the definition and properties of hailstone numbers, as used here, refer to the Collatz conjecture. One of the earliest references to this with tabulated results is by:
Brian Hayes SCIENTIFIC AMERICAN (Computer Recreations) Jan. 1984, page 10.
No matter what language is used, all of my benchmark programs are written in the same format. This consists of a main program that writes an output table to a data file, the table entries are calculated by calling a single function or a subroutine.
The benchmark test suite consists of programs written in: C, Java, Python, and a hybrid (consisting of a main routine in Python with a C subroutine). Here are the results...
Running table.c (written in C), elapsed time is:
real 0m0.500s user 0m0.500s sys 0m0.000s
Running table.java (written in Java), elapsed time is:
real 0m15.413s user 0m14.930s sys 0m0.070s
Running table.py (written in Python), elapsed time is:.
real 1m57.656s user 1m57.620s sys 0m0.030s
Running combo.py (written in Python, with a subroutine in C), elapsed time is:
real 0m5.161s user 0m5.090s sys 0m0.070s
Here is the source code...
****************************** table.c *************************************** #include <stdio.h> #include <stdlib.h> #include <sys/types.h>
/* prototypes */ void hsTry(long int, long int *); void hsDisp(void);
/* globals */ static int maxTry =100000; static char fname1[] = "Hsc.data";
main() { hsDisp(); }
void hsDisp(void) { /* prints the longest path and largest height, (as encountered so far) */ typedef struct { char hst; long int hs; long int hsMax; } rcdHM;
int i =0, ecMPH =1; long int hs = 0, hsMaxPath =0, hsMaxHeight =0, hsPH[2] ={0, 0}; rcdHM *hsMPH; rcdHM *idxMPH ;
FILE *fil1;
hsMPH = (rcdHM *) calloc(ecMPH,sizeof(rcdHM)); if (!hsMPH) { printf("\n calloc error \n"); exit(1);} idxMPH =hsMPH;
while (hs < maxTry) { hs++; hsTry(hs, hsPH); if (hsPH[0] > hsMaxPath) { hsMaxPath =hsPH[0]; ecMPH++; hsMPH = (rcdHM *) realloc(hsMPH, ecMPH * sizeof(rcdHM)); if (hsMPH == NULL) { printf("realloc error"); exit(1);} else { idxMPH++; idxMPH->hst ='l'; idxMPH->hs =hs; idxMPH->hsMax =hsMaxPath; } } if (hsPH[1] > hsMaxHeight) { hsMaxHeight =hsPH[1]; ecMPH++; hsMPH = (rcdHM *) realloc(hsMPH, ecMPH * sizeof(rcdHM)); if (hsMPH == NULL) { printf("realloc error"); exit(1);} else { idxMPH++; idxMPH->hst ='h'; idxMPH->hs =hs; idxMPH->hsMax =hsMaxHeight; } } }
if ( !(fil1=fopen(fname1,"w")) ) exit(1); fprintf(fil1, "calculates hailstone numbers \n"); fprintf(fil1, " hs path maxima \n"); idxMPH =hsMPH; for(i=1;i <= ecMPH; i++,idxMPH++) if (idxMPH->hst=='l') {fprintf(fil1, " %9i %9i \n", idxMPH->hs, idxMPH->hsMax);}
idxMPH =hsMPH; fprintf(fil1, "\n hs height maxima \n"); for(i=1;i <= ecMPH; i++,idxMPH++) if (idxMPH->hst=='h') {fprintf(fil1, " %9i %9i \n", idxMPH->hs, idxMPH->hsMax);}
free(hsMPH); fclose(fil1); /* clean up */ }
void hsTry(long int hs, long int hsPH[]) { /* input positive integer, returns path length and maximum height */ /* PathLength = hsPH[0], Height = hsPH[1] */ hsPH[0] =0; if (hs <= 0) printf("out of bounds \n"); while(1) { if (hs == 1) break; else if (hs%2 == 0) hs =hs/2; else if (hs%2 == 1) { hs =(hs*3)+1; if (hs > hsPH[1]) hsPH[1] =hs; } hsPH[0]++; } }
****************************** table.java *************************************** import java.io.*; import java.lang.*; import java.text.*; import java.util.*; import java.lang.reflect.Array; import java.lang.Math.*;
public class Hsj { static int maxTry = 100000; static PrintWriter out = new PrintWriter(System.out, true);
public static long [] hsTry(long hs, long [] hsPH) { // input positive integer, returns path length and maximum height /* PathLength = hsPH[0], Height = hsPH[1] */ hsPH[0] =0; if (hs >= 0) { while (true) { if (hs == 1) break; else if (hs%2 == 0) hs/=2; else if (hs%2 == 1) { hs =(hs+hs+hs)+1; if (hs > hsPH[1]) hsPH[1] =hs; } hsPH[0]++; } } return hsPH; }
public static void hsDisp() throws IOException { // Prints the longest path and largest height, (as encountered so far) File outputFile = new File("Hsj.data"); FileWriter output = new FileWriter(outputFile);
Vector hsMP = new Vector(); Vector hsMH = new Vector();
long hs =0, hsMaxPath =0, hsMaxHeight =0;
long hsPH[] = new long [2];
while (hs < maxTry) { hs+=1; hsTry(hs, hsPH); if (hsPH[0] > hsMaxPath) { hsMaxPath =hsPH[0]; hsMP.add(new Long(hs)); hsMP.add(new Long(hsMaxPath)); } if (hsPH[1] > hsMaxHeight) { hsMaxHeight =hsPH[1]; hsMH.add(new Long(hs)); hsMH.add(new Long(hsMaxHeight)); } } output.write(" calculate hailstone numbers \n");
DecimalFormat hfmt = new DecimalFormat(" 000000000000"); output.write(" hs length maxima \n");
for (int i =0; i < hsMP.size()-1; i+=2) { output.write(hfmt.format(hsMP.elementAt(i)) +hfmt.format(hsMP.elementAt(i+1))+"\n"); }
output.write("\n hs height maxima \n"); for (int i=0; i < hsMH.size()-1; i+=2) { output.write(hfmt.format(hsMH.elementAt(i)) +hfmt.format(hsMH.elementAt(i+1))+"\n"); }
output.close(); }
public static void main (String args[]) { try{ hsDisp(); }catch(IOException event) { out.println("io error");} }
}
****************************** table.py *************************************** #! $(env_path)/env $(py_path)/python # above parameters need to be configured to your own system
# Basic Hailstone number properties, (refer to Collatz conjecture)
def hsTry(hs): 'Find path length and maximum height for a hailstone number.' length = 0 ; height = hs
while (hs>1): if hs%2 == 0: hs/=2 elif hs%2 == 1: hs = hs+hs+hs+1 if hs > height: height =hs length+=1 return [length, height]
def hsDisp(maxTry): '''Write to file, length maxima and maxima of height maxima, of a sequence of hailstone numbers up to maxTry.'''
hs =0; hsMaxLength =0; hsMaxHeight =0 hsPH =[0,0]; hsML =[]; hsMH =[];
while (hs < maxTry): hs+=1 hsPH =hsTry(hs) if (hsPH[0] > hsMaxLength): hsMaxLength =hsPH[0] hsML.append((hs,hsMaxLength)) if (hsPH[1] > hsMaxHeight): hsMaxHeight =hsPH[1] hsMH.append((hs,hsMaxHeight))
fil=open('Hsp.data','w') fil.write('calculate hailstone numbers \n\n') fil.write(" hs length maxima \n") for x,y in hsML: fil.write(' %9d %9d\n' % (x,y)) fil.write("\n hs height maxima \n") for x,y in hsMH: fil.write(' %9d %9d\n' % (x,y)) fil.close
hsDisp(100000)
****************************** combo.py *************************************** #! $(env_path)/env $(py_path)/python # above parameters need to be configured to your own system
# Basic Hailstone number properties, (refer to Collitz conjecture)
# Check for module include path, if not found include it. import sys if filter(lambda x: sys.path[x]=='/examples', range(0,len(sys.path)))==[] : (sys.path).append('/examples')
import Calc
def hsDisp(maxTry): '''write to file, length maxima and maxima of height maxima, of a sequence of hailstone numbers up to maxTry'''
hs =0; hsMaxLength =0; hsMaxHeight =0 hsPH =[0,0]; hsML =[]; hsMH =[];
while (hs < maxTry): hs+=1 hsPH =Calc.hsTry(hs) if (hsPH[0] > hsMaxLength): hsMaxLength =hsPH[0] hsML.append((hs,hsMaxLength)) if (hsPH[1] > hsMaxHeight): hsMaxHeight =hsPH[1] hsMH.append((hs,hsMaxHeight))
fil=open('table.data','w') fil.write('calculate hailstone numbers \n\n') fil.write(" hs length maxima \n") for x,y in hsML: fil.write(' %9d %9d\n' % (x,y)) fil.write("\n hs height maxima \n") for x,y in hsMH: fil.write(' %9d %9d\n' % (x,y)) fil.close
hsDisp(100000)
****************************** Calc.c (called C subroutine) ******************* #include "Python.h"
static void hextTry(long int, long int *); static PyObject *
hs_call(self, args) PyObject *self; PyObject *args; { long int hs, hsPH[2]={0,0}; if (!PyArg_ParseTuple(args, "l", &hs)) return NULL; hsPH[1]=hs; hextTry(hs, hsPH); return Py_BuildValue("[l,l]",hsPH[0],hsPH[1]); }
static PyMethodDef HextMethods[] = { {"hsTry", hs_call, METH_VARARGS, "Find path length and maximum height for a hailstone number." }, {NULL, NULL, 0, NULL} /* Sentinel (depends on python version) */ };
static void hextTry(long int hs, long int hsPH[]) { /* input positive integer, returns path length and maximum height */ /* PathLength = hsPH[0], Height = hsPH[1] */ hsPH[0] =0; if (hs <= 0) printf("out of bounds \n"); while(1) { if (hs == 1) break; else if (hs%2 == 0) hs =hs/2; else if (hs%2 == 1) { hs =(hs*3)+1; if (hs > hsPH[1]) hsPH[1] =hs; } hsPH[0]++; } }
void initHext() { (void) Py_InitModule("Hext",HextMethods); }
In conclusion, I determined that, Python is suited as a prototyping language. Not shown in the timing data, is how quickly it took me to develop complete running error free code in python. This was much less than for any of the other languages. That is, you can develop a scratch pad or prototype application using python, that can give a rough or partial estimate of results for a problem. If need be, the application can be rewritten in a faster language after some of the bugs and coding problems have been initially solved by using python. In the long run, much development time can be saved by using this approach.
-Ihor
"Ihor" == Ihor Jakowec ihor@autobahn.mb.ca writes:
Ihor> During last night's meeting a question was raised about Ihor> benchmarks for the python language, or how fast it is Ihor> relative to other computer languages. Since I am a computer Ihor> language buff, I will attempt to give a more informative Ihor> answer to this question, (to augment and correct my Ihor> utterance of last evening).
Just for fun, I re-coded your Python version as Common Lisp, and ran the basic C and Python-only versions for comparison. Note that if you grok the syntax OK, the Lisp one is about as concise and legible as the Python, or at least, not nearly as painful to read as the C. Quicker to write too, I bet. <grin>
By the way, your benchmark is about as Python-unfriendly as it could be, being calculation-intensive with lots of function calls. Many real-world apps easily fall into the "fast enough" category with little work.
Anyway, here's the code and times on my system. Oh yeah, I just dumped it out to standard output, so I'd add a couple lines for real file output.
Cheers, Tim
;;;;;;;;;;;;;; table.lisp, using SBCL ;;;;;;;;;;;;;;;;;;;;;
(defun hstry (hs) (let ((length 0) (height hs)) (loop while (> hs 1) do (if (= (mod hs 2) 0) (setf hs (/ hs 2)) (progn (setf hs (+ (* 3 hs) 1)) (if (> hs height) (setf height hs)))) (setf length (1+ length))) (values length height)))
(defun hsdisp (maxTry) (let ((hs 0) (hsMaxLength 0) (hsMaxHeight 0) (hsML '()) (hsMH '()))
(loop while (< hs maxtry) do (setf hs (1+ hs)) (multiple-value-bind (l h) (hstry hs) (if (> l hsMaxLength) (progn (setf hsMaxLength l) (push (list hs hsMaxLength) hsML))) (if (> h hsMaxHeight) (progn (setf hsMaxHeight h) (push (list hs hsMaxHeight) hsMH)))))
(format t "~10,A ~10,A~%" "hs" "length maxima") (dolist (x (reverse hsML)) (format t "~{~10,d ~}~%" x)) (format t "~%~%~10,A ~10,A~%" "hs" "height maxima") (dolist (x (reverse hsMH)) (format t "~{~10,d ~}~%" x)) ))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
C: table.c
real 0m0.134s user 0m0.128s sys 0m0.002s
python2.2: table.py
real 0m21.171s user 0m20.237s sys 0m0.022s
python2.3: table.py
real 0m15.826s user 0m14.756s sys 0m0.021s
sbcl: table.lisp
real 0m2.868s user 0m2.259s sys 0m0.001s
Earlier, I wrote how the straight translation to CL had sped things up:
sbcl: table.lisp
real 0m2.868s user 0m2.259s sys 0m0.001s
That was pretty good I think, but could do better. Also, there is room to demonstrate why language features can play a role in other benchmarks such as programmer productivity and program flexibility.
I did some more tinkering, and added the following, which comes (I think) from one of Peter Norvig's texts:
(defun memoize-symbol (funsym) "Memoizes the function named by the given symbol." (let ((unmemoized (symbol-function funsym)) (table (make-hash-table :test #'equalp))) (setf (symbol-function funsym) #'(lambda (&rest args) (multiple-value-bind (value found) (gethash args table) (if found value (setf (gethash args table) (apply unmemoized args))))))))
What this does bears some thinking about: take an arbitrary function, and replace it with a version which caches previous results. Twelve lines, including a doc-string, to fundamentally change how an existing function works, without altering how you call it. *WHOA*
Now, I did make a couple small changes, because my first version of hstry returns multiple values, but the simple cache here only grabs one. Also, it didn't have a way of making use of sub-results, so there would only be a benefit if the function was called more than once.
In essence, the following code makes hstry recursive, and hsdisp is tweaked a little to take a single list instead of multiple values.
(defun hstry2r (hs) (let ((length 0) (height hs) (result nil))
(when (> hs 1) (if (= (mod hs 2) 0) (setf hs (/ hs 2)) (progn (setf hs (+ (* 3 hs) 1)) (if (> hs height) (setf height hs)))) (setf result (hstry2r hs)) (setf length (1+ (car result))) (when (> (cadr result) height) (setf height (cadr result)))) (list length height)))
(defun hsdisp2 (maxTry) (let ((hs 0) (hsMaxLength 0) (hsMaxHeight 0) (hsML '()) (hsMH '()) (lh '()))
(loop while (< hs maxtry) do (setf hs (1+ hs)) (setf lh (hstry2r hs)) (if (> (car lh) hsMaxLength) (progn (setf hsMaxLength (car lh)) (push (list hs hsMaxLength) hsML))) (if (> (cadr lh) hsMaxHeight) (progn (setf hsMaxHeight (cadr lh)) (push (list hs hsMaxHeight) hsMH))))
(format t "~10,A ~10,A~%" "hs" "length maxima") (dolist (x (reverse hsML)) (format t "~{~10,d ~}~%" x)) (format t "~%~%~10,A ~10,A~%" "hs" "height maxima") (dolist (x (reverse hsMH)) (format t "~{~10,d ~}~%" x)) ))
So now, what's the difference? Well, as-is, it is slower, mainly because it allocates more RAM for temporary use.
4.93 seconds of real time 4.158 seconds of user run time 0.383 seconds of system run time [Run times include 1.143 seconds GC run time.]
OK, fine, not so impressive, that's about twice as slow? Then I execute:
(memoize-symbol 'hstry2r)
This just turned my recursive version of hstry into the caching, recursive version. Because it breaks the process into calls of itself with other arguments, there are often duplicates which we can cache.
1.593 seconds of real time 1.252 seconds of user run time 0.06 seconds of system run time [Run times include 0.316 seconds GC run time.]
Much better, the benchmark now runs in about half the time my original needed, with little modification to the code. Now though, it gets better. If this result was something we'd want again, we get this:
0.158 seconds of real time 0.144 seconds of user run time 0.005 seconds of system run time
Almost identical to the C version! hsdisp2 still calls hstry2r 100000 times, but those results are all cached, leaving us with function call and hash lookup as the overhead.
There are a few points:
- An interactive development system lets you try things out before you have even saved a file, then splice it into code once it works. In some systems, you can even do that without shutting down the program.
- Language flexibility lets you easily make powerful changes that would cause oodles of grief to do in a lower-level language.
- The time saved lets you spend more time on the hot spots if you need to. If you don't need to, you are just done earlier, so go outside and enjoy the sun. Writing an entire program in a low-level language is a waste of time if you don't have to do that. Like Ihor's other example though with Python, you could do a piece in C to make it faster, and still have the rest of the program done the easy way.
Cheers, Tim
->The benchmark test suite consists of programs written in: ->C, Java, Python, and a hybrid (consisting of a main routine ->in Python with a C subroutine). Here are the results... -> I would be interested to know what versions of the various interpretors you used are as well as the OS & load you ran the tests under.
An inital glance the benchmarks look about right for how the various languages operate. While I don't agree with you on the conclusion that Python should be used as a prototyping language you need to take into account the amount of effort needed to do something 'useful' in a particular language. C is almost never the best choice these days (and I'm a huge C fan) as to get anything complex done it takes lots of effort. C++ is a bit better as it provides for some primitive 'Generics' and Java is a 'jack of all trades' lanugage. The various scripting languages provide few neat features that don't exist in the above languages such as the code=data paradigm of the 'eval' function in perl, garbage collection (yes Java has it but its pretty much a rogue feature you can't control) and weak typing. Their cost is obviously brute speed though machines these days are 'fast enough' to ignore preformance on most (not all) problems.
Enough of my rant, congratulations on a great post I really enjoyed it.
That is all,
-- Sean "I'm pretty sure this is true . . . I wouldn't bet my life on it, but I'd bet yours." - Erik Guentner