Thursday, April 7, 2011

Numerical Derivatives Comparison of Scheme and Fortran

This short note compares numerical derivative performance for Scheme and Fortran. In my profession, I compute numerical derivatives daily. Often, the numerical derivatives are part of a simulation routine, so computational time is critically important. The apparent gold standard for numerical performance is Fortran. It is said that if you have a numerical task to solve, Fortran will do the job and keep you in the top 80% tier, if not the top spot, of performance. On the other end of the spectrum, Scheme is known to do the job with 20% of the effort but with no guarantees for performance (some might say "guaranteed to be slow", but that's cruel).

Well, here's a case where Scheme is way faster than Fortran. The example comes from computational fluid dynamics. Given a function u(x), let a 2nd function be defined as t(x)=du/dx and a 3rd function be defined as f(x)=dt/dx. The function u(x) is defined as x*x + x + 1. I've normalized the speed with ifort Fortran's time (greater than 1.0 is faster than Intel's "ifort" Fortran).

  • stalin with gcc: 0.065
  • stalin with icc: 2.2 
  • ifort: 1
  • chicken scheme with gcc:  0.017
As a comparison, I defined the function u(x)=sin(x) for a comparison with a library sin(x) function. The results are as I would expect with multiple calls to a library routine.
  • stalin with gcc: 0.70
  • stalin with icc: 1.1
  • ifort: 1

Performance with the Stalin ("Stalin brutally optimizes") scheme compiler with Intel's C compiler is more than twice as fast as Intel's Fortran compiler. I've heard rumors of this issue, but this is the first time I've done it myself.

Is anyone interested in the codes? Let me know.

Update:
Hand coding the routine with Fortran (rather than passing a function) gives an order of magnitude improvement. Hand coding the Scheme routine decreased performance by about 25%. Changing from (set!) to recursion increased Scheme's performance by about 10%.

Conclusion:
My best Fortran version is about 3 times faster than my best Scheme version.