Timing data comparing CClasp to C++, SBCL and Python

Work on CClasp (Clasp using Robert Strandh’s Cleavir compiler) is moving forward, here is some timing data that I generated comparing CClasp performance to C++, SBCL and Python.

NOTE: this test is a specific test of an algorithm that uses FIXNUM arithmetic. I have inlined simple FIXNUM arithmetic (+, -, <, =, >, and fixnump) and so these operations are fast. Code that uses other functions will run a lot slower until inlining is implemented more broadly.

 

timing

I’m calculating the 78th Fibonacci number 10,000,000 times in each case. For these integer arithmetic heavy functions, CClasp performs pretty well (~4x slower than C++). Once type inference is added as well as a few other optimizations CClasp should be generating performant code.

Note: There are compiler settings (loop unrolling) where the C code runs even faster than SBCL, it’s just for this specific test, with the compiler settings below that SBCL comes out a little faster than C++. I don’t want to start an argument about the speed of SBCL vs C++ here, my point is that CClasp has come a long way from being hundreds of times slower than C++ to within a factor of 4.

Here is the C++ code, it converts the numbers back and forth from Common Lisp representations:


Integer_sp core_cxxFibn(Fixnum_sp reps, Fixnum_sp num) {
  long int freps = clasp_to_fixnum(reps);
  long int fnum = clasp_to_fixnum(num);
  long int p1, p2, z;
  for ( long int r = 0; r<freps; ++r ) {
    p1 = 1;
    p2 = 1;
    long int rnum = fnum - 2;
    for ( long int i=0; i<rnum; ++i ) {
      z = p1 + p2;
      p2 = p1;
      p1 = z;
    }
  }
  return Integer_O::create(z);
}

Here is the Common Lisp code:


(defun fibn (reps num)
  (declare (optimize speed (safety 0) (debug 0)))
  (let ((z 0))
    (declare (type (unsigned-byte 53) reps num z))
    (dotimes (r reps)
      (let* ((p1 1)
             (p2 1))
        (dotimes (i (- num 2))
          (setf z (+ p1 p2)
                p2 p1
                p1 z))))
    z))

Here is the Python code:


import time
def fibn(reps,num):
    for r in range(0,reps):
        p1 = 1
        p2 = 1
        rnum = num - 2
        for i in range(0,rnum):
            z = p1 + p2
            p2 = p1
            p1 = z
    return z
start = time.time()
res = fibn(10000000, 78)
end = time.time()
print( "Result = %f\n", res)
print( "elapsed time: %f seconds\n" % (end-start))

More details.

CClasp version is 0.3-test-10

It was compiled using settings:
“clang++” -x c++ -O3 -gdwarf-4 -g -Wgnu-array-member-paren-init -Wno-attributes -Wno-deprecated-register -Wno-unused-variable -ferror-limit=999 -fvisibility=default -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk -mmacosx-version-min=10.7 -std=c++11 -stdlib=libc++ -O3 -O3 -gdwarf-4 -g -O3 -Wno-inline -DBUILDING_CLASP -DCLASP_GIT_COMMIT=\”cf99526\” -DCLASP_VERSION=\”0.3-test-10\” -DCLBIND_DYNAMIC_LINK -DDEBUG_CL_SYMBOLS -DDEBUG_FLOW_CONTROL -DEXPAT -DINCLUDED_FROM_CLASP -DINHERITED_FROM_SRC -DNDEBUG -DPROGRAM_CANDO -DREADLINE -DTRACK_ALLOCATIONS -DUSE_BOEHM -DUSE_CLASP_DYNAMIC_CAST -DUSE_STATIC_ANALYZER_GLOBAL_SYMBOLS -D_ADDRESS_MODEL_64 -D_RELEASE_BUILD -D_TARGET_OS_DARWIN -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I”../../../../include” -I”../../../../projects/cando/include” -I”../../../../src” -I”../../include” -I”../../src/cffi” -I”../../src/core” -I”../../src/gctools” -I”../../src/llvmo” -I”../../src/main” -I”../../src/serveEvent” -I”../../src/sockets” -I”/Users/meister/Development/externals-clasp/build/common/include” -I”/Users/meister/Development/externals-clasp/build/release/include” -c -o “../../src/main/bin/boehm/cando/clang-darwin-4.2.1/release/link-static/main.o” “../../src/main/main.cc”

The C-code is embedded within Clasp and is thus compiled with the same settings.

The Clang version is 3.6.1

The Python version is 2.7.6. I ran the python code using: python fib.py

The SBCL version is: SBCL 1.2.11

These were run on a MacBook Pro (Retina, 15-inch, Early 2013)

Advertisements

29 thoughts on “Timing data comparing CClasp to C++, SBCL and Python

    • Chemistry is the science that ties together all other science! It is physics, it is math, it is biology, it is astronomy, it is EVERYTHING!

  1. You can compile the Python code with two lines of code (using Numba):

    import numba

    @numba.jit
    def fibn(reps,num):
    ….

    Using this compilation the Python time dropped on my machine from 51 sec to 0.46 sec

  2. Can you benchmark Scala, Go, Erlang, D and Rust too? Especially Rust. Word on the street is that it’s the Next Big Thing. It’s going to take the world by storm! It’s described as “blazingly fast” and real safe too!

  3. I managed to make the python code a little bit faster, nothing ground breaking:

    #!usr/bin/env python3

    import time

    def fibn(reps, num):
    for r in range(reps):
    p1 = p2 = 1
    rnum = num – 2
    for i in range(rnum):
    p2, p1 = p1, p1 + p2
    return p1

    start = time.time()
    res = fibn(10000000, 78)
    end = time.time()

    print(“Result = %f\n” % res)
    print(“elapsed time: %f seconds\n” % (end-start))

    Of course, this does not change too much the things for python or the conclusions of the post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s