Work on CClasp (Clasp using Robert Strandh’s Cleavir compiler) is moving forward, here is some timing data that I generated comparing CClasp performance to C++, SBCL and Python.
NOTE: this test is a specific test of an algorithm that uses FIXNUM arithmetic. I have inlined simple FIXNUM arithmetic (+, -, <, =, >, and fixnump) and so these operations are fast. Code that uses other functions will run a lot slower until inlining is implemented more broadly.
I’m calculating the 78th Fibonacci number 10,000,000 times in each case. For these integer arithmetic heavy functions, CClasp performs pretty well (~4x slower than C++). Once type inference is added as well as a few other optimizations CClasp should be generating performant code.
Note: There are compiler settings (loop unrolling) where the C code runs even faster than SBCL, it’s just for this specific test, with the compiler settings below that SBCL comes out a little faster than C++. I don’t want to start an argument about the speed of SBCL vs C++ here, my point is that CClasp has come a long way from being hundreds of times slower than C++ to within a factor of 4.
Here is the C++ code, it converts the numbers back and forth from Common Lisp representations:
Integer_sp core_cxxFibn(Fixnum_sp reps, Fixnum_sp num) {
long int freps = clasp_to_fixnum(reps);
long int fnum = clasp_to_fixnum(num);
long int p1, p2, z;
for ( long int r = 0; r<freps; ++r ) {
p1 = 1;
p2 = 1;
long int rnum = fnum - 2;
for ( long int i=0; i<rnum; ++i ) {
z = p1 + p2;
p2 = p1;
p1 = z;
}
}
return Integer_O::create(z);
}
Here is the Common Lisp code:
(defun fibn (reps num)
(declare (optimize speed (safety 0) (debug 0)))
(let ((z 0))
(declare (type (unsigned-byte 53) reps num z))
(dotimes (r reps)
(let* ((p1 1)
(p2 1))
(dotimes (i (- num 2))
(setf z (+ p1 p2)
p2 p1
p1 z))))
z))
Here is the Python code:
import time
def fibn(reps,num):
for r in range(0,reps):
p1 = 1
p2 = 1
rnum = num - 2
for i in range(0,rnum):
z = p1 + p2
p2 = p1
p1 = z
return z
start = time.time()
res = fibn(10000000, 78)
end = time.time()
print( "Result = %f\n", res)
print( "elapsed time: %f seconds\n" % (end-start))
More details.
CClasp version is 0.3-test-10
It was compiled using settings:
“clang++” -x c++ -O3 -gdwarf-4 -g -Wgnu-array-member-paren-init -Wno-attributes -Wno-deprecated-register -Wno-unused-variable -ferror-limit=999 -fvisibility=default -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk -mmacosx-version-min=10.7 -std=c++11 -stdlib=libc++ -O3 -O3 -gdwarf-4 -g -O3 -Wno-inline -DBUILDING_CLASP -DCLASP_GIT_COMMIT=\”cf99526\” -DCLASP_VERSION=\”0.3-test-10\” -DCLBIND_DYNAMIC_LINK -DDEBUG_CL_SYMBOLS -DDEBUG_FLOW_CONTROL -DEXPAT -DINCLUDED_FROM_CLASP -DINHERITED_FROM_SRC -DNDEBUG -DPROGRAM_CANDO -DREADLINE -DTRACK_ALLOCATIONS -DUSE_BOEHM -DUSE_CLASP_DYNAMIC_CAST -DUSE_STATIC_ANALYZER_GLOBAL_SYMBOLS -D_ADDRESS_MODEL_64 -D_RELEASE_BUILD -D_TARGET_OS_DARWIN -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I”../../../../include” -I”../../../../projects/cando/include” -I”../../../../src” -I”../../include” -I”../../src/cffi” -I”../../src/core” -I”../../src/gctools” -I”../../src/llvmo” -I”../../src/main” -I”../../src/serveEvent” -I”../../src/sockets” -I”/Users/meister/Development/externals-clasp/build/common/include” -I”/Users/meister/Development/externals-clasp/build/release/include” -c -o “../../src/main/bin/boehm/cando/clang-darwin-4.2.1/release/link-static/main.o” “../../src/main/main.cc”
The C-code is embedded within Clasp and is thus compiled with the same settings.
The Clang version is 3.6.1
The Python version is 2.7.6. I ran the python code using: python fib.py
The SBCL version is: SBCL 1.2.11
These were run on a MacBook Pro (Retina, 15-inch, Early 2013)