Linking LLVM bitcode files for a dynamic language

Clasp Common Lisp is a dynamic language in which every top-level form needs to be evaluated in the top level environment. Clasp compiles Common Lisp code to bitcode files and then links them together into a shared library or an executable. When the library or the executable are loaded, each top-level form needs to be evaluated. This is until Clasp gains the ability to save a running environment to a file, which it doesn’t have yet. Even then, the ability to play back the top-level forms will be needed to create the environment to write to a file.

So the compiled bitcode files need to keep track of the top-level forms so that they can be played back at startup. Clasp does this by defining a “main” function with internal linkage (called “run-all”) for each llvm::Module. “run-all” evaluates every top-level form that was compiled into the llvm:Module. The tricky part is how does this main function get exposed to the outside world so that it can be called if a single bitcode file is loaded into clasp or linked together into a library or executable and invoked with other “run-all” functions from other modules.

Clasp creates a global variable in each module called “global-run-all-array” that stores an array of initially one function pointer that points to the module’s “run-all” function.  The “global-run-all-array” global variable is defined with “appending” linkage. What this does is when bitcode files get linked together by the system linker, the “global-run-all-array” will have all of the “run-all” functions appended together and put back into the “global-run-all-array” global variable.

Then there is the problem to determine the number of entries in the “global-run-all-array”.  Clasp solves that by ensuring that the last module linked in a list of modules has a two element “global-run-all-array” where the second element is NULL and by adding a second global variable called “global-epilogue”.

When a bitcode file or a shared library is loaded into Clasp, it checks for the “global-epilogue” symbol, if it finds it then it knows that the “global-run-all-array” contains a NULL terminated array of function pointers to call.   If “global-epilogue” is not present, then it knows that “global-run-all-array” contains a single function pointer.

Clasp then invokes each of the “global-run-all-array” functions one after the other and each one of them invokes the compiled functions for the top-level forms for each of the bitcode files.

This only takes a few seconds when starting up Clasp.

Note: I haven’t been actively blogging – because I’m very, very actively programming.  If you want to say hi, I’m on IRC, #clasp almost every day.

Improving the Clasp programming experience

I’ve been working quietly to improve the Clasp programming experience by implementing a C++ scraper for Clasp that extracts all information from the C++ code required to expose that code to Common Lisp.

This means functions, classes, methods, symbols, enums and initialization functions are identified by the scraper and code is automatically generated to expose these entities to the Common Lisp programming environment.  This replaces thousands of calls that were hand coded by me and were called at startup to expose C++ functionality to Common Lisp with automatically generated code.  To expose a C++ function now all you do is some variation on this:

namespace ext {
CL_LAMBDA(x y &optional (z 0));
CL_DOCSTRING(R”doc(* Arguments
– x :: A fixnum
– y :: A fixnum
– z :: A fixnum
* Description
Add up to three fixnums together as long as they fit into an int.
If they don’t, an error will be signaled.)doc”);
CL_DEFUN int add_numbers(int x, int y, int z) {
return x + y;

The CL_LAMBDA(…), CL_DOCSTRING(…), and CL_DEFUN tags are read from the C++ source code by the scraper and code is generated to enable things like (ext:add-numbers 4 5) to be called from Common Lisp.  Above is a complicated example using the optional CL_LAMBDA(…) and CL_DOCSTRING(…) macros – something as simple as below is all that is necessary to bind the C++ function to the ext:add-two-numbers symbol:

CL_DEFUN int ext__add_two_numbers(int x, int y) {return x+y;}

The C-preprocessor is run on all of the source code with macros defined for tags like CL_DEFUN, CL_DEFMETHOD, CL_DOCSTRING(…), LISP_CLASS(…) etc. prior to the scraper searching for the tags. The scraper then generates code that is run at startup to expose everything. The scraper is run every time Clasp is built but it’s smart about only rebuilding generated code that needs to be rebuilt.

There are several benefits to this: (1) adding functions, classes, methods etc. to Clasp Common Lisp is easier now because you just add the entity and there are no more worries about where to add the call at startup to expose the entity. (2) The scraper builds a database of source location for all of these entities. So now, from Slime you can hit M-. on a Clasp symbol and emacs will jump to the source in the C++ source or Common Lisp source – wherever it is defined.

At the same time I’ve added code to extract and generate lambda-lists and docstrings for C++ functions and methods and they are also available now within the Common Lisp environment.

The scraper is written in standard Common Lisp and sbcl has been added as a build dependency for Clasp. The python scrapers Clasp used up to this point have all been retired and python is no longer a dependency for Clasp.

Why Common Lisp for Scientific Programming?

What makes any language suitable for scientific computing?

The most important goal is translating ideas into fast code and building on other people’s work. We are working on improving Clasp’s ability to generate fast code based on the excellent LLVM library and Clasp can expose C++/C/Fortran libraries to build on the work of others. I’ve programmed in many languages and Common Lisp is the best at expressing ideas. Every language gets translated into an abstract syntax tree on its way to native code, Lisp code is an abstract syntax tree.  There is no programming concept that can’t be expressed compactly in Lisp, this is not true in other languages.  You cannot yet express multiple dispatch functions, dynamic variables or Common Lisp style macros (a few CL features) compactly in R, Fortran, C++, or Python.

Why are R, Fortran, C++, or Python considered suitable for scientific computing?

It is the wrong question – those languages are just the languages that people started using and so they keep using them.  It is not a choice most people make – it is inertia.

I choose Common Lisp.

Clasp 0.4 – Joining Common Lisp and C++

Announcing Clasp 0.4 – a new release that incorporates a brand new compiler – capable of generating 200x faster code than previously, many bug fixes, a more complete Common Lisp implementation, and C++ interoperation.

Get Clasp 0.4 here

New features:

  • Clasp has a completely new, optimizing/inlining compiler called cclasp!
  • Fixnum, character and single-float types are immediate values.
  • General object pointers and cons pointers are tagged for speed.
  • Clbind library allows programmers to expose external C++ libraries.
  • Lots of bug fixes and stability improvements.

What is Clasp?

Clasp is a new Common Lisp implementation that seamlessly interoperates with C++ libraries using LLVM for compilation to native code. This allows Clasp to take advantage of a vast array of preexisting libraries and programs, such as out of the scientific computing ecosystem. Embedding them in a Common Lisp environment allows you to make use of rapid prototyping, incremental development, and other capabilities that make Common Lisp such a powerful language.

Getting Clasp

Get Clasp 0.4 here

Precompiled and prepackaged versions of Clasp will be available for a limited number of distributions. Check the releases to see if there is something available for you.

At the moment, Clasp is supported on Linux and Mac OS X. On these systems, you should be able to build it from source if a pre-made package is not available or workable for you. In case you cannot get it to compile even with the instructions below, the quickest way to get help is to either file an issue, or to chat with us directly.

Building on most systems will take around 4GB of RAM and 2-4 hours with a relatively modern processor.

Acknowledgements (#clasp IRC channel nicknames)

Robert Strandh (beach) – the Cleavir compiler and guidance.

Shinmera – build testing and organization.

stassats – guidance, bug finding and speeding up the compiler.

flash- – feedback and debugging of the clbind library.

SAL9000 – designing list iterator and feedback on ASTMatcher library.

faheem – guidance in setting up the build system.

Timing data comparing CClasp to C++, SBCL and Python

Work on CClasp (Clasp using Robert Strandh’s Cleavir compiler) is moving forward, here is some timing data that I generated comparing CClasp performance to C++, SBCL and Python.

NOTE: this test is a specific test of an algorithm that uses FIXNUM arithmetic. I have inlined simple FIXNUM arithmetic (+, -, <, =, >, and fixnump) and so these operations are fast. Code that uses other functions will run a lot slower until inlining is implemented more broadly.



I’m calculating the 78th Fibonacci number 10,000,000 times in each case. For these integer arithmetic heavy functions, CClasp performs pretty well (~4x slower than C++). Once type inference is added as well as a few other optimizations CClasp should be generating performant code.

Note: There are compiler settings (loop unrolling) where the C code runs even faster than SBCL, it’s just for this specific test, with the compiler settings below that SBCL comes out a little faster than C++. I don’t want to start an argument about the speed of SBCL vs C++ here, my point is that CClasp has come a long way from being hundreds of times slower than C++ to within a factor of 4.

Here is the C++ code, it converts the numbers back and forth from Common Lisp representations:

Integer_sp core_cxxFibn(Fixnum_sp reps, Fixnum_sp num) {
  long int freps = clasp_to_fixnum(reps);
  long int fnum = clasp_to_fixnum(num);
  long int p1, p2, z;
  for ( long int r = 0; r<freps; ++r ) {
    p1 = 1;
    p2 = 1;
    long int rnum = fnum - 2;
    for ( long int i=0; i<rnum; ++i ) {
      z = p1 + p2;
      p2 = p1;
      p1 = z;
  return Integer_O::create(z);

Here is the Common Lisp code:

(defun fibn (reps num)
  (declare (optimize speed (safety 0) (debug 0)))
  (let ((z 0))
    (declare (type (unsigned-byte 53) reps num z))
    (dotimes (r reps)
      (let* ((p1 1)
             (p2 1))
        (dotimes (i (- num 2))
          (setf z (+ p1 p2)
                p2 p1
                p1 z))))

Here is the Python code:

import time
def fibn(reps,num):
    for r in range(0,reps):
        p1 = 1
        p2 = 1
        rnum = num - 2
        for i in range(0,rnum):
            z = p1 + p2
            p2 = p1
            p1 = z
    return z
start = time.time()
res = fibn(10000000, 78)
end = time.time()
print( "Result = %f\n", res)
print( "elapsed time: %f seconds\n" % (end-start))

More details.

CClasp version is 0.3-test-10

It was compiled using settings:
“clang++” -x c++ -O3 -gdwarf-4 -g -Wgnu-array-member-paren-init -Wno-attributes -Wno-deprecated-register -Wno-unused-variable -ferror-limit=999 -fvisibility=default -isysroot /Applications/ -mmacosx-version-min=10.7 -std=c++11 -stdlib=libc++ -O3 -O3 -gdwarf-4 -g -O3 -Wno-inline -DBUILDING_CLASP -DCLASP_GIT_COMMIT=\”cf99526\” -DCLASP_VERSION=\”0.3-test-10\” -DCLBIND_DYNAMIC_LINK -DDEBUG_CL_SYMBOLS -DDEBUG_FLOW_CONTROL -DEXPAT -DINCLUDED_FROM_CLASP -DINHERITED_FROM_SRC -DNDEBUG -DPROGRAM_CANDO -DREADLINE -DTRACK_ALLOCATIONS -DUSE_BOEHM -DUSE_CLASP_DYNAMIC_CAST -DUSE_STATIC_ANALYZER_GLOBAL_SYMBOLS -D_ADDRESS_MODEL_64 -D_RELEASE_BUILD -D_TARGET_OS_DARWIN -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I”../../../../include” -I”../../../../projects/cando/include” -I”../../../../src” -I”../../include” -I”../../src/cffi” -I”../../src/core” -I”../../src/gctools” -I”../../src/llvmo” -I”../../src/main” -I”../../src/serveEvent” -I”../../src/sockets” -I”/Users/meister/Development/externals-clasp/build/common/include” -I”/Users/meister/Development/externals-clasp/build/release/include” -c -o “../../src/main/bin/boehm/cando/clang-darwin-4.2.1/release/link-static/main.o” “../../src/main/”

The C-code is embedded within Clasp and is thus compiled with the same settings.

The Clang version is 3.6.1

The Python version is 2.7.6. I ran the python code using: python

The SBCL version is: SBCL 1.2.11

These were run on a MacBook Pro (Retina, 15-inch, Early 2013)

I gave a talk on Clasp and my Chemistry at Google in Cambridge Mass last week

Here is the link for the talk.

This talk describes our unique approach to constructing large, atomically precise molecules (called “Molecular Lego” or “spiroligomers”) that could act as new therapeutics, new catalysts (molecules that make new chemical reactions happen faster) and ultimately to construct atomically precise molecular devices. Then I describe Clasp and CANDO, a new implementation of the powerful language Common Lisp. Clasp is a Common Lisp compiler that uses LLVM to generate fast machine code and it interoperates with C++. CANDO is a molecular design tool that uses Clasp as its programming language. Together I believe that these are the hardware (molecules) and the software (the CANDO/Clasp compiler) that will enable the development of sophisticated molecular nanotechnology.

For more info see:

What a great place Google was! My host, Martin Cracauer was fantastic, he made me feel really, really welcome and made sure that the talk would be recorded and put up on the web. He arranged it so that I could spend the afternoon talking with him and Doug and James, two Lisp/compiler gurus at Google. He also gave me a tour of Google, it was great.