Linking LLVM bitcode files for a dynamic language

Clasp Common Lisp is a dynamic language in which every top-level form needs to be evaluated in the top level environment. Clasp compiles Common Lisp code to bitcode files and then links them together into a shared library or an executable. When the library or the executable are loaded, each top-level form needs to be evaluated. This is until Clasp gains the ability to save a running environment to a file, which it doesn’t have yet. Even then, the ability to play back the top-level forms will be needed to create the environment to write to a file.

So the compiled bitcode files need to keep track of the top-level forms so that they can be played back at startup. Clasp does this by defining a “main” function with internal linkage (called “run-all”) for each llvm::Module. “run-all” evaluates every top-level form that was compiled into the llvm:Module. The tricky part is how does this main function get exposed to the outside world so that it can be called if a single bitcode file is loaded into clasp or linked together into a library or executable and invoked with other “run-all” functions from other modules.

Clasp creates a global variable in each module called “global-run-all-array” that stores an array of initially one function pointer that points to the module’s “run-all” function.  The “global-run-all-array” global variable is defined with “appending” linkage. What this does is when bitcode files get linked together by the system linker, the “global-run-all-array” will have all of the “run-all” functions appended together and put back into the “global-run-all-array” global variable.

Then there is the problem to determine the number of entries in the “global-run-all-array”.  Clasp solves that by ensuring that the last module linked in a list of modules has a two element “global-run-all-array” where the second element is NULL and by adding a second global variable called “global-epilogue”.

When a bitcode file or a shared library is loaded into Clasp, it checks for the “global-epilogue” symbol, if it finds it then it knows that the “global-run-all-array” contains a NULL terminated array of function pointers to call.   If “global-epilogue” is not present, then it knows that “global-run-all-array” contains a single function pointer.

Clasp then invokes each of the “global-run-all-array” functions one after the other and each one of them invokes the compiled functions for the top-level forms for each of the bitcode files.

This only takes a few seconds when starting up Clasp.

Note: I haven’t been actively blogging – because I’m very, very actively programming.  If you want to say hi, I’m on IRC, #clasp almost every day.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s