Why did I write the Clasp compiler?

Imagine that you have a collection of highly optimized C++ libraries that do some amazing stuff in a very time and memory efficient way and you want to write experimental code to do some complicated things by calling the API of those C++ libraries – what do you do?

You could write everything in C++ – but C++ is a tedious language to write complex, experimental code within because there is a lot of boilerplate that you need to write. I write a lot of C++ code, a couple of hours a day, and much of my time is consumed by figuring out how to express what I want to do in C++ and then when I rewrite the code when I realize that it fell short of my desires.

You could write a wrapper library or foreign function interface to expose your C++ libraries in some dynamic language like Python, Lua and then write your experimental code in that. However, you will very quickly run into the problem that you are spending much of your time writing and maintaining the foreign function interface. You will also encounter nasty problems involving the differences in memory management, exception handling and myriad other differences between C++ and the dynamic language you choose.

I’ve spent a lot of time doing both of these things and the problems that I’ve encountered have invariably derailed my ambitions and my ideas.

So I decided to do something different.

I decided to write a language from the ground up that would smoothly interoperate with C++.

I chose the most expressive, well-defined language in existence – Common Lisp.  I’ll say more about why later.

When I started the LLVM library (http://llvm.org) was just coming into maturity and I decided to use it as the back-end for my language.  I could leverage all of the work done by really smart hobbyists and people at Google, Apple and many other companies to perform much of the back-end code generation.  Choosing LLVM would also give me access to the entire LLVM ecology of tools and I would be able to incorporate the Clang C++/C/Objective-C compiler as a C++ library to more closely weave together Common Lisp and C++.

So now we can write complex code in Common Lisp that calls C++ APIs and have C++ code call Common Lisp code while spending very little time worrying about the interface between them.

Two C++ libraries that are already exposed within Clasp are LLVM and Clang.  The LLVM library exposes C++ API for LLVM-IR generation and is used by the Clasp compiler written in Common Lisp. It’s available within Clasp and I think it’s a great playground for exploring LLVM-IR.

I also exposed the Clang C++ compiler front end library including the entire Clang AST (Abstract Syntax Tree) and the ASTMatcher library. This enables a programmer to write C++ static analyzers and C++ refactoring tools in Common Lisp!   I used it to analyze the entire Clasp source code (165 C++ source files) and automatically build ~10,000 lines of C++ code that interfaces Clasp with the Memory Pool System copying garbage collector by Ravenbrook Systems (http://www.ravenbrook.com/project/mps).

 

About me

I’m a chemistry professor who writes software to design molecules that I hope will make the world a better place. I’ve been programming since I was 12, I have written code in a lot of programming languages including Basic, X86 assembler, Pascal, Prolog, Fortran, Smalltalk, TCL, PHP, Python, C, C++ and Common Lisp. More on this later.

I run a research group in the Chemistry Department at Temple University in Philadelphia, PA. We have developed a way to make the largest, most complex and most “programmable” molecules outside of biology. We call them “spiroligomers”, they are large, shape-programmable and functional group programmable molecules that let us construct molecules that bind proteins as therapeutics and accelerate chemical reactions the way that enzymes do. The goal is to create molecules that can do everything that proteins can do in nature but be designable and evolvable by human beings. More on this later.