In preparation for making Clasp generate performant code, I’ve switched to using tagged pointers and immediate fixnums, characters and single-floats. This is a common technique in dynamic language implementations where you represent small values directly within pointers rather than storing them on the heap. The tagging scheme in Clasp is as follows for 64-bit systems (32-bit is not supported at the moment).
A pointer provides 64-bits (63 … 0) to work with.
|FIXNUM||#b00||The other 62 bits are used to store a signed FIXNUM. The 0b0 bit allows certain arithmetic operations (addition, subtraction, comparison) to be very fast.|
|GENERAL||#b001||In the lower three bits represents a pointer to an object on the heap. The pointer is 8-byte aligned in memory.|
|CONS||#b011||A pointer to a CONS cell. Robert Strandh suggested this so that Common Lisp code can easily distinguish CONS cells from non CONS cells and speed up list traversal.|
|VA-LIST||#b101||A pointer to a wrapped va_list structure pointing to a list of arguments on the stack.|
|CHARACTER||#b010||A character in a 32-bit value (>> 5 to unbox).|
|SINGLE-FLOAT||#b110||Represents a single precision C/C++ float, a 32-bit value (>>5 to unbox).|
|GCTAG||#b111||Headerless objects like CONS cells use this tag to indicate that the tagged word is used by the garbage collector|
I’m especially interested in peoples thoughts on embedding something like a double-precision float within 61 bits. I’m 99.9(repeating)% sure it can’t be done but I’d love something like that. Arithmetic, Arrr! she be a harsh mistress.
All pointer access in Clasp is performed using a C++ template class called smart_ptr and it was relatively easy to modify this class to manage the tagged pointers. The untagging/tagging of OTHER-PTR and CONS-PTR pointers is carried out by overloading the C++ dereferencing operators: operator-> and operator*, of the smart_ptr template class.
It was a bit tricky to get these immediate values to play nice with C++ types and inheritance because the Clasp C++ code does not treat all Common Lisp types as a single undifferentiated pointer type. Clasp uses C++ types like T_sp, List_sp, Cons_sp, Float_sp, Symbol_sp, HashTable_sp etc. to distinguish different Common Lisp pointer types from each other within the C++ code. An assignment like:
T_sp x = y;
is valid and very common within the Clasp C++ code. But:
Symbol_sp s = gc::As(a);
will signal a Common Lisp error if
a doesn’t point to a SYMBOL at the time of the assignment to s.
To deal with this I defined a template class: TaggedCast<ToPtr,FromPtr> that provides an “isA” function and a “castOrNULL” function that can be specialized to simulate inheritance between classes that represent immediate values to the regular C++ class hierarchy.