Hacker News new | ask | show | jobs
by aidenn0 3038 days ago
Interestingly enough, SBCL on x86/x64 has a conservative, but moving, GC. It can know some, but not all roots precisely, so it pins any objects that are reachable through conservative roots.

It's earlier implementations were on RISC chips that had 24 or more GPRs so the implementation was simple: 2 stacks and divide the local registers in half for boxed and unboxed values. This obviously didn't work when porting to x86 which had far fewer registers.

The ARM port I believe uses the non-conservative approach, despite having 1 less register than x64 (the x64 port was derived from the x86 port so uses the same register assignments).

2 comments

FWIW Clozure CL has a precise gc on x86/x64. The register usage is detailed here: https://ccl.clozure.com/docs/ccl.html#register-and-stack-usa...
> divide the local registers in half for boxed and unboxed values

Fondly remembers the separate address and data registers on 68000. Why didn't they go back to this approach for x86_64 (16 registers), now that no one really cares about 32 bit x86?

The conservative GC approach has worked well enough in practice that nobody is going to do the work. Also there is a performance tradeoff in non-allocating code: Sometimes you need more unboxed registers, other times you need more boxed registers so with only ~6 of each[1] you will run into register pressure.

1: 2 stacks means 2 stack pointers and 2 frame pointers leaving only 12 registers left for values; it's also possible that the SBCL ABI uses a global register for something else as well, which would leave only 11. PowerPC is a really luxurious platform in which you have 32 GPRs so even if you use 8 GPRs for various bookkeeping purposes that leaves 24 remaining, which is enough for pretty much everyone.