Hacker News new | ask | show | jobs
by willvarfar 3038 days ago
This is just off the top of my head, but it made me wonder: are there any VMs that put a stack map header of some sort as a literal in the stack?

E.g. for each frame the compiler orders roots first and then other primitives. Then, as you enter the frame, write the number of roots to the stack. When the GC walks the stack it can see precisely which are roots.

5 comments

It can be done. In fact something very close to that is done in the Delphi compiler.

Delphi has a variety of reference counted types. Originally just strings, but later dynamic arrays and COM-style interfaces, all use automatic reference counting. Assignment and copying of these types are done via runtime calls, not just for variables of these types, but also structures and static arrays, recursively, that contain these types.

When allocating locals of these types, and of types that contain these types, the compiler also writes out an RTTI structure describing the stack layout, a bit like the activation record was an actual Pascal record. This RTTI is used to correctly decrement references when the stack is popped in normal case, or during exception unwinding.

The RTTI scheme is smart enough to encode several variables of the same type as being like a static array of that type, etc.

All this doesn't help with liveness, of course, which will still be a problem for the code presented in the article. In fact the efficiency of the encoding is in direct opposition to liveness; encoding things as contiguous blocks of pointers will most likely artificially extend their lifetime to the whole call frame.

That could be done, but generally precise collectors keep a fixed set of tables outside the stack, that way they only need to figure out which method frames are on the stack and then consult those tables. That incurs fewer writes.
The paper "Accurate Garbage Collection in an Uncooperative Environment" goes over a very roughly similar technique for compiling a GC language to C in a way that lets you easily find the roots because managed objects are stuffed in structs on the C stack in a certain way.

It's a really really neat paper that I've been itching to implement for a while.

The problem is that it all depends on the liveness of the variables. The same value on the stack can be a root at the begining of the method as the corresponding variable is still alive, and later it becomes useless as the variable is already dead. So, you still need to know a location of each live reference at every safepoint (and that means several stack-maps for each method).
You can zero them when they are not live.
Yeah, we've tried that. It was one of the attempts of improving conservative GC (no dead values on the stack => no false roots). Unfortunately, it causes noticeable performance degradation, so it is easier to consult with stack maps about liveness of the variable.
Could you use debug information for this? I think the stack frame layout should already be included there.