| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kazinator 2339 days ago

The strategy in the GC for determining the stack top for hunting GC roots will not work on all architectures.

On aaarch-64, the address of a local dummy variable may be above a register save area in the stack frame, and thus the scan will miss some GC roots.

In TXR Lisp, I used to use a hacked constant on aarch64: STACK_TOP_EXTRA_WORDS. It wasn't large enough to straddle the area, and so operation on aarch64 was unreliable.

http://www.kylheku.com/cgit/txr/commit/?id=3aa731546c4691fac...

A good stack-top-getting trick occurred to me: call alloca for a small amount of memory and use that address. It has to be below everything; alloca cannot start allocating above some register save area in the frame, because then it would collide with it; alloca has not know the real stack top and work from there.

Since we need to scan registers, we use alloca for the size of the register file (e.g. setjmp jmp_buf), and put that there: kill two birds with one stone.

http://www.kylheku.com/cgit/txr/commit/?id=7d5f0b7e3613f8e8b...

2 comments

naasking 2339 days ago

> On aaarch-64, the address of a local dummy variable may be above a register save area in the stack frame

Then use two stack frames! Every problem can be solved by adding an additional level of indirection. ;-)

kazinator 2339 days ago

In this case it won't help, because:

0. We are already in a frame that doesn't take any arguments of the "val" object type; how come that's not good enough?

1. The current stack frame is entered with a bunch of callee-saved registers, some of which contain GC roots.

2. The current stack frame's code saves some of them: those ones that it clobbers locally. It leaves others in their original registers.

3. Thus, if a another stack frame is called, there are still some callee-saved registers, probably containing GC roots, and some of these will go into the area below the locals.

4. You might think that if the save all the necessary registers ourselves into the stack and then make another stack frame, we would be okay. But in fact, no. Because by the time we save registers, the compiler generated function entry has already executed and saved some of those registers into the below-locals save area and clobbered them for its own use! So our snapshot possibly misses GC roots. The compiler generated code always has "first dibs" at the incoming registers, to push them into the below-locals save area, thus kicking the GC roots farther up the stack.

z92 2339 days ago

I use argv[0] as stack head.