Hacker News new | ask | show | jobs
by wonnage 5458 days ago
Can someone dissect this a little more? My understanding is the pointer to str never gets written to the stack, and so str on the heap might get freed before zstream_append_input makes use of it. But how could the GC see this/what is the faulty assumption?
2 comments

My understanding is that Ruby GC just runs through its heap of Ruby objects and sees which of them are reachable based on other objects in the Ruby heap and C-stack/registers.

Faulty assumption seems to be that counting references only to RVALUEs (Ruby objects in heap) is enough to determine if a part of memory can be freed. This breaks down in C-extensions where macros extract some part of the object or something pointed by it for use. In this case RSTRING_PTR extracts the C char-array used by str for zstream_append_input to use (lets call it arr).

If zstream_append_input or any calls underneath it tries to allocate a new Ruby object, GC may get called and str (and thus arr) may get freed because there are no references left to it anymore (no heap/stack/register because the register value was overwritten).

And this seems to require all Ruby C-extension writers to lock the objects they're using through macros with RB_GC_GUARD.

Edit: note that there are no references left to str

The point is that the GC cannot see that and so assumes that the object is no longer referenced and can be freed. A conservative collector works by scanning the live memory of the process for things that look like pointers into the same live memory and then assumes that all objects that are not the target of any of these pointers are garbage. Tough luck if the only reference to a live object lives in a register.
registers are scanned, too. the bug is not that the ref is in a register. the bug is that there are no refs anywhere. not on the stack and not in any register.
This statement confused the heck out of me (wow! magic free memory) but of course, the pointers are being held to the contents of the memory, just not to the start of the object, which is what the GC cares about.

Perhaps the GC could be modified to track pointers not just to the head of object but to any address within it. Alternatively, C-coders working with Ruby could just say "I'm using this gc object" before calling C code.

I don't see this is a fatal flaw at all. Sounds like its just a bug. Now if, as many here assert, this bug is present all over the Ruby VM, then that's pretty unfortunate. Is that the case, or just hyperbole?