| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by thibaut_barrere 5461 days ago

I do appreciate the technicality of the article, but I'm not sure to agree with the first point of conclusion: how does it makes MRI (and related) 'fatally flawed' though? (real question).

What makes it 1/ irreversible and 2/ bad for today's users?

EDIT: as well, I wouldn't stop using Ruby because of that; I would use JRuby or Rubinius or IronRuby (if I understand well, these ones are not affected?)

3 comments

jjore 5461 days ago

A fairly commonsensical approach is to just require all extension authors to annotate their code properly. At some basic level, this happens with Perl with its oft-maligned DSL for generating C code that happens to do all the right declarations. You might then end up writing your code using more macros. It's certainly not pretty but it is sound.

A plausible rewrite of that function in an XS for ruby would leave the function declaration and wrapper code up to your equivalent of xsubpp to execute your DSL and transform the wrapped code to fully functional C. If you build a C using extension from Perl, you'll find an XS file like http://cpansearch.perl.org/src/SIMON/Devel-Pointer-1.00/Poin... which during the `perl Makefile.PL && make` step is transformed via `xsubpp Pointer.xs > Pointer.c` and then compiled as normal C.

phillmv 5461 days ago

It's a bit hysterical.

Shit! MRI/YARV/REE are inherently fatally flawed! All that code I have running in production must be a FIGMENT OF MY IMAGINATION! SAVE YOURSELVES

benblack 5461 days ago

I am running this code in production, hence it cannot have bugs. QED.

Yours in perpetual bogglement,

Lil' B

msbarnett 5461 days ago

That is an interesting strawman you've constructed, as accepting it requires the reader to conflate the idea of bugs in general and "fatal flaws".

Obviously all non-trivial code working in production not only can have bugs, but will have bugs. Just as obviously, no reasonable person would consider those "fatal flaws" for any reasonable definition of the word fatal.

MRI/YARV's Conservative GC opens up some bedevilling classes of bugs for gem writers, obviously. Calling that a "fatal flaw" when millions of lines of production code continue to function despite its presence is nothing but over-the-top hyperbole.

pshc 5461 days ago

I think the author's definition of "fatally flawed" in this article is more along the lines of "this is an evolutionary dead end and I won't have anything to do with it in the long run" rather than "cannot work under any circumstance."

rbranson 5461 days ago

Erlang is 100% bug free. Use that instead.

strmpnk 5461 days ago

You should see Joe's rants on Erlang too. Not as bad as MRI but there are plenty of things to gripe about in beam.

strmpnk 5461 days ago

How does this deserve a down vote? Erlangs beam VM is pretty amazing but it's not without some pretty weird artifacts in the source, many of which I've discovered via Joe's twitter ranting.

(EDIT: I guess people don't like unpopular views at all, that's fine, long live jokes, forget the facts.)

msbarnett 5461 days ago

I'd rather see a technical analysis of Erlang that didn't read like it was written by Kanye West and sponsored by Axe Body Spray.

jamesgolick 5461 days ago

This would be funny if it made any sense and if it wasn't a ripoff of an old Merlin Mann tweet about DHH.

codahale 5461 days ago

Perhaps you should write that, then.

jcapote 5461 days ago

That's a pretty cool story you got there.

koudelka 5461 days ago

The point was clearly not that it has no bugs, but that if something is working to spec, it's working.

dlikhten 5461 days ago

I fail to see how this is an all hands abandon ship issue. If its a critical issue in all 3 interpreters they should be fixed asap if possible. At worst with a flag.

If rubinius/ironruby/jruby have no issues, this may become moot eventually as rubinius is gaining lots of traction recently and is becoming faster by the release outperforming standard ruby vms in many cases.

evanphx 5461 days ago

Neither Rubinius nor JRuby (and probably IronRuby too) have this issue because they all use accurate garbage collection rather than conservative. Accurate requires much more bookkeeping since all pointers must always be properly identified, but if you start writing a system with accurate GC, it's pretty easy. Bugs like this are a direct result of a conservative GC strategy (and these bugs, as I'm sure you got reading Joe's post, really really suck to find).

pmjordan 5461 days ago

This class of subtle bugs exists whether or not your GC is accurate as soon as you take the red pill and leave the VM environment. If you forget to add your C pointer to the accurate GC's root set, you're just as dead. Related story: http://news.ycombinator.com/item?id=217189

evanphx 5461 days ago

But that is by definition a tractable problem because the source will show that the root set isn't being used properly. (additionally, in practice this proves to be a rare and easy to fix bug)

jleader 5461 days ago

I think the author has a valid point that the "conservative" garbage collection approach has a flaw in its assumptions about the behavior of C compiler optimizations, and it doesn't sound like something easy to fix without a rewrite (i.e. switching to "accurate" GC). This sort of flaw will continue producing new surprising bugs, potentially any time the code is changed, or any time the compiler's optimizations change. These sorts of bugs are frustrating to track down, because they depend both on details of code optimization, and on details of memory allocation/deallocation history. If you compile with debugging options, you may change what optimizations are used; if you insert debug prints for some old-school log-based analysis, you may change the allocation/deallocation history, so the GC gets triggered in a different place.

jjore 5461 days ago

Right now, doesn't the GC traverse the entire heap and keep all objects where the memory's value looks like it might possibly be a pointer to some other object in memory?

This certainly isn't an awesome solution but couldn't the GC backtrace(3) the current process and look at %eax at all C stack frames to additionally include that value in the "pointers currently plausibly in flight" list?

pmjordan 5461 days ago

The problem is this[1]: strings are compound objects, which use 2 memory allocations. One for the object representation, the other for the memory holding the character array. The problem arises when you access the character array but technically no longer need the string object itself anymore. The C compiler notices that you don't use the pointer to the string object anymore, so it doesn't bother storing that on the stack. It is allowed to do this. The GC's mark phase now runs; it inspects all the stack frames and the global roots. It detects that no references to the string object exist and decides to collect it. There happens to be a destructor function associated with that memory object, which frees the character array, as the character array is manually memory managed. It blows up when you then try to access that character array directly.[2]

The correct way to handle this is to add the object reference to the GC's "root" set while you're using its guts, and removing it again when you're done.

Another possible solution is to allocate the string object and its character representation in one chunk of memory. This only works for immutable strings which never share substructure, though. The reason this works is that most conservative GCs will consider objects live as long as there is a pointer pointing to somewhere within a chunk of memory, not necessarily at the beginning.

[1] note: I'm not a Ruby coder but I fixed a very similar problem in a Lua implementation about 4 years ago. That one wasn't even conservative GC. EDIT: I told the story of that bug on HN 3 years (!) ago http://news.ycombinator.com/item?id=217189

[2] worse, it probably doesn't blow up immediately and instead causes memory corruption.