| The numbers for Ruby on Truffle are meaningless until people have had a chance to hit it hard with all the particular oddities of Ruby. Consider their 6 months was to hit 45% of RubySpec. You can reach 45% of RubySpec fairly easily if you go for the softest targets (I'm not saying that's what they've done - I haven't checked). [EDIT: I see they're doing some interesting things that certainly ought to beat MRI. If I understand it correctly it seems like they are somehow collapsing type checks for multiple operations. Of course the devil is in the details - if they are trying to defer type or method checks, and throw away results if the checks fails (which should be rare) that will only be safe if modifications that does happen does not introduce or remove side effects that can't be "rolled back", but I might be misunderstanding their presentation] The problem is the multitude of bizarre things that are legal Ruby. Like people doing eval("class Fixnum; def + other; 42; end; end;"). Yes, that's legal, and yes that means any integer arithmetic in your application is suddenly broken. More importantly it means any optimisations based on your beliefs about what any piece of code is meant to do, while they are most likely right, can turn out to be horribly wrong and so are problematic for a VM or compiler, without substantial amount of logic to be able to detect or bail out from optimised code to safe fallbacks. Doing so without slowing down the code when your guesses are right is hard because of how many ways there are of changing the behaviour of code in Ruby. Unless your compiler understands eval() and it is possible for it to reason about the contents of the eval string, it can make pretty much zero guarantees about the state of the world after an eval() call, and so it can make pretty much zero guarantees about the state of the world after any method call that could reach such an eval() call. Admittedly, that's a stupid thing to do, but it's legal in Ruby, and while the above example is extreme, you do find a lot of use that is roughly equivalent. E.g. autoload creates as much lack of predictability as eval. So does a 'require' or 'load' that might get triggered later in execution, for example. The reason those are important is that it makes a massive amount of optimisations far harder: You can't blindly cache method pointers, for example, because any method call potentially invalidates them. You can't even cache class pointers, because they can change: You can return from a method call and suddenly an object has an eigenclass. You can't inline functions without guarding them somehow to fall back to the full method call when it turns out some idiot did redefine Fixnum#+. You can't assume seemingly "safe" stuff like Fixnum#+(some other Fixnum) will even return an object of the type you assume, for the same reason - someone might decide to implement a DSL that redefines it. Frankly, it'd be fantastic to start deprecating some of the more obnoxious things like these, and weeding out the few uses of them, but as it stands today, a fast Ruby subset is "easy". A fast complete Ruby implementation is an entirely different beast. A fast incomplete Ruby implementation that refuses to support some of the most noxious corner cases would still be extremely useful for a lot of people, though. (in the interest of disclosure since I'm talking about another Ruby implementation: I'm writing a series on my own slow process of writing a Ruby compiler, though my goals are very different - mostly focused on writing about the process) |
I'll talk you through exactly how we solve the problem of redefining Fixnum, as one example of how we've tackled these problems.
Whenever you use Fixnum#+ in one of your methods, we lookup what that method is and cache the method so we can call it quickly next time. We actually never again check that this cache is still valid. The trick is that we sort of do the opposite - any time you do something that could invalidate that cache, we find the installed machine code that uses it, and delete it. If the machine code is still running somewhere on some stack for some thread or fibre, we jump from the machine code into an interpreted version which looks up the method again and carries on.
So Kernel#eval makes no difference - if something that you eval ruins your later cached method calls in the same method, that's not a problem because if you're still running the same machine code, then you can't have redefined Fixnum#+. If you had redefined it, you'd be back in the interpreter getting ready to compile again with new caches.
I'll also just point out that running RubySpec means we are successfully running something like 5000 lines of off-the-shelf unmodified systems code, just for the harness before we even get to the tests.
Our theory is that we can make Ruby very fast, without having to forgo any of your favourite random dynamic monkey-patching features.
Watch the video: http://medianetwork.oracle.com/video/player/2623645003001
Join us on the mailing list: http://mail.openjdk.java.net/mailman/listinfo/graal-dev