Hacker News new | ask | show | jobs
by grashalm01 3334 days ago
Your mission seems to be matching with what we are trying with Graal and Truffle. The Truffle API aims to be stable as well. We also provide basic building blocks like an object model. I am curious how you plan to support speculative optimizations that need to deoptimize and reconstruct interpreter stack frames? In my experience that's essential for building high performance dynamic language implementations.
3 comments

I wonder whether a "bridge-across-the-void" approach would work. The idea is that a speculative optimization would not discard its original form, so that deoptimization is as cheap as a rollback. Then, when speculative optimizations are performed, they form a kind of bridging branching structure which reaches across the void of invalid low-level IRs, trying to find a safe point at which the optimization can complete.

I know that Truffle is based on partial evaluation; how close are you to Futamura #2? That'd relieve you of the burden of having to care about what an "interpreter stack frame" even looks like. OTOH I can confirm that the same kind of speculation in optimization has to occur during self-application.

It really is a thorny problem, isn't it?

The interpreter I'm building right now will essentially be a JIT, but it will JIT to an internal IR instead of machine code. My plan is to eventually have it operate on compact and optimized stack frames as native code would.
> essential for high performance dynamic language implementations

Is it? Objective-C is a dynamic language (AOT-compiled) that doesn't have these features, and it is possible to write very high performance code with it.

I don't have any numbers, but I think it's generally possible to write very high performance code in Objective C... by not using the dynamic language features of it such as message sends.

Objective C I believe does a globally cached method lookup for every message send (!) and so can't inline through message sends (!), and since inlining is the mother of all optimisations I would imagine this would severely limit performance if you tried to use a lot of message sends in your inner loops. We should actually think about doing that experiment to see what the cost would be.

In a language like Ruby almost all operators, even basic arithmetic, are dynamic method calls, so you can't avoid using message sends anywhere. I think if you tried to do that in Objective C things might grind to a halt.

Objective C also lacks many dynamic language features which are the ones solved through speculative optimisation, such as integer overflow, access to frames as objects, and so on.

I'm not an expert on Object C though so happy to be corrected.

> I don't have any numbers

I do :-) Wrote about a book about it, in fact.

> but I think it's generally possible to write very high performance code in Objective C... by not using the dynamic language features of it such as message sends.

Yes, that's a common misconception...with a grain of truth. In my experience, you get the best performance by judiciously mixing dynamic and static features. And yes, that means eschewing some dynamic features in some inner loops (The 97:3 rule applies). However, you can also often gain significant performance by hiding behind a polymorphic dispatch.

For example, I reimplemented Apple's binary plist parsers+generators in Objective-C (from C) for a significant speed boost: the polymorphic implementation allowed me to put in override points for things such as lazy loading, and interface-based (de-)serialization removes the need for a generic intermediate representation. Compared to those advantages, the cost of message-sends is negligible (and optimizable if it becomes a problem).

> Objective C I believe does a globally cached method lookup for every message send(!)

Yes. It's quite fast and despite what people fret about rarely a problem.

> and so can't inline through message sends (!),

If it does become a problem (measure, measure, measure!), there are techniques to avoid the lookup: IMP-cache, convert to C function call, convert to inline function, convert to Macro.

> since inlining is the mother of all optimisations

Hmm...the mother of all optimizations is measuring and removing unnecessary code. Then comes eliminating/reducing and "sequentializing" memory access.

Very few of these can be automated.

Inlining is nice, too.

Thanks for that extra info.

How do you think Objective C would perform if every operator was a dynamic method call as it is in Ruby? Surely then you'd start to get frustrated with the overhead? That's why languages like Ruby need the speculative optimisations.

> How do you think Objective C would perform if every operator was a dynamic method call as it is in Ruby?

Depends very much on what you mean with "every": the 97:3 rule applies, and is almost certainly even more highly skewed today[1]. So for the vast majority of code, it wouldn't matter. Correction: doesn't matter. For example, Apple's Swift language produces code that is incredibly slow when non-optimized, loops and the like can easily be 1000x (a thousand times!) slower than optimized, and yet Xcode's debug builds default to non-optimized and people don't report that their debug builds are unusable.

Another example: I implemented the central re-pagination loop in my BookLightning imposition app[2] in my Objective-Smalltalk language[3], which currently has just about the slowest implementation imaginable (an AST-walker inefficiently implemented in Objective-C), at least an order of magnitude slower than Ruby. Despite that, BookLightning is at least an order of magnitude faster than the OS-X print system, which is largely written in C. Why? It computes by page (which is sufficient for this task), rather than by individual PDF graphical element. That difference is so great that the steering code controlling the computation just doesn't matter.

> Surely then you'd start to get frustrated with the overhead?

As long as Objective-C were still a hybrid language: probably not, because I could always eliminate the overhead in the (very) few places that mattered, and could do so reliably/predictably [4]. In fact, for Objective-Smalltalk I am very much leaning towards that approach (Smalltalk-ish by default, optimizations optional), and so far things are looking good.

> That's why languages like Ruby need the speculative optimisations.

Or C libraries, which is what I believe high performance Ruby code does.

p.s.: I think Truffle and Graal are awesome, and as a researcher I wish I'd come up with them. When doing actual practical performance work, I prefer simpler and more predictable tech.

[1] "The Death of Optimizing Compilers" http://cr.yp.to/talks/2015%2E04%2E16/slides-djb-20150416-a4....

[2] http://www.metaobject.com/Products/

[3] http://objective.st/

[4] http://blog.metaobject.com/2015/10/jitterdammerung.html

But going back to the original argument that was being made - you say that you don't need speculative optimisations to make something like Objective C fast. But you say to do that you don't use the dynamic features where you need performance - use macros and C functions instead.

So yes you don't need speculative optimisations... as long as you apply similar optimisations manually in the source code yourself. I'm not convinced therefore :)