Hacker News new | ask | show | jobs
by legulere 3780 days ago
tl;dr: B3 will replace LLVM in the FTL JIT of webkit. LLVM isn't performing fast enough for JIT mainly because it's so memory hungry and misses optimisations that depend on javascript semantics. They got an around 5x compile time reduction and from 0% up to around 10% performance boost in general.
1 comments

Actually the bigger reason is compile time - better optimizations based on JavaScript semantics are a secondary advantage.
I think that's mostly accurate, in the sense that we wouldn't have done this if it was only motivated by specializing for JavaScript semantics. We had gotten pretty good at having our high-level compiler (DFG) burn away the JavaScript crazy and leave behind fairly tight code for LLVM to optimize.

But as soon as we realized that we had such a huge compile time opportunity, of course we optimized the heck out of the new compiler for the kinds of things that we always wished LLVM could do - like very lightweight patchpoints and some opcodes that are an obvious nod for what dynamic languages want (ChillDiv, ChillMod, CheckAdd, CheckSub, CheckMul, etc).

But isn't it true that some of the things you ended up doing would make sense for LLVM, or would most of them be invalidated by the kinds of optimization passes that are common in LLVM?

E.g. stuff like making the in-memory IR representation better cacheable certainly sounds like it's all-upside, and LLVM should just learn from your project.

LLVM's use of a very rich (and hence not as memory efficient) IR is deeply rooted. Phases assume that given any value, you can trace your way to its uses, users, and owners. The LLVM code I've played with assumes this all over the place, so removing the use lists and owner links as B3 does would be super hard. B3 can do it because we started off that way.
There is a tradeoff between having a dense and compact IR and being able to modify it very easily. LLVM tends to focus on the second. Even if there are plan to balance a bit more, this will take a lot of time (and benchmarking).
what is memory usage like in terms of number of allocations and peak RSS with B3 vs. LLVM?