| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by azakai 4346 days ago

Yes, very curious about this too.

As a first guess, I'm not sure what the v8 strategy is here. The new compiler seems to use the "sea of nodes" approach as opposed to SSA form. A comparison of the two is here

http://static.squarespace.com/static/50030e0ac4aaab8fd03f41b...

The "sea of nodes" approach can give some speedups, but they don't appear huge - 10-20% in that link. Not sure how representative that data is. But it is interesting that modern compilers, like gcc and LLVM, typically use SSA form and not the approach v8 is taking, as further evidence that the "sea of nodes" is not clearly superior.

Perhaps the v8 designers believe there is some special advantage for JS that the new model provides? Otherwise this seems surprising. But hard to guess as to such a thing. If anything, JS has lots of possible surprises everywhere, which makes control flow complex (this can throw, that can cause a deopt or bailout, etc.), and not the best setting for the new approach.

Furthermore, the "sea of nodes" approach tends to take longer to compile, even as it emits somewhat better code. Compilation times are already a big concern in JS engines, more perhaps than any other type of compiler.

Perhaps v8 intends to keep crankshaft, and have turbofan as a third tier (baseline-crankshaft-turbofan)? That would let it only run the slower turbofan when justified. But that seems like a path that is hard to maintain - 2 register allocators, etc., - and turbofan seems like in part a cleanup of the crankshaft codebase (no large code duplications anymore, etc.), not a parallel addition.

Overall the Safari and Firefox strategies make sense to me: Safari pushes the limits by using LLVM as the final compiler backend, and Firefox aside from general improvements has also focused efforts on particular aspects of code or code styles, like float32 and asm.js. Both of those strategies have been proven to be very successful. I don't see, at first glance, what Chrome is planning here. However, the codebase has some intriguing TODOs, so maybe the cool stuff is yet to appear.

1 comments

rayiner 4346 days ago

The "sea of nodes" approach is just a data structure for representing a program in SSA form. It's orthogonal to anything that has an impact on speed. E.g. GCC uses a tree representation, LLVM uses a CFG, and Hotspot (C2 and Graal) uses a "sea of nodes" representation, but they all represent code in SSA form and that representation is orthogonal to the quality of particular optimizations implemented within the framework.

The speedup reported in that paper is from running constant propagation and dead code elimination at the same time instead of doing them separately, which finds more constants and dead code because the two problems are coupled. The same process can be implemented in a more traditional CFG representation (and generally is--sparse conditional constant propagation).

rayiner 4345 days ago

Too late to edit this, but I should clarify: "data structure" is probably not the right word. To be more precise, "SSA form" is a property of variables in a program IR. It means variables are assigned only once, that defs dominate uses, and value flow is merged at control flow merge points with phi nodes. You can have different program representations that all represent values in SSA form, but differ in how they represent other things. Where the "sea of nodes" representation differs is that it explicitly represents control dependencies. In LLVM, you always have a control flow graph, with basic blocks and edges between them. Control dependencies between instructions are implicit from their placement in particular basic blocks. In a "sea of nodes" IR, there are no basic blocks.[1] Control dependencies are represented explicitly with control inputs to nodes, just as data dependencies are represented explicitly with data inputs.

This makes certain things easier in a "sea of nodes" IR. Normally, during optimization you don't have to worry about maintaining a legal schedule of instructions within and between the basic blocks. You just have to respect the control dependencies. However, in order to get executable code you have to impose a schedule on the nodes, whereas with a more conventional CFG IR, you already have a schedule in the form of the basic blocks and the ordering within them.

[1] See Section 2.2-2.4 of Click's paper: http://paperhub.s3.amazonaws.com/24842c95fb1bc5d7c5da2ec735e.... His IR replaces basic blocks with Region nodes. The only instructions that must be linked to Region nodes are ones that inherently have a control dependency. E.g. an "If" node takes a data input and produces two control outputs, which can be consumed by region nodes. "Phi" nodes must also have control inputs, so they can properly associate different control flow paths with the data values that are merged along those paths.

azakai 4346 days ago

Thanks for the corrections.