I’m not entirely convinced that stages 2 & 3 are
necessary, maybe we stop at stage 1. It all depends on
what optimizations we’d like to do that can’t be done
because the IR is in the wrong form. Proceeding with
stages 2 and 3 might gain some efficiency in the compiler
itself because then we don’t have to generate the old IR
at all. I suspect that effect will be small, however.
10% slower, which isn't too bad, let's hope they can now focus on improving compilation speed, I miss go 1.4 having recently upgraded. It would be really nice to see that actually improve for 1.7 instead of regressing.
SSA, once you understand it, is easier to work with than almost all other forms of instruction sets. I'd argue that it would only accelerate new architecture in the long-run.
I'm interested in why LLVM was disqualified. Was it simply never considered or is it incompatible with the Go type system, calling convention, etc.?
It's rather easy to add a calling convention to LLVM. If I would have to guess it would be that they thought LLVM was too slow for them. They said from the start compilation speed was a big point for them.
Also LLVM requires you to either write your IR in SSA or add another expensive optimisation pass to make it SSA (mem2reg). Perhaps they thought writing an SSA generator would be too much of a headache.
> Proven and well tested: clang uses this technique for local mutable variables. As such, the most common clients of LLVM are using this to handle a bulk of their variables. You can be sure that bugs are found fast and fixed early.
In addition to what others said, support for precise garbage collection in LLVM was not ideal at the time. The experimental gc statepoint extension spearheaded by the Azul guys is trying to change that. They've been working on it publicly since late 2014.
And typically longer compile times.
Most variables are split into ("phi") variants, for each assignment, and many more costly optimization steps are now possible.
True, an increase in optimizations will likely mean longer compile times, on the other hand, with better optimized code, the compiler itself (as it's written in Go) will also perform better, which may negate some of the increase in compile time.
One of the alluring things of SSA form is that many optimizations are much faster to execute on the form. The costly part is to raise the SSA form in the first place which in the standard implementation requires one to build a costly dominator tree.
You don't need to add every optimization known to man to a compiler, so you can sometimes keep a few of the important ones and then skip every other optimization. A priori, I'd guess SSA would speed up the compiler, which means you end up having a better budget for the more expensive optimizations.
As stated in [1] they use a variant of "Simple and Efficient Construction of Static Single Assignment Form" [2], which does not require a dominator tree (or a liveness analysis).
I think Wirth with his Pascal compiler had this as a rule. If you added an optimization (which takes additional time), it must speed up the compiler enough that compilation times are not longer.
It depends on the compiler. The great thing about SSA conversion is that you only have to do it once, whereas a classical def/use-chaining analysis may have to be redone many times in an old-style optimizer.
SSA is a conversion of the program into a representation of its data flow. I've used it in multiple compilers and found it to be a big win for easing other analyses (e.g., induction variable recognition become trivial) and reducing bugs due to the update problem.
See: https://docs.google.com/document/d/1szwabPJJc4J-igUZU4ZKprOr...