Hacker News new | ask | show | jobs
by sanxiyn 1117 days ago
Besides devoting too many pages to parsing, I think compiler practitioners have low opinion of the book because it isn't useful for them. For example, the second edition claims to be updated to modern optimization techniques, and in a sense that's true, but it is useless because the book isn't using SSA. For practitioners, SSA is not optional these days, and everything about optimization in the book needs to be updated to SSA to be useful.

But you are probably not a compiler practitioner. Then I agree it is a good book to learn about compilers.

3 comments

But the dragon book introduces three address code. SSA is a variant of three address code. Take a look here http://www.cs.columbia.edu/~aho/cs4115/Lectures/15-03-25.htm....
3AC/TAC is something fundamentally different than SSA. Sure, both somehow superficially constrain how you write variable assignment statements - if your intermediate representation contains such things and is a list of instructions. 3AC means that your "instructions" have <=3 operands, SSA means each "syntactic" variable is assigned at most once, that is: they are not a variable anymore. In fact, SSA transformation makes an imperative program referentially transparent (excluding any loads/stores from/to memory) and thus makes it a pure functional program (excluding memory). You can see a program in SSA form simply as a functional program where each basic block is a (nested) function and each PHI is a parameter of the function. SSA as a list of instructions/statements as in 3AC is a red herring and doesn't have any relation to why it's used in compilers/for static analysis. SSA is not a variant of 3AC, also because SSA transformation of 3AC requires the same algorithms as SSA transformation of any other common imperative IR does.

In fact SSA makes no constraints on the number of operands in an instruction/operation what so ever.

Three address code:

```

p = a + b

q = p-c

p = q * d

```

Ssa:

```

p1 = a+ b

q1= p1 - c

p2 = q1 * d

```

Given the above examples how is it not straightforward to use ssa instead of three address code using what the dragon book teaches? What is wrong with the above examples?

In fact section 6.2.4 in the dragon book it says:

“Two distinctive aspects distinguish SSA from three-address code. The first is that all assignments in SSA are to variables with distinct names; hence the term static single-assigment.” The second one is that ssa uses a function to combine two definitions

That's exactly what I mean with superficially similar. Sure, you can transform a non-3AC SSA IR into three address code or transform a 3AC IR into SSA form - as you have done in your example. However, you will need an SSA transformation/construction algorithm to do the latter, which is not that simple or straightforward, especially if you want to be somewhat optimal with placing the PHIs. You can also try to apply your existing analyses and program optimizations on the 3AC SSA - chances are high though that you'll have to adapt them to correctly handle PHIs however. If you simply treat them as a unanalyzed function calls you will likely suffer from a huge loss in analysis precision and optimization opportunities.

On the other hand, if you write your analyses and transformations with SSA form as a precondition, you can reduce complexity because SSA form is referentially transparent, giving you a lot of things like use-def chains, dead code analysis, etc for more or less free.

This is what people mean when they say that SSA is essential for modern compiler construction. That quote from the book is reductionist to the maximum possible extent and put's it in the direct comparison with 3AC, to which it only has the relation that both are transformations of IR instructions the result of both have nothing in common.

The example is misleading because it displays the SSA IR also in 3AC - which is unnecessary - and it completely omits PHIs without which SSA properties simply do not hold and for which no 3AC correlation exists at all - they are not representable. In fact, control flow can easily be built such that a PHI has any amount of operands - e.g. a switch which might need significant additional code transformation to be representable as 3AC.

My apologies for saying the following but can you share some books or papers that support your view? The example that I provided was from the dragon book. CMU uses that same example for one of its ssa slides for an compiler optimization course. This makes me think that I can I indeed lower an ast to ssa that looks like the example I gave and still exploit the ssa for optimization. By the way the examples I provided above are from the dragon book.

Perhaps you work on llvm or something professionally?

But yeah if you can please provide support for your view. Thanks

I've worked on a research project using LLVM as a backend but with it's own SSA implementation at the chair of Sebastian Hack. One of his well known scientific contributions was on optimal register allocation with SSA. The chair also published http://dx.doi.org/10.1007/978-3-642-37051-9_6 a very good and easy way to do SSA construction directly from the AST without computing dominance frontiers.

You should probably start on the Wikipedia page for SSA, which - since the dragon book doesn't teach what SSA actually is, is a reasonable starting point https://en.m.wikipedia.org/wiki/Static_single-assignment_for... Note that even though it provides many examples and other comparisons it not once mentions 3AC (and vice versa actually). Under Benefits you can find links to various optimizations some of whose pages reference what SSA helps with. None of them have much if anything to do with 3AC.

That the CMU simply copies an example straight from the book and provides no further explanation and maybe even doesn't mention PHIs (?) doesn't speak for that course at all. Especially if you take a look at the list of compilers that do use SSA on Wikipedia and note that pretty much all major programming language implementations use SSA.

Of course you can lower an AST to SSA that looks like the example (except the example is so rudimentary it's still missing the very necessary PHI functions) and you can even make it have 3AC form and still exploit the SSA properties in optimizations. It's just that the properties and the SSA form still have no real relation except that both modify your list of instructions and claiming that there's one that's somehow meaningful or that 3AC gives you anything that SSA does is at best grossly misleading by a textbook.

Well yes, but you can't use the algorithm in the dragon book to generate three address code to generate SSA, because SSA has additional constraints. The dragon book really dropped the ball here, there is no excuse why it doesn't include any algorithm for SSA construction.
The dragon book is a theoretical book. It's like any other (good) OS book out there: good source of information to know the foundations. After reading it, go out and read something more practical (which you probably won't have issues learning it since you now know the foundations)
Do you recommend a book that’s more up-to-date?
You can try this book if you want something that came out this year https://github.com/IUCompilerCourse/Essentials-of-Compilatio.... Go to the releases to either get the racket version or python version. But I mean cmu uses the dragon book second edition for a graduate level compiler optimization class.
Tiger books include SSA, GC (tracing and RC), OOP, FP,....

https://www.cs.princeton.edu/~appel/modern/