Hacker News new | ask | show | jobs
by munin 3714 days ago
> Implementing SSA construction is hard. To save its users from having to implement it, LLVM provides stack slots. This means that one increment of a variable v will be composed of three LLVM instructions: one load, one add, and one store.

This is just wrong, though! LLVM provides stack slots for a good reason: stack allocated variables can escape. You need a place in memory to stick them. If mem2reg can prove that the stack allocated variable doesn't escape, it gets promoted into an SSA value.

3 comments

Then maybe I did not express myself in the best terms. QBE definitely supports stack slots and their registerization! Minic, a small C frontend shipped with QBE makes use of them.

The difference is that LLVM forces you to use them even when you know your locals do not escape (i.e. your source language is Pascal), QBE doesn't.

So, LLVM makes you use stack slots for two independent problems: 1. Compiling languages like C where locals can escape, and 2. Avoiding to construct SSA form in the frontend. In QBE, you use stack slots (alloc4, alloc8) to solve 1, but to solve 2, you can simply emit non-ssa form and QBE will fixup things for you.

> you can simply emit non-ssa form and QBE will fixup things for you

That's exactly what LLVM does though, except the "non-SSA form" involves loads/stores to alloca'd values. The difference is that in LLVM, the language is always in SSA form, it just has some reads / writes to memory that can be pruned, while QBE alternates between being an SSA language and not an SSA language.

LLVM also doesn't "make" you use stack slots. If you wanted to, you could emit fully pruned programs as LLVM IR. Using allocas for variables is a choice that clang makes.

I think the extra load/stores clutter the IL.

Also, QBE does not really "alternate" SSA/non-SSA, SSA form is built once at the beginning of the compilation pipeline and preserved later.

I don't understand what you mean by "fully pruned programs". Maybe you want to refer to pruned SSA form. And then, here is my point: with LLVM, either you build SSA yourself or you use allocas. QBE offers a convenient third option.

Some CFG transforms are actually much easier if you get out of an SSA first, reshuffle CFG without caring about maintaining your phis, and then simply rebuild an SSA form.
Expected to see you here. Hey, send me an email or something so I can send you interesting stuff without hunting your profile or random comments on HN. Address in my profile. Here's the one I was saving for you to complement the ML and Haskell CPU's I linked.

https://www.info.ucl.ac.be/~pvr/bam_jlp.ps

Cool stuff, eh? That they keep it close to a regular, RISC processor means optimizations of that should carry over. Unlike stuff like Fifth Generation that tried to go way, way the hell to far with Prolog hardware. ;) Should fit nicely into my concept of general-purpose CPU's with purpose-built coprocessors. Also speculate techniques might be helpful on ASIC's meant for today's big-data apps that use things like Datalog for queries. Ya think?

> I don't understand what you mean by "fully pruned programs". Maybe you want to refer to pruned SSA form.

The mem2reg pass in LLVM is the recommended method of constructing pruned SSA form. It just implements the standard algorithm.

> QBE does not really "alternate" SSA/non-SSA, SSA form is built once at the beginning of the compilation pipeline and preserved later.

The QBE IL presented to the user is not in SSA form, but internally within your compiler, an SSA representation is kept. I prefer to not have syntactic sugar in my IL.

Yes! And I would also add that:

> LLVM IL is more cluttered with type annotations and casts.

With LLVM moving to typeless pointers[1], this will be less true in the future than it is now.

[1] https://groups.google.com/forum/#!topic/llvm-dev/vphBegBWyyE

I don't know if that would alleviate the authors concern, because (as I understand that proposal) the type information is still present, it's just bolted into the load/store/GEP instructions now.

Nothing is stopping you from writing LLVM that doesn't use types though! That's entirely a front end decision. If you wanted to, in your front end, you could never emit record or array types and do book keeping on cells in memory entirely on your own using pointer arithmetic / pointertoint / inttopointer. Doing so is entirely at the expense of the speed of the generated code though!

Hi Munin, I'm writing a C compiler on LLVM at the moment, and hitting problems with types and pointers, the complexity of which made me think of just using casts everywhere. What is it that makes it expend speed of the generated code? Does it mean certain LLVM optimizer phases won't work anymore?
Types in C should not be complex. If you're hitting some kind of complexity, you're likely doing it the wrong way. C types are directly translated (one way, of course) into LLVM types.

Casts are evil: they break aliasing analysis, they screw up address spaces, they break more advanced forms of vectorisation (like polyhedral analysis), etc.

Most casts have zero performance cost and will not interfere with alias analysis. The exception for alias analysis is that inttoptr/ptrtoint is discouraged for address calculation, but the analysis can reason about that. Performance wise, I can see casts between float/int to not be free since they may appear in the final output.
I'm not LLVM expert but usually the answer to that kind of question is aliasing...
I take this as a compliment. It means my design choice was totally valid, and maybe even a good one.
qbe allows both stack slots like llvm, or directly inputting non ssa code which is automatically turned to ssa form.