Hacker News new | ask | show | jobs
by peterkelly 2853 days ago
> Why not use GCC or LLVM as part of your project?

That's not how the world gets compiler people.

Someone had to write GCC and LLVM, and someone will have to write whatever comes next. Languages and tooling only advance because there are those who choose to take the hard road instead of "just using that thing that someone else made".

1 comments

The successor to LLVM isn't going to be written by someone just hacking on an app and decides they could use a basic compiler. It's going to be written by someone who has the explicit goal of replacing LLVM, fully understands LLVM's design and tradeoffs, and has ideas about an architecture for a successor that improves on the design.

Encouraging people to use LLVM instead of home-growing a toy compiler can still get you compiler people. LLVM is just the backend, not the frontend, so you still need to write the compiler frontend yourself, and once you've learned LLVM, if you're so inclined you may start hacking on LLVM and therefore become a backend compiler person too.

> Hello everybody out there using minix -

> I'm doing a (free) operating system (just a hobby, won't be big and professional like gnu) for 386(486) AT clones.

—Linus Torvalds, in a post[1] to comp.os.minix, 25 August 1991 (emphasis mine)

[1]: https://groups.google.com/forum/#!msg/comp.os.minix/dlNtH7RR...

> It's going to be written by someone who has the explicit goal of replacing LLVM, fully understands LLVM's design and tradeoffs, and has ideas about an architecture for a successor that improves on the design.

And from where would they get those ideas? Tweaking LLVM is not enough to learn the necessary trade-offs. You need to write a compiler from the ground up. In fact you need to do it multiple times. Most likely you will not end up writing a LLVM replacement. Most likely you will end up a better programmer.

Most compilers use not one internal IR, but rather a successive series of IRs. LLVM itself has its normal IR layer, plus a machine IR layer for actually generating code for targets. Plus another layer (two actually, you can pick which one you want) for going between LLVM IR to machine IR. Most programming language frontends have their own IR (or two) that they use in between the AST and something like LLVM IR.

The genius of LLVM is that it's a stable, well-defined IR layer in the middle of the compilation stack that lets you switch out either the frontend or the backend without having to do the entire messy, boring, critical implementation of the other piece. And you can support pretty wacky frontend languages or pretty wacky backends (people compile LLVM to C source or FPGAs, for example).

Not really that much of a genius, given that IBM was already using the same architecture on their RISC compilers research during the 70's.

Check the papers related to PL.8 compiler architecture.