Hacker News new | ask | show | jobs
by cptwunderlich 1467 days ago
Gosh, I really wish GCC had more/better documentation. Especially big picture stuff. E.g., I would like to know what register allocation algorithms it uses (and how certain details are handled), but looking at that code I noped out...
5 comments

It uses something they call IRA and LRA on the RTL representation. But bottom line is: it's graph colouring.

https://gcc.gnu.org/onlinedocs/gccint/RTL-passes.html

There's really quite a lot of documentation and published papers when you actually look:

https://github.com/gcc-mirror/gcc/blob/751f306688508b08842d0...

LLVM: Lots of documentation for frontend authors, not so much for backend stuff.

GCC: Lots of documentation for backend authors, actively made difficult to being used for frontends for many years (although things have improved dramatically)

How have things evolved for LLVM backend authors? Has it improved too?
You can read GCC Summit presentations for things like that. The register allocator is called IRA/LRA. It used to have a ball of mud called reload that isn’t worth understanding because it doesn’t make much sense.

GCC’s code style is strange because the original authors wanted to make it look like Lisp for some reason.

Err I always thought it looked strange because it was in C with classes before or after some holy war.

While gcc (and in general compiler) plugins are some of the most interesting tech enablers (be it for fuzzing,, static analysis, or runtime checks injection) 'People competently maintaining gcc plugins' (a sect I'm not a part anymore, thank dog) are amongst the most patient, devoted, unsung angels of this world.

It’s in C++ now. The weird spacing and functions ending in _p are Lispisms.

It’s also garbage collected so it’s still not “normal” C++ but neither is LLVM.

Not sure about the plugin API, but C++ is basically impossible to use with plugins because it’s so hard to keep ABI contracts, so it might not have changed.

Well you basically have to compile the plugins against your gcc's headers anyway, and they're gpl by default (same as wireshark dissectors iirc). No the pain is all the churn on gcc internals and in plugin APIs over the years. You basically become an ifdef monkey and end up testing myriads of gcc versions...
Isn’t the compiler the one who makes the ABI contract?

And I’ve seen _p and friends all over the place usually to differentiate between a pointer and, umm, not pointer. I thought it was a C++ism to be honest.

C++ has things like the fragile base class problem meaning you can accidentally break it easily. There's issues with throwing exceptions across different libraries on some platforms (maybe just Windows?) but I forget the reason why.

p is short for predicate.

Fragile base class problem affects all languages that offer some kind of inheritance, not only C++.
Maybe the lispism are Stallman's legacy. He is a great lisp proponent after all.
This might be less true now, but for a long time gcc's code was terrible and undocumented on purpose. rms wanted it that way, to make it harder for it to be forked or EEE'd by corporations.

Whether that was a good plan is up for debate, but there you go.

People say that dmd's backend is terrible and undocumented, but I don't know what they're talking about:

https://github.com/dlang/dmd/tree/master/src/dmd/backend

> I would like to know what register allocation algorithms it uses

I'm wondering why you'd like to know. If it is just for your curiosity, that's very good. If you want to participate in the compiler development effort, hat tip!

But if you are thinking about tuning your code to such an internal detail, please don't! Coding to an implementation, rather than an interface, is never a good idea.