Hacker News new | ask | show | jobs
by black_knight 1387 days ago
I ran Gentoo back in the good old days. The biggest draw was that after about a week of compiling my system ran a lot faster because of all the compiler optimisations one could enable because it only had to work on your CPU.

I might be misremembering, but I think fastmath was one of the flags explicitly warned against in the Gentoo manual.

5 comments

> I might be misremembering, but I think fastmath was one of the flags explicitly warned against in the Gentoo manual.

It is, here: https://wiki.gentoo.org/wiki/GCC_optimization#But_I_get_bett...

ChromeOS is sort of the successor to Gentoo. The images are built with profile-guided, link-time, and post-link optimization, and they are targeted to the specific CPU in a given Chromebook. Every other Linux leaves a large amount of performance on the table by targeting a common denominator CPU that's 20 years old and not having PGO.
It's not a successor, it's a derivative. And yes, if you're only targeting specific known hardware than you can and probably should optimize for it, but most distributions fully intend to be usable on very nearly any x86(_64) hardware so they can't do that.
It's also a bit less relevant when everything is so fast. I used Gentoo on a cheap-for-the-time Pentium 133MHz. Gentoo was basically the difference between a modestly pleasant system and an unusably slow system if I tried to run a standard still-compiled-for-386 distro on it.

I've long since stopped worrying about it because on the systems I run, which are not top-of-the-line but aren't RPis either, it's not worth worrying about anymore for most programs. At most maybe you should target the one particular program you use that could use a boost.

Yeah, I don't know the breakdown between better hardware and better compiler optimizations (even in the default settings) and less differentiation between processors, but I've done some minor not-very-scientific tests of compiling packages with O3/march=mtune=native and in my limited experience it wasn't particularly useful. Like, not just small benefits, but zero or below the noise floor benefits in my benchmarks. Obviously this is super dependent on your workload and maybe hardware; it's an area where if you care, you have to do your own testing.
Tune for native sometimes makes a difference but not always. Targeting a platform that is known to have AVX2, instead of detecting AVX2 at runtime and bouncing through the PLT, can make a large difference. PGO remains the largest opportunity.
PGO is not something Gentoo does (outside a few packages like Firefox that have it in their build system) because the profiling cannot be done automatically for arbitrary packages.
Apple avoid this problem with their OS by having a separate architecture slice for modern x64 (Haswell+).
glibc also supports loading different libraries based on CPU capabilities and some binary distros make use of that.

Still, there are always (admittedly diminishing) returns if you can target the exact CPU you have - pretty sure no binary distro ever had variants for AMD's TBM instructions [0] but my Gentoo install made use of them (which was made absolutely clear when trying to run any of that on a Zen 2 machine lol).

[0] https://en.wikipedia.org/wiki/X86_Bit_manipulation_instructi...

Relevant: https://funroll-loops.probablymalware.lol/Gentoo-is-Rice.htm...

Though compiling for the exact CPU was definitely not insignificant before the amd64 switch which gave a new baseline - many distros still targeted i586.

It was and people would still use it because “hey it says fast”.

The CPU flags was less interesting to me compared to being able to disable features like X.

There was a big warning that it might produce broken system, iirc