Hacker News new | ask | show | jobs
by jeffbee 1387 days ago
ChromeOS is sort of the successor to Gentoo. The images are built with profile-guided, link-time, and post-link optimization, and they are targeted to the specific CPU in a given Chromebook. Every other Linux leaves a large amount of performance on the table by targeting a common denominator CPU that's 20 years old and not having PGO.
2 comments

It's not a successor, it's a derivative. And yes, if you're only targeting specific known hardware than you can and probably should optimize for it, but most distributions fully intend to be usable on very nearly any x86(_64) hardware so they can't do that.
It's also a bit less relevant when everything is so fast. I used Gentoo on a cheap-for-the-time Pentium 133MHz. Gentoo was basically the difference between a modestly pleasant system and an unusably slow system if I tried to run a standard still-compiled-for-386 distro on it.

I've long since stopped worrying about it because on the systems I run, which are not top-of-the-line but aren't RPis either, it's not worth worrying about anymore for most programs. At most maybe you should target the one particular program you use that could use a boost.

Yeah, I don't know the breakdown between better hardware and better compiler optimizations (even in the default settings) and less differentiation between processors, but I've done some minor not-very-scientific tests of compiling packages with O3/march=mtune=native and in my limited experience it wasn't particularly useful. Like, not just small benefits, but zero or below the noise floor benefits in my benchmarks. Obviously this is super dependent on your workload and maybe hardware; it's an area where if you care, you have to do your own testing.
Tune for native sometimes makes a difference but not always. Targeting a platform that is known to have AVX2, instead of detecting AVX2 at runtime and bouncing through the PLT, can make a large difference. PGO remains the largest opportunity.
PGO is not something Gentoo does (outside a few packages like Firefox that have it in their build system) because the profiling cannot be done automatically for arbitrary packages.
Apple avoid this problem with their OS by having a separate architecture slice for modern x64 (Haswell+).
glibc also supports loading different libraries based on CPU capabilities and some binary distros make use of that.

Still, there are always (admittedly diminishing) returns if you can target the exact CPU you have - pretty sure no binary distro ever had variants for AMD's TBM instructions [0] but my Gentoo install made use of them (which was made absolutely clear when trying to run any of that on a Zen 2 machine lol).

[0] https://en.wikipedia.org/wiki/X86_Bit_manipulation_instructi...