Hacker News new | ask | show | jobs
by brucethemoose2 1064 days ago
On linux distros, the package manager downloads different binaries based on your CPU. Skylake would be x86-64-v3, Zen 4 would be x86-64-v4, for example.

And there are different schemes for multiple architectures in the same program, like hwcaps.

1 comments

Isn’t this going to get very unmanageable very soon? Intel seems to add extensions every other year or so.
The extensions can be kinda broken down into 4 levels. Basically ancient, old (SSE 4.2), reasonably new (AVX2, Haswell/Zen 1 and up), and baseline AVX512.

https://developers.redhat.com/blog/2021/01/05/building-red-h...

There is discussion of a fifth level. Someone in the Intel Clear Linux IRC said a fifth level wasn't "worth it" for Sapphire Rapids because most of the new AVX512 extensions were not autovectorized by compilers, but that a new level would be needed in the future. Perhaps they were thinking of APX, but couldn't disclose it.

AVX10/APX does sound like a good baseline for v5.
except that it doesn't support full AVX-512, making the whole idea of backward compatibility between these levels meaningless. "It's Intel!!!"
Well that's an even better justification, as a x86-64-v5 level would be needed for the newer CPUs.

We can throw away any hope of v4 being a standard baseline.

It’s easy to fully automate and storage is relatively cheap these days.
I'd think the issue would be more build infra, every new variant means you have to build the world again
Again, compute is surprisingly cheap these days.

Work out what it would cost to compile - say - a terabyte of C code at typical cloud spot prices.

A large VM with 128 cores can compile the 100 MB Linux kernel source tree in about 30 seconds. So… 200 MB/minute or 12 GB/hour. This would take 80 hours for a terabyte.

A 120 core AMD server is about 50c per hour on Azure (Linux spot pricing).

So… about $40 to compile an entire distro. Not exactly breaking the bank.

you'd have to separate out compiling and linking at a bare minimum to get even a semi accurate model. plus a lot of userspace is c++, which is much, much slower.
Yes. Also, test it.
That can also be largely automated.
LTO does rarely break things in hard to detect ways, but I have never heard of a -march x86 compilation bug.