Hacker News new | ask | show | jobs
by hubicka 2746 days ago
There are several independent things

- first how you set -O2 defaults in your compiler. This is a delicate problem since you need to find right balance of code size, compile time, robustness of generated code (do not trigger undefined effect in super evil ways) and of course runtime. In benchmarks I have found that Clang has bit of edge for runtime which is mostly vectorization (on x86-64)

- selection of minimal ISA you support. For GCC x86-64 is still the original Opteron, but distributions can easily (and some do) decide for better. Indeed AVX is big win, but for general purpose distribution this is still too agressive. You can provide AVX optimized libraries where it depends

- selection of CPU tunning (i.e. generic/intel)

So I consider it mistake that GCC traded vectorization over compile time speed+reliablity for -O2 because it can make important difference in common workloads this days (not 10 years ago, say).

It is also clearly a bug for GCC to produce AVX instruction when not explicitly asked for :)

I also do testing on Zen, Core and some PowerPC. For the firefox machine I use Buldozer box because I don't care it spends long nights running builds & benchmarks and I think this particular problem is not very CPU specific.