Hacker News new | ask | show | jobs
by msla 980 days ago
I don't know why people are so reluctant to just use -O3
6 comments

Because -O3 enables optimizations that break code with some lesser known cases of UB[1]. So people just don't enable -O3 because they can't be bothered to fix UB in their codebase, or because they think it's a "compiler bug".

There are other reasons for not using -O3 which people already mentioned.

As a matter of fact, Gentoo specifically says in their docs that -O3 breaks some packages[2].

[1] https://stackoverflow.com/questions/57889116/different-evalu...

[2] https://wiki.gentoo.org/wiki/GCC_optimization#-O

UB-breaking optimizations are already enabled at O1. I don't think -O3 is particularly noticeable for that: your first link shows code that breaks between O3 and no optimization at all.

O3 does more aggressive inlining and generally expensive optimizations and optimizations that are not necessarily profitable.

I was only saying that -O3 enables _even more_ optimizations, which can lead to UB code that works with -O2 but misbehaves on -O3. And because -O3 is seldom used (for reasons other than code breakage), the code patterns that are actually UB but don't get compiled to bad code at -O2 (by e.g. gcc) are lesser known. It all depends on how much a given compiler is willing to make assumptions based on UB and what's permitted by C spec to make optimizations. At -O3 it probably makes a lot more of those assumptions, along the lines of "the spec says this is undefined behavior so I will assume it can never happen and optimize it based on that". The more aggressive inlining in conjunction with that probably exposes more even more UB.

Edit: found example where code works on -O2 but breaks on -O3, related to using abnormal amounts of stack space https://stackoverflow.com/questions/47058978/g-optimization-...

Isn’t the warning that -O3 might break stuff horribly obsolete? It was an issue in 2.95, but that was many years ago.
Because i) not all programs benefit tremendously from -O3 and ii) it adds to the compile time and binary size? It would be great to have some optimization level between -O2 and -O3 so that only portions that have a potential to be improved more than, say, 5% are compiled using -O3. In fact the existence of `#pragma GCC optimize` does suggest that this might be possible today with some heuristics... (Or use PGO, which will have the same effect. But PGO is still a novelty in 2023.)
you can manually set the optimizations (and order of optimizations) you want. -O3 ist just a predefined set of optimizations.
But it is not portable among compilers (for example, both GCC and Clang support SLP-based autovectorization but with different flags). -O# is one of a few flags that have roughly same meaning across many compilers.
Because in the real world code size is often more important than throughput on for large inputs in microbenchmarks. There is a point of diminishing returns and in the opinion of many -O3 is beyond that. If it does the job and the job is worth doing use it.
Using -O3 was resulting in correctness problems in my work codebase in the past. Nowadays the compiler (ifort) crashes when using -O3 for some reason.
> Using -O3 was resulting in correctness problems

Likely because your codebase had UB in it that didn't show itself until a certain optimization level. The solution is to fix all instances of UB. See my comment above.

I believe in our case it was a compiler bug, because it only happened for a few versions of ifort and was eventually fixed. But it scared us off from using -O3.
How often does O3 vs O2 enable the sort of optimizations that break code that is technically incorrect?
I just gave up and use tcc.

Forces me to use a better algorithm when it matters. Which more often than not, does not.

;-)