| I think it's 'common knowledge' which has outlived it's relevance as I can't recall the last time I found -O2 outperforming -O3. Practically every performance oriented open source program I come across also defaults to -O3 these days, or sometimes -Ofast which also enables -ffast-math. >-O3 can generate slower code because of the aggressive inlining and loop unrolling enabled -O3 turns on vectorization and inlining optimizations but I can't recall any loop unrolling options which are turned on at -O3. -funroll-loops is not turned on at any of the -O (including -O3) levels due to it being one of the hardest to get right without any runtime data as basis (which is why the only option that turns it on is PGO - profile generated optimization). Note that I'm talking about modern versions of GCC, if you are using GCC 4.21 on OSX then this (-O2 > -O3) may still typically be the case. >The situation is better nowadays but still, as far as I know, no major Linux distro uses -O3 as the default for binary packages. I'd say they typically use the upstream optimization settings. |
I can, was about 4 months ago with GCC 4.8.0.
>practically every performance oriented open source program I come across also defaults to -O3 these days
How large is your sample size there? I have only seen -O3 in the default makefiles of audio/video encoders. Those tend to be a natural fit for -O3. In contrast, here is the current makefile of my favorite "performance oriented" FOSS program:
http://repo.or.cz/w/luajit-2.0.git/blob_plain/HEAD:/src/Make...
CCOPT= -O2 -fomit-frame-pointer # Note: it's no longer recommended to use -O3 with GCC 4.x. # The I-Cache bloat usually outweighs the benefits from aggressive inlining.
>I can't recall any loop unrolling options which are turned on at -O3.
You are right (I just looked it up). Guess my memory failed me there.
>I'd say they typically use the upstream optimization settings
I wish! Packagers love to fool around with the upstream sources and makefiles to make them conform to whatever "standards" they have.