Hacker News new | ask | show | jobs
by gillianseed 5202 days ago
I'd say that in 9 out of 10 GCC creates faster code than LLVM/Clang (I see typically 5-10% difference in performance oriented code), add to this that LLVM/Clang lacks strong special optimization strategies like PGO (profile guided optimization) then it's a clear win for GCC. GCC also supports more languages and architectures than Clang which simply mirrors the needs of Apple (ObjC, C, C++). If you are on OSX then yes, there's likely little reason for you to use GCC since OSX ships with a (5 year?) old GCC version and also obviously because Clang/LLVM integrates much better with Apple's proprietary XCode.

That said I use both, and at work we test our code against both toolchains (and some other compilers aswell). The static analyser in Clang is a welcome addition and the error diagnostics/reporting is top notch so it certainly has strong features even though it falls behind GCC in code optimization.

3 comments

Since 3.0 added greedy register allocation and LLVM SVN just added better intrinsics for SSE and AVX, I've seen LLVM SVN from the last few weeks pull ahead of GCC 4.6.2 in quite a few personal projects for high performance C++ whereas before that, as you say, GCC was pretty much always ahead.

However, GCC 4.7's just around the corner and I haven't tried that yet...

PGO will probably come to LLVM soon, for what it's worth. They added branch probability and basic block frequency support in 3.0 with an eye towards it, anyway.
Great, yes there's been propositions made towards it for quite some time but no actual code so I was almost thinking it wouldn't happen, here's hoping it will happen now. Every other major compiler I can think of has it, GCC, ICC, Open64, MSVC, and the optimization often has a great impact on performance dependant code.
If your only reason for using gcc is speed and you're on a Linux platform, you should really try out icc. The loop unrolling and vector op generation are much better than gcc's (at least for code we've tested on, YMMV) and can result in some really big speedups.

Though, admittedly, if speed is an issue you probably have already manually loop-unrolled and used the gcc compiler intrinsics.

I have and yes ICC generally won on our tests but:

A) it's proprietary, I have no interest in relying on a proprietary toolchain (and from what I gather neither does the company I work for)

B) it supports a very limited range of cpu architectures, not only is it directly tailored for Intel cpu's it even has a history of selecting poor code paths for AMD cpu's.

As for manual loop unrolling, for alot of code PGO does a great job here by unrolling based on the statistics gathered during the first pass. In fact GCC's pgo seems to do a better job than ICC's pgo implementation, ICC's lto beats GCC's on the other hand, and of course ICC does a better job at vectorization and has better optimized math functions.