Hacker News new | ask | show | jobs
by near 4477 days ago
I go with:

g++ -std=c++11 -O3 -fomit-frame-pointer -fwrapv

fwrapv turns off some "bad" optimizations around signed integer overflow (too likely to cause harm, too unlikely to make a significant performance difference in most cases.)

I also use a lot of asserts to verify the behaviors too costly to not rely on for what I do (low-level CPU simulation and such): linear A-Z, 8-bit char, twos-complement math, arithmetic shift right on signed types, int > 16-bits, etc.

I'm sure my views won't be popular, and I'm not encouraging anyone to follow what I do, just stating my preferences.

I have a love-hate relationship with warnings. My problem is that you end up with false positives that amount more to "how the compiler authors think you should style your code" instead of reporting legitimate issues. When combined with -Werror, it's a show-stopper for no reason.

Clang is much more naggy than GCC. For instance, I frequently switch on boolean variables. Clang doesn't even have a "-Wno-" intrinsic I can push to temporarily disable this.

But there's nothing at all illegal about switching on a boolean value. It annoys me that I need to go back and add unnecessary explicit casting in 100 places in my project to keep Clang quiet, or face real warnings being lost in a sea of false warnings every single time I build my project. I know you can do if(var) { ... } else { ... } ... I don't care. I want to use switch, and I legally am allowed to. Don't bug me about it, Clang.

It also really hates empty statements, eg while(do_something()); warns that there's nothing inside the while loop. I know, the important part is do_something() and its return value. Same for if's, for's, etc. It wants me to put the ; on its own line. Uh, no. That's not my style at all.

And at the same time ... Clang caught a few bugs that GCC overlooked.

So, my current strategy is to build WIPs with GCC at default warnings and Clang with the sledgehammer of -w; and then before any releases, build with maximum warnings on both compilers and analyze each one for legitimate issues. I also run with valgrind to catch many other types of issues, like using uninitialized variables and memory leaks.

5 comments

From clang's source code:

    // switch(bool_expr) {...} is often a programmer error, e.g.
    // switch(n && mask) { ... } // Doh - should be "n & mask".
    // One can always use an if statement instead of switch(bool_expr).
I agree that clang should not warn on this without asking for warnings and even then it should be possible to disable it. Maybe it would be a good idea to file a bug report.

That being said, I can't imagine why you would ever choose to switch on a bool, and even go to the trouble to add a cast instead of just writing an if statement.

Also, my clang and g++ with "-Wall -Wextra" does not warn on an empty while statement.

Thanks for digging that up. Interesting to see their rationale. Seems like they could extend the test to not warn if it's just switch(var) only, but it might have to happen at a different eval level for that to work.

> I can't imagine why you would ever choose to switch on a bool

In my case, it's for opcode execution. They consist of various bit fields that control the behavior, some are only 1-bit wide, some are 2-bits to 5-bits wide. The implementation is a series of switch statements, one after the other, that have cases for all possible values. So by using switch() in all cases, the code looks consistent.

So this is kind of what I dislike about style choice warnings. It's easy to presume there's no valid use case, until you actually find one later on. I get that people can make mistakes, but when I know for certain that I haven't made a mistake, I don't like having to change my code anyway.

Could I force the boolean values to unsigned types even though they're only 1-bit? Yes. Could I just do if/else anyway here? Yes. But I don't want to. I'm happy with my code, and received no warnings at all when I wrote it with GCC several years ago. It's only a "problem" now that I need Clang to target OS X.

> Also, my clang and g++ with "-Wall -Wextra" does not warn on an empty while statement

Hmm, good to hear. If I get to my dev PC before the post is buried, I'll post the output I was getting.

Looks like somebody added -Wswitch-bool just now: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-2...
Oh wow, I wonder if someone saw this thread, or if that was just a coincidence.

Either way, very cool! Thanks for the heads up! I can't use it now obviously, but I can add a diagnostic disable line that'll work in the future.

1. -fomit-frame-pointer is implied by O3 on most platforms now

2. "too likely to cause harm, too unlikely to make a significant performance difference in most cases."

Please define "most cases". Without this, GCC will have significant trouble being able to derive the bounds of most loops, and in turn, will not be able to vectorize, unroll, peel, split, etc.

Saying "unlikely to make a significant performance different in most cases" is probably very very wrong for most people. The last benchmarks I saw across a wide variety of apps showed the perf difference was 10% in most cases, and a lot more in others.

1. that's good to know. I'm all for shortening my cflags line since I don't squelch my Makefile rules.

2. I always get bitten when I try and generalize. I tested this in all of my software, and was not able to detect any performance difference with or without -fwrapv (that is to say, < 1% difference, too small of a difference to make any conclusions.)

I know you can create extreme edge cases where there's a huge difference, just as you can probably make up one that's slower without -fwrapv if you really wanted to.

But yeah, maybe I just don't write code that lends itself to benefiting heavily from these types of assumptions. I also tend to not really rely on signed integer overflow following twos-complement. But all the same, I will take well-defined behavior over the crazy stuff GCC can produce any day, even at the cost of a bit of performance. Of course, going all the way to -O0 is way too extreme. So a case where I see no perceptible performance impact and gain defined behavior? Win-win.

2. You must not write software very amenable.

Fun fact btw: GCC and LLVM are the only compilers I know of to assume loops can overflow at all when optimizations are on.

Compilers like XLC will actually even assume unsigned loop induction variables will not ovefrlow at O3, unless you give them special flags.

:)

> Compilers like XLC will actually even assume unsigned loop induction variables will not ovefrlow at O3

That's not exactly fair. The C standard guarantees that unsigned variables will overflow by wrapping, so if the compiler assumes such a loop won't terminate, it is not conformant.

That's exactly the point: they cheat, because it makes code faster except in the small percent of code it breaks.

Let us all now bow our heads to the almighty SPEC gods ...

I'm curious if there is a way to rewrite your loops which use unsigned types to somehow communicate to the compiler that you will not overflow.
You'd have to use annotations or asserts. Or rely on literal whole program analysis to prove upper bounds of parameters/etc (which still may not be possible statically)

Otherwise, given as something as simple as

for (unsigned i = 0; i < N; i+=2)

You can't say it iterates N/2 times

I took a couple examples where gcc failed vectorization for unsigned indexes and tried to rewrite them in a way that didn't sacrifice 1/2 of the type's range just to satisfy the optimizer. In the first example I could just change a "<= n" to "!= n + 1", and the second example, which was based on your comment, could be solved by using pointers instead of indexing. I still wonder how many examples can't be solved without using signed types.

The results: http://ideone.com/7vidIs

Generated code http://tinyurl.com/le72k4o

In reality I'm not sure what kind of loop would have an index only fitting in an unsigned type.

On a 32-bit machine an unsigned array index means one object using more than half the address space. It's sensible to use unsigned 64-bit for file sizes, but I think it's quite odd that C programers would use it for a loop or array index. Wrong, but defined, behavior is worse than undefined behavior, you know.

On a 64-bit machine, well, it shouldn't be a problem to use signed long long.

If you change <= n to != n + 1, you have just written broken code.

Think of n == UNSIGNED_MAX. n+1 will overflow.

Right, but the same is true of signed as well.
We (as an industry) need to stop using -fomit-frame-pointer, at least by default. I'd be interested to see if there's any real-world workload (not a benchmark) where it makes even a measurable difference, let alone a significant one. The problem, of course, is that it destroys the ability to examine performance in production with tools like DTrace and the like. A one-time couple-of-percent improvement in some cases (which, again, would be surprising to see anyway) is not worth losing the ability to gain more performance improvements for the rest of the software's lifetime.
It was very beneficial on the register-starved x86, but I notice less impact on amd64.

I definitely also have a debug-mode that builds with -g and without -s -O3 -fomit-frame-pointer.

amd64 has enough registers that using one for the frame pointer isn't too bad.

On the other hand, some platforms don't need a frame pointer for debugging; if you emit correct unwind tables that's enough for the debugger to construct a backtrace. I haven't looked lately but am pretty sure amd64 ELF is one of them.

Also, modern gcc generates okay debug info for optimized programs such that it's much more likely you can read variables out of a crash in gdb.

> On the other hand, some platforms don't need a frame pointer for debugging; if you emit correct unwind tables that's enough for the debugger to construct a backtrace. I haven't looked lately but am pretty sure amd64 ELF is one of them.

Working with unwind tables is more complicated and many useful debugging tools don't do so.

It's good to have it at least in dev, but many bugs only show up in production, and performance work can only usefully be done on the shipping, optimized code.
> Clang is much more naggy than GCC. For instance, I frequently switch on boolean variables. Clang doesn't even have a "-Wno-" intrinsic I can push to temporarily disable this.

I don't know how responsive they are, but you might try filing a bug.

Why would you want to switch on a boolean statement?

EDIT: Ignore me. I see your explanation below.