Hacker News new | ask | show | jobs
by bokglobule 2943 days ago
Amen. IMO the root of all evil is compiler "optimization" of code. If you really, really need the wee bit of extra performance through optimization, hire an expert at ass'y language and optimize the bit that matters. Rarely is this the case, especially with business apps.

I've seen my fair share of bugs from the compiler incorrectly hoisting variables from loops, mis-using registers, etc. I prefer to turn off compiler optimization as one of the first steps in a new project, but that's me.

4 comments

> If you really, really need the wee bit of extra performance through optimization, hire an expert at ass'y language and optimize the bit that matters. Rarely is this the case, especially with business apps.

If execution efficiency is not a concern and you're using C, then you're most likely using the wrong language. Especially for "business apps".

> If execution efficiency is not a concern and you're using C

Two things.

Thing one: Execution efficient on super scalar processors is very unlike what you assume it to be. For instance you assume alignment makes you program fast, when in fact it makes it slower due to cache pressure.

Thing two: On lower end processors what's important is not speed but code size.

Thing three: (off by one error natch) The amount of critical C code is tiny tiny tiny compared to the amount of C code where execution speed just does not matter because the code is almost never executed. The bleeding hot sections are often rewritten in assembly (encryption cough decryption)

> Execution efficient on super scalar processors is very unlike what you assume it to be.

It's not. What do you think I "assume execution efficiency to be"?

> On lower end processors what's important is not speed but code size.

That's why I said "most likely", not "always". Besides, those tiny microprocessors typically don't run the "business apps" which bokglobule was talking about.

> The amount of critical C code is tiny tiny tiny compared to the amount of C code where execution speed just does not matter because the code is almost never executed.

Not true for most software. Sure, most applications spend most of their execution time in a small part of the code, but it's typically not so small that you could easily rewrite it in assembly.

> The bleeding hot sections are often rewritten in assembly (encryption cough decryption)

Cryptography kernels are written in assembly to ensure that they have a constant execution time to prevent timing attacks. Not so much for performance.

> I "assume execution efficiency to be"?

You assume that modern high performance processors execute instructions. They do not, they analyze, optimize and emit their an internal instruction stream and then execute that. That extra step is what makes fiddly UB type 'optimizations' worthless. And since you don't know what processor is going to execute the code and how it's optimized, the speed gains are basically 'noise'

> Besides, those tiny microprocessors typically don't run the "business apps" which bokglobule was talking about.

"Business apps" is usually network and io bound more than anything.

> Not true for most software.

Dan Bernstein says you're wrong.

> Cryptography kernels are written in assembly to ensure that they have a constant execution time to prevent timing attacks. Not so much for performance.

Dan Bernstein says you're wrong here too. Performance is everything with encryption. Consider video streaming over and encrypted connection. Yeah that.

> You assume that modern high performance processors execute instructions.

Yeah, stop making stuff up about me.

> they analyze, optimize and emit their an internal instruction stream and then execute that.

I know how an out-of-order, superscalar processor works. I also know that they can't do magic, because their optimizations are severely limited by time and scope constraints (although they do have access to some information that isn't available at compile time).

> That extra step is what makes fiddly UB type 'optimizations' worthless.

I'm not arguing for optimizations based on UB. This subthread is about compiler optimizations in general, which bokglobule claimed to be nearly always unnecessary.

> Dan Bernstein says you're wrong.

Do you have a study you can cite? Do you think just dropping a name will convince anyone?

> Performance is everything with encryption.

I never claimed it wasn't. I claimed that performance isn't the reason why cryptography kernels are rewritten in assembly, and that's because C + optimizing compiler is already fast enough and the small performance gain alone doesn't justify the switch to assembly.

You can care enough about efficiency to use C, and still not care about the last 0.1% of efficiency. You can especially not care about the last 0.1% of efficiency in 99% of your code.
If you're compiling C with -O0 (as OP implies)... it's not just the last 0.1% of efficiency you're missing out on. Modern C compilers generate really, really crappy code at -O0, and you're looking at a 10x, 20x slowdown by forgoing optimizations. At those slowdowns, using an interpreted language that lacks a JIT starts to look competitive in performance.
Which is why it's so important to have -- ultra-clear -- (have you looked at the clang documentation or GCC manpages?) guidelines on how to disable optimizations that can be dangerous/unpredictable (strict-aliasing comes to mind) "within" -O1 or -O2 or -Os

One shouldn't need to throw the baby out with the bathwater (-O0) in order to get some semblance of semantics that won't pull the rug under your feet when you're not looking.

In practice, what these requests tend to boil down to is requests for the compiler to read the programmer's mind (strict aliasing is pretty much the biggest exception). What optimizations do you think are "dangerous/unpredictable"? Demanding that things like traps happen predictably means that you heavily constrain the ability to do dead-code elimination (can't eliminate code that could trap!) or loop-invariant code motion, two of the biggest performance wins, especially for things that could trap such as memory loads and stores, which are the code you most want to avoid whenever possible for performance.

Undefined behavior essentially says that compilers don't have to care about what happens in the cases that would constitute undefined behavior. This doesn't manifest in the compiler as if (undefined_behavior()) { destroy_users_code(); }, contrary to popular opinion. It instead tends to manifest as logic like "along this control-flow path, this condition is true, so we can now thread the jump from block A to block C since you're redundantly checking a known-true condition" and only after unwrapping several layers of computed assumptions do you find the "we assumed overflow cannot occur" at the bottom.

Exactly this.

There seems to be some idea that there is some "evil pass" in the compilers which looks for undefined behavior and then maliciously optimizes surrounding code to do something the programmer didn't expect.

Not at all. Even all the examples of UB leading to unexpected optimizations usually involve a chain of events, with very straightforward and necessary optimizations being involved, like value propagation, inlining, dead-code elimination, etc - and these aren't inherently related to "exploiting UB". You could (in some compilers) disable one of the things in the chain and perhaps avoid the problem for that example, but you'd also hurt a lot of other code that relied on the optimization.

That's one of the problems with people asking for specific small snippets of code where the UB-related transformation produced a big gain: you can certainly find these (but they'll often be picked apart) - but the bigger problem is not with the specific examples: it's that whatever optimization optimization you disable to make the small example work as you'd expect might then produce worse code across your application.

So people who want to disable the optimization that does "that" are often incorrectly assuming there is a small simple optimization which leads to "that" in the first place.

Still, I definitely agree that the situation regarding UB is depressing in many respects. Many of the decisions made by the C committee in the past haven't aged well. If you take a look at the low-level optimizations afforded by the largely-deterministic Java specification, they are largely at the same level as C - but Java had the benefit of coming along a couple of decades later where many open questions in C's time, such as integer overflow behavior, integer sizes, shift behavior, pointer models, etc, had largely been resolved. Platforms that don't conform to the JVM's model of an ideal machine will just have to generate slow code in some cases.

I'm not requesting the compiler to read my mind, I'm just asking for dead obvious and simple guidelines that allow me to perform a cost-benefit analysis and tune the compiler's behavior to what I consider acceptable.

Examples:

I don't care at all about optimizations that are a result of treating signed integer overflow as undefined behavior. I'll go for predictable, deterministic behavior every single time.

Same for strict aliasing rules. I find it absolutely insane and mind boggling that -O2 enables strict aliasing amongst who knows what else (50+ other flags). Why can't the impact of these optimization flags be easier to deduce? Why do I feel like I need to be a compiler developer just to get some measure of confidence in what the optimizers are doing? It's insane that there are people who treat this sort of unwarranted, dangerous complexity as a rite of passage and don't push for something that's better. Most importantly, it's terrifying that a significant chunk of C programmers _are not even aware of these issues_.

C will stick around for decades and will continue to be picked up by newcomers. It doesn't have to stay as dangerous and uncompromising as it is now.

EDIT: There is some progress with things like ubsan and asan but alas they're fairly limited platform-wise. What's worrying is that there hasn't been a clear shift in the mentality of those who control the language as the OP indicates in his report.

Every once in a while I have to compile a kernel without optimizations. The performance penalty you pay is far away from 0.1%, it is definitely very noticeable on first glance. There might be projects where even that does not matter, but I would not think those are the majority.
An order of magnitude slower on non-trivial computation-heavy code (i.e,. not just doing a lot of IO or calls into libraries/the kernel) is a pretty good rule of thumb. Sometimes better, sometimes much worse.

The more layers of abstraction, the worse the penalty for not optimizing, so C++ is usually more heavily affected than C, for example: many of the so called "zero-cost" abstractions in C++ rely on a good optimizer.

C is nowadays foremost a system programming language. Try compiling your operating system with -O0 and see if you still don’t need those optimizations. You’d be surprised.
there's tenuous and difficult to reason about optimizations and then there's removing all of the completely useless loads and stores of intermediate results that occurs in the default simplistic translation.

i might be with you on the former, but that first pass really cuts down on executable (icache) size, nets a huge performance win and actually makes the generated code alot easier to read.

some of those analyses can get pretty involved. but i dont think we should be discouraging people from investigating them - or from providing nice safe defaults and flags to turn them on.

one other speed-independent factor here is that like you I usually start without any optimizations. and usually when I turn it on I find a few bugs in my own code right away. thats pretty helpful.

If the project will eventually be compiled with optimizations in production, it's probably better to hit those problems earlier than later.