Hacker News new | ask | show | jobs
by dataflow 1915 days ago
> my personal rule of thumb is that my software must be useable at -O0 with address sanitizers on my desktop

The trouble with this criterion is that it fundamentally alters the language from the ground-up: it forces you to optimize the source code structure for this too, not just run-time performance. Specifically, one of the core strengths of C++ is that no matter how many (practical) levels of wrapping and forwarding you do, as long as they're simple, they can generally all get flattened and go away with optimizations like inlining. But if you don't enable optimizations, now every indirection in your source code will cost you—even absolutely trivial things, like std::move() or std::forward(), that should be 100% free. This obviously hampers your ability to design good C++ abstractions, and, basically, turns C++ into a different language (like Javascript or Python). It seems rather suboptimal. (Do you not encounter these issues in your particular application?)

What I would probably prefer in your situation is to change the criteria somewhat, by doing things like keeping ASAN, enabling some debug-mode facilities (like ITERATOR_DEBUG_LEVEL=1 for MSVC), but also enabling some optimizations for inlining and such so that you don't fundamentally alter the language like this. And/or you can just slow down your CPU when testing (in Windows you can just set the max CPU speed in Advanced Power Options).

2 comments

Presumably they still optimize and write for -O3, just that they run far slower version.

Without any manual optimization targeting O0.

(main negative is that missing performance degradation appearing in 03 ut not O0 may be harder to notice)

> : it forces you to optimize the source code structure for this too,

I thought that it would but on my dev machine (a broadwell 6900k, still pretty good but definitely not top of the line) I actually have to push it a fair bit to have this be an issue (which is why it is important to do it ! because low-power computer are really low-power compared to that), so this question definitely does not come up during the design (which is in my case generally very template-y and subject to the issues you mention). For reference, the app in question is https://ossia.io

The cases where doing this led to changes in code were more in the lines of "welp, looks like this algorithm I implemented for rendering waveforms is damn inefficient", "gonna have to think if I can redraw this widget less", "I should really cache the results of this computation", etc.

Interesting, I guess it depends on your application. :-) You made me go back and double-check this on an actual program I had; here's what it is as a comparison point:

So I have an application in front of me right now that I've already optimized the heck out of (and it's as close to single-pass as can be), and turning off optimizations in release mode makes a basic 0.27-second task take 2.4 seconds... almost an order of magnitude difference.

And when I try to break into the code to see where it stops, it's almost always within traditionally-very-cheap operations like std::vector::emplace_back

  1 std::vector::emplace_back
  2 std::vector::_Emplace_back_with_unused_capacity
  3 std::_Default_allocator_traits::construct
  4 T::T
  5 U::V::w
and std::lower_bound

  1 std::lower_bound
  2 std::lower_bound
  3 std::_Seek_wrapped
  4 std::_Vector_const_iterator::_Seek_to
which have suddenly become incredibly expensive due to lack of optimizations like inlining. And notice this is all in the standard library, not within my own (template-light) code.

Going from 0.27 seconds (near-instantaneous for the user) to 2.4 seconds (a huge lag) is enough to make the program incredibly frustrating. Whether it's still "usable" at that point I guess is a matter of debate (some devs just put up with any amount of lag you throw at them!), but I feel pretty safe in saying the task I'm trying to accomplish simply would not be possible without optimizations.

So I'm guessing your performance targets & constraints are quite different, and that's probably why this isn't such a big deal in your case.