Hacker News new | ask | show | jobs
by rmcclellan 2731 days ago
I work using C++17 for high performance applications, and I can relate to a lot of these gripes. I think it's a fair point that C++ is unreasonably complex as a language, and it's been a serious problem in the community for a long time.

One part that really struck me as odd is the focus on non-optimized performance. To me, this is an important consideration, but not nearly as important as optimized performance. Using techniques like ranges can definitely slow down debug performance, but much of the time it _dramatically increases_ optimized performance vs. naive techniques.

How do ranges speed up optimized builds? One of the best techniques for very high performance code is separation of specifying the algorithm and scheduling the computation. What I mean by this is techniques like [eigen](http://eigen.tuxfamily.org/index.php?title=Main_Page) and [halide](http://halide-lang.org) where you can control _what_ gets done and _how_ it gets done separately. Being able to modify execution orders like this is critical for ensuring that you're using your single-core parallelism and cache space in an efficient way. This sort of control is exactly what you get out of range view transformers.

3 comments

> I work using C++17 for high performance applications

> One part that really struck me as odd is the focus on non-optimized performance

I'm guessing your high performance applications aren't interactive? When your application has to respond to user input in real time, a binary that is 100x slower than real time is completely useless. You can't play a game at 0.3 frames per second.

I would be interested in seeing an example of how Halide-like techniques can be used with C++ ranges. I am skeptical that you could get the kind of performance improvements that Halide can achieve. And of course you won't get the GPU or DSP support that is really useful for that kind of computation.

This is what RelWithDebInfo builds are for.

Don't make my Debug build into a RelWithDebInfo build or it makes it a huge pain to track down subtle bugs/errors in non-performance-critical unit tests.

This is dealt with in the article - debugging optimised code is a pain, even when you know what you're doing. The source-level debugging often doesn't work, variable watches often don't work (and this even though DWARF has a whole pile of features specifically so that this stuff can work...), and debugging at the assembly language level is a chore.
gcc has -Og (optimize without harming debugging) which is supposed to avoid these problems
Well, you aren’t rebuilding your binary every frame, are you? I might be missing something.

Also, I think build time is super important in most contexts - what I think is less important is runtime speed when you’ve disabled all optimizations.

It’s not the speed of compilation. It’s the speed the program runs with debug build. So runtime speed.

And for games you need decent runtime speed. If you cannot run your game in debug build one has to do good old printf debugging. And yes, if you cannot actually play the game (as in over 10fps) that means you cannot run it in debug build.

Are you confusing build time with performance of the resulting binary? I'm talking about the latter. Both are important and both are lacking with modern C++ in debug mode.

Edit: I see, I carelessly used the word "build" to mean a compiled binary, which was ambiguous. I've changed it.

Thanks for the clarification. I guess like all trade-offs, it's context dependent. I see the advantages of having a realtime usable non-optimized build for debugging. Since I use modern libraries like Eigen, that option has not been available to me for some time.

With "modern" techniques, the performance ceiling is a bit higher - whether that benefit is worth it depends on a lot of factors.

If you're doing Linear Algebra, you're kind of in a C++ sweet-spot, I think.

In particular, you can always debug a tiny version of whatever problem you're trying to solve, so you don't really care that much about non-optimized performance, and a lot of times you're willing to eat a long compile time if it means you squeeze out that last couple percent. Conversely, you care a lot about cache micro-optimizations and talking to GPUs and stuff like that, and generally you want to be just banging on some piece of memory you got from the OS, all things that non-C++ languages make extraordinarily painful.

Even Fortran, which the haters were trying to push as "just better" than C++ for linear algebra has really disappointed me.

> Using techniques like ranges [...] _dramatically increases_ optimized performance vs. naive techniques.

This claim will require some evidence. In my experience, it's extremely common for novice engineers to trade orders of magnitude in build time overhead chasing negligible runtime performance improvements.