| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by CyberRabbi 1152 days ago
	> Given an optimizing compiler, the first function (count(a)) is likely to just immediately return the size of the backing vector. The function is nearly free. The compiler is able to do that with count_inheritance() as well if it's able to prove which instance of iter_base is used in the call. I suppose even many experienced C++ developers are not aware of this. This optimization is known as "devirtualization" and is fairly well-implemented in Clang and GCC. It's even more effective since the advent of LTO. Some more info: https://quuxplusone.github.io/blog/2021/02/15/devirtualizati... https://blog.llvm.org/2017/03/devirtualization-in-llvm-and-c...

1 comments

munificent 1152 days ago

That's true but devirtualization optimizations tend to be pretty brittle and it's very easy to fall of the optimizer's blessed path and end up back to doing to a virtual call without realizing it.

Worse, once the devirtualization optimization has failed, any further optimizations you would get from inlining the call will also fail.

If you're programming in C++, you probably do care about this level of performance, and in that case, it's nice to program in a style that guarantees it instead of hoping for a sufficiently smart compiler.

link

kaba0 1151 days ago

Unless you are in a hot loop (where you may not use virtual methods to begin with), I don’t think that performance difference is significant. Virtual calls have a slight overhead, but far from serious, and similarly not inlining something that you call only a single time for example is not the end of the world.

link

munificent 1151 days ago

The problem with not inlining is less with the overhead of the function call itself, and more the loss of further optimization opportunities. Consider this (trivial) example:

    main() {
      int x = foo() + 3;
    }

    int foo() {
      return 5;
    }

Without inlining you have both the overhead of the call and the arithmetic addition. If you can inline the call then you get:

    main() {
      int x = 5 + 3;
    }

But more importantly, the optimizer can now also eliminate the addition too:

    main() {
      int x = 8;
    }

This is obviously a trivial example, but in real-world code, the optimization options opened up after inlining are important.

link

CyberRabbi 1152 days ago

> If you're programming in C++, you probably do care about this level of performance, and in that case, it's nice to program in a style that guarantees it instead of hoping for a sufficiently smart compiler.

Neither implementation guarantees any particular sequence of assembly instructions. Both require hoping that a sufficiently smart compiler will compile it to a sufficiently optimal sequence of instructions.

link

munificent 1152 days ago

Yes, in principle a compiler is free to generate arbitrarily horrendous code regardless of what you ask it to do.

In practice, non-virtual function calls are reliably compiled to fairly efficient code while virtual calls are much less reliable.

link

CyberRabbi 1152 days ago

> In practice, non-virtual function calls are reliably compiled to fairly efficient code while virtual calls are much less reliable.

Like I said, this echoes the conventional wisdom that most C++ developers seem to retain. The compiler landscape has changed since that wisdom was formed, since the advent of LTO and devirtualization optimizations.

link

asdfws 1152 days ago

So has programmer methodology. Throwing virtual calls everywhere is a relic of the past from the era of OOP fetishism. If you actually have statically verifiable leafs for virtual calls then you didn't even need a virtual call to begin with. It's code slop.

link

i-use-nixos-btw 1152 days ago

Strongly agree on the philosophy - but alas my experience with this specific case has not been great.

I’ve seen it simplify patches of code here and there, and that does apply to the trivial examples given in the post this topic links to - a function call involving a known child class. But add some basic real-world complexity and it quickly gets too complex for the optimiser to prove that it knows for sure what child class it is dealing with.

link