| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dataangel 1302 days ago
	They’re nowhere near free. Branch prediction table has finite entries, instruction cache has finite size, autovectorizing is broken by bounds checks, inlining (the most important optimization) doesn’t trigger if functions are too big because of the added bounds checking code, etc. This is just not great benchmarking — no effort to control for noise.

5 comments

dahfizz 1302 days ago

> autovectorizing is broken by bounds checks

This is the big one. You pay a 50% penalty for actual CPU bound, iteration heavy code with bounds checking enabled.

https://github.com/matklad/bounds-check-cost

link

zozbot234 1301 days ago

The proper way of addressing that is to manually hoist bound checks out of "hot" loops. Not just remove them altogether.

link

camkego 1301 days ago

This should be the article.

Running this with 1.65 on an Intel 12400 gets a nearly 4x speedup when bounds checking is not needed. Just wow.

Bounds checking avoidance is important when it becomes a significant chunk of your hot-path.

link

moloch-hai 1302 days ago

For real programs, you should demand that the compiler hoist such checks out of the loop, which may then be vectorized the usual way.

If the compiler can't do that by itself, a library should do it.

The real issue is whether the information about the true size of the memory region involved is available at the point where it is needed. This may come down to how good the language is at capturing desired semantics in a library. Rust still has a long way to go to catch up with C++ on this axis, and C++ is not waiting around.

Rust claims responsibility for enforcing safety in the compiler, with libraries using "unsafe" to delegate some of that to themselves. Users then trust the compiler and libraries to get it right. In C++, the compiler provides base semantics while libraries take up the whole responsibility for safety. Users can trust libraries similarly as in Rust, to similar effect.

Modern C++ code typically does no visible operations with pointers at all, and most often does not index directly in arrays, preferring range notation, as in Rust, achieving correctness by construction. A correct program is implicitly a safe program.

link

varajelle 1301 days ago

> This may come down to how good the language is at capturing desired semantics in a library. Rust still has a long way to go to catch up with C++ on this axis, and C++ is not waiting around.

What catch up does Rust need to do?

Rust has slice that know the size of its data built in the language, while C++ doesn't. And Rust has stricter const and mutability rules that facilitates optimizations.

As for the implementation, Rust use LLVM which is also the backend used by one of the popular C++ compiler.

link

moloch-hai 1301 days ago

I am talking about language features that library authors can use to capture and express semantics in their libraries... but only if the language implements those features. C++ just has a lot more of them.

link

xyzzyz 1301 days ago

Like what, for example? To the contrary, I think that, other than constness, C++ has rather few facilities to communicate semantic invariants to the compiler.

link

varajelle 1301 days ago

And event const can't in general be used for optimizations (because there can be another reference to the same location, or one can just const_cast)

link

grogers 1301 days ago

If the thread you are on doesn't modify the variable (e.g. by const_cast), and that variable isn't atomic or volatile, the compiler should be allowed to treat it as invariant. Whether it does in practice probably depends on a lot of things though.

link

nextaccountic 1301 days ago

> For real programs, you should demand that the compiler hoist such checks out of the loop, which may then be vectorized the usual way.

LLVM sometimes does this, but when it doesn't, you may insert asserts to guide the optimizer, as explained here https://news.ycombinator.com/item?id=33808853

I think this technique works in C and C++ too (if you use clang or gcc)

link

moloch-hai 1301 days ago

Sometimes a __builtin_assert(c) may help (which is not the same as the normal assertion, which won't). Other times, you need to make a private copy of a value that the compiler could not otherwise assume will not be clobbered.

link

pjmlp 1301 days ago

Unfortunely I only see Modern C++ on C++ conference talks and on my hobby projects.

Most of the stuff I see at work, is quite far from this ideal reality, starting with Android's codebase, or the various ways C++ gets used in Microsoft frameworks.

link

moloch-hai 1300 days ago

There are choices for places to work. Maybe try another one?

link

kelnos 1301 days ago

At the risk of moving the goalposts: so what? The vast majority of applications running out there would not be impacted meaningfully in the least by taking that performance hit.

Bounds checking should be the default, and then only when someone has proved through benchmarking and profiling that it's actually a problem for their application, should they even consider turning it off.

link

imtringued 1301 days ago

Bounds checks are the easiest type of code to branch predict. You just assume they never trigger, suddenly you have a 99.99% hit rate on them. When they trigger you don't care about the branch misprediction at all because the program is already busted and security is more important.

link

TylerE 1302 days ago

Yeah, I didn’t find it compelling either.

If your conclusion is “no signal, just noise” boost the input until the signal becomes apparent. If that means writing such a massive loop that the program takes an hour to run, fine.

link