Hacker News new | ask | show | jobs
by killingtime74 1302 days ago
Can someone smarter than me enlighten me when you would consider disabling bounds checking for performance? In ways the compiler is not already doing so? The article starts with a bug that would have been prevented by bounds checking. It's like talking about how much faster a car would go if it didn't have to carry the extra weight of brakes.
10 comments

I think the point of the article is the other way around: when starting from a language like C that doesn't have bound checking, moving to Rust will involve adding bounds checks and then an argument will be made that this will regress performance. So to test that hypothesis you start with the safe Rust code, and then remove the bounds check to emulate what the C code might be like. If, as in the article, you find that performance is not really affected, then it makes a C-to-Rust migration argument more compelling.
Making the migration in order to find out if it was worth it appears to be quite an expensive test of an hypothesis.
What's the alternative, really?
Sometimes the extra speed can be relevant. Knowing what the upside _can_ be can help inform the choice whether it's worth it.

Secondly, even assuming you want runtime bounds checking everywhere, then this is still a useful analysis because if you learn that bounds-checking has no relevant overhead - great! No need to look at that if you need to optimize. But if you learn that it _does_ have an overhead, then you have the knowledge to guide your next choices - is it enough to be worth spending any attention on? If you want the safety, perhaps there's specific code paths you can restructure to make it easier for the compiler to elide the checks, or the branch predictor to make em smaller? Perhaps you can do fewer indexing operations altogether? Or perhaps there's some very specific small hot-path you feel you can make an exception for; use bounds-checking 99% of the time, but not in that spot? All of these avenues are only worth even exploring if there's anything to gain here in the first place.

And then there's the simple fact that having a good intuition for where machines spend their time makes it easier to write performant code right off the bat, and it makes it easier to guess where to look first when you're trying to eek out better perf.

Even if you like or even need a technique like bounds checking, knowing the typical overheads can be useful.

I've seen bounds checks being compiled to a single integer comparison followed by a jump (on x86 at least). This should have a negligible performance impact for most programs running on a modern, parallel CPU. However, for highly optimized programs that constantly saturate all processor instruction ports, bounds checks might of course become a bottleneck.

I think the most preferable solution (although not always possible) would be to use iterators as much as possible. This would allow rustc to "know" the entire range of possible indexes used at runtime, which makes runtime bounds checking redundant.

Some old benchmarks here: https://parallel-rust-cpp.github.io/v0.html#rustc

In Rust you can usually use iterators to avoid bounds checks. This is idiomatic and the fast way to do things, so usually when using Rust you don't worry about this at all.

But, occasionally, you have some loop that can't be done with an iterator, AND its part of a tight loop where removing a single conditional jump matters to you. When that matters it is a real easy thing to use an unsafe block to index into the array without the check. The good news is then at least in your 1 million line program, the unsafe parts are only a few lines that you are responsible for being sure are correct.

In tight numeric code that benefits from autovectorization.

Bound checks prevent some optimizations, since they're a branch with a significant side effect that the compiler must preserve.

I don't really see this as a person who wants to turn off the bounds checking for any real reason, but as someone who just wants to have an idea of what the cost of that bounds checking is.
Sometimes the programmer can prove that bounds checks are unnecessary in a certain situation, but the compiler can't prove that itself, and the programmer can't communicate that proof to the compiler. Bounds checks can result in lost performance in some cases (very tight loops), so unsafe facilities exist as a workaround (like `get_unchecked`).
> It's like talking about how much faster a car would go if it didn't have to carry the extra weight of brakes.

And there’s folks who do exactly that.

>Can someone smarter than me enlighten me when you would consider disabling bounds checking for performance?

Because it is faster. Worst case you are triggering a branch miss, which is quite expensive.

>It's like talking about how much faster a car would go if it didn't have to carry the extra weight of brakes.

So? Every wasted CPU cycle costs money and energy. Especially for very high performance applications these costs can be very high. Not every car needs brakes, if it doesn't need to stop by itself and crashing hurts nobody they are just waste.

In some very hot code (most times loops with math stuff were it prevents some optimizations and/or the compiler fails to eliminate it) it can lead to relevant performance improvements.

Because of this you might find some rust code which opts out of bounds check for such code using unsafe code.

But this code tends to be fairly limited and often encapsulated into libraries.

So I agree that for most code doing so is just plain stupid. In turn I believe doing it on a program or compilation unit level (instead of a case by case basis) is (nearly) always a bad idea.