int mid =(low + high) / 2;
> Specifically, it fails if the sum of low and high is greater than the maximum positive int value (2^31 - 1).I would really challenge calling this kind of effects "bug" or "breakage". It's like calling Newtons law of gravity broken, because it's not accurate at predicting how galaxies move. Things are engieered and tested for a certain scale. Knowing which tools to use at which sacle is part of the craft of engineering. |
More often they’re engineered and tested for an arbitrary scale. The limits aren’t considered, behavior at the edges isn’t accounted for, and it’s assumed it will be good enough for real world inputs.
The use of `int` tends to be a dead giveaway. There are some cases where it’s clearly correct: where the spec says so (like argv), where you’re starting from a smaller type and it’s impossible for the calculations to overflow in an int (like adding two uint8), that sort of thing. And there are cases where it’s subtly correct, because you know the range of the value is sufficiently limited, either by mechanics or by spec.
But most of the time, int gets chosen because it’s the apparent default and it’s easy to type. No analysis has been done to see if it’s correct or if you want to declare your code to only support inputs in a certain range.
It’s really clear if you’ve written a binary search (or anything else that works on general arrays) in C and you use int as the index type. There’s pretty much no scenario where that makes sense. In theory you could analyze the entire program and prove that over-large arrays are never passed in, and keep doing it to make sure it stays that way, but that’s not realistic. If the programmer actually took one second to think about the appropriate data types, they’d use size_t rather than int.
You can still have this bug with size_t, of course. But it won’t be “this falls apart with arrays over 1G elements on 64-bit systems that can easily handle them.” If you declare that you wrote the obvious midpoint calculation with size_t because you didn’t intend to support byte arrays larger than half the address space, it’s at least plausible.