Hacker News new | ask | show | jobs
by caladin 1561 days ago
Binary search is deceptively simple. It is quite fertile ground for off-by-one errors, integer overflow, infinite loops, etc.

Wikipedia has an entire section dedicated to this here: https://en.wikipedia.org/wiki/Binary_search_algorithm#Implem...

  When Jon Bentley assigned binary search as a problem in a course for professional programmers, he found that ninety percent failed to provide a correct solution after several hours of working on it, mainly because the incorrect implementations failed to run or returned a wrong answer in rare edge cases. A study published in 1988 shows that accurate code for it is only found in five out of twenty textbooks. Furthermore, Bentley's own implementation of binary search, published in his 1986 book Programming Pearls, contained an overflow error that remained undetected for over twenty years. The Java programming language library implementation of binary search had the same overflow bug for more than nine years.
5 comments

I agree that binary search is prone to errors, but the overflow error is just calculating the midpoint as

    (low + high) / 2
I mean, that's technically true because it might crash for huge arrays, but if this is a bug then the follow implementation of a function that adds to integers is also buggy, since it will overflow when x + y doesn't fit in an int:

    int add(int x, int y) { return x + y; }
An important difference between your add method and binary search is that the signature of your add method already implies the contract that the sum of the two integers must fit into an int because there simply is no correct value of the specified return type that could be returned otherwise.

There's nothing about the signature of an ordinary binary search method that would imply that it only works for arrays that have less than MAX_INT/2 elements.

True, but you can make the same case for the addition operator which might have a type annotation somewhere, but it's certainly not visible to most programmers.

Simply put, in many languages there is no addition operator which does mathematically correct addition and that is a sad state of affairs.

This is a modern phenomenon.

In the first few decades of electronic computers and programming languages, having an addition or any other arithmetic operation that would not signal correctly the overflow exceptions would have been considered as completely unacceptable.

Computers without the right hardware implementation appeared initially among the so-called minicomputers and microcomputers, i.e. the cheapest computers, which initially were intended for things like industrial control or peripherals for larger computers, where it was supposed that competent programmers will take care to use appropriate workarounds for the hardware limitations.

Unfortunately, this kind of implementation of the arithmetic operations, without appropriate means for detecting overflows, intended initially only for the cheapest products, has spread over the years to all CPUs.

Even if from time to time there are news about some horror story caused by a combination of weak hardware with the lack of appropriate software checks, it appears that there is no hope that this unfortunate hardware design fashion will ever be reversed.

The processors I've used all have an overflow flag that will tell you if an addition result exceeded the size of the register. But I'm not aware of any compilers that will use the flag, because it adds overhead that isn't wanted or needed 99.99% of the time.
> Simply put, in many languages there is no addition operator which does mathematically correct addition and that is a sad state of affairs.

I read a book on Clojure when it was fairly new containing a spirited defense of the fact that arithmetic operators like + and - always returned the correct result. This was slower, because they needed to do bounds checking, but the result was always correct. If you wanted faster arithmetic with bugs, you'd use the explicit operators +. or -. (or *. or, presumably, /. -- I'm not sure how division was handled).

Shortly after that, Clojure reversed its policy and + will give you fast addition with bugs.

None, I think. Best case you have somewhat graceful handling on out of memory, but that handling isn't going to give you the result of the addition.
I don't think this is an unfair gotcha. If you have u32 for low and high, and someone gives you an array with 4 billion entries, you can return the correct value if you compute the midpoint correctly.
It's more comparable to:

  int add(int x, int y) { return (2*x + 2*y) / 2; }
If ints are 32 bit you would expect to be able to add 1 to 2^20, but that causes an overflow with this implementation. (low + high) / 2 is arguably worse because you turn the algorithm from working on any array into one that only works on arrays that are small enough not to cause the overflow.
Instead of

   (low + high) / 2
I use

   low + (high-low) / 2
to avoid any possible overflow.
I think that may actually be considered buggy, and the actual definition should be

   long add(int x, int y) { return x + y; }
with the caveat of course that `long` is larger than an `int`, which I can't recall if it is valid on all platforms. So mathematical u32 + u32 should be u64, but you can define an `add` function that has strict overflow behavior.
Shouldn't you write

    return (long)x + y;
to ensure the addition happens after the conversion to long?
Yes, this is essentially the case I'm making: if this implementation of binary search is buggy, then so is the addition operator in C (and many related languages).
They're not comparable. Trying to compare them suggests that you don't understand what the issue is with the `(low + high) / 2` case, or you don't want to acknowledge it.

Simple addition will fail if the result of the expression should fall outside the range of integers expressible for a given integer width—because it can't fit. That's not insightful. It's facile. The criticism of the naive midpoint computation is not facile.

The problem with using `(low + high) / 2` to compute the midpoint (instead of `(high - low) / 2 + low`) is that it will give invalid results even when the data type from the signature is wide enough that the expected result can fit.

No, it's just slightly unintuitive. That's not the same thing.
That might not be enough. On 64-bit linux, both int and long are 32 bits.
You meant on 64-bit Windows.

On most 64-bit UNIX-like systems, long is the same with long long.

Microsoft are the kings of backward compatibility. Since long has been 32 bits for the last 20 years, it shall remain so until the end of time. Linux compilers are free to be a little more pragmatic. The standards are flexible enough to allow either behavior.
In Microsoft C++ long and int are both the same size, 32 bits. Even when size_t is 64 bits.
Also, no integer overflow in Python, so no bug here.
Very true! Which is also why I used a C-like syntax in the snippet ;)
Tim Bray wrote a post, "On the Goodness of Binary Search," in which he stated, "Simplicity is good, and binary search, coded properly, (see below) is awfully damn simple" (emphasis in original):

https://www.tbray.org/ongoing/When/200x/2003/03/22/Binary

In an update to his post, Tim linked to Joshua Bloch's "Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken" post about the problems you describe:

https://ai.googleblog.com/2006/06/extra-extra-read-all-about...

> he found that ninety percent failed to provide a correct solution after several hours of working on it

I wonder how many university algorithms engineers would run into a similar situation with. I'm not very far out of university, so I have an unfair advantage, but take some engineers with at least 10 year experience and have them implement Dijkstra's by memory. Not saying this is necessarily a bad thing. If you're in industry for years, you have gained a lot more applicable knowledge to your work, kind of like how runners have a different kind of strength and muscle than weight lifers - they both meet their needs.

That said, I'm surprised at 90% overflowing code. Is it all in that midpoint calculation?

Yes, it's always the midpoint calculation. Everybody forgets that (low+high) could overflow. It's trivial to work around once you know - just replace mid=(low+high)/2 with mid=low+(high-low)/2.
I agree - the best binary search algorithm is the one someone else wrote, preferably in the standard library of the language you are using.

I've written binary search algorithms maybe a half a dozen times, and every time I get tripped up by some boundary condition or other and what is only a few lines of code takes a really long time to get exactly right for all the corner cases.

You'd be surprised how often standard libraries can be broken. The integer overflow bug for binary search has been found in many of them.
That passage was the inspiration for someone's blog to present a challenge and see how many people could pass the test.

https://reprog.wordpress.com/2010/04/19/are-you-one-of-the-1...

Covered by HN at the time: https://news.ycombinator.com/item?id=1277459

My own solution was in Python, so it didn't fall prey to the very common integer overflow error. Python has the amazing and handy property that integers don't overflow.