Hacker News new | ask | show | jobs
by mlthoughts2018 2502 days ago
binary search is very widely misimplemented,

- https://en.m.wikipedia.org/wiki/Binary_search_algorithm#Impl...

> “When Jon Bentley assigned binary search as a problem in a course for professional programmers, he found that ninety percent failed to provide a correct solution after several hours of working on it, mainly because the incorrect implementations failed to run or returned a wrong answer in rare edge cases. A study published in 1988 shows that accurate code for it is only found in five out of twenty textbooks. Furthermore, Bentley's own implementation of binary search, published in his 1986 book Programming Pearls, contained an overflow error that remained undetected for over twenty years. The Java programming language library implementation of binary search had the same overflow bug for more than nine years.”

1 comments

The bug only manifests if the range you're searching is within MAX_INT/2 (or whatever numeric datatype you're using). I'm not sure I understand the point you're trying to make.
I didn't understand what you were talking about offhand and thought range might be the value of the stored/searched data, not a pointer arithmetic issue.

When the array/list consumes a memory footprint larger than ptrdiff_t it was possible that the specific implementations might fail. It is crucial to remember that the difference between memory address is signed and that the size of objects (even if bare machine integers of some size) in the set might be displayed as a number smaller than one of the powers of two many recall.

On a 32 bit machine this wouldn't be approx 2 billion but rather often 500M 32 bit words or 250M-ish 'double' precision numbers. Let alone if the search is over more complex structures representing objects. For a smaller 16-bit embedded system I could even more readily see this occurring.

You make a good point. I was actually speaking about the general case, where you're simply pulling the bounds of x_min and x_max closer until f(x) reaches a target value or the bounds converge to the same value. f could be an array and x could be an index, but it doesn't have to be. The bug still exists in this general case. All you need is a numerical datatype that can overflow, and most of them can.
It seems you’re deliberately not trying to understand.
I assure you I'm being sincere.
Yet you focus only on one aspect of the linked discussion on the high rate of misimplementations, an aspect that represents the least relevant part, and then act as if this invalidates the point?

That’s not sincere, it’s disingenuous.

I didn't say it invalidates the point, because I honestly don't know what your point was to begin with. Which is why I asked the question originally. I'm still no closer to knowing, and it seems like you're not interested in elaborating. Suit yourself.
This is an extremely disingenuous reply as well. You started off your first reply by saying “the bug only manifests ...” already pre-supposing that the scope of my original comment was restricted to that one thing, and tacitly ignoring all the other aspects of the link.

The (obvious) point is that binary search is widely misimplemented even by professionals writing textbooks (and this had little to do with the specific Java bug responsible for that one example), and this is all stated directly in the link you didn’t read.

It’s comically ill-suited for time based tech screens as even experts teaming with editors often still get it wrong.