Hacker News new | ask | show | jobs
Nearly all binary searches and mergesorts are broken (2006) (ai.googleblog.com)
163 points by finnlab 1326 days ago
24 comments

Fun fact, there are some other lessons here: it can sometimes pay off to (1) generalize your function, and (2) respect the mathematical axioms you're supposed to be following. This (obviously) isn't to say you should always generalize everything, but you should at least consider what would happen if you did so, and if the difference is small, perhaps do it. The benefit of doing so being that it can avoid problems that aren't otherwise obvious—sometimes by design, sometimes by accident.

In particular, (x + y) / 2 is the wrong implementation of midpoint in general, because it would fail to even compile on objects you can't add together. But midpoint is well-defined on anything you can subtract (i.e. anything you can define a consistent distance function for)—and it doesn't require addition to be well-defined between those objects!

One obvious (in C/C++, and not-so-obvious in Java) counterexample here is pointers/iterators. You can subtract them, but not add them. And, in fact, if you implement midpoint in a manner that generalizes to those and respects the intrinsic constraints of the problem, you end up with the same x + (y - x) / 2 implementation, which doesn't have this bug.

Interesting. Another example is datetimes. You can't add datetimes. You can add a datetime and a time delta, and the difference of two datetimes is a timedelta.

I guess in maths this is called a generating Lie algebra (maybe someone can comment on this?)

I think the concept you are looking for is a ["torsor"](https://en.wikipedia.org/wiki/Principal_homogeneous_space).

Basically,

1. You have a 0 time delta, and you can add and subtract them satisfying some natural equations. (time deltas form a group)

2. You can add time deltas to a datetime to get a new datetime, and this satisfies some natural equations relating to adding time deltas to each other (time deltas act on datetimes).

3. You can subtract two datetimes to get a time delta satisfying some more natural equations (the action is free and transitive).

The term I was looking for was affine structure, as I commented to someone else. But from your link, which I can't understand entirely, I get the sense that a torsor is an even bigger generalization.
> The term I was looking for was affine structure, as I commented to someone else. But from your link, which I can't understand entirely, I get the sense that a torsor is an even bigger generalization.

An affine space is a torsor under a vector space, and you can have instead a torsor under any group. This loses a bit of structure, in the sense that you can take convex combinations in an affine space but not in an arbitrary torsor; but otherwise it is a proper generalisation. But the convex combination $(a + b)/2$ used to obtain a midpoint is exactly what we want here!

Indeed, torsors have exactly the properties you describe, but notably not the ability to find the midpoint between two points (that would involve extracting square roots in a group, which is not guaranteed possible, or uniquely defined when possible).
And in fact finding the midpoint is not possible half the time in the space we're interested in (https://news.ycombinator.com/edit?id=33497270). So what is the algebraic structure that underlies the binary-search algorithm, since evidently it isn't really the torsor of a group?
> So what is the algebraic structure that underlies the binary-search algorithm, since evidently it isn't really the torsor of a group?

Though it pains me to say so as an algebraist, I think that it probably just isn't a problem most usefully modelled with a more abstract algebraic structure. Although it would be easy to cook up a structure permitting "division with rounding" … maybe a Euclidean domain (https://en.wikipedia.org/wiki/Euclidean_domain) is something like the right structure?

> I guess in maths this is called a generating Lie algebra

This is often called an affine structure.

This is the term I was looking for, thank you.
Not all metric spaces have midpoints (or unique midpoints) so it’s not true you can compute a midpoint any time you have a distance function (you are right you can define it but that’s kind of useless computationally since it doesn’t give you an algorithm).
If we're going the pedantic route, note that you don't need (and in fact half the time cannot have) uniqueness in our case anyway. There isn't really a unique midpoint for {0, 1, 2, 3}; both 1 and 2 are valid midpoints, even for binary search. We just pick the first one arbitrarily and work with that.

But note that that sentence was just about calculating midpoints, not about the larger binary search algorithm. And in any case, I was just trying to convey layman intuition, not write a mathematically precise theorem.

This should also be obvious after a bit of thought to anyone who has worked with timestamps, and is also well-known in e.g. animation where midpoint is just a special case of p=0.5.
Not sure about midpoint being well-defined on anything we can subtract.. Z and R are infinite.. there are a lot of values in there that don't compute.

To vary your point here, the axioms for twos complement and IEEE floating-point aren't well known or observed.

There are countably infinite turing machines and there is one for every element in Z. But there are uncountably infinite real numbers, so we’re out of luck for almost all of them.
The bug in question is trying to compute an average as

    avg = (x + y) / 2
which fails both for signed ints (when adding positive x and y overflows maxint) and for unsigned ints (when x + y wraps around 0). Note that this can only be considered a bug for array indices x,y when these are 32 bit variables and the array can conceivably grow to more than 2 billion elements.

I wonder what is the simplest fix if the ordering between x and y is not known (e.g. in applications when x and y are not range bounds) and the language has no right-shift operation...

M = L + (R - L)/2

looks fairly simple to me. note this works of any ordering of R and L if the data type is signed.

But doesn't this have the same overflow issue, e.g. if R is a large positive number and L is a large negative one?
In binary search and heapsort, neither L nor R are negative - they are the number of elements in the array.
Binary search can be done on anything, not just arrays. Often you apply it to an algorithm and there isn't a collection at all, you just know the right answer is between some numbers so binary search lets you find it in logarithmic number of tries. If computing the number is costly then binary search is necessary to compute the result at all in those cases.
Yes, if you want a solution to a related problem then the implementation will need a better midpoint calculation.

However, the linked-to binary search starts from 0:

  1:     public static int binarySearch(int[] a, int key) {
  2:         int low = 0;
  3:         int high = a.length - 1;
and the promoted fix is:

  6:             int mid = low + ((high - low) / 2);
The aforementioned Wikipedia binary sort also takes a length, rather than start/end:

  function binary_search(A, n, T) is
      L := 0
      R := n − 1
You can use unsigned division and addition and then go back to signed and it is still correct.
Only on 2s-complement.
Which is basically everything.
Even if the data type is unsigned it suffices as long as the R - L term is signed (so the division by 2 is an arithmetic shift, not a logical shift).
Guess it depends on how you define "simplest"?

x / 2 + y / 2 + ((x & 1) + (y & 1)) / 2

Or

    x / 2 + y / 2 + (x & y & 1)
Edit: This is the same you wrote, but it gets the wrong number for negative values, for negative ints the rounding will go up and not down.
And this is exactly why I like to use higher level programming languages. Let someone smart figure all this out for me, and give me (grug) a generic binary search routine that works on arbitrary collections of arbitrary ordered things.
Very much relevant talk about std::midpoint in C++:

https://youtu.be/sBtAGxBh-XI

In general (x + (y - x) / 2) is more general than (x + y) / 2. If x and y are not in some group, but rather in the torsor of some group, you can't really sum them. Any attempt to do so involves introducing some arbitrary reference point. You can always do this, but once you do, you're at risk of your calculation results depending on the choice of arbitrary reference point and hence being meaningless.

The difference of two elements of the torsor of some group G is an honest-to-God group element of G, though, and so you have an honest-to-God identity element. You may or may not have an honest-to-God division or halving operator (which computes e given (e + e)) but in cases where G is the additive group of some field you do.

However, in this case our array indices are drawn from something like ℤ/2³²ℤ, and we might be trying to halve odd numbers, so none of this is justifiable! We want something different from our halving operator.

https://math.ucr.edu/home/baez/torsors.html

I see dataflow and maxiepoo were already talking about this: https://news.ycombinator.com/item?id=33493149

> The difference of two elements of the torsor of some group G is an honest-to-God group element of G, though, and so you have an honest-to-God identity element. You may or may not have an honest-to-God division or halving operator (which computes e given (e + e)) but in cases where G is the additive group of some field you do.

… some field of characteristic ≠ 2, of course.

Right, in GF(2ⁿ), e + e is always 0. Thank you for the correction.
The simplest fix is obviously to use 64 bit ints and call it a day.

  (x^y)/2 + (x&y)
Related:

Google Research Blog: Nearly All Binary Searches and Mergesorts Are Broken - https://news.ycombinator.com/item?id=16890739 - April 2018 (1 comment)

Nearly All Binary Searches and Mergesorts Are Broken (2006) - https://news.ycombinator.com/item?id=14906429 - Aug 2017 (86 comments)

Nearly All Binary Searches and Mergesorts Are Broken (2006) - https://news.ycombinator.com/item?id=12147703 - July 2016 (35 comments)

Nearly All Binary Searches and Mergesorts are Broken (2006) - https://news.ycombinator.com/item?id=9857392 - July 2015 (43 comments)

Read All About It: Nearly All Binary Searches and Mergesorts Are Broken - https://news.ycombinator.com/item?id=9113001 - Feb 2015 (2 comments)

Nearly All Binary Searches and Mergesorts are Broken (2006) - https://news.ycombinator.com/item?id=7594625 - April 2014 (2 comments)

Nearly All Binary Searches and Mergesorts are Broken (2006) - https://news.ycombinator.com/item?id=6799336 - Nov 2013 (46 comments)

Nearly All Binary Searches and Mergesorts are Broken (2006) - https://news.ycombinator.com/item?id=1130463 - Feb 2010 (49 comments)

Google Research Blog: Nearly All Binary Searches and Mergesorts are Broken [2006] - https://news.ycombinator.com/item?id=621557 - May 2009 (9 comments)

This was always my go to interview question when I wanted to smugly prove to someone I’m smarter than them because I knew in fact they were smarter than me and I was feeling insecure. Good to see others use overflow gotchas too.
My favorite was; write a function that determines the number of games necessary to be played in a single elimination tournament with N participants. It’s interesting to watch how many go off into recursion land when they get into the mind set of solving these Leet Code puzzles.
My favorite is when interviewers expect you to know sportsball stuff like tournament elimination rules when interviewing programmers who clearly don’t care about sportsball
Could be chess.
Real nerds don’t compete, they program
N-1 games?
I hate when interviewers rely on niche recall-only interview questions...
Eh, I don't think integer overflow is a recall-only type question

This type of issue is pretty common to encounter and I make at least a few fixes a year specifically addressing integer overflow across many companies

the right solution is to parametrize the search region as (offset, length) instead of (start, end). then the midpoint is just offset+length/2.

you can also remove that unpredictable branch in the loop if you want.

  whatever_t *bisect(whatever_t *offset, size_t length, whatever_t x) {
    while(size_t midpoint = length / 2) {
      bool side = x < offset[midpoint];
      midpoint &= side - 1;
      length >>= side;
      offset += midpoint;
      length -= midpoint;
    }
    return offset;
  }
(offset, length) was how I coded it in the 90s, too, precisely because it made correctness clearer. "Nearly all" broken, hmph.
Here is the approach taken in Go's sort.Search()

Do the sum using signed int.

Then cast to unsigned int before the division (i.e., use a non-arithmetic shift low).

Then cast back to signed int.

  func Search(n int, f func(int) bool) int {
      // Define f(-1) == false and f(n) == true.
      // Invariant: f(i-1) == false, f(j) == true.
      i, j := 0, n
      for i < j {
          h := int(uint(i+j) >> 1) // avoid overflow when computing h
          // i ≤ h < j
          if !f(h) {
              i = h + 1 // preserves f(i-1) == false
          } else {
              j = h // preserves f(j) == true
          }
      }
      // i == j, f(i-1) == false, and f(j) (= f(i)) == true  =>  answer is i.
      return i
  }
If you care about stuff like this you may enjoy the puzzle "Upside-Down Arithmetic Shift":

https://bugfix-66.com/76b563beb6f4e61801fce4e835be862fb3dbbe...

The solution here is not really interesting except from a language design perspective. Go avoids this problem by having the maximum array length be int, but doing the math in uint. This won’t work in languages that lack uints (Java) or have maximum array sizes in uint (C/C++).
Java lacks a distinct uint type, but (since Java 8) allows you to perform unsigned operations on a regular int, effectively treating it as a uint.

It doesn't help that almost nobody knows this, though.

At the point where you're writing `>>>` to, ironically, do proper arithmetic - you should probably write a correct version without a shift instead.
This wouldn't work for C/C++ because in these languages signed integer overflow is undefined behavior.
That is correct. A serious mistake in C.

Go was designed by (among others) the father of Unix Ken Thompson, with an understanding of the mistakes of C and C++.

Another example is that Go requires explicit integer casts (disallowing implicit integer casts) to avoid what is now understood to be an enormous source of confusion and bugs in C.

You can understand Go as an improved C, designed for a world where parallel computing (e.g., dozens of CPU cores) is commonplace.

> A serious mistake in C.

Well, that's arguable. This "mistake" could be fixed in C tomorrow without breaking the semantics of any existing C code, but notice that this hasn't been "fixed" in any of the latest C standards, so perhaps it's still there for a reason.

And the reason it hasn't been "fixed" is that compilers can optimize code better if they can assume that signed addition won't overflow.

So it's more of a trade-off rather than strictly being a disadvantage.

It's also something you can "fix" in your own code if you want to, by passing a compiler flag (-fwrapv in gcc), although arguably, at that point your code wouldn't be strictly C-standard compliant anymore. Or by using some library that handles arithmetic overflow by wrapping around, which could be implemented in standard C.

> Another example is that Go requires explicit integer casts (disallowing implicit integer casts) to avoid what is now understood to be an enormous source of confusion and bugs in C.

I agree with you on this, although forcing explicit casts also makes the code more verbose and can make it harder to understand what's going on.

I think a balanced approach is requiring explicit casts only for the "tricky" cases, i.e. when the values might become truncated, and possibly also when sign-extension might be necessary and therefore might result in something the programmer didn't expect.

But if I were to design a language I'm not sure that I would require explicit casts for e.g. promoting a uint16_t to a uint32_t...

> You can understand Go as an improved C, designed for a world where parallel computing (e.g., dozens of CPU cores) is commonplace.

That's a bit of a hot take :) Let me know when the Linux kernel starts to get rewritten in Go ;)

> And the reason it hasn't been "fixed" is that compilers can optimize code better if they can assume that signed addition won't overflow.

Everyone says that and they have no proof.

> Everyone says that and they have no proof.

Here's an entire blog post with proof:

https://kristerw.blogspot.com/2016/02/how-undefined-signed-o...

You could write the same approach in C as `(size_t)i+(size_t)j` without UB. The real reason it doesn't work in C is because a memory region can be large enough to still overflow in that case.
That's not exactly the same approach, because you're doing unsigned addition while the Go code is doing signed addition.

And technically speaking, I think C doesn't guarantee that 'size_t' is at least as large as a 'signed int' (even though this is true on all platforms that I know of), so your approach would fail if that weren't the case. Although, you could use 'ssize_t' instead of 'int', or 'unsigned int' instead of 'size_t' to fix that.

> The real reason it doesn't work in C is because a memory region can be large enough to still overflow in that case.

The Go code we are discussing has nothing to do with memory regions, it's a generic binary search function, so it can be used for e.g. bisecting git commits. It doesn't require the calling function to use arrays.

Although yes, if the calling code were trying to do a binary search on an array, conceptually it could fail, but in that case you could argue the bug would be in the calling function, because it would be trying to pass the array length into a binary search function which only accepts an `int` or `ssize_t` function parameter, which could result in the array length being truncated. But strictly speaking, this would not be an arithmetic overflow issue.

That said, I would just fix the code so that it works for the full 'size_t' range, since the most common use case of a binary search function is indeed to do searches on arrays. In that case, the Go approach wouldn't work indeed.

> I think C doesn't guarantee that 'size_t' is at least as large as a 'signed int'

That doesn't matter, because size_t is large enough to hold any array index (that's kind of[0] the defining property of size_t), so any array index in a signed int can be safely converted to size_t. The real problem is that (using 16-bit size_t for illustrative purposes) if you have, say, x = (size_t)40000 and y = (size_t)50000 into a 60000-element array, x+y = (size_t)90000 = (size_t)24464, which means (x+y)/2 = 12232, which is the completely wrong array element.

0: Technically, size_t is large enough to hold any object size, but array elements can't be smaller than char (sizeof can't be less than 1), so a array can't have more elements than it's sizeof.

> > I think C doesn't guarantee that 'size_t' is at least as large as a 'signed int'

> That doesn't matter, because size_t is large enough to hold any array index (that's kind of[0] the defining property of size_t), so any array index in a signed int can be safely converted to size_t.

Well, the Go code we're discussing has nothing to do with arrays or array indices, so `size_t` doesn't help here.

Go look at the code :) It's a generic function for doing binary search, which accepts an `int` as a function argument, specifying the search size.

The code is then doing:

  h := int(uint(i+j) >> 1) // avoid overflow when computing h
Replacing the Go expression `uint(i+j)` with `(size_t)i+(size_t)j` in C like morelisp proposed would not work correctly if `size_t` is smaller than `int`.

That's the point I was making.

Pretty sure that’s not the case for 64 bit systems since you can “only” allocate about 48 bits of address space (maybe slightly more on newer systems).

For 32 bit systems using 64bit instead of size_t would similarly solve the problem.

Well, that's not something the C standard (or POSIX, etc) guarantees, is it?

Conceptually, a 64-bit kernel today could allow your program to allocate (almost) the entire 64-bit address space, assuming it does memory overcommit (like Linux) and/or uses some kind of memory compression (like Linux supports as well).

There might be some MMU limitations on today's mainstream systems, but this doesn't mean that all 64-bit systems have those limitations or that those limitations will remain there in the future.

So your code would break as soon as a new system comes along without those limitations.

Also, this would be even more true if the code and stack would be stored in different address spaces, as theoretically that would even allow you to allocate the entire address space, I think.

The system you describe simply doesn’t exist, standards or no. A 64-bit kernel can’t hand out 64-bits worth of addresses because no CPU built today supports it.

A 48-bit index to an array can represent >240TBytes of RAM minimum - if your records are > 1 byte, you have significantly higher storage requirements. The largest system I could find that’s ever been built was a prototype that has ~160TiB of RAM [1]. Also remember. To make the algorithm incorrect, the sum of two numbers has to exceed 64bits - that means you’d need >63-bits of byte-addressable space. That just simply isn’t happening.

Now of course you might be searching through offline storage. 2^63 bits is ~9 exabytes of an array where each element is 1 byte. Note that now we’re talking scales of about about the aggregate total storage capacity of a public hyperscaled cloud. Your binary search simply won’t even finish.

So sure. You’re technically right except you’d never find the bug on any system that your algorithm would ever run on for the foreseeable future, so does it even matter?

As an aside, at the point where you’re talking about 48-bits worth of addressable bytes you’re searching, you’re choosing a different algorithm because a single lookup is going to take on the order of hours to complete. 63-bits is going to take ~27 years iff you can sustain 20gib/s for comparing the keys (sure binary search is logarithmic but then you’re not going to be hitting 20gib/s). Remember - data doesn’t come presorted either so simply getting all that data into a linearly sorted data structure is similarly impractical.

> For 32 bit systems using 64bit instead of size_t would similarly solve the problem.

~~C does not guarantee a 64 bit type exists.~~ This is not really correct these days.

> C does not guarantee a 64 bit type exists.

Isn't (signed/unsigned) 'long long int' mandatory since C99?

It says: 'There are five standard signed integer types, designated as signed char, short int, int, long int, and long long int'.

My cursory search for 'long long' in the standard didn't find anything about it being optional...

There are still edge cases here - various posters here have mentioned them.

The proper method is to type promote first - not just to unsigned but to a wider variable type - 32 to 64 bits or from 64 to 128 bits. Unsigned simply gives a single extra bit, while erasing negative semantics. Promoting to twice the size works for either addition or multiplication. The benefits are correctness and the ability to be understood at a glance.

> There are still edge cases here - various posters here have mentioned them.

Are you sure? What's an example of an array.length that would trigger a remaining edge case here? (Keep in mind array.length is 32-bit in Java.)

Calling binary search and mergesort implementations "broken" does the author no service with his argument. If the key lesson is to "carefully consider your invariants" then the proper takeaway is that binary search and mergesort implementation lose generality with large arrays.

The implementation shown works perfectly for arrays on the order 2^30. Calling them broken is like saying strlen is broken for strings that aren't null terminated.

Mergesort and binary search have a contract which defines:

- Which inputs are valid.

- For a valid input, what the constraints on the return value are.

You'd have a point if the implementations had as an input constraint: "array must be less than 2^30". But they didn't.

Otherwise, nothing is broken unless it never returns the right answer. Take:

fn add(x: u32, y:u32) -> u32 { return 1; }

This implementation works perfectly for numbers that add to 1. It just loses generality outside of that.

fn add(x: u32, y: u32) -> u32 { return (x + y) - (x >> 10) - (y >> 10); }

This implementation works for x < 2^10 and y < 2^10. Arguably this implementation is much worse than the previous one because it fails unexpectedly. At least the previous implementation would be much more obviously broken.

But these are both broken because they don't fulfill the (implicit) contract for add. You can't just say "well, it's implied that my add function only takes inputs that add to 1" unless you actually write that somewhere and make it clear.

I get what you're saying but I don't think they're analogous. If nothing else, strlen is defined only with null-terminated strings; this comes in both the spec itself, as well as the documentation of pretty much every implementation you find. Whereas most binary search implementations don't claim they only work under some particular inputs. (I think there are likely more differences too, but this is sufficient to make my point.)

More generally, I feel like the thought process of "it's not broken if it works fine for inputs that occur 99% of the time" is an artifact of how little attention we pay to correctness, not something that is intrinsically true. If your function breaks for inputs that are clearly within its domain without any kind of warning... it's broken, as much as we might not want to admit it. We're just so used to this happening near edge cases that we don't think about it that way, but it's true.

> most binary search implementations don't claim they only work under some particular inputs

They do implicitly. It's just common sense. When you read a recipe in a cookbook, it usually doesn't mention that you're expected to be standing on your legs, not on your arms. Reader is expected to derive these things themselves.

A lot of generic algorithm implementations will start acting weird if your input size has the order of INT_MAX. Instances this big will take days or weeks or process on commodity CPUs, so if you're doing something like that you would normally use a specialized library that takes these specifics into account.

>> most binary search implementations don't claim they only work under some particular inputs

> They do implicitly. It's just common sense.

That's neither how language specifications work, nor true in this case even if it's true in other cases. Providing one more of the same kind of input that already works is in no way the same thing as changing something totally unrelated.

> When you read a recipe in a cookbook, it usually doesn't mention that you're expected to be standing on your legs, not on your arms.

I don't think this binary search was breaking because of people standing on their arms either.

> A lot of generic algorithm implementations will start acting weird if your input size has the order of INT_MAX. Instances this big will take days or weeks or process on commodity CPUs,

It's incredibly strange to read this from someone in 2022. I don't know of any standard library algorithm that would take "days or weeks" for inputs of size 2^31 now, let alone the majority of them being like this. In fact I don't think this was the case back when the article was written either.

Ok, I looked at it closer and I admit that quicksort implemented in C won't take days on an input of 2³¹ elements. It will take less than 1-2 hours, I think. Something that is a bit worse than O(n log n) or has a 20× bigger constant hidden in O(·) will take days though.

I don't see my other arguments being convincingly refuted, so they still hold.

> Ok, I looked at it closer and I admit that quicksort implemented in C won't take days on an input of 2³¹ elements. It will take less than 1-2 hours, I think.

How ancient is your machine? Quicksorting (2^31 - 16) elements (because that's Java's hardcoded limit) takes < 11 seconds on my machine, and a big chunk of that time is taken in the random number generation to create the input...

  # Temp.java
  import java.util.*;
  public class Temp { public static void main(String[] args) { byte[] arr = new byte[0x7FFFFFF0]; new Random().nextBytes(arr); Arrays.sort(arr); } }

  $ javac Temp.java && time -p java Temp
  real 10.30
No, wait. The QuickSort from the article is O(n²), in fact. So that one specifically will take weeks, or even months to run — especially in Java. Feel free to test it and get back to me if you think I'm wrong.
> Calling binary search and mergesort implementations "broken" does the author no service with his argument.

Very much on brand for the FAANG r/iamverysmart crowd though.

What on Earth are you talking about? There's nothing "iamverysmart" about the blogpost at all. The guy literally cites an example where the code broke in production, it isn't an esoteric hairsplitting point at all.
They tried to implement a standard algorithm themselves and failed. Doesn't mean that almost all binary searches are wrong. C++ standard library had a correct implemented binary search with more flexible signature when this article was written, they could just have used that one instead.
This blog post predates r/iamverysmart. There was a way of talking and discourse in 2006 that this is very much as example of. One has to take things from the time they were written.
r/iamverysmart was created in response to a particular human behavior, not the other way around.
it literally threw an exception in production on a valid input so yes it is fair to say it was broken
"It is not sufficient merely to prove a program correct; you have to test it too."

Well in fact it is exactly the contrary.

I took it as a reference to Knuth: "Beware of bugs in the above code; I have only proved it correct, not tried it."

https://staff.fnwi.uva.nl/p.vanemdeboas/knuthnote.pdf [PDF] page 7 of the PDF, 5 of the classroom note.

It’s a clumsy formulation, but if what he means is that you need to be assured that the model you’re proving in accurately reflects the behavior of what is being modeled then he is correct at least sometimes. For example a naive Z3 proof of the mid procedure would be valid since Z3 ints are unbounded. The issue isn’t that the proof is wrong, it’s that the model is.

If the system has a well written formal specification then your model can be built from that without error if done diligently. One real world example is the first Algol 60 compiler, which was built to a formal specification. On the other hand if there is no useful spec or no spec at all then you end up needing to experiment, ie test, and get your model as close as you can.

No, what you've observed is the (IIRC the terminology) converse, namely:

It is not sufficient merely to test a program; you have to prove it correct too.

In addition, it is not sufficient merely to prove a program correct; you have to test it too.

In summary, you have to both prove a program correct, and test it; skipping either will result in buggy garbage.

Grandparent is correct. If you've proven the behavior correct, you don't need to test. The proof is the test. This is usually only true in languages-that-are-proof-assistants (idris). In the cases above, they hadn't actually formally proven the behavior correct.
If instead of 'int' you were to use 'size_t' (or the equivalent of that provided by your programming language of choice), then there should be no issues in practice. Then you would only see overflows if your elements were 1 byte in size, and the input spans more than half of the virtual address space. This is unlikely for two reasons:

1. If you only have single byte elements, you'd better use counting sort.

2. There always tend to be parts of the virtual address space that are reserved. On x86-64, most userspace processes can only access 2^47 bytes of space.

> input spans more than half of the virtual address space

Not only that, but in practice most general purpose operating systems are designed with higher-half kernels[0].

[0] https://wiki.osdev.org/Higher_Half_Kernel

32-bit Mac OS X was not (it had a 4/4 scheme).

Though even then I'm not sure you could reliably allocate two gigs of contiguous virtual space without running into some immovable OS-provided thing.

It is unfortunate that the language doesn't have a built-in "average between two ints" function. It is a common operation, people often get it wrong, as shown by this article, and it may have a really simple and correct assembly representation that the compiler may take advantage of.

Such a function, even if it seems trivial, has some educative value as it opens an opportunity to explain the problem in the documentation.

I feel that it’s so simple that many people will overlook that it even exists. In languages that have both, it’s hard for functions to compete with operators. I don’t think that this is the best design to promote correctness.
Maybe, but providing simple functions for "obvious" operations, to promote correctness, make it easier for the compiler, or just for convenience is not uncommon at all. Most languages have a min/max function somewhere, sometimes built-in, sometimes in the standard library, even though it is trivial to implement. C is a notable exception, and it is a problem because, you have a lot of ad-hoc solutions, all with their own issues.

If you look at GLSL, it has many function that do obvious things, like exp2(x) that does the same thing as pow(2,x), and I don't think anyone has any issue with that. It even has a specific "fma" operation (fma(a,b,c) = a*b+c, precisely), that solves a similar kind of problem as the overflowing average.

Knuth’s section on binary search in The Art of Computer Programming is enlightening. One historical curiosity that he notes is that it took something like a decade from the discovery of the algorithm to an implementation that was correct for all inputs.

I briefly tried using binary search as a weeder problem and quickly abandoned it when no one got it right.

This article is poorly/incompletely reasoned.

Suppose your high, low and mid indexes are as wide as a pointer on your machine: 32 or 64 bits. Unsigned.

Suppose you're binary searching or merge sorting a structure that fits entirely into memory.

The only way (low + high)/2 will overflow is if the object being subdivided fills the entire address space, and is an array of individual bytes. Or else is a sparsely populated, virtual structure.

If the space contains distinct objects from [0] to [high-1], and they are more than a byte wide, this is a non-issue. If the objects are more than two bytes wide, you can use signed integers.

Also, you're never going to manipulate objects that fill the whole address space. On 32 bits, some applications came close. On 64 bits, people are using the top 16 bits of a pointer for a tag.

> Suppose your high, low and mid indexes are as wide as a pointer on your machine: 32 or 64 bits. Unsigned.

Yeah, if you suppose that, you can correctly conclude that you only run into overflow if the object is a byte array that fills more than half the address space (though not the entire address space as you say). And that's why this problem remained unnoticed from 01958 or whenever someone first published a correct-on-my-machine binary search until 02006.

But suppose they aren't. Suppose, for example, that you're in Java, where there's no such thing as an unsigned type, and where ints are 32 bits even on a 64-bit machine. Suddenly the move to 64-bit machines around 02006 demonstrates that you have this problem on any array with more than 2³⁰ elements. It's easy to have 2³⁰ elements on a 64-bit machine! Even if they aren't bytes.

Does anyone know why the bitshift method works?

Is it that low and high are both floating point, so you're not constrained by int precision and so you don't get an overflow error. The article makes it sound like sign switching is the issue, but this is just a general overflow problem, right?

The ">>>" operator works, the ">>" operator doesn't. The reason the former works is that it basically performs unsigned division by a power of 2; the latter does it signed. There's no floating-point.
What do the five >s and the , mean in this comment?
>>> is bitwise right shift (fills in with zeros), >> is arithmetic right shift (fills in with the sign bit).
> >>> is bitwise right shift

Well, they're both bitwise right shifts, the ">>>" is specifically a logical or unsigned right shift.

Whoops yes I meant logical.
No, it's because the reason that integers overflow is that negative numbers are technically stored as larger than positive numbers in the Two's complement representation most computers use to store integers. Neither low and high are floats.

Example with 8-bit integers (from wikipedia):

Bits, Unsigned value, Signed value

0000 0000, 0, 0

0000 0001, 1, 1

0000 0010, 2, 2

0111 1110, 126, 126

0111 1111, 127, 127

1000 0000, 128, −128

When the logical bit shift is conducted on -128, -128 is treated as an unsigned integer. Its sign bit gets shifted such that the integer becomes 0100 0000, aka 64.

Oh I see, this is very helpful. Thank you.
16 years later, it's still incorrect on Wikipedia

https://en.wikipedia.org/wiki/Binary_search_algorithm#Proced...

But this is pseudocode. For all you know, it could be implemented in a language whose integers are arbitrary precision, in which case it is perfectly correct and appropriate.
> language whose integers are arbitrary precision

I’m not sure what this could mean. Could you please share some examples?

Python, for example, has arbitrary precision integers. That means that it is theoretically possible to represent any whole number in Python, at least assuming your computer has enough memory to support it. Under the hood, the `int` object can have several different implementations depending on how large the number is. So small numbers will be represented one way, and larger numbers might be implemented as 64-bit integers, but very large numbers are implemented as an array of other integers that can grow arbitrarily large. You can think of the array as being like base-10 representation (so 17,537 might be represented as [1, 7, 5, 3, 7]), although in practice much larger bases are used to make the calculations quicker.

Obviously maths with the smaller representations will be quicker than with this array representation, so the interpreter does some work to try and use smaller representations where possible. But if you tried to, say, add two 64-bit signed ints together, and the result would overflow, then the interpreter will transparently convert the integers into the array representation for you, so that the overflow doesn't happen.

So the first poster said that the default merge sort implementation on Wikipedia was buggy, because it doesn't protect against overflows (assuming that the implementation used fixed-sized integers). The second poster pointed out that if the implementation used these arbitrary precision integers, then there is no chance of overflow, and the code will always work as expected.

You can look up "bigint" which seems to be the term of art for implementations of arbitrary precision integers in most languages. You can also read a bit about how they're implement in Python here: https://tenthousandmeters.com/blog/python-behind-the-scenes-...

> Python, for example, has arbitrary precision integers.

In the spirit on nitpicking on edge cases: It does, but quiet often you pass a number to some C library (other than stdlib) and C does not honour this arrangement.

Then an exception is generated. Errors do not pass silently in Python.
This means that the instead of fixed width integer types that have a finite maximum due to being 32-bit or 64-bit etc…, the language could use an integer type that can grow to be as many bytes as is needed to store the number. This is called a BigInt in JavaScript for instance.
Python does fine with (2**1024 + 3**768) // 2

For example.

Haskell has arbitary-precision integers.

Until you run out of memory, but yeah.

yep, so does Mathematica
Integers that are represented by a dynamic number of bits -- as in Python or Javascript BigInt
Python3 doesn't have a maximum integer and therefore cannot experience overflow when adding two integers, for example. You can keep adding one forever.
Take Python for example
Lisp has arbitrary-precision integers by default.
not all languages tie their "integers" to a fixed-bit-length value
see "implementation issues" in the same article, with

  M = L + (R - L)/2
I'm aware (plus the fact that the algorithm is correct in Python). It's very unlikely that this is an argument I can win.

I'm taking a pragmatic perspective: like it or not, people are going to skim the article and copy & paste the pseudocode.

Given that the pseudocode is buggy in the vast majority of programming languages and the user isn't informed about this in the pseudocode, it's going to lead to unnecessary bugs.

> people are going to skim the article and copy & paste the pseudocode.

Heh. But then again, these kind of people will create way worse problems than last-bit overflows.

They do discuss it though at the end of the article

https://en.wikipedia.org/wiki/Binary_search_algorithm#Implem...

And as other mentionned, this is pseudo code and not implementation. But if you think it's incorrect, feel free to correct it.

Spoiler: If you are using Javascript, this bug only affects you if your arrays have more than Number.MAX_SAFE_INTEGER/2 entries, which is about 2^52. In other words, don't waste your time with fixing this bug.
Unless you're binary searching something other than a data structure. Fascinatingly, binary search works just fine in optimization problems where the function to optimize is monotonic.
Anyone dealing with arrays containing a billion elements or more really ought to be using 64 bit arithmetic to avoid problems like this. Certainly better to do this the right way though.
Is there any reason not to use 64bit arithmetic anyway?
This is a great example of how good algorithms are software plus hardware. The idea that a pure mathematical idea can be naively implemented on any hardware has never truly materialized.

Yes, we are a long way from flipping switches to input machine code, but there are still hardware considerations for correctness and performance, e.g. the entire industry of deep learning running somewhat weird implementations of linear algebra to be fast on GPUs.

Oh boy, in 2022 you could not afford writing a broken binary search in any serious coding interview. Back before 2006 apparently PhD students in CMU could not.
Are you kidding? If you were asked in a coding interview to write a binary search, and you wrote the broken version in the post on a whiteboard, you'd be in the top 5% of applicants. Most applicants can barely write a for loop on the board.
Can't it simply be written like this?

    mid = low/2 + high/2
While that is true it is not relevant here, since this example does not involve associativity.

What is relevent here is that integer division is not distributive over addition.

Nope, try low = 1, high = 1 and you get mid = 0.
i think you can fix it with: (low >> 1) + (high >> 1) + (low & 1 & high)

for unsigned numbers. not sure if it works for signed numbers.

For low=3, high=5 case, this gives mid=3.
My data structures professor took off points for that in an assignment once :(
I think this might be better for c/c++ though admittedly a bit more cryptic:

(x>>1) + (y>>1) + (0x01 & x & y)

On mobile, this site is broken too. Text doesn't wrap and scrolling seems to be disabled.
It's a post from before the iPhone came out, try reading the WAP version of the blog on your Cingular connection.
The blog is still active though. Somehow they fixed their layouts but kept old posts on the old layout?
Yeah I had to use reader mode.
I really dislike when devs disable mobile scrolling without knowing for sure their content is wrapping properly.
I'd put the blame on languages that don't allow exceptions, and whose return value in case of errors belong to the same domain as the solution.

I've coded binary searches and sorts tons of times in C++, and yet none was succeptible to this bug. Why? Because, whenever you're talking indices, you should ALWAYS use unsigned int. Since an array can't have negative indices, if you use unsigned ints the problem is solved by design. And, if the element is not found, you throw an exception.

Instead, in C you don't have exceptions, and you have to figure out creative ways for returning errors. errno-like statics work badly with concurrency. And doing something like int search(..., int* err), and setting err inside of your functions, feels cumbersome.

So what does everyone do? Return a positive int if the index is found, or -1 otherwise.

In other words, we artificially extend the domain of the solution just to include the error. We force into the signed integer domain something that was always supposed to be unsigned.

This is the most common cause for most of the integer overflows problems out there.

When you’re talking indices, you should NEVER use int, unsigned or not. The world is 64-bit these days and int is stuck at 32 bits almost everywhere. And even on 32-bit systems indexing with unsigned int may not be safe unless you think about overflow, as this bug demonstrates (at least unsigned overflow is not immediate UB in C and C++ like signed overflow is…)

C has size_t. Use it.

To be fair, size_t doesn't solve this particular problem; you also need to use correct array slice representation (ptr,len) not (start,end), and calculate the midpoint accordingly (ie (ptr,len/2) or (ptr+len/2,len-len/2)).

(And because C doesn't mandate correct handling of benign undefined behavior, you still have a problem if you `return ptr-orig_ptr` as a size_t offset (rather than returning the final ptr directly), because pointer subtraction is specified as producing ptrdiff_t (rather than size_t), which can 'overflow' for large arrays, despite that it's immediatedly converted back to a correct value of size_t.)

The problem is not solved by using unsigned ints though, because it stems from integer overflow. I'm afraid your implementations are, alas, also incorrect.
Confused, how does using unsigned integers not solve this particular problem? Doesn't the article itself show solutions with unsigned integers?
Example using 16-bit size_t for convenience:

  char array[60000]; // 5KB left for code and stack if not segmented
  size_t i = 40000;
  size_t j = 50000;
  size_t mid = (i+j)/2; // should be 45000
  // i+j = (size_t)90000 = 24464
  // mid = 24464/2 = 12232 != 45000
Larger integers make the necessary array size bigger, but don't change the overall issue.
Unsigned int is 32-bit, most address spaces are 48-bit or more.
Array.length is 32-bit in Java. This is from 2006.
Or you can return an unsigned int which is the highest valid index+1.

    ['a','b','c'].indexof('b') == 1 // found - return index
    ['a','b','c'].indexof('w') == 3 // not found - return size of array
There should be a pointer type that is not an int and whose size depends on the host platform.

Casts from int to pointer should be explicit.

We are enamoured with programmer convenience at the expense of the safety of our systems. It’s unprofessional and we should all aim to fix it.

The type that’s meant for indexing in C and C++ is called `size_t`. It is pointer-sized. In Rust it’s called `usize` and Rust does not have implicit conversions, so if you accidentally use too narrow an integer type to compute an index, at least Rust forces you to add an explicit cast somewhere.
I've seen libraries that add a size_t type as an alias to int on certain systems. Rust gets it right here.
Seeing the new erroneous assumption being made in this post reminded me of Linus Torvalds' rant about C++ http://harmful.cat-v.org/software/c++/linus
For most practical purposes, an int64 index can include a universe of negative return codes with no loss of functionality.

The problems here are about using integers that are too narrow and not properly doing arithmetic to prevent overflow from impacting the result.

> For most practical purposes, an int64 index can include a universe of negative return codes with no loss of functionality.

Isn't this article a counterexample to that? Where using signed instead of unsigned actually does result in a loss of functionality?

No. This article explicitly mentions the "int" type, which in C, C++, and Java is 32 bits long. 32-bit ints are not large enough for this purpose: they can only index 2 billion items directly (which will overflow a lot given that a standard server now has 256-512 GB of RAM), and this average calculation hits problems at around 1 billion items. Overflows on 64-bit ints (when used to store 63-bit unsigned numbers) are not going to happen for a very long time.
Wasn't Array.length 32-bit on Java when the article was written? In fact, isn't it 32-bit even now?

Moreover I don't see how you deny that using signed would lose functionality in this case—it's pretty undeniable that it gives the wrong answer in cases where unsigned would give the correct answer; the code is right in the article and you can test it out. This is true irrespective of any other cases that neither might handle correctly (assuming you believe any exist, but see my previous paragraph).

I didn't say that a signed int would be fine. I said that a signed 64-bit int would be fine.

Moreover, it is trivial to convert from a 32-bit signed or unsigned type to a 64-bit int, so you are not constrained by the size type of Java.

The naive (x+y)/2 returns the wrong number for x=UINT_MAX and y=UINT_MAX, for a trivial counter example.
"Because, whenever you're talking indices, you should ALWAYS use unsigned int."

sounds like a lot of your code is in fact broken...

I think it'd be nice if you give some examples of how using unsigned integers for indices breaks code in cases where signed integers don't, because otherwise your comment is very unilluminating.
you need to google why size_t exists. size_t guarantees you to be able to represent any array index. unsigned int can be arbitrarily tiny and a bigger loop may cause unsigned overflow during your loop on some architectures. in other words, your code would be broken. size_t will make it work correctly everywhere. it is the correct way of representing array indices.
C & C++ allow exceptions on signed integer overflow.