Hacker News new | ask | show | jobs
by froh 1326 days ago
M = L + (R - L)/2

looks fairly simple to me. note this works of any ordering of R and L if the data type is signed.

2 comments

But doesn't this have the same overflow issue, e.g. if R is a large positive number and L is a large negative one?
In binary search and heapsort, neither L nor R are negative - they are the number of elements in the array.
Binary search can be done on anything, not just arrays. Often you apply it to an algorithm and there isn't a collection at all, you just know the right answer is between some numbers so binary search lets you find it in logarithmic number of tries. If computing the number is costly then binary search is necessary to compute the result at all in those cases.
Yes, if you want a solution to a related problem then the implementation will need a better midpoint calculation.

However, the linked-to binary search starts from 0:

  1:     public static int binarySearch(int[] a, int key) {
  2:         int low = 0;
  3:         int high = a.length - 1;
and the promoted fix is:

  6:             int mid = low + ((high - low) / 2);
The aforementioned Wikipedia binary sort also takes a length, rather than start/end:

  function binary_search(A, n, T) is
      L := 0
      R := n − 1
C++ has had a correct binary search in its standard library since c++98, and it works on pointers, integers etc, both signed and unsigned without overflows. I'm not sure why they say that this doesn't exist.

https://en.cppreference.com/w/cpp/algorithm/lower_bound

I've lost track of the thread, because I don't know what you are talking about. Who is saying what doesn't exist?

The lower_bound you pointed to takes start/end iterators defining the partially-ordered range to examine.

The linked-to essay and the Wikipedia take array + size.

These are different APIs.

The latter - which is what this thread is about, IMO - is easier to implement because the sizes are always non-negative.

You can use unsigned division and addition and then go back to signed and it is still correct.
Only on 2s-complement.
Which is basically everything.
Even if the data type is unsigned it suffices as long as the R - L term is signed (so the division by 2 is an arithmetic shift, not a logical shift).