Hacker News new | ask | show | jobs
by mqus 1326 days ago
But doesn't this have the same overflow issue, e.g. if R is a large positive number and L is a large negative one?
2 comments

In binary search and heapsort, neither L nor R are negative - they are the number of elements in the array.
Binary search can be done on anything, not just arrays. Often you apply it to an algorithm and there isn't a collection at all, you just know the right answer is between some numbers so binary search lets you find it in logarithmic number of tries. If computing the number is costly then binary search is necessary to compute the result at all in those cases.
Yes, if you want a solution to a related problem then the implementation will need a better midpoint calculation.

However, the linked-to binary search starts from 0:

  1:     public static int binarySearch(int[] a, int key) {
  2:         int low = 0;
  3:         int high = a.length - 1;
and the promoted fix is:

  6:             int mid = low + ((high - low) / 2);
The aforementioned Wikipedia binary sort also takes a length, rather than start/end:

  function binary_search(A, n, T) is
      L := 0
      R := n − 1
C++ has had a correct binary search in its standard library since c++98, and it works on pointers, integers etc, both signed and unsigned without overflows. I'm not sure why they say that this doesn't exist.

https://en.cppreference.com/w/cpp/algorithm/lower_bound

I've lost track of the thread, because I don't know what you are talking about. Who is saying what doesn't exist?

The lower_bound you pointed to takes start/end iterators defining the partially-ordered range to examine.

The linked-to essay and the Wikipedia take array + size.

These are different APIs.

The latter - which is what this thread is about, IMO - is easier to implement because the sizes are always non-negative.

random access iterators provide exactly the pointer into array semantics, with a difference type and jumping to a location in the range.

https://en.cppreference.com/w/cpp/named_req/RandomAccessIter...

they were conceived for c++'98 and transmogrified to a trait for c++20

the run time in GPs lower_bound link is log(n) for random access iterators and liner for non-random access.

You can use unsigned division and addition and then go back to signed and it is still correct.
Only on 2s-complement.
Which is basically everything.