Hacker News new | ask | show | jobs
by dataflow 1326 days ago
>> most binary search implementations don't claim they only work under some particular inputs

> They do implicitly. It's just common sense.

That's neither how language specifications work, nor true in this case even if it's true in other cases. Providing one more of the same kind of input that already works is in no way the same thing as changing something totally unrelated.

> When you read a recipe in a cookbook, it usually doesn't mention that you're expected to be standing on your legs, not on your arms.

I don't think this binary search was breaking because of people standing on their arms either.

> A lot of generic algorithm implementations will start acting weird if your input size has the order of INT_MAX. Instances this big will take days or weeks or process on commodity CPUs,

It's incredibly strange to read this from someone in 2022. I don't know of any standard library algorithm that would take "days or weeks" for inputs of size 2^31 now, let alone the majority of them being like this. In fact I don't think this was the case back when the article was written either.

1 comments

Ok, I looked at it closer and I admit that quicksort implemented in C won't take days on an input of 2³¹ elements. It will take less than 1-2 hours, I think. Something that is a bit worse than O(n log n) or has a 20× bigger constant hidden in O(·) will take days though.

I don't see my other arguments being convincingly refuted, so they still hold.

> Ok, I looked at it closer and I admit that quicksort implemented in C won't take days on an input of 2³¹ elements. It will take less than 1-2 hours, I think.

How ancient is your machine? Quicksorting (2^31 - 16) elements (because that's Java's hardcoded limit) takes < 11 seconds on my machine, and a big chunk of that time is taken in the random number generation to create the input...

  # Temp.java
  import java.util.*;
  public class Temp { public static void main(String[] args) { byte[] arr = new byte[0x7FFFFFF0]; new Random().nextBytes(arr); Arrays.sort(arr); } }

  $ javac Temp.java && time -p java Temp
  real 10.30
No, wait. The QuickSort from the article is O(n²), in fact. So that one specifically will take weeks, or even months to run — especially in Java. Feel free to test it and get back to me if you think I'm wrong.
> No, wait. The QuickSort from the article is O(n²),

What article are you referring to? This article is about binary search and mergesort, not quicksort?

And which quicksort has O(n^2) typical behavior? That's the worst-case behavior you normally only get on adversarial inputs, not the typical one. (And standard libraries have workaround for that anyway.)

Sorry, I indeed switched to a wrong browser tab and skimmed instead of reading when started this discussion. Please disregard the quicksort discussion.

I still think that it's normal for programs to behave weird when the input parameters are getting close to INT_MAX. Sometimes it's unavoidable. And if it's not specified in the function docs, you should go and check the implementation as a programmer. For binary search it is avoidable, so the criticism of the linked implementation is fair.