|
|
|
|
|
by rcoh
4878 days ago
|
|
I became irritated when quick sort was claimed be O(n^2). Not only is that not true in practice, it's just wrong for all but the simplest implementations of the algorithm. With minor tweaks to the selection of partitions, quick sort can be made O(n lg n). Specifically, if combined with linear time median finding, we can pick perfect partitions every time. (Of course, this algorithm is painfully slow in practice). All this said, the idea of analytic combinatorics is interesting for a deeper understanding of algorithm performance. It quantifies behavior that good engineers understand is swept under the rug by Big-O notation. Of course, if you really want good performance, you'll have to understand the algorithm beyond just it's combinatorial properties -- caching locality and coherence, branch mispredictions, function call-overhead etc. play an enormous role when it comes time to really make code fast. To write fast code, you need the whole picture. |
|
I have found way too many implementations in the wild, with various optimizations that STILL had an O(n^2) worse case, one that was actually triggered by real data.
> With minor tweaks to the selection of partitions, quick sort can be made O(n lg n). Specifically, if combined with linear time median finding, we can pick perfect partitions every time. (Of course, this algorithm is painfully slow in practice).
You are mostly contradicting yourself here. It is painfully slow in practice, therefore, no one does linear time median finding; therefore, partitions aren't equal sized, and the O(n lg n) guarantee CANNOT be made.
Have you ever seen a widely used quicksort implementation that actually has an O(n lg n) guarantee? I haven't. Closest I've seen is median-of-five-random-elements, which probabilistically is excellent, but is STILL not an O(n lg n) - if you have an adversary, and they build something like http://www.cs.dartmouth.edu/~doug/aqsort.c adjusted to your algorithm, you'll get a O(n^2).
Also, it is quite surprising how many quicksort implementations out there will do O(n^2) if they get a vector of ALL EQUAL VALUES. I have seen that happen in practice (including the Java standard library at the time I tested it), and I haven't seen one text book mention that the partitions should be (all < pivot) (all == pivot) (all > pivot), which is the only practical way to avoid it.