Hacker News new | ask | show | jobs
by marte 5825 days ago
Some examples:

Even though Winograd's algorithm (matrix multiplication) is theoretically faster than Strassen's, the constant is so high that Winograd's is only faster in matrices so large you can't practically compute in the first place.

Quicksort is a more familiar example. Although its worst case complexity is O(n^2), many techniques have been invented to avoid the worst cases and execute in just O(n log n) (average case), and it's usually faster than merge sort in practice.

1 comments

Technically, you can't ever guarantee that you'll avoid the worst case of quicksort. All the techniques to make it nlogn are probabilistic, and you can always engineer a pathogenic input that will make quicksort run quadratic. Mathematically this is expected, but I'm not disagreeing that quicksort is in practice even faster than the true nlogn comparison sorts.

I don't consider this a limitation of big O notation at all, but rather a common misconception among students when they first learn about the notation.

Is it true that quicksort is worst-case O(n lg n) if you always select the median? What if there are a lot of duplicate values at the median? My understanding has always been that for any quicksort algorithm on given hardware you can construct a pathogenic input that will be sorted in quadratic time.
Hardware is irrelevant. The analysis is mathematical and abstract. The model that people typically work in when doing this analysis is the comparison model: one < comparison costs 1 (http://en.wikipedia.org/wiki/Comparison_sort). There are other models that more closely resemble computers. For example, in the cell probe model (http://www.itl.nist.gov/div897/sqg/dads/HTML/cellProbeModel....) you have a word size (e.g. 32 bits, but the exact number is irrelevant) and you can construct radix sort in that model.

If all values in your set of numbers to sort are unique, then quicksort is indeed O(n lg n) worst case if you pivot about the median. If there are duplicate values, can you think of a way to still guarantee worst case O(n lg n)?