Hacker News new | ask | show | jobs
by xvector 2349 days ago
I see. Then Radix Sort never really was faster than traditional sorts if we go by Big-O notation alone, right?

If I understand your post correctly, Radix sorts in O(log(maxsize) * N) instead of O(NlogN). So why do people say Radix Sort is O(N)?

2 comments

People say that because it's true, for a bounded key size, and in practice many radix sort implementations have a large fixed key size, so the log factor just doesn't come up.

E.g., if you use a 64-bit key, with a default implementation, you support up to 2^64 elements with your O(n) sort. No one has a machine with more than 2^64 elements today, so the the sort is in practical terms very much like O(n).

OTOH the O(n log n) of comparison based sorts are very real: the log term is in n, the number of inputs, and it is very obvious when you plot sort performance.

What is interesting though is when you start optimizing radix sort in certain ways, e.g., eliminating passes based on the high bits being zero (in an LSD sort) or using an MSD sort which stops when the bucket size gets smaller than some threshold, the log n factor actually appears again: since you often need ~log n passes to complete the sort rather than say a fixed number like log 64 (base = digit bits) = 8 for 64-bit keys. The base of the logarithm is larger though (2^radix bits), so the hidden constant factor is lower than comparison sort where the base is generally 2.

One answer is that if you allow duplicates then it can handle larger amounts of items without any slowdown.

Another answer is that people are silly and constant vs. log factors aren't intuitive.