Hacker News new | ask | show | jobs
by bloomer 2283 days ago
I haven’t read this in detail, but a few things stand out. In the write up, you say it compares better than bitwise radix sort. You need to compare against an optimized byte-wise radix sort which is by far the fastest sort for decent sized arrays of integers. The second claim that skyline sort somehow sorts an array of size a faster than two arrays of size n does not make sense. That implies that the time per element goes to zero as the size goes to infinity. That would be faster than O(n), which is what radix sort is, i.e. constant time per element for larger arrays.

It sounds like this is just a variation of counting sort or bucket sort optimized for a small number of unique values.

2 comments

I know I need to compare it head to head against bytewise radix sort. I might do that and get back with the results. In the meantime, let me clarify. EDIT: Blas Mena and I did test it against radix sort, you can find it https://github.com/daniel-cussen/skylinesort/blob/master/src..., it tests with 4-bit radix sort and radix sort comes out 3% slower.

When I say that it can sort an array of size 2n faster than two arrays of size n it is because if you plot the asymptotic time, you find it's second derivative is negative (concave-up). This is because of the subtraction in the asymptotic time, O(k u - u log u + n). A Stanford professor was reluctant to accept an asymptotic time with a subtraction in it, but I explained it was really the logarithm of a division, ku-ulogu was actually u log(K/u) where K is 2^k, the total number of elements in the empty auxiliary array. log(K / u) is the same as log K - log u and log(K)=k, so you end up with k u - u log u, which has a negative second derivative when k is fixed.

When log u = k, skylinesort does indeed run in linear time, but it runs slower that linear time before that, and log u is never greater than k, by definition.

Daniel also addressed this some, but I want to highlight that sorting one array of size 2n faster than two arrays of size n does not imply per element time decreasing to zero. E.g. if the runtime were n-2^(-n) you would have that property even though the per element runtime tends toward 1.

<rant> That kind of phenomenon (reading too far into an infinite process) is ubiquitous once you start noticing it. Infinitely many universes don't imply the existence of every possible universe (if you subscribe to such theories), pie having infinitely many digits doesn't imply that it encodes every possible message somewhere in those digits (most real numbers have that property, but iirc it's still an open question for pi), and so on. </rant>