Hacker News new | ask | show | jobs
by stevvooe 4845 days ago
I'm not sure. I haven't closely studied the difference between each algorithm. My guess would be that sort -u would perform better as the data set gets larger with a good block size setting because it does do an external sort. Cardinality would also affect the performance. If the unique set handily fits in memory, an external sort on a large data set wouldn't be very efficient.