Hacker News new | ask | show | jobs
by enigmo 4845 days ago
What about 'sort -u' ?
1 comments

I'm not sure. I haven't closely studied the difference between each algorithm. My guess would be that sort -u would perform better as the data set gets larger with a good block size setting because it does do an external sort. Cardinality would also affect the performance. If the unique set handily fits in memory, an external sort on a large data set wouldn't be very efficient.