Hacker News new | ask | show | jobs
by f311a 238 days ago
People often use sort | uniq when they don't want to load a bunch of lines into memory. That's why it's slow. It uses files and allocates very little memory by default. The pros? You can sort hundreds of gigabytes of data.

This Rust implementation uses hashmap, if you have a lot of unique values, you will need a lot of RAM.

1 comments

Yeah definitely, it's always a trade-off. I think in many cases where I use it especially the number of unique values is actually not crazy high (much less than the required RAM) and the number of lines is crazy high.

So in those settings I think it's absolutely worth it