Hacker News new | ask | show | jobs
by PaulHoule 5215 days ago
If you're handling a lot of data it make sense to hash-partition it on some key and spread it out to a large number of files.

In that case you might have, say, 512 partitions and you can farm out compression, decompression and other tasks to as many CPUs as you want, even other machines in a cluster.