Hacker News new | ask | show | jobs
by viraptor 367 days ago
What's the reason for using bz2 here? Wouldn't it be faster to do a one off conversion to zstd? It beats bzip2 in every metric at higher compression levels as far as I know.
2 comments

Common Crawl delivers the data as bz2. Indeed I store intermediate data in zstd with ZFS.
That assumes you're processing the data more than once.