| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by joelthelion 1508 days ago
	In the Era of zstandard, do we really need this?

4 comments

wongarsu 1507 days ago

I find it somewhat telling that they don't benchmark themselves against zstd.

Right now I'm almost exclusively using zstd (general stuff) or lzma2/xz (high compression where read speed doesn't matter). And of course gz and zip for data interchange where compatibility is key. From the information presented bzip3 won't replace any of those use cases for me, but that's fine. Maybe it fits somebody else's use case, or maybe it's the foundation for the next great algorithm that we all end up using.

palaiologos 1507 days ago

zstd -19 linux.tar 462.58s user 0.76s system 100% cpu 217M memory 7:42.56 total

% wc -c linux.tar.zst linux.bz3 134980904 linux.tar.zst 129255792 linux.bz3

149764 1507 days ago

  # compression

  bzip3 -j 4 -e linux-5.18-rc6.tar linux-5.18-rc6.tar.bz3 
    user: 345.48s system: 0.59s cpu: 373% total: 1:32.75

  zstd -19 --long -T4 -f linux-5.18-rc6.tar
    user: 1270.48s system: 0.89s cpu: 376% total: 5:37.9

  > du -b linux-5.18-rc6.tar.* | sort -rn | reln
  1.000000  130907738  linux-5.18-rc6.tar.zst
  0.994715  130215881  linux-5.18-rc6.tar.bz3

With additional ‘--ultra -22’ tar.zst is smaller, but the compression time sky rockets.

  # decompression 

  bzip3 -j 4 -d linux-5.18-rc6.tar.bz3 linux-5.18-rc6.tar 
    user: 222.57s system: 0.92s cpu: 362% total: 1:01.69

  bzip3 -d linux-5.18-rc6.tar.bz3 linux-5.18-rc6.tar 
    user: 141.29s system: 0.89s cpu: 99% total: 2:22.19

  zstd -d -T4 -f linux-5.18-rc6.tar.zst 
    user: 2.26s system: 0.84s cpu: 99% total: 3.102

zstd doesn’t seem to support parallel decoding, but still 20x faster

codewiz 1507 days ago

Is reln a command to add a column of relative numbers to the left? Neat.

149764 1507 days ago

Yes, a small python script.

cout 1507 days ago

Have you ever tried lzma/lzma2 with the hc3 (hash chain) match finder instead of the default (bt3 or bt4) match finder? I've found this to be a really good middle ground between gz/deflate and lzma2 with default settings.

proofrock 1508 days ago

Yes, because someone said the same when zstandard came out. This may not have the same strong points, but maybe the next will… compression is not a completed task.

trasz 1507 days ago

Not to mention the restrictive license which effectively prohibits its use in any Open Source project licensed under anything other than GPLv3.

palaiologos 1507 days ago

Frankly, same holds for gzip. I've been planning to relicense bzip3 with the more permissive LGPLv3.

chungy 1507 days ago

gzip has BSD-licensed compatible alternatives already. It's doubtful the same attention would be given to bzip3; chicken-and-egg scenario there. Plus the lingering question of "Why not zstd?"

trasz 1507 days ago

Gzip is just a frontend for zlib, which is BSD(ish).

baybal2 1507 days ago

1. zStandard is not a standard

2. Bzip2 is somewhat is a standard

3. zStandard is not a substitute for Bzip2

Beltalowda 1507 days ago

In what way is bzip2 more of a "standard" than zstd? bzip2 doesn't even seem to have any official reference description of its file format; just an "unofficial" one[1], whereas zstd is RFC 8478[2].

When I evaluated various compression algorithms a few years ago zstd came ahead of bzip2 in every metric.

[1]: https://github.com/dsnet/compress/blob/master/doc/bzip2-form...

[2]: https://datatracker.ietf.org/doc/html/rfc8478

chasil 1507 days ago

That is interesting.

The author of lzip has harsh criticism of xz, and admiration of bzip2 for error detection/correction and "rightsizing" the container format.

I use lzip in preference to xz unless I need portability.

https://www.nongnu.org/lzip/xz_inadequate.html

Beltalowda 1507 days ago

As far as I know xz and zstd and completely unrelated?