Hacker News new | ask | show | jobs
by tamsaraas 1408 days ago
zstd not good for everything. Yes, it's fast in terms of compression. Easy & fast you can make compressed archive. But if you need to use the zstd on daily basis - i have bad news. ZSTD not suitable for general purpose archiver.

Depends on the data - you can have totally different output compressed archive if compare to winrar for example.

Example: I doing backups time to time, and archive important files into compressed archive containers (rar / 7zstd).

I've noticed, that my dev folder with tons of repositories, images, and different work related fines - vary damn too much.

/dev/ size = ~11GB

rar output (normal compression) = ~2.1 GB zstd archive output (normal compression) = ~4.7GB

Why? linked files, same files not treated as a 1 file + links to these files. Instead these files compressed each 1 by 1 instead of copy 1 identical, and compress the file. And many things like that.

Suggestion for 7z-zstd -> add ability to save links to files, not treat them as separate files, and adding an option like in winrar to search for identical files first and re-link all of them and remove duplicates, instead of compression each.

1 comments

Your criticisms seem to be about 7z, not zstd. Ztd doesn't have an archive concept, it's just compresses a sequence of bytes.
i'm totally correct in what am i writting. zstd not wide range suitable compression. And all current implementation that used non for 1 file compression - awful.
No, your critiques are of how 7zip implements an archival format built on top of zstd. The zstd algorithm has no concept of files, only bytes.

Archival software then has to build a file format on top of the compression algorithm, and there are multiple ways to slice the problem. For example, a tar.gz will first tar everything into a big archive file, then feed it into gzip for compression. zip, on the other hand, feeds each file individually into the chosen algorithm (DEFLATE for most implementations).

Your critique is that the 7zip archive format is not suitable for use with zstd in the case of many small yet identical files. zstd is doing its job, just the archival format is not playing along.

Zstandard doesn't accept multiple files at all, so it's an archiving format's job to convert files into byte sequences and compress them accordingly. It looks like that 7z wasn't able to deduplicate entirely or partially identical files, in the other words zstd could never know that it is compressing almost identical files over and over.