Hacker News new | ask | show | jobs
by tropin 1575 days ago
Is not just "slightly better" in some cases. An obvious one is repeated files. If RAR encounters a file ten times, it will compress once and store it and nine pointers. lzma and zstd guis i've tested will store ten compressed copies. It happens all the time in backups.
3 comments

> If RAR encounters a file ten times, it will compress once and store it and nine pointers.

A bunch of compression formats do that. Even zip files can, which leads to interesting tricks like:

https://www.bamsoftware.com/hacks/zipbomb/

Handling multiple identical files is certainly desirable - but this is not the "core" compression algo. Its a cute optimization trick...
> "lzma and zstd guis i've tested will store ten compressed copies"

It depends on dictionary size, word size, the actual data etc.