Hacker News new | ask | show | jobs
by st_goliath 1969 days ago
A while ago I did some simplistic SquashFS pack/unpack benchmarks[1][2]. I was primarily interested in looking at the behavior of my thread-pool based packer, but as a side effect I got a comparison of compressor speed & ratios over the various available compressors for my Debian test image.

I must say that LZ4 definitely stands out for both compression and uncompression speed, while still being able to cut the data size in half, making it probably quite suitable for life filesystems and network protocols. Particularly interesting was also comparing Zstd and LZ4[3], the former being substantially slower, but at the same time achieving a compression ratio somewhere between zlib and xz, while beating both in time (in my benchmark at least).

[1] https://github.com/AgentD/squashfs-tools-ng/blob/master/doc/...

[2] https://github.com/AgentD/squashfs-tools-ng/blob/master/doc/...

[3] https://github.com/AgentD/squashfs-tools-ng/blob/master/doc/...

2 comments

> lz4 (...) probably quite suitable for life filesystems and network protocols

Actually, no. lz4 is less suitable than zstd for filesystems.

BTW, lz4 is present in many mozilla tools like thunderbird: it's represented by its bastard child lz4json, which is diverging by just the headers don't work with regular lz4 tools

> achieving a compression ratio somewhere between zlib and xz, while beating both in time (in my

Your observation is correct: zstd is now standard and the default on openzfs 2.0, replacing lz4.

The 19 compression variants offer more flexibility than just lz4- another strength is the decode time is not a function of the compression factor, which is something good on coldish storage that's rarely updated.

> zstd is now standard and the default on openzfs 2.0, replacing lz4.

Are you sure? The default compression level has always been "off", but when switched on - the default has been lz4 for about 5 years. Zstd support was added less than a year ago and there are still a lot of things that need to be fixed before one could even suggest that it might be a sane default. I like zstd, but I like my uncorrupted data more. I know that compatibility between compressor versions and pools is a concern, and there are also the compression performance problems with the way zstd handles zfs block sizes. Thankfully lz4 works great for zfs and has for many years now.

https://github.com/openzfs/zfs/blob/master/include/sys/zio.h...

> Actually, no. lz4 is less suitable than zstd for filesystems.

Why's that? What benefit would I get from switching? Is it workload-dependent?

EDIT: To be clear, I'm not disagreeing; if zstd will work better, I want to know about it so that I can switch my pools to use it.

>> Actually, no. lz4 is less suitable than zstd for filesystems.

>

>Why's that? What benefit would I get from switching? Is it workload-dependent?

Presumably because Zstd has much better compression, while still being quite fast.

I don't see however how that invalidates any of my observations. Some filesystems like e.g. UBIFS support LZ4, but now also support Zstd, because both are suitable for the task (and LZ4 was around earlier).

In the end it is a classic space vs. time trade-off and there is AFAIK no generic right or wrong answer (except that some algorithms are too slow to even be considered).

F2FS supports compression with LZ4 since Linux 5.6.