Hacker News new | ask | show | jobs
by dmw_ng 2243 days ago
The bottom line is they optimize installation time by amortizing it out over the runtime life of the package, or in other words, optimizing a one time 15 second process to be a 14 second process, in return for making a many-times 1 second process a 30 second process. It makes absolutely no sense.

They do this using a filesystem originally designed for embedded devices, using a driver hacked to disable threading support because the sheer number of filesystems snapd mounts would otherwise consume a huge amount of memory in per-cpu buffers used for decompression. In other words, they broke squashfs for everyone in the process of trying to make it work for snap.

On-demand decompression like this has made very little sense on desktops since the mid 90s, and even if it did, snapd's manifestation of it is particularly terrible.

2 comments

> On-demand decompression like this has made very little sense on desktops since the mid 90s

Ok, maybe not desktops? But ZFS on-disk compression is a sysadmin's frickin dream -- just one example that you can access logfiles with plaintext tools like grep while benefiting from the space savings with neglible cost, LZ4 has basically no overhead at all, https://www.servethehome.com/the-case-for-using-zfs-compress...

I really hope you will try on-disk compression, encryption, deduplication, and that sort of thing sometime, you will see it is so much better than gzip-compressed, gpg-encrypted files

Filesystem compression is a completely different animal than this. It has to deal with your ability to modify the file at any time. It doesn't compress the whole file together, it does it in blocks. When you launch a binary (and the system mmaps it) it doesn't have to decompress the entire file before you can start using it, only the first compression block.

Compression also typically makes it faster to launch applications from spinning rust, because the bottleneck is the drive and reading 50MB and decompressing it is faster than reading 100MB uncompressed. This would be true of SSDs as well except that most of them already do this internally.

But snap isn't reading e.g. 64kB and then giving you 128kB on demand (and then prefetching the next block) like the filesystem does, it has to read and decompress the entire 100+MB package before you can even open it. That is very silly and adds a perceptible amount of latency.

> But snap isn't reading e.g. 64kB and then giving you 128kB on demand (and then prefetching the next block) like the filesystem does, it has to read and decompress the entire 100+MB package before you can even open it. That is very silly and adds a perceptible amount of latency.

Wait, I could be wrong about this. I was deducing it from other people saying that it has to decompress the package every time you open it plus the empirically long application load times, but it turns out it's using squashfs which at least in principle could be doing the compression the same way as zfs. I haven't checked whether it does or not.

They're doing something wrong though or it wouldn't be this slow. Possibly more than one thing. Unfortunately there are a lot of different ways to screw this up, like not caching the decompressed data so it has to be decompressed again on every read even if it's already in memory, or using too CPU intensive a compression algorithm or too large a block size, or double (or triple or quadruple) caching because it's loop-mounted and then forcing slow disk reads through inefficient cache utilization, or over-aggressive synchronous prefetching, or any combination of these. Or maybe it actually is doing whole-file-level compression.

Now I'm curious which one(s) it really is.

Can't you just alias cat to zcat and so on? There should be such tools available for just about everything that isn't a container format (zip, 7z, tar).
It’s a bad implementation. You can run inline compression on latency sensitive workloads like VDI without issue.

Compression makes a lot of sense as the cost for fast high capacity SSD is usually much higher than the extra CPU cycles required to decompress.

It's not squashfs fault, it's the snap people that just have absolutely no clue. squashfs is designed for embedded systems with say 8 or 16 MiB of very slow NOR flash, so you maximize compression ratio at the expense of speed (because the flash is probably still slower).
And decompression is typically very fast. What I don’t understand is why they’re not using something like zstd if they care about speed. It’s a supported compression algo for squashfs, but still they insist on using a single threaded compression (xz iirc?) algo.
The kernel code for reading zstd squashfs image has been merged for some time. But zstd is only a recently supported algorithm in upstream squashfs tools for creating the squashfs image.

In my testing with OS installs that depend on squashfs+xz, there is a significant lzma hit for decompression, resulting in significant latencies. And the higher the compression level used, the more the hit when decompressing. While compression computational hit for zstd is in the same ballpark as xz to achieve the same compression ratio, (a) decompression computational cost is far less with zstd, translating into faster reads; (b) is fairly consistent regardless of compression level.

Another factor for squashfs is the block size. The bigger it is, the better the compression ratio, but the greater the read amplification. I haven't looked at it, but it might be they're overoptimized for space reduction with too little consideration for performance. Since this isn't a one time use image, like for an installation, but intended to be read over and over again, erofs might be an alternative worth benchmarking.

https://linuxreviews.org/images/d/d2/EROFS_file_system_OSS20...

https://www.usenix.org/system/files/atc19-gao.pdf

https://lkml.org/lkml/2018/5/31/306

I think there is a cultural bias in this type of application that favors disk space to reduce overheads on mirrors.