Hacker News new | ask | show | jobs
by zrm 2242 days ago
Filesystem compression is a completely different animal than this. It has to deal with your ability to modify the file at any time. It doesn't compress the whole file together, it does it in blocks. When you launch a binary (and the system mmaps it) it doesn't have to decompress the entire file before you can start using it, only the first compression block.

Compression also typically makes it faster to launch applications from spinning rust, because the bottleneck is the drive and reading 50MB and decompressing it is faster than reading 100MB uncompressed. This would be true of SSDs as well except that most of them already do this internally.

But snap isn't reading e.g. 64kB and then giving you 128kB on demand (and then prefetching the next block) like the filesystem does, it has to read and decompress the entire 100+MB package before you can even open it. That is very silly and adds a perceptible amount of latency.

1 comments

> But snap isn't reading e.g. 64kB and then giving you 128kB on demand (and then prefetching the next block) like the filesystem does, it has to read and decompress the entire 100+MB package before you can even open it. That is very silly and adds a perceptible amount of latency.

Wait, I could be wrong about this. I was deducing it from other people saying that it has to decompress the package every time you open it plus the empirically long application load times, but it turns out it's using squashfs which at least in principle could be doing the compression the same way as zfs. I haven't checked whether it does or not.

They're doing something wrong though or it wouldn't be this slow. Possibly more than one thing. Unfortunately there are a lot of different ways to screw this up, like not caching the decompressed data so it has to be decompressed again on every read even if it's already in memory, or using too CPU intensive a compression algorithm or too large a block size, or double (or triple or quadruple) caching because it's loop-mounted and then forcing slow disk reads through inefficient cache utilization, or over-aggressive synchronous prefetching, or any combination of these. Or maybe it actually is doing whole-file-level compression.

Now I'm curious which one(s) it really is.