| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by klauspost 1590 days ago

Nice tools!

When it is serverside, reading a 50MB CFD is a small task. And once it is read we can store the zipindex for even faster access.

We made 'zipindex' to purposely be a sparse, compact, but still reasonably fast representation of the CFD - just enough to be able to serve the file. Typically it is around a 8:1 reduction on the CFD, but it of course depends a lot on your file names as you say (the index is zstandard compressed).

Access time from fully compressed data to a random file entry is around 100ms with 1M files. Obviously if you keep the index in memory, it is much less. This time is pretty much linear which is why we recommend aiming for 10K file per archive, which makes the impact pretty minimal.