Hacker News new | ask | show | jobs
by stavros 917 days ago
It occurs to me that you could stick a few bytes of header in the beginning of the ZIP file, to tell you the exact location of the header at the end of it, thus avoiding multiple lookups. It would even still be ZIP-compatible.
1 comments

Definitely. I take an alternative but similar approach: since I control the zip files, I can guarantee that the header is always within the last N kilobytes of the zip file (configurable value of N). I spend a HEAD request to get the length of the zip file and then walk backwards by N kilobytes. You would request the few bytes at the beginning instead of using that request to get the file length.
How can you guarantee that it's always within N kilobytes, when N depends on the number of files in the zip?
If you're creating the zips in the first place, you can just check and see how big the headers are when you create them. If you happen to get N wrong, you can request another chunk, but obviously it's nice to avoid multiple requests to get the header. For my use case, the number of files is small and relatively consistent between zips so a generous value of 64KB ended up working great.