Hacker News new | ask | show | jobs
by marginalia_nu 1578 days ago
Do zip files really offer performant random access with tens of millions of entries?
2 comments

I don't see why they wouldn't as long as the TOC (i.e. filenames + headers) fits in ram. The offsets are all there for a simple seek.
Yeah. Tens of millions sounds fine. And you don’t have to keep navigating the raw zip file, you just do it once. “All file offsets available in a single immutable in-memory hash map” is basically the dream scenario. I imagine if you were desperate you could pack more in by representing your file names efficiently in memory, a bit of path compression or a trie or whatever, but if it already works it already works.
Yes, that’s why for instance Java uses it to store large amounts of class files.