Hacker News new | ask | show | jobs
by Xamayon 1179 days ago
I do exactly that for storing the result thumbnails for some of the dbs in my reverse image search engine (SauceNAO). Non compressed zip files allow quickly/easily seeking to and accessing component files without extraction. A few tens to hundreds of thousands per zip file works great. Millions would probably not be too different, but would use more resources/take more time when loading the zip file index.
1 comments

Interesting. Have you ever considered SQLite file storage? I'm wondering how it would compare.

https://www.sqlite.org/sqlar.html

Haven't looked into it, but it sounds like it would work similarly (with some nice benefits such as also being able to easily store other metadata/etc). Feasibility would depend on how quickly the indexes and such load, and the resource consumption associated with opening/closing dozens of them at a time 24/7. In my screwy case there are hundreds of thousands of zip files which are randomly accessed on the fly to grab one or two thumbnails at a time. The random access speed on unloaded files is critical, and for zip files it's extremely quick.