| HN Mirror

Yes, I have been doing same thing, only with LMDB.

I do not think LMDB could load from in-memory only object (as it has to have file to memory-map to), however.

But same design reasons, I wanted something that

a) I can move across host architectures

b) something that can act as key-val cache, as soon as the processes using it are restarted (so no cache hydrating delay)

c) something that I can diff/archive/restore/modify in place

We tested sqllite for the above purpose at the time, and writing speed and ( b ) - lmdb was significantly faster.

So we lost the flexibility of SQLite, but I felt it was a reasonable tradeoff, given our needs.

I also know that one of the Intel's python toolkits for image recognition/ai, uses LMDB (optionally) store images that processing routines do not have incur the cost of directory lookups when touching millions of small images. (forgot the name of the toolkit though)…

Overall, this a very valid practice/pattern in data processing pipelines, kudos to you for mentioning it.