| HN Mirror

I am very curious how you are going to get SQLite working with piping, especially on extract... It's pretty common to do stuff like "curl ... | tar xvf -", so that you can start extraction the moment first kilobyte of data arrives. This really saves a lot of time, as disk + network work in parallel.

A less common tar's feature is packing on compress -- stuff like "ssh remote tar cvf - ... > local-file.tar", which skips temporary file on remote machine, and also saves lots of time in transfer.

But for both of those, sqlite's "memory" won't help you there - memory or not, you still need to have the entire file to read it. So if you just store file contents in the sql database, then you have to fetch everything up to the latest byte before you can get any data out.

Maybe you can have index in sqlite, and append data as-is... but where would you put that index?

if you put it in front (like squashfs), you need to produce entire metadata before writing first data byte.. and that should include compressed sizes too (assuming you want to support random extraction), which means you cannot stream file out until you finish compressing all the data. And also sometimes you will not be able to add files to the archive without rewriting the whole archive (if the index grows and you didn't leave enough padding). This might be OK, but definitely should be mentioned.

If you put it at the end (like zip), you will be able to stream file out during compression, but fast decompression would be impossible. Also, you'll forego any sqlite transitional guarantees - since the database will be created in-memory, and only written at the very end once all the files are written.

So frankly, I don't see how you can win on a streaming front, unless you really have a custom format and "sqlite3" is just a small part of it.

(Another problem is there is not even a short spec - how is sqlite3 used, what is your schema, and so on. And I am sorry, but I am not going to read the source code just to figure this stuff out).