This is a funny thread to me because my frustration is at the intersection of your comments: I keep wanting sqlite for writes (and lookups) and duckdb for reads. Are you aware of anything that works like this?
Aha! That makes so much sense. Thank you for this.
Edit: Ah, right, the downside is that this is not going to have good olap query performance when interacting directly with the sqlite tables. So still necessary to copy out to duckdb tables (probably in batches) if this matters. Still seems very useful to me though.
Analytics is done in "batches" (daily, weekly) anyways, right?
We know you can't get both, row and column orders at the same time, and that continuously maintaining both means duplication and ensuring you get the worst case from both worlds.
Local, row-wise writing is the way to go for write performance. Column-oriented reads are the way to do analytics at scale. It seems alright to have a sync process that does the order re-arrangement (maybe with extra precomputed statistics, and sharding to allow many workers if necessary) to let queries of now historical data run fast.
I think you could build an ETL-ish workflow where you use SQLite for OLTP and DuckDB for OLAP, but I suppose it's very workload dependent, there are several tradeoffs here.
Right. This is what I want, but transparently to the client. It seems fairly straightforward, but I keep looking for an existing implementation of it and haven't found one yet.
It's not bad if you need something quick. I haven't had a large need of ANN in duckdb since it's doing more analytical/exploratory needs, but it's definitely there if you need it.