Hacker News new | ask | show | jobs
by beingflo 643 days ago
I've been eyeing DuckDB for a metric collection hobby project. Quick benchmark showed promising query performance over SQLite (unsurprising considering DuckDB is column oriented), but quite a bit slower for inserts. Does anyone have experience using it as an "online" backend DB as opposed to a data analytics engine for interactive use? From what I gather they are trying to position themselves more in the latter use case.
3 comments

Doing row-by-row inserts into DuckDB is really slow. Accumulating rows in an in-memory data structure and periodically batching them into something like an in-memory Arrow table, and then reading the Arrow table into DuckDB, is fast and has been tenable for my own use cases.
You can always use sqlite as your primary data store, and then directly query the sqlite database from duckdb whenever you need analytics.
Depends on the scale of users you expect for your project. Generally I like to keep oltp and olap tools in their lanes, but if < 100 people are going to be using it probably doesn't matter. I doubt duckdb has any sort of acid guarantees, so thats something to keep in mind.
DuckDB does have ACID guarantees and transactions but I'd not be surprised if they are rarely used (if at all).

Ref: https://duckdb.org/docs/sql/statements/transactions

In the concurrency documentation they explicitly specify that it's not designed for lots of small transactions

Concurrency: https://duckdb.org/docs/connect/concurrency