Hacker News new | ask | show | jobs
by theLiminator 427 days ago
How does it compare to duckdb and/or polars?
3 comments

This is very much an active space, so the half-life of in depth analyses is limited, but one of the best write ups from about 1.5 years ago is this one: https://bicortex.com/duckdb-vs-clickhouse-performance-compar...
In my understanding DuckDB doesn't have its own optimised storage that can accept writes (in a sense that ClickHouse does, where it's native storage format gives you best performance), and instead relies on e.g. reading data from Parquet and other formats. That makes sense for an embedded analytics engine on top of existing files, but might be a problem if you wanted to use DuckDB e.g. for real-time analytics where the inserted data needs to be available for querying in a few seconds after it's been inserted. ClickHouse was designed for the latter use case, but at a cost of being a full-fledged standalone service by design. There are embedded versions of ClickHouse, but they are much bulkier and generally less ergonomic to use (although that's a personal preference)
Nah, duckdb does have their own format (not sure if it's write friendly, though i believe it is).
It cannot do concurrent writes natively and that is not a design goal of the DuckDB creators. See: https://duckdb.org/docs/stable/connect/concurrency.html#writ...
It does, but the performance isn't great apparently: https://github.com/duckdb/duckdb/discussions/10161
Yeah, I don't really know. Though in the OLAP space that issue/discussion is really old. There's a good chance that performance is dramatically better now though YMMV.
Clickhouse is a network server, duckdb and polars are in-process databases. It's like postgres vs sqllite.
There's chdb though.