|
|
|
|
|
by whalesalad
942 days ago
|
|
Can anyone comment on QuestDB vs Clickhouse vs TimescaleDB? Real world experience around ergonomics, ops, etc. Currently using BigQuery for a lot of this (ingesting ~5-10TB monthly) but would like to begin exploring in-house tooling. On the flip side, we still use PSQL/RDS a lot and I enjoy it for the low operations burden - but we're doing some time series stuff with it now that is starting to fall over. TimescaleDB is nice because it is postgres, but afaik cannot work inside RDS. Clickhouse is next on my list for a test deployment, but QuestDB looks pretty neat too. |
|
Clickhouse was deployed to replace a home grown distributed storage system that at one point in time a long time ago was much cheaper and faster than the results the team was getting with BQ.
We evaluated clickhouse and druid for a few data stores for doing interactive queries on a fairly high throughput clickstream data pipeline.
Clickhouse won in terms of performance. We did some chaos testing on it in the form of simulating node outages and network partitions and we were happy with the results. My only complaint is that there are some other database options which don't require me to run a database and that's nice if I can get away with it.
One of the things you might try is dumping your data into parquet files on gcs. There are quite a few databases you can query that with and get good results depending on your indexing and partitioning needs. It's tough to get lower operational burden than "stick it in cloud storage and sometimes spin up some stateless compute to query it".
I think duckdb is super cool but for me at the moment it's a solution that I'm still in search of a problem for.