Hacker News new | ask | show | jobs
by 1egg0myegg0 1506 days ago
Persistent indexes are being actively worked on! Stay tuned. As for the crashes - DuckDB is very well tested and used in production in many places. The core functionality is very mature! Let us know if you test it out! Happy to help if I can.

(disclaimer - on the DuckDB team)

1 comments

Hi, good to hear that you guys care about testing. One thing apart from the Github issues that led me to believe it might not be super stable yet was the benchmark results on https://h2oai.github.io/db-benchmark/ which make it look like it couldn't handle the 50GB case due to a out of memory error. I see that the benchmark and the used versions are about a year old so maybe things changed a lot since then. Can you chime in regarding the current story of running bigger DBs like 1TB on a machine with just 32GB or so RAM? Especially regardung data mutations and DDL queries. Thanks!
Yes, that benchmark result is quite old in Duck years! :-)

We actually run that benchmark as a part of our test suite now, so I am certain that there is improvement from that version.

The biggest DuckDB I've used so far was about 400 GB on a machine with ~250 GB of RAM.

There is ongoing work that we are treating as a high priority for handling larger-than-memory intermediate results within a query. But we can handle larger than RAM in many cases already - we sometimes run into issues today if you are joining 2 larger than RAM tables together (depending on the join), or if you are aggregating a larger than RAM table with really high cardinality in one of the columns you are grouping on.

Would you be open to testing out your use case and letting us know how it goes? We always appreciate more test cases!