Hacker News new | ask | show | jobs
by legg0myegg0 2101 days ago
Before the latest optimization, and only using 1 core, vs. SQLite we were seeing 133x performance on a basic group by or join, and about 4x for a pretty complex query. It was roughly even to Pandas in performance, but it can scale to larger than memory data and now it can use multiple cores! As an example, I could build views from 2 Pandas DataFrames with 2 columns and 1 million rows each, join them, and return the 1 million row dataset back to Pandas in 2 seconds vs. 40 seconds with SQLite/SQLAlchemy... Pretty sweet. DuckDB is going to be even faster now I bet!
2 comments

Have you compared this to dask from a performance standpoint? That is the larger than memory solution we're currently using for analytics.
How about the other way? When will SQLite perform better than DuckDB?
If your workload is lots of singular INSERT or UPDATE’s.