Hacker News new | ask | show | jobs
by tconbeer 995 days ago
It's great as: 1. An ephemeral processing engine. For example, I have a machine learning pipeline where I load data into a DataFrame, and then I can use DuckDB to execute SQL on my DataFrame (I prefer both the syntax and performance to Pandas). 2. A data lake processing engine. DuckDB makes it very easy to interact with partitioned files. 3. A lightweight datastore. I have one ETL pipeline where I need to cache the data if an API is unavailable. I just write the DataFrame to a DuckDB database that is on a mounted network filesystem, and read it back when I need it.