|
|
|
|
|
by tconbeer
995 days ago
|
|
It's great as:
1. An ephemeral processing engine. For example, I have a machine learning pipeline where I load data into a DataFrame, and then I can use DuckDB to execute SQL on my DataFrame (I prefer both the syntax and performance to Pandas).
2. A data lake processing engine. DuckDB makes it very easy to interact with partitioned files.
3. A lightweight datastore. I have one ETL pipeline where I need to cache the data if an API is unavailable. I just write the DataFrame to a DuckDB database that is on a mounted network filesystem, and read it back when I need it. |
|