| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tconbeer 1043 days ago
	It's great as: 1. An ephemeral processing engine. For example, I have a machine learning pipeline where I load data into a DataFrame, and then I can use DuckDB to execute SQL on my DataFrame (I prefer both the syntax and performance to Pandas). 2. A data lake processing engine. DuckDB makes it very easy to interact with partitioned files. 3. A lightweight datastore. I have one ETL pipeline where I need to cache the data if an API is unavailable. I just write the DataFrame to a DuckDB database that is on a mounted network filesystem, and read it back when I need it.