Hacker News new | ask | show | jobs
by wmwmwm 1517 days ago
I really like that you can use duckdb to sql query an on disk directory structure of parquet files with no preloading into RAM or other db formats. Super useful/quick and only one line of code! Instead of selecting from a table you just pass a glob pattern into the SQL - and since it’s a oneliner I can use it in adhoc notebooks too. It even has a to_df() method on the query result so you can get it into pandas for further manipulation