Hacker News new | ask | show | jobs
by ankoh 1695 days ago
It depends.

Querying CSV files is particularly painful over the network since we still have to read everything for a full scan.

With Parquet, you would at least only have to read the columns of group by keys and aggregate arguments.

Try it out and share your experiences with us!