Hacker News new | ask | show | jobs
by Lyngbakr 1261 days ago
We saw major improvements when we simply wrote full tables from a transactional database to parquet, but also, as you say, modelling the data appropriately produced significant improvements, too.
1 comments

A column oriented database is probably the bigger performance increase. Parquet and a good data warehouse (something like Clickhouse, Druid or Snowflake) will both use metadata and efficient scans to power through aggregation queries.