Hacker News new | ask | show | jobs
by hodgesrm 168 days ago
If it's under a few terabytes, copy it to ClickHouse and store it in the native format, make your change, and dump it back out again. ClickHouse reads and writes Parquet very efficiently and supports efficient distribution of work across clusters. It's a rewrite but it's fast.