Y
Hacker News
new
|
ask
|
show
|
jobs
by
kwillets
612 days ago
One more: do you prefer the CDC technique over using the rowgroups as chunks (ie using knowledge of the file structure)? Is it worth it to build a parquet-specific diff?
1 comments
ylow
612 days ago
I think both are necessary. The cdc technique is file format independent. The row group method makes Parquet robust to it.
link