Hacker News new | ask | show | jobs
by liuliu 1703 days ago
Right. That's why I am more interested in arrow than parquet. Going from a pure compressed storage format to incorporate computation would be more difficult than going from memory-mapped / computation format to long-term storage. Arrow already made some good choices regarding data exchange over wire, these are translatable to data exchange over time.

Of course, I am only dealing with a few hundreds GiB data, not sure at larger scale whether arrow fails.