|
|
|
|
|
by azimafroozeh
326 days ago
|
|
Arrow is primarily an in-memory format, while Parquet is commonly used for on-disk storage. Typically, data is stored in Parquet and then read into memory as Arrow.
FastLanes is a new on-disk file format, comparable to Parquet, but offers around 40% better compression and faster decoding thanks to its data-parallel encoding design. Disclaimer: I'm the first author of the paper. |
|
This is referenced in the link above. https://arrow.apache.org/docs/python/ipc.html
Unfortunately I'm stuck with CSV at work for now.