|
|
|
|
|
by TechBro8615
1875 days ago
|
|
> unless "disk" is super fast and thus more likely memory, and your data is ephemeral, you probably shouldn't Can you elaborate why Arrow is not a good format for storing to disk? If you’re using it for in-memory querying, why would you not want to also serialize it directly to disk instead of using some intermediary format? |
|
Performance: Arrow does not do significant compression. Feather started adding it, but that adds even more change risk. Parquet/ORC/Arrow are all fairly similar, so until Arrow catches up and stablizes, I'd stick w/ Parquet/ORC. We do GPU stuff, and get in-GPU decompression already, so that's been a win/win.