Hacker News new | ask | show | jobs
by hintymad 1261 days ago
Thanks for sharing your insights. Any comments on Feather vs Parquet? If we don't need to support tools that can only interact with Parquet, how will Feather pan out as a Parquet alternative (or Feather can't be such alternative at all)?
1 comments

I recently looked into this as well. Specifically how the two formats differ. As it stands right now the “Feather” file format seems to be a synonym for the Arrow IPC file format or “Arrow files” [0]. There should be basically no overhead while reading into the arrow memory format [1]. Parquet files on the other hand are stored in a different format and therefore occur some overhead while reading into memory but offer more advanced mechanism for on disk encoding and compression [1].

As far as I can tell the main trade-off seems to be around deserialization overhead vs on disk file size. If anyone has more information or experience with the topic I'd love to hear!

[0] https://arrow.apache.org/faq/#what-about-the-feather-file-fo... [1] https://arrow.apache.org/faq/#what-is-the-difference-between...

EDIT:

More information: https://news.ycombinator.com/item?id=34324649

This is also my understanding - see https://news.ycombinator.com/item?id=34324649
Thanks! Just stumbled across your comment as well.