Hacker News new | ask | show | jobs
by pcmoritz 3170 days ago
Author here. From our perspective, CapnProto has similar characteristics as Flatbuffers and the reasons to prefer Arrow over it are the same: We would need to develop a mapping from Python types to CapnProto from scratch and Arrow has many facilities that are useful for us already built (Tensor serialization, code to deal with some Python types like datetimes, zero copy DataFrames, a larger ecosystem for interfacing with other formats like parquet, reading from HDFS, etc.). And it is designed for Big Data. So Arrow was a very natural choice (and it also supports Windows). Wes is doing some amazing work here!
1 comments

It looks like Arrow utilizes FlatBuffers internally [1]. Seems like using the Arrow project builds a lot of scaffolding that'd otherwise need to be built for this particular use case.

1: https://arrow.apache.org/docs/metadata.html