|
|
|
|
|
by sandGorgon
3210 days ago
|
|
pretty cool and thanks for that reply! Did you look at something like Arrow/Feather, which is looking to get adopted as the interoperable format in R/Pandas ... and maybe even Spark.
There's been quite a bit of momentum behind it to optimize it for huge usecases - https://thenewstack.io/apache-arrow-designed-accelerate-hado... It is based on Google Flatbuffers, but is undergoing enough engineering specifically from a big data/machine learning perspective. Instead of building directly over Protobuf, it might be interesting to build it on top of Arrow (in exactly the same way that Feather is based on top of Arrow https://github.com/wesm/feather). |
|
We chose protobuf mainly due to a good caffe adoption story and also the track record of it being compatible with many platforms (mobile, server, embedded, etc). We actually looked at thrift - which is Facebook owned - and it is equally nice, but our final decision was mainly to minimize the switching overhead for existing users such as Caffe and TensorFlow.
To be honest, protobuf is indeed a little bit hard to install (especially if you have python and c++ version differences). Would definitely be interested in taking a look at possible solutions - serialization format and the model standard is mor e or less orthogonal, so one may see a world where we can convert different serialization formats (JSON <-> protobuf as an overly simplified example)