Hacker News new | ask | show | jobs
by quest88 3356 days ago
I'd like to know this too. As a passerby, those seem to have solved serialization, so I'm curious why you need rowfiles instead of e.g. protobuf.
1 comments

One reason on top of my head: Using such communication protocol would require changes to the other services consuming it.
So did switching to their homebrew serialization format -- in fact, most of the article is about how they managed the changes (which touched codebases at multiple sites in a fairly large organization).
Those switches all occurred at the pipeline level, leaving the map-reduce platform untouched. Switching our base logs to something like Parquet, Thrift or Protobuf would be a much larger project. We do support writing and reading Parquet to allow us to interface with other big data systems.