| Story time: the whole development of protobuf was... a mess. It was developed and used internally at Google long before it was ever open sourced. Protobuf was designed first and foremost for C++. This makes sense. All of Google's core services are in C++. Yes there's Java (and now Go and to some extent Python). I know. But protobuf was and is a C++-first framework. It's why you have features like arena allocation [1]. Internally there was protobuf v1. I don't know a lot about this because it was mostly gone by the time I started at Google. protobuf v2 was (and, I imagine, still is) the dominant form of. Now, this isn't to be confused with the API version, which is a completely different thing. You would specify this in BUILD files and it was a complete nightmare because it largely wasn't interchangeable. The big difference is with java_api_version = 1 or 2. Java API v1 was built like the java.util.Date class. Mutable objects with setters and getters. v2 changed this to the builder pattern. At the time (this may have changed) you couldn't build the artifacts for both API versions and you'd often want to reuse key protobuf definitions that other people owned so you ended up having to use v1 API because some deep protobuf hadn't been migrated (and probably never would be). It got worse because sometimes you'd have one dependency on v1 and another on v2 so you ended up just using bytes fields because that's all you could do. This part was a total mess. What you know as gRPC was really protobuf v3 and it was designed largely for Cloud (IIRC). It's been some years so again, this may have changed, but there was never any intent to migrate protobuf v2 to v3. There was no clear path to do that. So any protobuf v3 usage in Google was really just for external use. I explain this because gRPC fails the dogfood test. It's lacking things because Google internally doesn't use it. So why was this done? I don't know the specifics but I believe it came down to licensing. While protobuf v2 was open sourced the RPC component (internally called "Stubby") never was. I believe it was a licensing issue with some dependency but it's been awhile and honestly I never looked into the specifics. I just remember hearing that it couldn't be done. So when you read about things like poor JSON support (per this article), it starts to make sense. Google doesn't internally use JSON as a transport format. Protobuf is, first and foremost, a wire format for C++-cetnric APIs (in Stubby). Yes, it was used in storage too (eg Bigtable). Protobuf in Javascriipt was a particularly horrendous Frankenstein. Obviously Javascript doesn't support binary formats like protobuf. You have to use JSON. And the JSON bridges to protobuf were all uniquely awful for different reasons. My "favorite" was pblite, which used a JSON array indexed by the protobuf tag number. With large protobufs with a lot of optional fields you ended up with messages like: [null,null,null,null,...(700 more nulls)...,null,{/*my message*/}]
GWT (for Java) couldn't compile Java API protobufs for various reasons so had to use a variant as well. It was just a mess. All for "consistency" of using the same message format everywhere.[1]: https://protobuf.dev/reference/cpp/arenas/ |
So, protobuf1 which was perfectly serviceable wasn't open sourced, it was rewritten into proto2. In that case migration did happen, and some fundamental improvements were made (e.g. proto1 didn't differentiate between byte arrays and strings), but as you say, migration was extremely tough and many aspects were arguably not improvements at all. Java codebases drastically over-use the builder/immutable object pattern IMO.
And then Stubby wasn't open sourced, it was rewritten as gRPC which is "Stubby inspired" but without the really good parts that made Stubby awesome, IMO. gRPC is a shadow of its parent so no surprise no migration ever happened.
And then Borg wasn't open sourced, it was rewritten as Kubernetes which is "Borg inspired" but without the really good part that make Borg awesome, IMO. Etc.
There's definitely a theme there. I think only Blaze/Bazel is core infrastructure in which the open source version is actually genuinely the same codebase. I guess there must be others, just not coming to mind right now.
Using the same format everywhere was definitely a good idea though. Maybe the JS implementations weren't great, but the consistency of the infrastructure and feature set of Stubby was a huge help to me back in the days when I was an SRE being on-call for a wide range of services. Stubby servers/clients are still the most insanely debuggable and runnable system I ever came across, by far, and my experience is now a decade out of date so goodness knows what it must be like these days. At one point I was able to end a multi-day logs service outage, just using the built-in diagnostics and introspection tools that every Google service came with by default.