Hacker News new | ask | show | jobs
by est 130 days ago
> The contract-first philosophy

gRPC/protobuf is largely a Google cult. I've seen too projects with complex business logic simply give up and embed JSON strings inside pb. Like WTF...?

Everything was good in the begining, as long as everyone submits their .proto to a centralized repo. Once the one team starts to host their own, things get broken quickly.

As it occured to me, gRPC could optionally just serve those .proto files in the initial h2 handshake on the wire. It add just few kilobytes but solves a big problem.

4 comments

I personally really like gRPC and protobufs. I think they strike a good balance between a number of indirectly competing objectives. However I completely agree with your observation that as soon as you move beyond a single source of truth for the .proto files it all goes to shit. I've seen some horrible things--generated code being committed to version control and copied between repos, .proto files duplicated and manually kept up to date (or not). Both had hilarious failure modes. There is no viable synchronization mechanism except to ensure that each .proto file is defined in exactly one place, that each time someone touches a .proto file all the downstream dependencies on that file are updated--everyone who consumes any code generated from that .proto--and that for every such change clients are deployed before servers. Usually these invariants are maintained by meatspace protocols which invariably fail.
I don't see why any of that would be necessary. There are simple rules for protobuf compatibility and people only need to follow them. Never re-use a field number to mean something else. Never change the type of a field. That's it. Those are the only rules. If you follow them you don't have to think about any of that stuff that you mentioned.
Absolutely! Forward and backward compatibility are one of the wonderful things about protobufs. And that all goes wrong when you try to define the interface in more than one place.

EDIT: also, although the wire protocol may tolerate unknown or missing data, almost always the application doesn't.

EDIT AGAIN: I'm not saying this is how it should be just that this is the low energy state the socio-technical system seems to arrive at over time. So ideally it should be simple but due to imperfect decisions it gets horribly complicated over time.

I fail to see how the application will even be aware of unknown data. Explain what practical problem could possibly arise if you think a message has 4 fields and I send you a fifth one.

Edited to reply to your edits: People who are just bozos with computers will never be kept from bozotry by any interchange format. If they lack any semblance of foresight then maybe they simply should get a different line of work. Postel's law is in force here. If you start sending me emails with extra headers my email program is never going to care. Protobufs are the same way.

Apologies for the delay, this site appears to be rate limiting me. Yeah used correctly they're great. But they're almost never used correctly in practice. I agree this is bozotry in the extreme, but it's widespread. To avoid it all they'd need to do is read like 4 pages of well-written accessible documentation, but sadly that bar is too high. I don't blame protobufs! It's just that, somehow, what should be an elegant, simple system turns into a nightmare in practice. Every. Goddamn. Time. Not unlike when people try to use Kafka. That isn't to say the tool shouldn't be used, just that maybe we need a better way to organize/educate/hire engineers so they don't ruin things so badly. Or at least some way to impose an upper bound the damage they can do. Maybe there's some kind of regularization effect if you force everyone to work with Map<Object, Object> JSON. Or maybe it's just the state everything devolves to eventually.
> Everything was good in the begining, as long as everyone submits their .proto to a centralized repo. Once the one team starts to host their own, things get broken quickly.

Is this an issue with protobufs per se though? It's a data schema. How are people supposed to develop to a shared schema if a team doesn't - you know - share their schema? That could happen with any other particular choice for how schemas are defined.

It's a problem with PB because it requires everything to be typed (unless you use Any), which requires all middleware to eagerly type check all data passing through. With JSON, validation will be typically done only by the endpoints, which allows for much faster development.

There was a blog a few years ago, where an engineer working on the Google Cloud console was complaining that simply adding a checkbox to one of the pages required modifying ~20 internal protos and 6 months of rollout. That's an obvious downside that I wish I knew how to fix.

My guess is there's more to that story than just "protobufs don't forward unknown fields" because that's not how they work be default. Take a look at https://protobuf.dev/programming-guides/proto3/#unknowns.

https://kmcd.dev/posts/protobuf-unknown-fields/ discusses the scenario you're hinting at.

It's possible in the story you mention that each of those ~20 internal protos were different messages, and each hop between backends was translating data between nearly identical schemas. In that case, they'd all need to be updated to transport that data. But that's different and the result of those engineers' choice for how to structure their service definitions.

The problem is different. Protobuf's unknown field support is useful if you want to forward a message in its entirety, and it allows you to copy an input message even though it has fields unknown to the middleware. The problem arises because at Google, in order to minimize payloads and storage sizes, they almost always create "intermediate" protobufs that are only used by middleware to talk to other middleware.

Example:

The service that manages the web frontend knows that the new checkbox is auth-related and therefore it has to go into the WebServiceAuthRequest PB message, but it doesn't have the new schema of the WebServiceAuthRequest message with the checkbox field, so it can't create a WebServiceAuthRequest message because it doesn't know which numeric ID to use for the value.

The "common wisdom" at Google was that you have to add a new field starting at the leaves (the storage backends) and work your way up to the middleware, then the web frontends and finally the JS code. And yes, in the worst case it can take two quarters and modifying 20 intermediate services (each with its own ServiceFooRequest protobuf) just to add a new checkbox in the UI.

And in writing this I came up with a way to avoid the problem, but it would require an incompatible change to the PB wire format. Hmmm...

> As it occured to me, gRPC could optionally just serve those .proto files in the initial h2 handshake on the wire

Do you mean the reflection protocol, or some other .proto files?

It does have discovery built in. Is that what you want?
you mean grpc.reflection.v1alpha.ServerReflection? Close enough, sadly not generally enabled.