Hacker News new | ask | show | jobs
by ljm 658 days ago
I used protobuf extensively with Kafka and I remember having to be quite particular about how the proto files/packages were arranged so as to avoid naming and versioning conflicts.

We never generated go code from it, but it took a bit of fine tuning to get generated code that felt at least somewhat ergonomic for Ruby and Typescript. It usually involved using some language specific alternative to protoc for that language because the code generated by protoc itself was practically unreadable. IIRC in the case of Typescript I had to write a script that messed around with the directory structure so you could use sensible import paths and aliases, because TS itself wasn't discovering them automatically without it.

That's stuff you can work with and solve technically. Initial faff but it's one and done. The worst problem I had with it was protobuf3 stating every field is optional by default, and the company I worked at basically developed a custom schema registry setup that declared every field as required by default, with a custom type to mark a field as optional. It turned literally every modification to a protobuf definition into a breaking change and, what's worse, it wasn't done end to end and the failures were silent, so you'd end up with missing data everywhere without knowing it for weeks.

1 comments

>the failures were silent, so you'd end up with missing data everywhere without knowing it for weeks.

This is the main reason I think protobuf's "zero values are simply not communicated" is fundamentally wrong. Missing data is one of the easiest flaws to miss, and it tends to cause problems far away from the source of the flaw, in both time and space, which makes it extremely hard to notice and fix.

I get the arguments in its favor. I get the arguments in favor of "everything is optional by default". But presence is utterly critical in detecting flaws like this, and it can't always be addressed in a backwards-compatible way by application code. E.g. in proto's case, it's not possible because that data does not exist, and adding it would change the binary data. Even binary-compatible workarounds like "add a field with a presence fieldset" aren't usable because that unrecognized field will be silently ignored by older consumers, so you're right back to where you started.

It needs to exist from day 1 or you're shooting your users in the feet.