Hacker News new | ask | show | jobs
by ghostwriter 1658 days ago
> The "required fields considered harmful" opinion was a hard lesson learned through real experience -- the experience of repeated outages of large, complex systems like Google Search, GMail, etc. Certainly, prior to this experience, everybody assumed required fields were a good idea.

There is a common trait in the systems you mentioned: they generally allow for a permissive representation of a domain data where many of the fields could be omitted or replaced by zero-values / defaults, because most of them, by their nature, have to do with things that are optional and are tolerable to noise and accidental mistakes (percentile precision). How much of A/B test data and user tracking stats do gmail / google search encode and process as protobuf?

If you compare it to a simulation engine's data stream or a collaborative BIM / CAD model, you will find out that almost everything that travels over a network in these systems is required to be unambigous and strictly consistent at sending and receiving sites. All binary representations of physical relations in these models are not just scalar values that can tolerate a default value assigned by a protocol parser upon receiving a missing field. The scalar values appear at UI rendering / output formatting. But most of the time you deal with relations and equations and you need to be able to differentiate between missing-by-intent and missing-by-mistake cases. Zero-values will not be helpful either, because a zero value itself can be represented in multiple ways, depending on the model being evaluated and the context it's evaluated in, the values can legitimately come in different precisions, units, ratios (descrete vs dense) and so on, and those are not distinct fields, their combinations are often mutually exclusive. This is not the kind of validation you want to delegate to calling sites implemented in different languages and maintained by different teams of different technical capacity to solve the challenge of a proper validation. The invariants and constraints have to be encoded into the protocol, and required fields is a low-level "must have" bit of it.