It should be pointed out that while this was once accepted as gospel, it has been coming under a lot of fire lately. HTML, once arguably the flagship of this principle and its greatest success (I say "arguably" because you can also argue TCP), no longer works this way. HTML5 specifies how bad input should be handled, and if you accept that "how to process nominally bad input" as the "real" standard, HTML is now strict in what it accepts. It's just that what it is strictly accepting appears quite flexible.
I'm not a big believer in it myself; "liberal in what you accept" and "comprehensible for security audits" are not quite directly opposed, but certainly work against each other fairly hard. There's a time and a place for Postel's principle, but I consider it more an exception for exceptional circumstances rather than the first thing you reach for.
> HTML5 specifies how bad input should be handled, and if you accept that "how to process nominally bad input" as the "real" standard, HTML is now strict in what it accepts.
HTML5 is a shining example of "be liberal in what you accept", and its improved documentation of how to handle bad input (note that bad input is still permitted!) greatly expands HTML's "be conservative in what you send". I think HTML5 is a perfect example of the Robustness Principle.
The "bad input" is, arguably, no longer bad input. The standard has been redefined to strictly specify what to do with that "bad" input, and if you don't handle it exactly as the standard specifies, it won't do what you "want" it to do.
That's not "being liberal in what you accept". Being liberal in what you expect is what we had before HTML 5, where the standard specified the "happy case" and the browsers were all "liberal in what they expect", in different ways. I am not stretching any definitions here or making anything up, because "liberal in what you accept" behaviors in the real world demonstrably work this way; everybody is liberal in different ways. It can hardly be otherwise; it isn't "being liberal in what you accept" if you accept exactly what the standard permits, after all. When liberality is permitted, what happens in practice is that out-of-spec input is handled in whatever the most convenient way for the local handler is, in the absence of any other considerations (such as deliberately trying to be compatible with the quirky internal details of the competition). Browsers leaked a lot about their internal differences if you observed how they tended to handle out-of-spec input. Thus a standard like HTML5 that clearly specifies how to handle all cases now is fundamentally not "liberal in what it accepts" anymore.
Instead, it is a rare, if not unique, example of a standard that has been rigidly specified after a couple of decades of seeing exactly how humans messed up the original standard. It is, nevertheless, now quite precise about what to do about the HTML you encounter. You aren't allowed to be "liberal", you're told exactly what to do.
> The "bad input" is, arguably, no longer bad input.
What? Yes it is! Defined behavior for invalid markup doesn't make that markup valid.
HTML5 doesn't refuse to accept anything that HTML 4 accepted. Defining behavior for invalid markup does not even impact "be liberal in what you accept", the scope of what is accepted hasn't changed. It affects "be conservative in what you send", in particular it more closely matches that half of the principle.
> HTML5 doesn't refuse to accept anything that HTML 4 accepted.
It does. It doesn't accept NET syntax, i.e., `p/This is contents of a p elements/`. (No browser ever supported this, but because HTML 4 is defined to be an SGML application and it's DTD allows NET syntax to be used, it is theoretically conforming HTML 4.)
I think web browsers are a better example of it. The HTML parsing/DOM tree system usually is pretty forgiving about missing/malformed tags, but still always returns a result rendered as if the HTML had been written to spec.
No, we decided that what was important was interoperable implementations: it doesn't matter how you achieve that goal. What's needed is specs that define how to handle all input (it doesn't matter what the spec says: it can define how to handle every single last possible case as HTML5 does, or it can define a subset of inputs to trigger some fatal error handling as XML1.0 does) and sufficient test suites that implementers catch bugs in their code before it ships (and the web potentially starts relying on their quirks).
The problem with IE6 was the fact that it wasn't interoperable (in many cases, every implementation was conforming according to the spec, and there were frequently differences in behaviour in valid input that the spec didn't fully define) and the fact that it had lots of proprietary extensions (and being strict and disallowing any extensions makes it hard to extend formats in general in a non-proprietary way; one option is strict versioning but then you end up with a million if statements all over the implementation to alter behaviour depending on the version of the content).
Some of the worst issues with IE that took the longest for other browsers to match were things like table layout: IE quite closely matched NN4 having invested a lot in reverse-engineering that as the web depended on the NN4 behaviour in places; Gecko had rewritten all the table layout code from NN and didn't match its behaviour having been written according to the specs which scarcely define how to layout any table even today.
No, be strict, fail fast, and report the errors. Robustness is not achieved by muddling through on a misinterpretation, it is achieved by working toward correctness.
I think "fail fast" can work well in a closed and controlled system but when accepting input from many other parties it's not as practical or desirable.
My hopefully-better-expressed point is that it's easy to interpret the robustness principle in different ways, some of which lead to better code, and some of which... don't.
No worries. I think the philosophy is rooted in the fact that you can't control what other parties will send you; you can only control what you send in response. So that's the main thing to keep in mind.
It's sort of like the good life advice you hear occasionally: you can't control other peoples' actions; only your own. Emotional maturity, etc.
Isn't that missing the point of the robustness principle, which is more related to say, networking, and accepting things that aren't strictly to RFC spec, but when sending things, you match the spec to the letter?
Be liberal in a well-defined way in what you accept. Accepting a variety of input is fine, as long as it is formally defined and recognized. The robustness principle is not an excuse to be sloppy with the input.
(why? see [2] in my other post, "The Science of Insecurity")
I'm not a big believer in it myself; "liberal in what you accept" and "comprehensible for security audits" are not quite directly opposed, but certainly work against each other fairly hard. There's a time and a place for Postel's principle, but I consider it more an exception for exceptional circumstances rather than the first thing you reach for.