Hacker News new | ask | show | jobs
by SigmundA 2306 days ago
>Except those are not the alternatives. The alternatives are consistently rendered websites or inconsistently rendered websites. If browsers had strictly enforced HTML syntax from the beginning, noone would ever have built websites with "little issues in the markup".

Thats not reality, if everyone got perfect formed input we wouldn't be having this debate, the reality is it occurs, so what do you do, reject it or accept it and try and do something with it. XHTML simply rejects malformed markup and you get a blank page, HTML tries to make sense of it and render something.

>IP stacks do not accept randomly misformatted IP packets. The result is obviously not that you constantly encounter internet services that you can not access because your IP stack is picky about broken IP packets, the result is that noone ever sends you broken IP packets.

So you never heard of ECN? The ECN bits being set where technically incorrect depending on how pedantic you where in the interpretation and some stacks rejected packets if the bits weren't set to zero. Due to he robustness principle most stacks ignored these bits allowing others to use them for ECN, allowing a graceful update to the spec. The stacks that took your stance however and rejected where simply roadblocks in the adoption.

>No, it just isn't. You are just looking at a very small part of the consequences of this implementation strategy that indeed happens to be positive, but completely ignoring the big picture of all the externalities and other indirect damage that result from it.

I'm not ignoring anything, I am just pointing out reality, the real world is messy and the stacks that try to keep working under messy conditions seem to be prevailing. Its not pretty and I don't deny the issues that arise, but here we are communicating on the largest most successful computer network ever built using a protocol and a markup language built with Postels law in mind.

>Erm ... no? The reason why XHTML was abandoned was because people are incompetent at writing software, and there existed an alternative that allowed them to keep their idiotic practices, including all the vulnerabilities and interoperability problems that result from those, so that's what people did.

I think most who know the history there would disagree with this opinion [1], it was obvious to me at the time why XHTML would fail even though I thought it a cleaner solution, I realized thats what was holding it back. It was much better to see your page come up with maybe a weird rendering artifact than just have the browser render nothing and throw an error if some small part was malformed.

>How does that follow? And what does that have to do with anything?

Because complaining about security vulnerabilities found in some of the most used software in the world while comparing to something that no one uses doesn't help your point.

>Relevant ... for what?

Uh gee I don't know maybe Postel's law is kinda relevant when discussing TCP because Postel wrote the spec you know like what you asked in the post before? What kind of game are you playing here?

1. https://thehistoryoftheweb.com/when-standards-divide/

1 comments

> Thats not reality, if everyone got perfect formed input we wouldn't be having this debate,

Erm ... you do understand that, you know, there is feedback involved in this? That I am obviously not saying that noone would ever have typed broken HTML into a file if browsers had rejected broken HTML from the start?

I mean, it's even the norm for implementations of other computer languages to be rather strict about syntax, and it doesn't hinder their popularity with the same audience. The exact same people who produce garbage HTML do so using Perl or PHP or Ruby or ... whatever. And whatever you otherwise think about those languages, none of them will just make shit up when there is a syntax error in your program, they will simply reject it. And no, that does not mean that I am claiming that noone has ever made a syntactical mistake when writing code in those languages. But, you know, people are actually capable of fixing those mistakes when they are pointed out to them.

> So you never heard of ECN? The ECN bits being set where technically incorrect depending on how pedantic you where in the interpretation and some stacks rejected packets if the bits weren't set to zero. Due to he robustness principle most stacks ignored these bits allowing others to use them for ECN, allowing a graceful update to the spec. The stacks that took your stance however and rejected where simply roadblocks in the adoption.

Erm ... what? That's almost fractally wrong!?

None of the ECN problem was one of pedantry, it was simply one of a broken specification, namely the TCP specification. "Reserved for future use. Must be zero." is simply a bad specification. If you specify an extension mechanism, you have to always specify how the extension mechanism is supposed to work. What you call the pedantic interpretation is a perfectly valid interpretation of what the text says. You are just looking at it in hindsight, with the idea that it's supposed to support the operation of ECN, and then it's obviously a problem--but people who implemented TCP stuff before there was ECN could not possibly know that that is how people would expect to use this if the TCP specification doesn't specify that. There is nothing wrong with extension mechanisms that work by having the recipient discard messages with flags it doesn't know. That's just not what ECN chose to do, but that is kinda ECN's fault. You might just as well have ended up with a situation where someone would have tried to build an extension that assumes that recipients discard segments with unknown flags, and everyone would have been pointing fingers at those who chose to ignore the flags instead, and how they were pedantic to ignore the flags just because the specification does not explicitly say that such segments are invalid. It's just an accident of history that most implementations chose to ignore unknown flags, and therefore people now point to the exception, without any basis other than them being the majority.

Also, obviously, the "robustness principle" did not allow for a graceful update to the spec. The fact that a graceful update was not possible is the whole reason why you mentioned ECN at all. And that is not necessarily a result of failing to follow the robustness principle, as the robustness principle really doesn't tell you anything useful. All you can do with it is to point at things in hindsight and say "if everyone had built this the same way, then things would be compatible now!" But the robustness principle is useless for actually achieving that. For any format specification, there is an almost infinite number of ways you can deviate from the specification where humans could look at any individual one of those deviations and come to an agreement as to how that deviating message could reasonably be interpreted. And any one of those deviations could in principle be implemented as part of the corresponding parser. But implementing a parser that "correctly" interprets all of those possible deviations is at the very least a major undertaking, and usually even impossible due to contradictions between various deviations when they appear in combination.

And that is why hindsight is misleading: In hindsight, you only see one particular (small set of) deviation(s) causing interoperability problems, and it would almost always have been possible to make every parser coherently interpret those deviations just fine, and if everyone had done that, then you would not have any interoperability problems. But that isn't the perspective of someone who initially builds the implementation. They can only either strictly follow the spec (which works perfectly if everyone does so and the spec isn't broken) or they can increase complexity of and effort required for their implementation an order of magnitude or more to accept close to anything that could happen (which noone does for obvious reasons) or they can implement a random selection of deviations they like (which then leads to interoperability problems and the view in hindsight that everyone else could easily have done the same, which, of course, they couldn't, because they couldn't know what others were doing). Of course, there is a simple solution to that last approach: If you want to implement deviations from the agreed-upon spec but you don't want to run the risk of creating interoperability problems, you could get together with all the other implementers and talk about which deviations everyone is going to implement. But obviously, that's just the first approach in disguise: After you have agreed on the deviations, they aren't deviations anymore, you have simply created a new spec, and everyone then strictly follows that new spec.

Essentially, what is happening here is that you see one interpretation of something that the spec doesn't actually specify as obvious. And then you claim that the solution to interoperability problems is that everyone does the obvious thing. But you fail to recognize that the whole problem we are trying to solve with specifications in the first place is that what seems obvious is different for different people. Which is why this (a) can not work and (b) obviously in practice does not work. You can not solve the problem of people having different approaches to problems by simply saying "they should just all have the same approach" while at the same time saying that methods to create agreement (i.e., specifications) should not be taken too seriously.

> I'm not ignoring anything, I am just pointing out reality, the real world is messy and the stacks that try to keep working under messy conditions seem to be prevailing. Its not pretty and I don't deny the issues that arise, but here we are communicating on the largest most successful computer network ever built using a protocol and a markup language built with Postels law in mind.

Then your points are just irrelevant? I never said that broken systems can not be successful, did I? Yes, there clearly are evolutionary advantages to externalizing costs, and taking risks can pay off. But there are also other parties who have to pay those externalized costs, and taking risks can also end in a catastrophe. Externalizing costs is still an asshole move (and is generally frowned upon by society when people understand that that is what is happening) and whether the risks taken by the web, for example, have actually paid off is far from obvious.

Also, possibly all of this was built with Postel's law in mind. But what I would be interested in is whether that was to our benefit. Just because something was a factor in creating a certain overall positive situation does not mean that therefore that factor made that situation better than if it hadn't been there. In particular, evolutionary success does not mean that a different approach would not have produced a better result.

> I think most who know the history there would disagree with this opinion [1], it was obvious to me at the time why XHTML would fail even though I thought it a cleaner solution, I realized thats what was holding it back. It was much better to see your page come up with maybe a weird rendering artifact than just have the browser render nothing and throw an error if some small part was malformed.

How does that contradict what I said? Yes, it was obvious that XHTML would fail due to the massive incompetence of developers ... your point being?!

> Uh gee I don't know maybe Postel's law is kinda relevant when discussing TCP because Postel wrote the spec you know like what you asked in the post before? What kind of game are you playing here?

I am not sure what kind of game you are playing, but I had the impression like you were trying to make a point and not just state the historical fact that that's where Postel formulated the "robustness principle". Yeah, I agree, that's what he did. And it was a bad idea.

>And whatever you otherwise think about those languages, none of them will just make shit up when there is a syntax error in your program, they will simply reject it. And no, that does not mean that I am claiming that noone has ever made a syntactical mistake when writing code in those languages. But, you know, people are actually capable of fixing those mistakes when they are pointed out to them.

Except its pretty common now for programming languages add quality of life changes that loosen some of the strict parsing rules, such as trailing commas or optional semi colons. Same with whitespace, many languages don't pay much attention to it then you have a formatter that is strict about it (gofmt). This is Postels law in action, liberal acceptance strict output. The alternative is strict adherence to whitespace then no need for a formatter, just have the compiler reject it and put the burden on the programmer.

>Also, obviously, the "robustness principle" did not allow for a graceful update to the spec.

Again your opinion is not shared historically, ECN is held up as an example of the robustness principle having been followed in most stacks, with some problem ones that did not causing some issues [1].

>Then your points are just irrelevant?

Then your points are just irrelevant? We can play this game forever. Just the fact the you are using HTML and not XHTML and TCP under that to write these should make some relevant point that you can't seem to see.

>Yes, it was obvious that XHTML would fail due to the massive incompetence of developers ... your point being?!

Or more likely all these developers weren't incompetent including myself, just when given the choice the strictness of XHTML lost to the liberalness of HTML proving Postels law again. Messy and robust won over clean and fragile again, that's the point, get it?

>I am not sure what kind of game you are playing, but I had the impression like you were trying to make a point and not just state the historical fact that that's where Postel formulated the "robustness principle". Yeah, I agree, that's what he did. And it was a bad idea.

>As for TCP ... how is it relevant that Postel wrote the spec? Does that mean that the vulnerabilities in TCP never happened? Or are you saying that modern TCP implementations try to accept any crap whatsoever? (No, they don't, of course they don't, people have actually learned that that's a bad idea.)

Going back to you original question since your having hard time connecting the dots, Postel wrote the spec for TCP and put his law in the spec as guidance. ECN was developed taking advantage of that principle and most stacks accepted the malformed packets because of it. There are other examples of this [2], TCP is complicated if stacks didn't follow Postels law they would never get anything done on the internet.

1. https://tools.ietf.org/html/draft-ietf-tcpm-generalized-ecn-... 2. https://www.snellman.net/blog/archive/2016-02-01-tcp-rst/

> Except its pretty common now for programming languages add quality of life changes that loosen some of the strict parsing rules, such as trailing commas or optional semi colons. Same with whitespace, many languages don't pay much attention to it then you have a formatter that is strict about it (gofmt). This is Postels law in action, liberal acceptance strict output.

Erm ... no, it's obviously not? Or at least not in a way that is relevant to this discussion. I am obviously not objecting to specifying languages that give you a lot of freedom in how you format things, so what is the point of bringing up that you could interpret the robustness principle to mean just that? I am obviously objecting to accepting input that does not conform to the respective relevant specification, and the fact that making languages more flexible in their formatting is often useful has no relevance to that whatsoever.

You interpret some term to mean a broad range of things, I point out that one of those things is a bad idea, and your defense is that one of the other things is good ... how is that even an argument? How does that change that what I pointed out is a bad idea?

> The alternative is strict adherence to whitespace then no need for a formatter, just have the compiler reject it and put the burden on the programmer.

No, the alternative is strict adherence to the language specification. Or, really, it's not an alternative at all, because there is zero contradiction between specifying a language with flexible whitespace grammar (or separator grammar or whatever) and then strictly enforcing that grammar (and thus obviously avoiding interoperability problems).

> Again your opinion is not shared historically, ECN is held up as an example of the robustness principle having been followed in most stacks, with some problem ones that did not causing some issues [1].

In other words: You position is unfalsifiable? If there are no interoperability problems due to everyone interpreting messages identically, then that is obviously due to the robustness principle, and if there are interoperability problems because implementations deviate in how they interpret messages, then that is also obviously a success of the robustness principle? Is there any scenario where that robustness principle would not count as successful?

> Then your points are just irrelevant? We can play this game forever. Just the fact the you are using HTML and not XHTML and TCP under that to write these should make some relevant point that you can't seem to see.

How is the fact that I am using something in any way relevant to the question of whether an alternative would have avoided interoperability problems and vulnerabilities?

> Or more likely all these developers weren't incompetent including myself, just when given the choice the strictness of XHTML lost to the liberalness of HTML proving Postels law again. Messy and robust won over clean and fragile again, that's the point, get it?

How is it relevant that HTML won? How do you connect from "technology X won over technology Y" to "therefore, technology Y would not have had fewer interoperability problems and vulnerabilities than technology X"?

Why do you answer every question as to technical properties of a technology with "it lost" or "it won" while completely failing to say anything at all about the technical property being discussed?

NOONE DENIES THAT HTML WON OVER XHTML.

Also, it seems you almost completely ignored the central explanation of my previous post, simply to repeat your previous points as if I never had said anything. I am happy to read your explanation as to where my analysis is wrong, but I am completely uninterested in reading over and over points that I repeatedly explained why I don't agree with them with no insight at all into how my reasoning is wrong.