| > This is not true because you are imagining a world with strict parsing but where people are still acting as though they have lax parsing. In reality, strict parsing changes the incentives and thus people’s behaviour. Dude I lived in that world. A fair amount of developers explicitly opted into strict parsing rules by choosing to serve XHTML. And yet, those developers who opted into strict parsing messed up their XML generation frequently enough that I, as an end user, was presented with that "XML Parse Error" page on occasion. I don't understand why you'd think all developers would stop messing up if only strict parsing was hoisted upon everyone rather than only those who explicitly opt in. > In your hypothetical world, they are making that syntax error… and just deploying it anyway. No, they're not. In my (non-hypothetical, actually experienced in real life) world of somewhat wide-spread XHTML, I'm assuming that developers would make sites which appeared to work with their test content, but would produce invalid XML in certain situations with some combination of dynamic content or other conditions. Forgetting to escape user content is the obvious case, but there are many ways to screw up HTML/XHTML generation in ways which appear to work during testing. > We have strict syntax almost everywhere. How often do you see a Python syntax error in the backend code? Never, but people don't dynamically generate their Python back-end code based on user content. > How often do you run across an SVG that fails to load because of a syntax error? Never, but people don't typically dynamically generate their SVGs based on user content. Almost all SVGs out there are served as static assets. |
No they didn’t, unless you and I have wildly different definitions of “a fair amount”. The developers who did that were an extreme minority because Internet Explorer, which had >90% market share, didn’t support application/xhtml+xml. It was a curiosity, not something people actually did in non-negligible numbers.
And you’re repeating the mistake I explicitly called out. Opting into XHTML parsing does not transport you to a world in which the rest of the world is acting as if you are in a strict parsing world. If you are writing, say, PHP, then that language was still designed for a world with lax HTML parsing no matter how you serve your XHTML. There is far more to the world than just your code and the browser. A world designed for lax parsing is going to be very different to a world designed for strict parsing up and down the stack, not just your code and the browser.
> I'm assuming that developers would make sites which appeared to work with their test content, but would produce invalid XML in certain situations with some combination of dynamic content or other conditions. Forgetting to escape user content is the obvious case, but there are many ways to screw up HTML/XHTML generation in ways which appear to work during testing.
Again, you are still making the same mistake of forgetting to consider the second-order effects.
In a world where parsing is strict, a toolchain that produces malformed syntax has a show-stopping bug and would not be considered reliable enough to use. The only reason those kinds of bugs are tolerated is because parsing is lax. Where is all the JSON-generating code that fails to escape values properly? It is super rare because those kinds of problems aren’t tolerated because JSON has strict parsing.