| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by singpolyma3 40 days ago
	To be fair, HTML5 also has a defined parsing algorithm. It just happens to always work on any input to produce a webpage

3 comments

jerf 40 days ago

Yes, this is what you'd want. It doesn't have to be a complicated as the HTML5 algorithm either. That's complicated because it was a harmonization of at least 3 browser's multi-decade heuristics and untold terabytes of existing HTML practice. An algorithm unconcerned with backwards compatibility could much simpler, but still clearly define error behavior much easier to use than "scream and die".

And it's still unambiguous. You can cringe at what some people do, but it would be strictly a taste issue rather than a technical one, as the parse would still be unambiguous. And if you think you can fix taste issues with technical specification, well, you've already lost anyhow.

link

stavros 40 days ago

I think the GP has an issue not with the specification part, but with the part where it's forbidden for clients to render a noncompliant page.

link

tardedmeme 40 days ago

It's not forbidden. They just don't render certain noncompliant pages. Namely the ones with gross syntax errors.

Why are we okay with formats like PDF that have similarly catastrophic error handling?

link

zbentley 40 days ago

I mean, we aren’t ok with that for PDF. That’s why PDF renderers have incredibly baroque rules for parsing weirdly or brokenly formatted documents, and why many PDF documents fall back to embedding images or absolute-positioned pixel-like layouts for compatibility purposes.

link

stavros 40 days ago

I mean, the linked page and the comment above say it is:

> It is explicitly forbidden for clients to accept any page that doesn't conform with the specification. This prevents the standardized diabolic rules that one must implement in order to correct a

link

masklinn 40 days ago

I don't get this reply. GP didn't say anything about parsing algorithms, they said (correct) things about hard errors on the web.

link

112233 40 days ago

why for? the reply is about factual historical experience with webpage hard errors.

Would you like to have a law that forbids you, under penalty of fine, to read any book you buy or borrow that is lacking or has damaged pages?

link

jazzypants 40 days ago

I thought they were just bolstering the refutation of TFA's assertion that XHTML is strictly better because of its parsing algorithm.

link