Hacker News new | ask | show | jobs
by still_grokking 1986 days ago
Most of the "plenty of good HTML5 parsers out there" are broken. No wonder as the spec is nuts. (It took years before there was even a correctly working validator).

Also I was explicitly talking about XML compatible HTML. It's called so because it's XML compatible.

Btw, have you ever seen HTML in the web browser dev tools? Guess why it shows always the "optional" tags. ;-)

1 comments

> Most of the "plenty of good HTML5 parsers out there" are broken.

Can you name a few popular and widely used HTML5 parsers that are broken and tell us what the bugs are in those parsers? I would be surprised if you can find or name even two such parsers that are popular but cannot handle optional tags correctly as required by the spec.

> Also I was explicitly talking about XML compatible HTML.

There is no such thing as XML compatible HTML (unless you mean XHTML which we are not discussing here). Maybe you mean XML-serialized HTML5. I can only guess since the terminology you are using is vague and unclear. In any case, HTML5 by itself is incompatible with XML. I mentioned this in my previous comment. Not all tags in HTML5 are self-closing, thus incompatible with XML. XML-serialized HTML5 is however compatible with XML, by definition, and in that case, one would use an XML parser, not an HTML5 parser. More importantly, you can safely omit the optional tags and still convert your HTML5 document into XML-serialized HTML5 document without any issues whatsoever. This was explained to you by anjbe here at https://news.ycombinator.com/item?id=25706163. He is absolutely right.

> Btw, have you ever seen HTML in the web browser dev tools? Guess why it shows always the "optional" tags. ;-)

You see all the tags there because it shows the entire DOM. The browser automatically creates the elements when optional tags are not explicitly present in the HTML. This is all spelled out in the spec very clearly. Any HTML5 parser worth its name follows the spec. I am not sure what your point is here.

See https://html.spec.whatwg.org/multipage/syntax.html#optional-... for details, especially:

"Omitting an element's start tag in the situations described below does not mean the element is not present; it is implied, but it is still there. For example, an HTML document always has a root html element, even if the string <html> doesn't appear anywhere in the markup."

I hope that explains why you always see the elements for the optional tags in a web browser's developer tools.