Hacker News new | ask | show | jobs
by bentley 237 days ago
Implicit elements and end tags have been a part of HTML since the very beginning. They introduce zero ambiguity to the language, they’re very widely used, and any parser incapable of handling them violates the spec and would be incapable of handling piles of real‐world strict, standards‐compliant HTML.

> I wish all these tolerances wouldn't exist in HTML5 and browsers simply showed an error, instead of being lenient.

They (W3C) tried that with XHTML. It was soundly rejected by webpage authors and by browser vendors. Nobody wants the Yellow Screen of Death. https://en.wikipedia.org/wiki/File:Yellow_screen_of_death.pn...

3 comments

> They introduce zero ambiguity to the language

Well, to parsing it for machines yes, but for humans writing and reading it they are helpful. For example, if you have

    <p> foo
    <p> bar
and change it to

    <div> foo
    <div> bar
suddenly you've got a syntax error (or some quirks mode rendering with nested divs).

The "redundancy" of closing the tags acts basically like a checksum protecting against the "background radiation" of human editing. And if you're writing raw HTML without an editor that can autocomplete the closing tags then you're doing it wrong anyway. Yes that used to be common before and yes it's a useful backwards compatibility / newbie friendly feature for the language, but that doesn't mean you should use it if you know what you're doing.

It sounds like you're headed towards XHTML. The rise and fall of XHTML is well documented and you can binge the whole thing if you're so inclined.

But my summarization is that the reason it doesn't work is that strict document specs are too strict for humans. And at a time when there was legitimate browser competition, the one that made a "best effort" to render invalid content was the winner.

The merits and drawbacks of XHTML has already been discussed elsewhere in the thread and I am well aware of it.

> And at a time when there was legitimate browser competition, the one that made a "best effort" to render invalid content was the winner.

Yes, my point is that there is no reason to still write "invalid" code just because it's supported for backwards compatibility reasons. It sounds like you ignored 90% of my comment, or perhaps you replied to the wrong guy?

I'm a stickling pedant for HTML validity, but close tags on <p> and <li> are optional by spec. Close tags for <br>, <img>, and <hr> are prohibited. XML-like self-closing trailing slashes explicitly have no meaning in XML.

Close tags for <script> are required. But if people start treating it like XML, they write <script src="…" />. But that fails, because the script element requires closure, and that slash has no meaning in XML.

I think validity matters, but you have to measure validity according to the actual spec, not what you wish it was, or should have been. There's no substitute for actually knowing the real rules.

Are you misunderstanding on purpose? I am aware they are optional. I am arguing that there is no reason to omit them from your HTML. Whitespace is (mostly) optional in C, does that mean it's a good idea to omit it from your programs? Of course a br tag needs no closing tag because there is no content inside it. How exactly is that an argument for omitting the closing p tag? The XML standard has no relevance to the current discussion because I'm not arguing for "starting to treat it like XML".
I'm beginning to think I'm misunderstanding, but it's not on purpose.

Including closing tags as a general rule might make readers think that they can rely on their presence. Also, in some cases they are prohibited. So you can't achieve a simple evenly applied rule anyway.

IMO, all of those make logical sense. If you’re inserting a line break or literal line, it can be thought of as a 1-dimensional object, which cannot enclose anything. If you want another one, insert another one.

In contrast, paragraphs and lists do enclose content, so IMO they should have clear delineations - if nothing else, to make visually understanding the code more clear.

I’m also sure that someone will now reference another HTML attribute I didn’t think about that breaks my analogy.

I didn't have a problem with XHTML back in the day; it tool a while to unlearn it; I would instinctively close those tags: <br/>, etc.

It actually the XHTML 2.0 specification [1] that discarded backwards compatibility with HTML 4 was the straw that broke the camel's back. No more forms as we knew them, for example; we were supposed to use XFORMS.

That's when WHATWG was formed and broke with the W3C and created HTML5.

Thank goodness.

[1]: https://en.wikipedia.org/wiki/XHTML#XHTML_2.0

XHTML 2.0 had a bunch of good ideas and a lot of them got "backported" into HTML 5 over the years.

XHTML 2.0 didn't even really discard backwards-compatibility that much: it had its compatibility story baked in with XML Namespaces. You could embed XHTML 1.0 in an XHTML 2.0 document just as you can still embed SVG or MathML in HTML 5. XForms was expected to take a few more years and people were expecting to still embed XHTML 1.0 forms for a while into XHTML 2.0's life.

At least from my outside observer perspective, the formation of WHATWG was more a proxy war between the view of the web as a document platform versus the view of the web as an app platform. XHTML 2.0 wanted a stronger document-oriented web.

(Also, XForms had some good ideas, too. Some of what people want in "forms helpers" when they are asking for something like HTMX to standardized in browsers were a part of XForms such as JS-less fetch/XHR with in-place refresh for form submits. Some of what HTML 5 slowly added in terms of INPUT tag validation are also sort of "backports" from XForms, albeit with no dependency on XSD.)

XHTML in practice was too strict and tended to break a few other things (by design) for better or worse, so nobody used it...

That said, actually writing HTML that can be parsed via an XML parser is generally a good, neighborly thing to do, as it allows for easier scraping and parsing through browsers and non-browser applications alike. For that matter, I will also add additional data-* attributes to elements just to make testing (and scraping) easier.