Hacker News new | ask | show | jobs
by naniwaduni 2236 days ago
This actually produces the DOM equivalent to <div> <p> </p><h1> </h1><h2> </h2></div>.

Many of the rules for unclosed tags are more there so that browsers can agree on what to do with garbage first, and for you to rely on only incidentally! They defer to historical practice before common sense!

In order to predict this reliably, you essentially need to have the list of content categories[1] memorized (or look them up). Not all of them are ... necessarily intuitive.

[1]: https://developer.mozilla.org/en-US/docs/Web/Guide/HTML/Cont...

1 comments

Is there a way to get warnings for HTML that looks valid with matching start and end tags but doesn't actually parse the way it is written? I get the impression that we end up needing to memorize those content categories even if we plan to only generate html with all the start and end tags.

For example, <p>A<p>B</p>C</p> looks like two nested <p> but it is parsed as 3<p> next to each other: <p>A</p><p>B</p>C<p></p>.

At the margins, yes, but in practice if you have seemingly balanced opening and closing tags but invalid nesting, the outer close tag generally makes the HTML invalid, for which there's plenty of tooling to check.