|
|
|
|
|
by thiht
512 days ago
|
|
Interesting perspective, it makes me miss XHTML wayyy less. I was under the impression that XHTML (XML) was better specified and had less weirdness. I know HTML is now better specified but some of the things inherited from HTML 4 and before make no sense to me (optional closing times SOMETIMES, optional stuff everywhere). |
|
This, IMO, is a bigger reason to avoid regex and XML parsers for HTML documents. The rules aren’t apparent when thinking linearly about what strings appear after or before each other; they become clearer when thinking of HTML as a shorthand syntax for certain kinds of push and pop operations.
XHTML is easier to parse, but for well-formed documents pushes the complexity of invalid markup into the rendering side. For example, it’s well-formed to include a button inside a button, so XHTML browsers render exactly this, but it makes no sense from a UI perspective and strange things happen when invalid markup is sent in well-formed XML.