| I understood what you meant. > HTML specifies to ignore any extraneous whitespace and simply collapse it into a single space[...] Outside of that and a few special cases, the default behavior is to ignore whitespace No it doesn't, and it's not. What you're describing is how the browser displays the content. (And a few other things—like interactions when you select text to drag and drop or copy it to the clipboard.) > building a website in PHP[...] you end up adding a lot of whitespace from indenting your code, etc., and it would be a nightmare if HTML didn’t treat whitespace as it does You keep saying "HTML" when you mean something else. In almost every instance if you just said "the browser" (broadly) instead, then you'd be good, but you keep saying "HTML". There are absolutely parts of the browser that don't care whether they're seeing one space or a thousand varied whitespace characters (tabs, carriage returns, linefeeds, etc), because based on what style properties are in effect at that place the browser will be presenting that content to the user as if there's one space character when laying it out and putting it on screen. But the only whitespace that gets ignored in HTML, really, is the whitespace inside angle brackets around attributes and element names. Your string metaphor is a good one. Content marked up with HTML is like one big string, and as you say, no one would expect whitespace in a string to be insignificant. It's not insignificant in HTML, either; it does, by default, get painted as if sequences of multiple whitespace characters were a single space, in most contexts. But again, that's a separate thing entirely. |
Also, this is an example of whitespace that is ignored:
<p>[whitespace here]I’m a text node[more whitespace here]</p>
I don’t believe that is what you referred to when you said “inside angle brackets around attributes and element names”.
Here the whitespace or sequence of spacelike characters is not collapsed into a single space. It is simply ignored, and the text node (string) begins at the first non-whitespace character.
That is actually what I referred to when I said that you end up adding a lot of extra whitespace when building a website in, say, PHP. Because that is where it typically ends up in the generated output.