ok well I forgot the PRE tag as CaptainNegative pointed out but when you say
>No. It works the way I described.
what you described made reference to white-space: normal which is a CSS property that I don't believe is available as part of the HTML standard itself (although I don't really keep up anymore so I could be wrong) but certainly wasn't part of older versions of the spec.
You are putting undue focus on a parenthetical (that I only even put in as a hedge[1] in the first place).
Copy and paste my comment somewhere, delete the parenthetical, and then read the result to yourself.
"HTML [...] treats all whitespace as insignificant" is simply inaccurate, no matter how you want to constrain it (e.g. "in older versions" or not). Whitespace is not insignificant.
Let me be clear about what I meant by whitespace insignificance.
When you put plain text into an element, that is equivalent to a string in typical programming terms. No, whitespace is not entirely insignificant within a text node. But almost. If we leave out <pre> and other special cases here, HTML specifies to ignore any extraneous whitespace and simply collapse it into a single space. So it is “extraneous whitespace insignificant” in a sense. It doesn’t ignore whitespace interely, but no one would expect that in the contex of a string in any language, even a whitespace insignificant one.
In a text node HTML goes out of it’s way to minimize the meaning of whitespace, but it does do the minimum of respecting that words have spaces between them. You can put spaces some places and have it break or change stuff, like in the middle of an attribute name or value, in the middle of an element name, etc. But you would expect that to happen in any whitespace insignificant language. Outside of that and a few special cases, the default behavior is to ignore whitespace (for example whitespace between the beginning or ending tag of an element and the text node it contains), and as such HTML is very much whitespace insignificant in my opinion.
The reason why I commented that this design was absolutely the right call is basically cases like building a website in PHP, where you mix the two languages together. Here you end up adding a lot of whitespace from indenting your code, etc., and it would be a nightmare if HTML didn’t treat whitespace as it does.
> HTML specifies to ignore any extraneous whitespace and simply collapse it into a single space[...] Outside of that and a few special cases, the default behavior is to ignore whitespace
No it doesn't, and it's not. What you're describing is how the browser displays the content. (And a few other things—like interactions when you select text to drag and drop or copy it to the clipboard.)
> building a website in PHP[...] you end up adding a lot of whitespace from indenting your code, etc., and it would be a nightmare if HTML didn’t treat whitespace as it does
You keep saying "HTML" when you mean something else. In almost every instance if you just said "the browser" (broadly) instead, then you'd be good, but you keep saying "HTML".
There are absolutely parts of the browser that don't care whether they're seeing one space or a thousand varied whitespace characters (tabs, carriage returns, linefeeds, etc), because based on what style properties are in effect at that place the browser will be presenting that content to the user as if there's one space character when laying it out and putting it on screen. But the only whitespace that gets ignored in HTML, really, is the whitespace inside angle brackets around attributes and element names.
Your string metaphor is a good one. Content marked up with HTML is like one big string, and as you say, no one would expect whitespace in a string to be insignificant. It's not insignificant in HTML, either; it does, by default, get painted as if sequences of multiple whitespace characters were a single space, in most contexts. But again, that's a separate thing entirely.
I don’t understand your distinction between “the browser” and “HTML” in this context. The browser is merely the interpreter of the language, but the HTML specification lays out how the language should be interpreted.
Also, this is an example of whitespace that is ignored:
<p>[whitespace here]I’m a text node[more whitespace here]</p>
I don’t believe that is what you referred to when you said “inside angle brackets around attributes and element names”.
Here the whitespace or sequence of spacelike characters is not collapsed into a single space. It is simply ignored, and the text node (string) begins at the first non-whitespace character.
That is actually what I referred to when I said that you end up adding a lot of extra whitespace when building a website in, say, PHP. Because that is where it typically ends up in the generated output.
$ dump ./scratch/p.html
3c 70 3e 20 20 0a 20 20 49 27 6d 20 61 20 74 65
< p > . I ' m a t e
78 74 20 6e 6f 64 65 20 5b 20 20 20 20 5d 20 20
x t n o d e [ ]
20 20 3c 2f 70 3e 0a
< / p > .
(I replaced your first square bracket sequence with two spaces followed by a newline (U+000A) followed by two more spaces, and I replaced the second square bracket sequence with a space followed by a literal left square bracket, followed by four spaces characters, followed by a literal right square bracket, followed by four more spaces.)
The text node's value is exactly the sequence of characters between the closing angle bracket in `<p>` and the opening angle bracket in `</p>`:
" \n I'm a text node [ ] "
> The browser is merely the interpreter of the language, but the HTML specification lays out how the language should be interpreted.
You're right about the second half, but you're wrong in thinking that it says extra whitespace should be ignored. It doesn't. The bigger problem, though, is in the first half.
I think you have an oversimplified understanding of what's going on in a browser and of the relationship that HTML has to what you see when the browser paints the content on the screen and lets you interact with it; a fundamental misunderstanding seems to exist on your part regarding the pipeline that you do or don't think of as existing between the markup and what you actually get when you open the page in a browser—there's a lot more to it than the browser being "merely the interpreter" for HTML.
>No. It works the way I described.
what you described made reference to white-space: normal which is a CSS property that I don't believe is available as part of the HTML standard itself (although I don't really keep up anymore so I could be wrong) but certainly wasn't part of older versions of the spec.