| HN Mirror

$ dump ./scratch/p.html 3c 70 3e 20 20 0a 20 20 49 27 6d 20 61 20 74 65 < p > . I ' m a t e 78 74 20 6e 6f 64 65 20 5b 20 20 20 20 5d 20 20 x t n o d e [ ] 20 20 3c 2f 70 3e 0a < / p > .

I see. Don’t know if you’re still checking for replies on this thread. Livin’ up to my name. Thanks for taking the time to explain, though.

I’m going to have to look further into this to get a better understanding, but I suppose the rules for collapsing whitespace in a text node exist somewhere in the HTML specification, but not at the “interpretation” stage as I assumed.

To be clear what I imagined was that at the interpretation stage a text node would be marked to begin at the first non-whitespace character and end at the last non-whitespace character. And then within the text node there might be additional whitespace that would need to be collapsed into a single space.

Since the first type is not rendered at all and the second type is collapsed to a single space I assumed the rules could exist at two different points in the process/pipeline.

So what I gather here is that both types exist at a later stage than “interpretation” (basically what you see when you open Developer Tools and inspect individual nodes).

But I guess the subtlety here is that at whichever stage the whitespace collapsing/removal happens, the rules for it would still have to be defined by the HTML specification somehow.

And another subtlety to counteract that is that HTML is a markup language and not a programming language. One is executed, one is rendered. So any comparison between say Python and HTML needs to take that into account.

So even though there is some whitespace ignoring going on at some point from:

<p>[whitespace]This textnode has extraneous whitespace[whitespace]</p>

To the point where [whitespace] is not rendered in the viewport, the fact that the ignoring does not happen at the “interpretation” stage is important because that’s as far as the comparison between say Python and HTML can go before the two veer off in different directions.

I’m mainly typing this out for my own understanding, but again, will have to look into it myself to validate or correct my current framework of thinking about this. Thanks for an interesting discussion