Hacker News new | ask | show | jobs
by Sesse__ 1281 days ago
If nothing else, it would strongly increase maintenance and testing costs. You also have the usual problems of cut-and-paste between examples and stylesheets not necessarily working as you'd expect. Then there's the subtle issue like how you'd serialize such things from CSSOM (JavaScript); which parser mode would you use? How would you serialize something “unsupported” in one mode that was created by CSSOM? Which mode would inline style attributes use?
1 comments

Understood. But if the idea was for a 'strict' syntax? i.e. in order to make use of newer syntax rules, CSS would need to declare itself as such and be formatted per spec (and would fail to parse otherwise, same as now), negating the need for "infinite lookaheads" per the explainer. So it's more like a switch statement than multiple engines. CSSOM is straightforward this way, you're dealing with strict by default. Inline styles the same.
It's not clear at all what that strict syntax would look like that would disambiguate with less lookahead. The fundamental problem is that colon is used both for declarations and in (pseudo-)selectors, not that people are somehow inventing corner-case CSS that nobody actually writes and we can just outlaw in a spec change.

The only change I can really think of would be requiring space after the colon for declarations (i.e. “color:red” is disallowed, it must be “color: red”), but that's much more than a strict mode, that's something that invalidates millions and millions of perfectly valid web pages and introduces a much larger whitespace sensitivity than today.

The difference isn't the separator, it is the suffix. Strict designation would afford that either properties be signed off with a semi-colon on the same line (or just a newline). Alternatively you could go the other way and enforce selector signoff with a comma or a bracket on the same line. No strict, no nesting /newfangled wizardy.

This allows for graceful degradation.

My point about corner cases is that there is very, very limited use of pseudo selectors, relatively speaking. Let alone pseudo selection where the selector is based on an element and not a class, or ID, or something else easily differentiable from a property. Which is to say, they are the corner case.

Once you start looking at the suffix to disambiguate what the first token means, you're already in the more-than-one-token lookahead land, which is what we're trying to avoid in the first place.

CSS property declarations already need to be signed off with a semicolon on the same line. If not, the entire declaration is ignored (this is specified in the CSS standard, and if you don't implement it correctly, you will break real web pages).

I'm sorry but I thought the challenge as described was one of "infinite lookahead"? Similarly, the csswg profer "graceful degradation" as the reason why a declaration isn't workable. But this solution clearly doesn't require infinite lookahead. It also degrades gracefully.

In fact lookahead isn't needed at all, except in (exceptionally) rare cases. Is the problem that the parsers are incapable of using any smarts beyond what is already provided?

Am I missing something?

Aside: Good point on the semicolon! I think in the previous discussion someone was making the point that parsers are exceptionally flexible/forgiving re. weird and wonderful line break and spacing combos. I wasn't sure about the status of semicolon usage. Idea of strict would just be to put an end to that.

Edit: and hey, apologies for labouring the point on this. But I am genuinely interested. I feel like these conversations just always end up in "you wouldn't understand" territory.

> But this solution clearly doesn't require infinite lookahead.

It clearly does? There can be an infinite number of tokens before you see the semicolon and know what you're parsing. The page contains examples of this, or you can dig into those bug threads.

> In fact lookahead isn't needed at all, except in (exceptionally) rare cases.

“You don't need to support lookahead, except sometimes” really means “you need to support lookahead”. And that changes how your parser and tokenizer has to work (in particular, you need to be capable of saving a potentially infinite amount of tokens in case you need to rewind). You don't get around that by saying it's rare.

This is the correct answer. I hope this is where they land because all of the options they’ve presented are pretty awkward