|
|
|
|
|
by cocok
677 days ago
|
|
I stopped at 1.1 Notation. Full of arbitrary-looking decisions on which characters can be used for what. It's 2024, and we still don't have a string notation that doesn't use the same character for opening and closing delimiter. If I start parsing from an arbitrary offset in the code, I can't say whether a double quote I read is the beginning of a string, or the end of one. I have to either resort to heuristics, or parse from the beginning of the file (at least once; and then cache offsets known to be outside a string). Something like "() would be nice. Still the familiar double quote, but the grouping is defined in a grammatically-superior way. Also, still no identifiers that can start with a digit. Most of the mainstream languages have such complex grammars, probably requiring hand-coded parsers, but I can't have a "52cards" identifier. Is this really that hard compared to everything else? Now, I'm self-taught and all. Maybe I'm missing something and the professors are right. |
|
> Is this really that hard compared to everything else?
In the languages I've seen that don't allow numbers in identifiers, it's because doing so makes other expressions ambiguous.
E.g. in Python (et al), things that start with 0x are treated as a hex literal. 0x9 would be ambiguous because it could either be an identifier named 0x9 or a literal for 9 in hex.
It also makes integer literals ambiguous, because 54 would be both a valid identifier and a valid literal.
You could disambiguate that with more rules (identifiers can include numbers but can't start with 0x, identifiers must include at least one non-numeric character), but the gain for doing so is so low it feels a little Quixotic.