Hacker News new | ask | show | jobs
by Groxx 5653 days ago
I don't see why parsing would be a problem; identifiers (and their pieces) always start with letters (so no "var 1"), alphanumerical, and cannot be a reserved word.

Meaning, parse word by word until you hit a key word or a significant character (,:". etc). You can't have "varb function(arg)" or its equivalent in any language I know, because it doesn't make sense - there's no operation on the varb, it's just "there". Similarly, "x y z = q r t" is unambiguous, because there's no stop to parsing either "x y z" or "q r t".

I think I'd like it. Hitting shift all the time, or reaching for "_" is a PITA and significantly slows my typing. It's especially annoying when you realize that identifiers with blanks could be leveraged into most languages with almost zero change to the parser, as long as it requires an end-of-statement terminator or ends on newlines.

1 comments

Meaning, parse word by word until you hit a key word or a significant character (,:". etc).

If keywords are allowable in identifiers (such as "end of file"), then your algorithm is not sophisticated enough. When you encounter a token that is the same token as a keyword, you need to use context to determine if it is actually a keyword or part of an identifier.

This may be a serious problem if the grammar has "<identifier> <keyword>" in it. That is, "X keyword" could be the identifier "X keyword" or it could be the identifier "X" followed by "keyword." There's a reason that most programming languages require that identifiers are a single token.

> When you encounter a token that is the same token as a keyword, you need to use context to determine if it is actually a keyword or part of an identifier.

You're presuming here that a space delimits tokens. In this language, that may not be the case. The lexer may create a single token from "a b c".

>If keywords are allowable in identifiers

Big "if" (why shouldn't it disallow them?), and completely resolved by modifying your naming scheme in those situations: EndOfFile is unambiguous, as is end_of_file, ifSuccess, etc.

It's unusual as most programming languages allow keywords to appear in identifiers (for example, new_thing is a legal C++ identifier). Further, if I understand the language correctly, the literal "end_of_file" becomes the same identifier as "end of file". And the stated purpose of allowing white space in identifiers is to avoid camel case and underscores.
I don't think that's the case. I think the example was just to show how you can write with spaces instead of underscores. I could be wrong though, I haven't tried the language.

The documentation doesn't state one way or the other, but it does include underscores as part of identifiers, and doesn't mention any stripping. Only that spaces are ignored entirely.