|
|
|
|
|
by nine_k
159 days ago
|
|
Indent-based syntax is relatively simple to parse. You basically need two pieces of state: are you in indent-sensitive mode (not inside a literal, not inside a parenthesized expression), and what indentation did the previous line have. Then you can easily issue INDENT and DEDENT tokens, which work exactly like "{" and "}". The actual Python parser does issue these tokens. Actually Haskell has both indent-based and curlies-based syntax, and curlies freely replace indentation, and vice versa (but only as pairs). |
|
That’s enough for INDENT, but for DEDENT you also need a stack of previous indentation levels. That’s how, when the amount of indentation decreases, you know how many DEDENTs to emit.
The requirement for a stack means that Python’s lexical grammar is not regular.