|
|
|
|
|
by WorldMaker
696 days ago
|
|
Most parsers don't actually work with "lines" as a unit, those are for user-formatting. Generally the sort of building blocks you are looking for are more along the lines of "until end of expression" or "until end of statement". What defines an "expression" or a "statement" can be very complex depending on the parser and the language you are trying to parse. In JS, because it is a fun example, "end of statement" is defined in large part by Automatic Semicolon Insertion (ASI), whether or not semicolons even exist in the source input. (Even if you use semicolons regularly in JS, JS will still insert its own semicolons. Semicolons don't protect you from ASI.) ASI is also a useful example because it is an ancient example of a language design intentionally trying to be resilient. Some older JS parsers even would ignore bad statements and continue on the next statement based on ASI determined statement break. We generally like our JS to be much more strict than that today, but early JS was originally built to be a resilient language in some interesting ways. One place to dive into that directly (in the middle of a deeper context of JS parser theory): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe... |
|
The example snippet I added is designed to violate the rules I could come up with. I'd specifically like to know: what are better rules to solve this specific case?