|
|
|
|
|
by Timon3
700 days ago
|
|
I don't know a lot about parser theory, and would love to learn more about ways to make parsing resilient in cases like this one. Simple cases like "ignore rest of line" make sense to me, but I'm unsure about "adversarial" examples (in the sense that they are meant to beat simple heuristics). Would you mind explaining how e.g. your `as` stripping could work for one specific adversarial example? function foo<T>() {
return bar(
null as unknown as T extends boolean
? true /* ): */
: (T extends string
? "string"
: false
)
)
}
function bar(value: any): void {}
Any solution I can come up with suffers from at least one of these issues:- "ignore rest of line" will either fail or lead to incorrect results
- "find matching parenthesis" would have to parse comments inside types (probably doable, but could break with future TS additions)
- "try finding end of non-JS code" will inevitably trip up in some situations, and can get very expensive I'd love a rough outline or links/pointers, if you can find the time! [0] TS Playground link: https://www.typescriptlang.org/play/?#code/AQ4MwVwOwYwFwJYHs... |
|
In JS, because it is a fun example, "end of statement" is defined in large part by Automatic Semicolon Insertion (ASI), whether or not semicolons even exist in the source input. (Even if you use semicolons regularly in JS, JS will still insert its own semicolons. Semicolons don't protect you from ASI.) ASI is also a useful example because it is an ancient example of a language design intentionally trying to be resilient. Some older JS parsers even would ignore bad statements and continue on the next statement based on ASI determined statement break. We generally like our JS to be much more strict than that today, but early JS was originally built to be a resilient language in some interesting ways.
One place to dive into that directly (in the middle of a deeper context of JS parser theory): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...