|
Hello, I work on the C# compiler and we use a handwritten recursive-descent parser. Here are a few of the more important reasons for doing so: * Incremental re-parsing. If a user in the IDE changes the document, we need to reparse the file, but we want to do this while using as little memory as possible. To this end, we re-use AST nodes from previous parses. * Better error reporting. Parser generators are known for producing terrible errors. While you can hack around this, by using recursive-descent, you can get information from further "up" the tree to make your more relevant to the context in which the error occurred. * Resilient parsing. This is the big one! If you give our parser a string that is illegal according to the grammar, our parser will still give you a syntax tree! (We'll also spit errors out). But getting a syntax tree regardless of the actual validity of the program being passed in means that the IDE can give autocomplete and report type-checking error messages. As an example, the code "var x = velocity." is invalid C#. However, in order to give autocomplete on "velocity", that code needs to be parsed into an AST, and then typechecked, and then we can extract the members on the type in order to provide a good user experience. My personal opinion is that everyone should just use s-expressions. Get rid of this whole debate :P |
It supports both incremental parsing and an API to inspect and recover incomplete ASTs (which powers Merlin, the IDE-like thing for OCaml). It provides stellar debugging features for ambiguous grammars and a way to have good error messages (which is used in compcert's C parser and facebook's reason).
So, it's not impossible. Most parser generators are not that good, though.