Hacker News new | ask | show | jobs
by dkjaudyeqooe 846 days ago
This is entirely the case. Given a sensible grammar stated in a sensible way, it's very easy to write a nice recursive decent parser. They are fast and easy to maintain. It doesn't limit the expressiveness of your grammar unduly.

Both GCC and LLVM implement recursive decent parsers for their C compilers.

Parser generators are an abomination inflicted upon us by academia, solving a non problem, and poorly.

1 comments

Agree completely. Having used a bunch of parser generators (Antlr and bison most extensively) and written a parser combinator library, I came to the conclusion that they're a complete waste of time for practical applications.

A hand-written recursive descent parser (with an embedded Pratt parser to handle expressions/operators) solves all the problems that parser generators struggle with. The big/tricky "issue" mentioned in the article - composing or embedding one parser in another - is a complete non-issue with recursive-descent - it's just a function call. Other basic features of parsing: informative/useful error messages, recovery (i.e. don't just blow up with the first error), seamless integration with the rest of the host language, remain issues with all parser generators but are simply not issues with recursive descent.

And that's before you consider non-functional advantages of recursive-descent: debuggability, no change/additions to the build system, fewer/zero dependencies, no requirement to learn a complex DSL, etc.

The main advantage of recursive descent is that since parser rules correspond to functions in the host language, you can put a breakpoint on them in the debugger and understand how you got there. You can't do that with a table-driven LALR(1) parser like Bison.

I think you cover that with "debuggability" remark.

Here is something else. Yacc/Bison parsing uses its own stack for the symbols. Whereas recursive descent uses the regular run-time stack.

Pop quiz: which stack is visible to your garbage collector, and which isn't?

In TXR Lisp, which uses a Bison/Byacc generated parser, GC is suspended while yyparse() executes. Otherwise it would prematurely collect anything that is only stored in the Yacc stack, and so there would have to be a GC hook to traverse that stack (which would depend on undocumented features of the generated code, and likely have to have some separate coding for Bison versus Byacc.)

> handle expressions/operators

Precedence climbing is a natural and efficient way to parse expressions in a recursive decent parser. It's used by Clang in LLVM.

https://eli.thegreenplace.net/2012/08/02/parsing-expressions...

Make them all s-expr with prefix notation solves even that generally, correct?
> A hand-written recursive descent parser (with an embedded Pratt parser to handle expressions/operators)

Exactly the recipe I swear by as well!