| First, you should absolutely write couple of parsers by hand first and then repeat this exercise now and then. I understand the reasons why author does not use parser generators. However, if you are writing a parser for serious production use I urge you to seriously consider parser generator instead of going manual route. Here is why/ Parser generators are akin to compilers. They require certain constraints to be met but in return they generate extremely efficient parsing code. For classes of languages for which parser generators exist, you cannot beat generator with handwritten code neither in terms of parsing nor in terms of maintainability. Citing shift-reduce conflicts as one of the reasons to write parser by hand is akin to resorting to assembly being frustrated with C compiler errors. Yes, there are cases when hand-written parsers are preferred. gcc switched for parsing of C/C++ from flex/bison to handwritten parser during 3.x and clang also has handwritten parser. But this is because C and C++ are languages with context-dependent grammar and C++ syntax became increasingly arcane over the years. You constantly have to resort to tricks during C++ parsing. For example, to properly parse C++ class definition, you need to pass it two times, first reading declarations and only then both declaration and method bodies. You also need to resort to tricks and heuristics if you want to parse '>>' as part of nested template instead of right shift operator etc etc. Almost always that kind of complicated, context-dependent grammar makes it possible (and in case of Perl, even very easy) to write WTF code. |
Maintainability is a moot point. The more complex your language, the bigger a maintenance benefit you get from a parser generator, providing it's expressive enough. For parsing C++ outside of a commercial compiler, I'd look at a GLR parser, for which the tables would most likely be tool-created. (In a commercial compiler, I'd be back to hand-written again.)
The value of being able to change your grammar and have your parser follow suit instantaneously isn't high past the prototyping stage. Other things will consume the parse tree, and depending on the tool, the parse tree's shape may be driven by the parse rules (ANTLR) or the parser actions may be more or less deeply embedded in the grammar and require refactoring themselves (most other tools). The downstream consumers of the structures almost certainly need modification too, since it's not likely you're just changing syntax sugar. Whereas if you have a hand-written parser, you can minimize the work needed to adjust downstream. You have more latitude for engineering.
It's great to use tools to validate a grammar, to prototype parsing it, and perhaps even for lightweight work like analysis. But when it's essential you have a 100% accurate semantic analysis, great error messages, excellent performance, deep tooling integration (e.g. IDE code completion), the more control you need over the parsing processes. It's closer to the critical path of success for your target market, and generators are too generic.
For me, parser generators work well for a certain range of applications. Given a range of complexity, with 1 being a date format parser and 10 being a commercial compiler with IDE integration, parser generators work well somewhere around 3 to 7. At the lower end, their costs in terms of integration, third-party dependencies etc. outweigh the complexity of the problem they're solving. At the higher end, you need a lot more out of the tool than it is designed to give you, and working around it causes more pain than anything you're saving.
I was a front-end engineer on the Delphi compiler for 6 years. I don't know of any major commercial compiler that uses a parser generator. Almost all use hybrid recursive descent.