Hacker News new | ask | show | jobs
by tgv 658 days ago
The original comment says that using yacc/bison is "fundamentally misguided." But parser generators make it easy to add a correct parser to your project. It's obviously not the only way. Hand-rolling has a bunch of pitfalls, and easily leads to apparently correct behavior that does weird things on untested input. Your comment then is a bit like: I've never had memory corruption in C, so Rust/Java/etc. is for toy projects only.
1 comments

> Hand-rolling has a bunch of pitfalls

I'm arguing that this is not the case in reality, and asked for concrete examples... So again I ask for a concrete example... For memory corruption, there are plenty of examples.

For parsing, I know one example that lead to problems. Interestingly, it was about using a state machine that was then modified (manually) and the result was broken. Here I argue that using a handwritten parser, instead of a state machine that is then manually modified, would not have resulted in this problem. Also, there was no randomized testing / fuzz testing, which is also a problem. This issue is still open: https://issues.apache.org/jira/browse/OAK-5367

There's no reason for concrete examples, because the point was about the fundamental misguidedness of parser generators, not about problems with individual parser generators or the nice things you can do in a hand-rolled one, but to accommodate you, ANTLR gives one on its home page: "... At Twitter, we use it exclusively for query parsing in Twitter search... Samuel Luckenbill, Senior Manager of Search Infrastructure, Twitter, inc."

Also, regexps are used very often in production, and that's definitely a parser-generator of sorts.

The memory corruption example was an analog, but to spell it out: it's easier and faster to write a correct parser using flex/bison than by hand, especially for more complex languages. Parser-generators have their use, and are not fundamentally misguided. That you might want to write your own parser in some cases does not diminish that (nor vice versa).