Hacker News new | ask | show | jobs
by abecedarius 4665 days ago
I used to mostly code parsers by hand for many of the same reasons, for years -- this article gives a nice rundown -- but the repetitive code really does kind of annoy, and I've found something else that works for me. Going over the listed problems:

Lexing and context: PEGs don't need a separate scanner; the option of calling an arbitrary function to parse a part usually suffices for unusual context needs.

Shift/reduce and grammar conflicts: PEG again sidesteps the problem, at the cost of sometimes resolving ambiguity in an unexpected way.

Syntax tree: Call semantic actions instead. For example: https://github.com/darius/peglet/blob/master/examples/regex....

Mixed code: Semantic actions are denoted by function names instead of inline code. I've used the same grammar in different languages.

Other limitations: Given a small parsing library -- like one or two pages of non-golfed code -- it's more thinkable to hack it to address whatever particular problem comes up.

So most often these days I use https://github.com/darius/peglet when I have a parsing problem. It's definitely not for coding gcc with.

1 comments

I use LPeg myself. What I like about LPeg is that you can compose it. Once I have an LPeg expression that parses, say, an IPv6 address, I can reuse that expression in a larger grammar.