Hacker News new | ask | show | jobs
by ogogmad 187 days ago
I think most PL parsers use recursive descent, which doesn't require any formal language theory. That said, it's possible that formal language theory has enabled regular expressions to enter widespread use - first via theory; then to make tokenisers; then included in text editors; and then to make GREP.
1 comments

You're going to have a very hard time parsing a language using recursive descent if you don't understand some formal language theory. Recursive descent algorithms have a lot of trouble with left-recursion, and can result in arbitrary backtracking if you're not careful, which can blow up memory and time. To fix left-recursion you need to rewrite your grammar, which means familiarity with formal language theory. To avoid backtracking you typically use an LL(N) or variations thereof, which once again require having a pretty good understanding of formal language theory.
Nah. Source: have done it without formal language theory. You can do most things by printf debugging and kludging.

(Left-recursion? Use a while loop! Mutable state is a pathway to many abilities functional programmers consider unnatural.)

You can definitely write parsers and lexers knowing just enough to be dangerous.

I remember being in the PHP culture where people wrote regex-based hacks instead of the character-at-a-time or token-at-a-time state machines or state machine + a stack kind of things which would be too slow in a language like that.

The less you know the faster you get in trouble when things get tough.

My statement was specific to recursive descent parsing.
But recursive descent parsing is implemented with functions operating on parser state. The whole point of it is that it's very easy to write a parser because the abstraction exactly matches to function calling in programming languages. So you can do the whole range of pl primitives too.