| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by eru 77 days ago

I wrote a bunch of parser combinator libraries myself, and used plenty more.

They are basically the same idea as regular expressions, but more flexible: you have a bunch of combining operations for your regular languages to build bigger regular languages. But that doesn't tell you how it's implemented in the backend. Could be Recursive, could be automata, could be backtracking, could be anything.

If you want to write your first compiler, I'd even go so far as to suggest to use something with a Lisp syntax as your source language, explicitly so you can minimise the parsing aspect.

Parsing is a lot of fun by itself, but it doesn't actually have much to do with the core of what makes compilers interesting and challenging. It's almost an independent pursuit, and very useful outside of writing compilers, too.

> As far as I can tell, parser combinators are just one way that promises to let "write a compiler without understanding abstract languages" but all these methods actually wind-up being libraries that are far complicated than gp's "recursive descent + pratt parsing", which is easy once you understand the idea of an abstract language.

Where do you suspect the complexity here? You can write a toy library for parser combinators that's really simple, if your implementation language is at least as capable as Rust or even Python (with Haskell and OCaml being probably the easiest). If you are using an off-the-shelf industrial-stength, production-grade parser combinator library: of course, that's complicated under the hood. That's the price the authors wilingly pay for great error handling and performance etc.

For most people writing their first compilers, they would better off starting the project from when they already have some tree or DAG representation. Plenty of (more!) interesting challenges left. Going from stream of bits to the parsed syntax tree is something they can learn about later (or not at all), without missing much.