Hacker News new | ask | show | jobs
by seanmcdirmid 4465 days ago
Most real production compilers don't use lexer or parser generators.
3 comments

A "micro compiler" doesn't qualify as "real production compiler". I think pointing out that a lexer/parser could have been used is relevant in this case.
Lexer/parser generators aren't used in practice even for hobbyists, so pointing out standard tools could have been used is weird since the tools are obviously not that popular.
In my experience lexers and parser generators are used extensively by hobbyists. Tools such as Antlr, Boost.Spirit and Xtext come to mind.
Xtext is very marginal compared to the far wider world of custom languages. Antlr not so much, but still, tons of people write rec. des. parsers themselves.
Is that true? At least in the OCaml ecosystem (where there is many many compilers), ocamllex and menhir seems to be used quite often.
I know a lit of hobbyists and professionals who just roll recursive decent parsers given the reduced complexity and better error recovery. Using these tools, they don't necessarily help unless you have icky syntax to deal with or really care about that last ounce of performance.
The main compilers for PHP, Ruby, and Go all do, although gcc uses a handwritten one now http://en.m.wikipedia.org/wiki/GNU_bison

Do those not count? What are the tradeoffs to doing it from scratch?

Scala and C# are by hand. I imagine C++ is also since there is no other way to do it. Java uses JavaCC, Lua is by hand

Using tools is often more complicated than just writing code, it isn't that hard to write a recursive descent parser, and you have the flexibility if loading it with all the error recovery you want. What tools give you is potentially better parser performance and more theoretic assurances (useful for tricky grammars).

After the comments I've seen from paulp, I'd hardly believe that way Scala did this was optimal, although I'm not sure there were any issues with lexing & parsing
It's not optimal and it doesn't matter considering parsing overhead is just noise compared to type system computations.