Hacker News new | ask | show | jobs
by munificent 190 days ago
It looks like, overall, this design gets the parser about twice as fast as a simple one that creates tree-like ASTs.

That's not nothing. But a parser is rarely the most time-intensive part of a production compiler. And the parser does get iterated on a lot in languages that are evolving and adding new syntax.

Given that, I'd be inclined to take the performance hit and stick with a simpler AST representation if that yields a more hackable, maintainable compiler front end.

3 comments

That's a good caution. However, traversing a flat AST (iterating a "struct of arrays" rather than a pointer-based tree) is also going to be faster. So the next steps of the compiler, say type checking and code emitting, will also be faster. But how much, or whether it's worth it even then, I'm not sure.
True, but that does also depend on where you store semantic information. Zipping past a nicely packed AST won't buy you much if for every node you have to look up its type or other semantic information somewhere else in memory through some slow process.
Another small trick is to use a "reversed" bump allocator, that starts handing out the memory from the end with the larger addresses. Since AST structures are almost always created bottom-up, children before the parents, the root node at the very end on the return from parseProgram/parseModule/etc. function, you will end up with the AST that has most its pointers pointing forward. This means that during AST walks, you'll be going from lower addresses to the higher ones which I think is actually somewhat faster than the reverse order.
Usually yes, but it's still a neat trick to be aware of. For interpreted scripting languages, parsing can actually be a significant slowdown. Even more so when we start going into text-based network protocols, which also need a parser (is CSS a programming language or a network protocol? :) )
(author here) I agree that it's a lot of complexity, and I acknowledge this in the article: You can get quite far with just a bump allocator.

I didn't go into this at all, but the main benefit of this design is how well it interacts with CPU cache. This has almost no effect on the parser, because you're typically just writing the AST, not reading it. I believe that subsequent stages benefit much more from faster traversal.

(By the way, I am a huge fan of your work. Crafting interpreters was my introduction to programming languages!)