| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dcreager 861 days ago

This is exactly right. You either end up with something very low-level (on the level of LLVM IR, for instance)—which means you aren't constructing and analyzing high-level language constructs anymore—or with something high-level but with many language-specific special cases grafted on.

Where we've found success is in stepping back and creating formalisms that are language-agnostic to begin with, and then using tree-sitter to manage the language-specific translations into that formalism. A good example is how we use stack graphs for precise code navigation [1], which uses a graph structure to encode a language's name binding semantics, and which we build up for several languages from tree-sitter parse trees [2].

[1] https://dcreager.net/talks/stack-graphs/

[2] https://github.com/github/stack-graphs/tree/main/languages