Hacker News new | ask | show | jobs
by zelphirkalt 820 days ago
There is probably no such thing, because it would be hard to map programming language concepts onto each other perfectly. OK, you could have a union of those concepts in the IR, but the the benefit of IR would disappear, because you will have to deal with all the things on the next layer.
1 comments

This is exactly right. You either end up with something very low-level (on the level of LLVM IR, for instance)—which means you aren't constructing and analyzing high-level language constructs anymore—or with something high-level but with many language-specific special cases grafted on.

Where we've found success is in stepping back and creating formalisms that are language-agnostic to begin with, and then using tree-sitter to manage the language-specific translations into that formalism. A good example is how we use stack graphs for precise code navigation [1], which uses a graph structure to encode a language's name binding semantics, and which we build up for several languages from tree-sitter parse trees [2].

[1] https://dcreager.net/talks/stack-graphs/

[2] https://github.com/github/stack-graphs/tree/main/languages