| I contributed to a similar system used within Google (partially open source at kythe.io), that took the very different approach of integrating with the language-native toolchain for each language. As this article describes, doing this requires per-language integrations and also effectively being able to "run the build" for any given code (because e.g. the C++ header search path can vary on a per-source-file basis), which is untenable for a codebase as large and varied as GitHub's. However, if you can make it work, you get the benefit of having the compiler's understanding of the semantics of the code, which is especially finicky in complex languages like C++ or, say, Rust. For example, if you look at this[1] method call it refers to a symbol generated by a chain of macros, but the browser is still able to point you at the definition of it. It's an interesting tradeoff to make: the GitHub approach likely doesn't handle corner cases like the above but it makes up for it in broad applicability and performance. I recall an IDE developer once telling me they made a similar tradeoff in code completion, in that it's better DX to pop up completions quickly even if they're "only" 99% correct. (To be clear, I absolutely think the approach taken in the article was the right one for the domain they're working in, I was just contrasting it against my experience in a similar problem where we took a very different approach.) [1] https://source.chromium.org/chromium/chromium/src/+/main:v8/... |
The build-based approach that you describe is also used by the Language Server Protocol (LSP) ecosystem. You've summarized the tradeoffs quite well! I've described a bit more about why we decided against a build-based/LSP approach here [4]. One of the biggest deciding factors is that at our scale, incremental processing is an absolute necessity, not a nice-to-have.
[1] https://github.blog/2021-12-09-introducing-stack-graphs/
[2] https://dcreager.net/talks/2021-strange-loop/
[3] https://news.ycombinator.com/item?id=29500602
[4] https://news.ycombinator.com/item?id=29501824