Hacker News new | ask | show | jobs
by zaksingh 648 days ago
On the efficiency angle, I think a big difficulty here that isn’t often discussed is that many optimization strategies relevant to incremental compilation slow down batch compilation, and vice versa.

For example, arena allocation strategies (i.e internment of identifiers and strings, as well as for allocating AST nodes, etc) is a very effective optimization in batch compilers, as the arenas can live until the end of execution and therefore don’t need “hands on” memory management.

However, this doesn’t work in an incremental environment, as you would quickly fill up the arenas with intermediary data and never be deleting anything from them. This is one reason rust-analyzer reimplements such a vast amount of the rust compiler, which makes heavy use of arenas throughout.

As essentially every programming language developer writes their batch compiler first without worrying about incremental compilation, they can wind up stuck in a situation where there’s simply no way to reuse their existing compiler code for an IDE efficiently. This effect tends to scale with how clever/well-optimized the batch compiler implementation is.

I think the future definitely lies in compilers written to be “incremental first,” but this requires a major shift in mindset, as well as accepting significantly worse performance for batch compilation. It also further complicates the already very complicated task of writing compilers, especially for first-time language designers.

2 comments

That's a great point about allocation/memory management. As an example, rust-analyzer needs to free memory, but rustc's `free` is simply `std::process::exit`.

If I remember correctly, the new trait solver's interner is a trait (https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_...) that should allow rust-analyzer's implementation of it to free memory over time and not OOM people's machines.

> I think the future definitely lies in compilers written to be “incremental first,” but this requires a major shift in mindset, as well as accepting significantly worse performance for batch compilation. It also further complicates the already very complicated task of writing compilers, especially for first-time language designers.

I'm in strong agreement with you, but I will say: I've really grown to love query-based approaches to compiler-shaped problems. Makes some really tricky cache/state issues go away.

I thought that rust's compiler was indeed written to be incremental first. Check a sibling comment of mine for reasons why I thought so.