Hacker News new | ask | show | jobs
by readittwice 3557 days ago
I think I read that too, but I am not sure that this is actually true. Please correct me if I am wrong, according to the Rust developers code generation seems to be the bottleneck. That's why they are working on incremental compilation to improve compilation times. They detect changes in the input files by comparing the AST to the old version. So parsing is always necessary but later stages can be cached. That seems to confirm that parsing is not the bottleneck, at least not for Rust.

OTOH that probably also depends on your use case, JS for example needs to get parsed at every page load. Some JS-VMs only parse the source code lazily, at first the parser only detects function boundaries. Only if this function is really executed, the function gets parsed completely.

4 comments

> I think I read that too, but I am not sure that this is actually true. Please correct me if I am wrong, according to the Rust developers code generation seems to be the bottleneck.

You're both right. The problem of C++ is how '#include' works, that it just includes the content of all files and therefore there's more overhead on the lexing side.

Rust's "equivalent" of '#include' is 'use', which doesn't have this problem, because it doesn't concatenates files.

You can make codegen faster by doing less work and producing less sophisticated code. You can't not look at every byte. It's the limiting factor in fastest possible compilation, not what takes the most time in most production compilers.
I suspect that modern languages try and fix some of this by making the syntax unambiguous. Means you only need to tokenize each file exactly once. Compare with older languages where if you change a header/module/etc you need to reparse the whole shebang.

Possible with rust that moves a bunch of the grant work into the code generator. AKA where as in C/C++ by the time you're generating code all your type information is set in stone, possible in rust a bunch of stuff isn't resolved yet.

Rust uses LLVM, a heavyweight backend with fearsomely thorough optimisation.

It might be that for non-optimised output, the parser becomes the bottleneck again. Or it might be that LLVM just imposes overhead even for the unoptimised case, that pays for itself in the optimised case.