Hacker News new | ask | show | jobs
by HeroicKatora 650 days ago
I couldn't quite replicate those numbers (rustc 1.78, gcc 14, g++ 14) with a recent state. On my machine (Ryzen 9 7900X, LVM on NVMe) it's rustc 60-80ms, gcc 20-30ms and tcc in 2ms. Intererestingly, g++ is still 200ms on that machine. Activating time and the builtin time-passes in rustc here's also an interesting observation: rustc spends 47ms of its time in sys and 23ms in user compared to <3ms for both C variants. It counts its own time as 50ms instead for some reason, not sure what it is subtracting here. Also looking at individual passes of the compiler (rustc +nightly -C opt-level=1 -Z time-passes gcd.rs) reveals it spends 33ms linking, 16ms in LLVM and only a negligible time in what you'd consider compiling.

I think the test is uultimately non-sensical for the question being posed here. It doesn't reveal anything insightful about scaling to real world program sizes, either. The time of rustc is dominated by the platform linker anyways. Sure, one might argue that this points out Rust as relying too much on the linker and creating too many unused symbols. But the question of whether this is caused by the language and in particular its syntactical choices .. should at that point be answered with probably not. It's not a benchmark you want to compare by percentage speedups anyways since it's probably dominated by constant time costs for any of the batteries included standard library languages compared to C.

1 comments

thank you very much for the failed replication!

it's interesting, my machine is fairly similar—ryzen 5 3500u, rustc 1.63.0, luks on nvme. is it possible that rustc has gotten much faster since 1.63?

while i agree that it's not the most important test for day-to-day use, i don't agree that it falls to the level of nonsensical. how fast things are determines how you can use them. tcc and old versions of gcc are fast enough that you could very reasonably generate a c file, compile it into a new shared object, dlopen it, and call it, every screen frame. there are some languages, like gforth, that actually implement their ffi in such a way, and sitkack and i have both done some experiments with inline c and jit compilation by this mechanism

i do agree that the syntactical choices of the language have relatively little to do with it, and your rustc measurements provide strong evidence of that—though perhaps it is somewhat unfavorable for c++ that it commonly has to tokenize many megabytes of header files and do the moral equivalent of text replacement to implement parametric polymorphism

Thank you for re-validating the numbers on your end, it's indeed very possible. There's been quite a few improvements in those versions. Though the effect size does not quite fit with most of the optimizations I can recall, maybe it's much more related to optimizations to the standard library's size and linking behavior.

With regards to standard use, for many users the scenario is definitely not common. I'd rather rustc be an effective screw driver and a separate hammer be built than try to mangle both into the same tool. By that I mean, it's very clear which portion of the compiler must be repurposed here. The hard question is whether the architecture is amenable to alternative linker backends that serve your use-case. I'm afraid I can't answer that conclusively. Only so much, the conceptual conflict of Rust is that linkining is a very memory-safety critical part of the process. And with its compilation module model it relinks everything into the resulting binary / library which includes a large std and dependency tree even if much of this is removed by the step. Maybe that can be changed; and relying a tool whose interface was ultimately designed with C in mind is also far from optimal to compute those outputs and inputs. It's hard to say how much of it stems from compatibility concerns and compatibility overheads and how much is fundamental to the language's design which could be shed for a pure build process.

With regards to C++, I suspect it's rooted in the fact that parsing it requires in principle the implementation of a complete consteval engine. The language has a dependency loop between parsing and codegen. This of course, is not how data should be laid out for executing fast programs on it. It's quite concerning given the specifications still contains the bold faced lie that "The disambiguation is purely syntactic" (6.8; 1) for typenames vs non-typenames to parse constructors from declarations which at the present can require arbitrary template specialization. It might be interesting to see if those two headers in your example already execute some of these dependency loops but it's hard for me to think of an experiment to validate any of this. Maybe you have ideas, is there something like time-passes?

dunno. with respect to c++, you could probably hack together a c++ compiler setup that was more aggressive about using precompiled-header-like things. and if you're trying to abuse g++ as a jit, you could maybe write a small, self-contained header that the compiler can handle quickly, and not call any standard library functions from the generated code