Hacker News new | ask | show | jobs
by KerrAvon 507 days ago
I'm curious: what's the theory behind why this would be faster than mold in the non-incremental case? "Because Rust" is a fine explanation for a bunch of things, but doesn't explain expected performance benefits.

"Because there's low hanging concurrent fruit that Rust can help us get?" would be interesting but that's not explicitly stated or even implied.

1 comments

I'm not actually sure, mostly because I'm not really familiar with the Mold codebase. One clue is that I've heard that Mold gets about a 10% speedup by using a faster allocator (mimalloc). I've tried using mimalloc with Wild and didn't get any measurable speedup. This suggests to me that Mold is probably making heavier use of the allocator than Wild is. With Wild, I've certainly tried to optimise the number of heap allocations.

But in general, I'd guess just different design decisions. As for how this might be related to Rust - I'm certain that were Wild ported from Rust to C or C++, that it would perform very similarly. However, code patterns that are fine in Rust due to the borrow checker, would be footguns in languages like C or C++, so maintaining that code could be tricky. Certainly when I've coded in C++ in the past, I've found myself coding more defensively, even at a small performance cost, whereas with Rust, I'm able to be a lot bolder because I know the compiler has got my back.

> Mold gets about a 10% speedup by using a faster allocator (mimalloc). I've tried using mimalloc with Wild and didn't get any measurable speedup

Perhaps it is worth repeating the experiment with heavy MLoC codebases. jmalloc or mimalloc.

Rust is a perfectly fine language, and there's no reason you should not be able to implement fast incremental linking using Rust, so - I wish you success in doing that.

... however...

> code patterns that are fine in Rust due to the borrow checker, would be footguns in languages like C or C++,

That "dig" is probably not true. Or rather, your very conflation of C and C++ suggests that you are talking about the kind of code which would not be used in modern C++ of the past decade-or-more. While one _can_ write footguns in C++ easily, one can also very easily choose not to do so - especially when writing a new project.

Tell me you don't have rust experience without telling me you don't have rust experience.
I mean, sorry for the snark but really, there's so many of these things that it's just ridiculous to even attempt to compare. e.g. I wouln't ever use something like string_view or span unless the code is absolutely performance critical. There's a lot of defensive copying in C(++), because all the risks of losing track of pointers are just not worth it. In Rust, you can go really wild with this, there's no comparison.
> because all the risks of losing track of pointers are just not worth it.

These risks are mostly, and often entirely, gone when you write modern C++. You don't lose track of them, because you don't track them, and you only use them when you don't need to track them. (Except for inside the implementations of a few data structures, which one can think of as the equivalent of unsafe code in Rust). Of course I'm generalizing here, but again, you just don't write C-style code, and you don't have those problems.

(You may have some other problems of course, C++ has many warts.)

I don't see how modern C++ solves any of those problems, and especially without performance implications.

Like, how do you make sure that you don't hold any dangling references to a vector that reallocated? How do you make sure that code that needs synchronization is synchronized? How do you make sure that non-thread safe code is never used from multiple threads? How do you make sure that you don't ever invalidate an iterator? How do you make sure that you don't hold a reference to a data owned by unique pointer that went out of scope? How do you make sure you don't hold a string view for a string that went out of scope?

As far as I know (and how I experienced it), the answer to all of those questions is to either use some special api that you have to know about, or do something non-optimal, like creating a defensive copy, use a shared pointer or adding "just in case" mutex, or "just remember you might cause problem a and be careful."

In Rust all of those problems are a compile error and you have to make an extra effort to trigger them at runtime with unsafe code. That's a very big difference and I don't understand how can modern C++ come even close to it.

That you subject yourself to FUD is not an argument for anything.
No, it's just business. Memory corruption bugs are crazy expensive. One of those N cases goes wrong at some point and somebody will have to spend a week in gdb with corrupt stacktraces from production on some issue that's non determinstic and doesn't reproduce on dev machine.