Hacker News new | ask | show | jobs
by omeid2 2502 days ago
Curious what is the fundamental difference that makes Go do M:N thread efficiently? Considering that the compiler has far less information than Rust about the program.
3 comments

Go is more efficient at M:N then Rust can be mostly for two reasons:

1. Go can start stacks small and relocate them, because the runtime needed to implement garbage collection allows relocation of pointers into the stack by rewriting them. Rust has no such runtime infrastructure: it cannot tell what stack values correspond to pointers and which correspond to other values. Additionally, Rust allows for pointers from the heap into the stack in certain circumstances, which Go does not (I don't think, anyway). So what Rust must do is to reserve a relatively large amount of address space for each thread's stack, because those stacks cannot move in memory. (Note that in the case of Rust the kernel does not have to, and typically does not, actually allocate that much physical memory for each thread until actually needed; the reservation is virtual address space only.) In contrast, Go can start thread stacks small, which makes them faster to allocate, and copy them around as they grow bigger. Note that async/await in Rust has the potential to be more efficient than even Go's stack growth, as the runtime can allocate all the needed space up front in some cases and avoid the copies; this is the consequence of async/await compiling to a static state machine instead of the dynamic control flow that threads have.

2. Rust cares more about fast interoperability with C code that may not be compiled with knowledge of async I/O and stack growth. Go chooses fast M:N threading over this, sacrificing fast FFI in the process as every cgo call needs to switch to a big stack. This is just a tradeoff. Given that 1:1 threading is quite fast and scalable on Linux, it's the right tradeoff for Rust's domain, as losing M:N threading isn't losing that much anyway.

Go needs to allocate a growing stack on the heap, needs to move it around, etc. It's not as efficient as Rust's async.
But is it as efficient or more than the linux threads?
I don't have the full answer (and I would love if someone more knowledgeable could jump in this thread) but I'd say it depends since there are a few antagonistic effects :

- goroutines are (unless it changed since last time I used it) cooperatively scheduled. It's cheaper than preemptive scheduling, but it can lead to big inefficiencies on some workload of you're not careful enough (tight loops can hold a (Linux) thread for a long time and prevent any other goroutines from running on this thread).

- goroutines start with a really small (a few kB) stack which needs to be copied to be grown. If you end up with a stack as big as a native stack, you'd have done a lot of copies in the process, that wouldn't have been necessary if the stack was allocated upfront.

I don't think the implication was that Go does it efficiently.