|
|
|
|
|
by 0x8BADF00D
2771 days ago
|
|
This seems like a familiar issue I've run into with workstations I've used in the past running Xeons. Not sure how NTOSKRNL handles scheduling of parallel tasks. I'd venture a guess and say it's hybrid (M:N threads), where multiple userland application threads are mapped to some "virtual processor" in kernelmode. That leads to priority inversion between the userland and kernelmode threads, which could explain why Windows benchmarks are terrible when dealing with multiple physical cores. |
|
Not even sure how it would work or even make any sense to have N:M handled by the kernel. N:M is usually a mainly a userspace thing. And Windows is even less likely to use that kind of convolution, because IIRC it can call back from kernel to userspace (that design I would not recommend, btw, but oh well). You have fibers, of course, but that's a different thing.
Windows does not scale probably simply because the kernel is full of "big" locks (at least not small enough...) everywhere, and they have far less fancy structures and algo than Linux (is there any equivalent of RCU that is widely used in there? - not sure). Cf the classic posts of the builder of Chrome who every now and then encounter a ridiculous slowdown of his builds on moderately big computers, sometimes because of mutexes badly placed.