Hacker News new | ask | show | jobs
by MuffinFlavored 1251 days ago
> In a typical Windows 10 installation with many background processes and services, the CPU context switching rate can vary greatly depending on the specific system's hardware configuration, running processes, and workload. However, on a typical system, the context switching rate can be anywhere from a few hundred to a few thousand times per second.

Would you agree with this statement from ChatGPT? Is the Windows kernel handling thousands of context switches and time slicing processes the way you described? pushad/pushfd + popad/popfd

2 comments

Yeah, more or less; mostly more. pusha/pushad is one of those instructions that sounded good, but isn't used much (it became invalid in amd64), windows will push the registers one at a time, and maybe FPU, MMX, SSE, etc registers; of course, that's a lot of extra pushing, so there's strategies to avoid it if the thread doesn't use them. If you switch to a different task, you're going to need to load its page tables, and these days you've gotta flush a bunch of caches to avoid Spectre (although you shouldn't avoid the Spectre game from the 90s, that was nifty).

If you're good at Windows, you can probably get a count of context switches per second on your system, with your load. Context switches generally includes interrupts as well as calls into the kernel from userspace. A server work load is going to go up to hundreds of thousands, maybe millions per second, again depending on your load.

This seems wildly inefficient. Can’t we have multiple sets of registers? Not millions, but…

Are registers expensive in hardware? Why not have loads of them?

Many processors do have this and expose it to programmers, it’s called “register windows” (though usually used for procedure calls). Or you can have banked registers, which often serve different privilege levels. Once you start looking at the microarchitectural level, you’ll find that modern processors have large register files and rename them into the architecturally visible ones.
Registers are expensive, yeah, but pushing them onto the stack isn't the most expensive part of a context switch on a modern cpu anyway. Switching the page table, and blowing away the TLB is. Pushing all the registers is some nice sequential memory activity and the stack area is frequently accessed and unlikely to have contention, so it's easy to cache (other than one or the other of push or pop has to go backwards, so you better have predictive access in both directions)
> windows will push the registers one at a time

Wouldn't Windows (and Linux) use the FXSAVE instruction instead?

Probably FXSAVE or XSAVE for the mathy registers, yes. But that doesn't cover the general purpose registers, and (F)XSAVE can be skipped if the process in question doesn't use fancy math (easy to detect, disable it, when the process uses it, the kernel will catch the fault, then enable it and set a flag on the process so it saves and restores that state as well)
Please stop using ChatGPT to write your comments. Nobody here is here to have a conversation with ChatGPT, and anyone who wants to talk with ChatGPT instead of actual human beings can do that privately without polluting the conversations of real human beings.
I think it would only be problematic if he was pretending that what ChatGPT said is what he said. Instead he asked a question to verify if what he found is true (same way you would do with other online sources).