| HN Mirror

Just to expand on the timing and cost argument:

When you have 10,000 tasks and about 8 cores (give or take a few) the number of context switches is very large. Switching in the kernel will happen mostly in the system call boundary of blocking IOs and require the scheduler to make a decision on what thread to wake up next and then change the running process.

This can be seen in function context_switch inhttps://github.com/torvalds/linux/blob/master/kernel/sched/c... without the arch dependent components and can hardly be compared in complexity and effort to switching between 4 and 8 registers in user-space.

The above still doesn't include any changes to the TLB and memory protection tables as I assume the OS optimized those away when it switched between two threads of the same program. An optimization I'm not sure that happens normally.