|
|
|
|
|
by lmm
1799 days ago
|
|
Isn't the point of preemption to degrade gracefully when the cores are oversubscribed? E.g. the first system I worked on ran potentially CPU-heavy work from various clients, and used per-client threads to isolate them; every so often clients would find ways to get their thread stuck doing a large amount of CPU work (e.g. regex backtracking) and although these were in some sense bugs (and we did fix them), it was very useful that even if one or two clients blocked all their threads (which was often more than our number of physical cores), this wouldn't completely block other clients' threads from running. |
|
Suppose you have 100K threads, and only 1% of them become CPU-bound for 100ms. That could take down your 32-core server for 3 seconds, which is bad. But suppose we had 10ms time-slices. Then, those busy threads' latency might go from 100ms to as high as a few minutes, which means effectively taking them down. The scale has a qualitative effect here. So, rather than time-sharing, it might be better to optionally install some other preemption policy -- maybe something that indefinitely suspends threads that behave badly too often and puts them in some collection.
The point is that time-slicing will probably not be helpful in sufficiently many cases, and we don't yet know what will. We'd like to gather more data before offering something. In some other languages/runtimes it might be worthwhile to just expose a capability and see what people do with it, but with Java, within five minutes you'll have twenty libraries doing time-sharing, and thousands of people using them blindly whether it's good or bad for them (just because they say they do time-sharing, and that's good, no?), and now there's just noise and bad habits everywhere. This is nanny-state governance, but we've learned our lesson, and you can't be too careful with an ecosystem this big.