Hacker News new | ask | show | jobs
by xmcqdpt2 885 days ago
The change in semantics is that while in principle your OS thread will always have a turn at making progress (assuming no super heavy spin locks etc), that isn't true for virtual threads. The classic situation and the one they hit in the article is something like this,

You've got some virtual threads that encounter this code,

    synchronized(foo) {
      foo.wait()
    }
And some other virtual threads that are in charge of awaking the waiters,

    synchronized(foo) {
      operation()
      foo.notify()
    }
This is a classic approach to the producer/consumer pattern in Java.

If operation() can do a virtual thread suspend, then it's possible to be suspended, relinquish the platform thread, which the scheduler reuses for the consumer and gets blocked on Object.wait. If this happens enough, you can end up with all the platform threads blocked, and no threads available to make progress on the producer.

The problem is that Object.wait doesn't release the virtual thread, which is a pretty major foot gun that I think the JDK team would have liked to avoid but it was too hard to implement correctly in the current JDK's codebase.

1 comments

The only way I can see this being a problem is if the virtual threads can't be stolen from their (now pinned) carrier thread. Because otherwise that's all true of real threads too, blocking them is the whole point of Object.wait.

If there's no work-stealing from pinned carriers (or they're low-finite and normal threads are effectively infinite): yes that'd be a HUGE issue. I would be shocked if they released anything with that limitation though, that would violate some of the core expectations of mutexes and threads - independent ones need to make progress or nearly all patterns can't guarantee progress.

From Java docs for `jdk.virtualThreadScheduler.maxPoolSize`: the default is 256.

So yeah I can see that starving rather quickly, particularly with benchmarking-like workloads. Synchronized is very very common, 256 concurrent calls really doesn't seem all that abnormal.

If that were raised to like max-int32 would things be fine, semantically? That'd mimic real threads limits (no jvm limit at all afaict).

> If there's no work-stealing from pinned carriers (or they're low-finite and normal threads are effectively infinite): yes that'd be a HUGE issue. I would be shocked if they released anything with that limitation though, that would violate some of the core expectations of mutexes and threads - independent ones need to make progress or nearly all patterns can't guarantee progress.

Correct you can't steal the carrier thread from an Object.wait() waiting virtual thread. This is apparently in the pipeline but it is a pretty major limitation.

Most cases of synchronized/notify/wait should probably use concurrent collections instead (as message queues) so in greenfield code it's not that big of a deal. Virtual threads make writing consumers/producers using collections way easier too.

Sadly, most Java projects are not greenfield projects.

>Correct you can't steal the carrier thread from an Object.wait() waiting virtual thread. This is apparently in the pipeline but it is a pretty major limitation.

I mean stealing other virtual threads from the pinned carrier thread (except for the one pinning it) so they can make progress. Normal work-stealing stuff - the queue(thread) is blocked(pinned), so process that task(virtual thread) in a different queue(thread).

It makes sense that a pinned thread remains pinned with the virtual thread that pinned it.

The 256 default carrier thread limit is going to frequently be a problem though, yeah. That's more than enough to cause all this, and it's a pretty crazy default imo.