Hacker News new | ask | show | jobs
by synthetigram 885 days ago
This problem is not going to go away so easily. Numerous core Java classes (like BufferedInputStream) use synchronized. I count 1600+ usages in java.base. The blocking issue means it's _much_ easier to accidentally run into this, rather than waving it away as an unlikely edge case.

I personally ran into this Using the built in com.sun webserver, with a virtual thread executor. My VPS only has two CPUs which means the FJP that virtual threads run on only have 2 active threads at a time. I ran into this hang when some of the connection hung, blocking any further requests from being processed.

4 comments

As the JEP states, pinning due to synchronized is a temporary issue. We didn't want to hold off releasing virtual threads until that matter is resolved (because users can resolve it themselves with additional work), but a fix already exists in the Loom repository, EA builds will be offered shortly for testing, and it will be delivered in a GA release soon.

Those who run into this issue and are unable or unwilling to do the work to avoid it (replacing synchronized with j.u.c locks) as explained in the adoption guide [1] may want to wait until the issue is resolved in the JDK.

I would strongly recommend that anyone adopting virtual threads read the adoption guide.

[1]: https://docs.oracle.com/en/java/javase/21/core/virtual-threa...

> unable or unwilling to do the work to avoid it

The problem is that it's rare to write code which uses no third-party libraries, and these third-party libraries (most written before Java virtual threads ever existed) have a good chance of using "synchronized" instead of other kinds of locks; and "synchronized" can be more robust than other kinds of locks (no risk of forgetting to release the lock, and on older JVMs, no risk of an out-of-memory while within the lock implementation breaking things), so people can prefer to use it whenever possible.

To me, this is a deal breaker; it makes it too risky to use virtual threads in most cases. It's better to wait for a newer Java LTS which can unmount virtual threads on "synchronized" blocks before starting to use it.

> have a good chance of using "synchronized" instead of other kinds of locks; and "synchronized" can be more robust than other kinds of locks (no risk of forgetting to release the lock, and on older JVMs, no risk of an out-of-memory while within the lock implementation breaking things),

I haven't professionally written Java in years, however from what I remember synchronized was considered evil from day one. You can't forget to release it, but you better got out of your way to allocate an internal object just for locking because you have no control who else might synchronize on your object and at that point you are only a bit of syntactic sugar away from a try { lock.lock();}finally{lock.unlock();} .

The fact that the monitor is public rarely causes issues, and in those cases where it's used on internal objects, it's not really public anyhow.

There's an additional benefit to using the built in monitors, and that has to do with heap allocation. The data structure for managing it is allocated lazily, only when contention is actually encountered. This means that "synchronized" can be used as a relatively low cost defensive coding practice in case an object which isn't intended to be used by multiple threads actually is.

Is there a similarly low-level synchronization mechanism that doesn't work this way? .NET's does the same thing.

I guess I might have preferred if both Java and .NET had chosen to use a dedicated mutex object instead of hanging the whole thing off of just any old instance of Object. But that would have its own downsides, and the designers might have good reason to decide that they were worse. Not being able to just reuse an existing object, for example, would increase heap allocations and the number of pointers to juggle, which might seriously limit the performance of multithreaded code that uses a very fine-grained locking scheme.

In .net async won where lock and mutex does not work (lock is like synchronized, not exactly the same, tough). That’s why most libraries use SemaphoreSlim which would work with green threads. But that’s more because of the ecosystem. I’ve barley stumble upon lock’s and mutex is mostly used in the main method since it acquires a real os mutex, not really a cheap thing but for GUIs it’s clever to check if the app is running. Most libs that use system.threading.task use semaphoreslim tough.
Yeah, definitely. But for a fair comparison I think you have to look at how .NET did things before async/await hit the scene. And, for that, the aspect of the design in question is quite similar between the two.
Hi Ron. Thanks a lot for the amazing work you are doing on loom and whole JVM platform. EA builds and GA release you mentioned can make it into 22 or you meant EA build for 23?
Wow, I would love to be in the meeting where this decision was made.

Let's ship this with a foot gun, but lets not mention in the JEP that it may hang - let them figure it out.

I don't know man?

We make scalable graphics rendering servers to stream things like videogames across the web. When we started the project to switch to virtual threads we had that as number one on the big board. "Rewrite for reentrant locks."

Maybe we have more fastidious engineers than a normal company would since we are in the medical space? But even the juniors were reading and familiarizing themselves on how to properly lock in loom's infancy.

All that only to point out that, yes, they had communicated the proper use of reentrant locks long ago.

I do understand what you're saying from an engineering management perspective though. That effort cost a fortune. Especially when you have the FDA to deal with.

It was more than worth it though! In the world of cloud providers, efficiency is money.

Wait, are you writing medical videogames?
We use the same technologies to deliver, say, remote CT review capability, that you would use to stream a videogame. It's just far more likely that the audience I'm communicating with, HN, is familiar with the requirements of videogame streaming, than it is that they are familiar with remote medical dataset viewing. Obviously the requirements or our use case are far more stringent, but no need to go into all that to illustrate the point made.

1 - Use virtual threads with reentrant locks if you need to do "true heavy" scaling.

2 - Kind of implied, but since you gave the opportunity to make it explicit with your comment =D, there is no need to waste your life on earning no money in videogames when the medical industry is right there willing to pay you 10x as much for the same skills. (Provided your skill is in the hard backend engine and physics work. They pay more for the ML too, if I'm being honest.)

I understand the frustration, but why not read a doc?

https://docs.oracle.com/en/java/javase/21/core/virtual-threa...

In Virtual Threads: An Adoption Guide part there is:

When using virtual threads, if you want to limit the concurrency of accessing some service, you should use a construct designed specifically for that purpose: the Semaphore class.

That language only obliquely mentions the issue. It is nowhere near clear and direct enough for someone who is just, for example, using a third-party library that is affected. And then it's stuck inside detailed documentation that anyone who wasn't personally planning on adopting virtual threads is unlikely to read.

This seems like it's at least vaguely headed in the direction of that famous scene from early in The Hitchhiker's Guide to the Galaxy:

“But the plans were on display…”

“On display? I eventually had to go down to the cellar to find them.”

“That’s the display department.”

“With a flashlight.”

“Ah, well, the lights had probably gone.”

“So had the stairs.”

“But look, you found the notice, didn’t you?”

“Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”

Maybe you should stick to reading Adams and not programming?
You might accidentally write an infinite loop as well - should we not use Turing-complete languages or what?

It’s not like multithreaded computing wasn’t full of footguns anyway.

I would like to take this opportunity to thank pron and the amazing jdk developers for working on a state of the art runtime and language ecosystem and providing it for free. Please ignore the entitled, there are many many happy Dev's who can't thank you all enough.
People always forget that things that only happen every few million times, can happen fairly frequently on a busy server. This has bitten me numerous times. The nature of a lot of these types of issues is that they are hard to detect and hard to reproduce.

Virtual threads are nice for unblocking legacy code but they aren't without issues. There are better options for new code with less trade offs on the jvm as well. I've recently been experimenting with jasync-postgresql (there's a mysql variant as well) as an alternative to JDBC in Kotlin. It's a nice library. It does have some limitations and is a bit on the primitive side. But it appears to be somewhat widely used in various database frameworks for Scala, Java, and Kotlin.

Databases and database frameworks are an area on the JVM where there just is a huge amount of legacy code built on threads and blocking IO. It's probably one of the reasons Oracle worked on virtual threads as migrating away from these frameworks is unlikely to ever happen in a lot of code bases. So, waving a magic wand and making all that code non blocking is very attractive. But of course that magic has some hard limitations and synchronize blocks are one of those. I imagine they are working on improving that further.

> Virtual threads are nice for unblocking legacy code but they aren't without issues. There are better options for new code with less trade offs on the jvm as well.

The designers of Project Loom would say the exact opposite. The whole push behind Project Loom and similar models (Go's oft-praised "goroutines" runtimes being another one) is motivated by Threads being a much better fit for async behavior in a fundamentally procedural language like Java or Go than promise-based frameworks like async/await.

The whole motivation of Project Loom is to make the simple thing (spawning threads to handle blocking IO) the fast thing as well (by actually replacing the blocking IO with efficient async IO OS calls and managing the threads internally). Project Loom will be considered a full success if the next generation Java web server does something akin to "new Thread(() -> {executeHandlerFunc(conn); }.Start(); " for each incoming connection, just like the Go built-in web server.

I think it's not that black and white. Clearly they made a choice to be backwards compatible. Not because Java Threads have a nice API (not even close) but because a lot of legacy code that will never be changed uses it. Including all the ugly bits that you shouldn't be using. Like a lot of the low level synchronization primitives that date back to the early days of Java. It's an impressive bit of work but they made some compromises to make things work. A new API would have been easier, would have had less overhead, and be nicer to use. But backwards compatibility with legacy code was a big goal.

It mostly works fine and it's an impressive bit of engineering. But it has some really ugly failure modes in combination with hacky legacy code designed for real threads. So, you can't blindly assume things to just work. Hence the deadlocks.

Many Java servers already work the way you outline. It's just that they are a bit tedious to use with the traditional Java frameworks. Which is one reason I like using Spring's webflux with Kotlin instead. Just way nicer when it's all exposed via co-routines.

There are two separate choices. One is the choice of whether to implement green threads in the JVM at all, or whether to use async/await, or some other type of concurrency primitive. The other is whether to expose the new concurrency primitive using a new API or an existing one.

You could say the second choice, the specific API, was done, at least to some extent, for backwards compatibility reasons. I wouldn't agree, but I think there is at least some argument to be made. Here is one of the designer's explanation [0]:

> We also realized that implementing the existing thread API, so turning it into an abstraction with two different implementations won't add any runtime overhead. I also found that when talking about Java's new user mode threads back when this feature was in development, and back when we still called them fibers, every time I talked about them at conferences, I kept repeating myself and explaining that fibers are just like threads. After trying a few early access releases of the JDK with a fiber API, and then a thread API, we decided to go with the thread API.

However, the choice of adding a new concurrency primitive to Java in the form of green threads instead of others was very very clearly not done for backwards compatibility's sake. Ron Pressler (who is active here as 'pron') has several talks on the advantages of green threads over async/await that you can look at [0][1]. The designers of Go also had the same belief, and also chose to add green threads as the fundamental built-in concurrency primitive in Go, obviously not for backwards compatibility reasons in their case.

[0] https://www.infoq.com/presentations/virtual-threads-lightwei...

[1] https://www.youtube.com/watch?v=EO9oMiL1fFo

>The designers of Project Loom would say the exact opposite.

Sure, but then again the designers of circa 2000-2010 J2EE also thought the verbosity and over-engineering was a good idea.

There might be some justification for comparing any one particular thing to the worst possible particular thing if those things have something in common. The only feature the two things you picked have in common is the word 'java'.
Also have in common the "appeal to authority": (the designers) as arbiters of good judgement
Appeal to expertise. Appeal to authority is a falacy when the authority is not an expert in the requisite domain. eg: we don't care what a policeman thinks about astrophysics, we do care what the astrophysicist says.
J2EE started as a Objective-C framework, before being rewritten in Java.
I don't know.

My understanding is that that highest performance webserver is nginx. And it uses async internally.

IMO, virtual threads is a better general purpose language feature because it avoids function coloring and is generally easier to reason about, but it may not result in the highest performance Java webserver.

NGINX is a native C implementation, so it has to be carefully written to use the OS's native high-performance IO and native OS threads.

The purpose of project Loom is to abstract that away from Java application code. The runtime can use the most efficient IO for the given platform (ideally io_uring on Linux or IOCP on Windows, for example) even if the application code calls the old blocking File.Write(). The application can then use simple APIs and code patterns, but still get massive performance.

With Loom, you can easily have 20,000 virtual threads servicing 20,000 concurrent HTTP requests and each "blocked" in IO, while only using, say, 100 OS threads that are polling an IOCP. A normal Linux box can typically only handle around maybe 1000 threads across all running processes.

Servicing 20,000 concurrent requests on a single box where somehow threads are the bottleneck, is that not a problem that approximately no one has?
Most application webservers (by default) handle one request per thread. For mostly IO bound stuff (which many projects are), it makes sense to me that threads become a bottleneck in relatively ordinary scenarios.
The lack of support for synchronized isn't a fundamental or hard limit, it's just that the HotSpot implementation is complicated for performance reasons and they put off rewriting that code until later. They're indeed working on that now and in some future version I guess wait/notify and synchronized blocks will start to work. After all, you can easily transform such code into an equivalent that does work.
There are ways to find problem sections without having to trigger a full deadlock: https://openjdk.org/jeps/444

  The system property jdk.tracePinnedThreads triggers a stack trace when a thread blocks while pinned. Running with -Djdk.tracePinnedThreads=full prints a complete stack trace when a thread blocks while pinned, highlighting native frames and frames holding monitors. Running with -Djdk.tracePinnedThreads=short limits the output to just the problematic frames.
Was curious what it is "jasync". And man it hurts me to see documentation like this (when compared to classic javadocs)

https://github.com/jasync-sql/jasync-sql/wiki/API-Overview

From project WIKI (https://github.com/jasync-sql/jasync-sql/wiki)

Synchronized blocks are not a problem. Synchronized blocks that later don’t unblock the thread may sometimes be.
BufferdInputStream is rewritten and is only using synchronized if subclassed. In fact there has been a lot of work removing the synchronized keyword.
I've written an open source library to easily replace synchronized with something more virtual thread friendly: https://github.com/japplis/Virtually