Hacker News new | ask | show | jobs
by taspeotis 698 days ago
My rough understanding is that this is similar to async/await in .NET?

It’s a shame this article paints a neutral (or even negative) experience with virtual threads.

We rewrote a boring CRUD app that spent 99% of its time waiting the database to respond to be async/await from top-to-bottom. CPU and memory usage went way down on the web server because so many requests could be handled by far fewer threads.

7 comments

> My rough understanding is that this is similar to async/await in .NET?

Well somewhat but also not really. They are green threads like async/await, but it's use is more transparent, unlike async/await.

So there are no special "async methods". You just instantiate a "VirtualThread" where you normally instantiate a (kernel) "Thread" and then use it like any other (kernel) thread. This works because for example all blocking IO API will be automatically converted to non-blocking IO underwater.

It's a different model. Microsoft did work on green threads a while ago and decided against continuing.

Links:

https://github.com/dotnet/runtimelab/issues/2398

https://github.com/dotnet/runtimelab/blob/feature/green-thre...

It should be pointed out, that the main reason they didn't go further was because of added complexity in .NET, when async/await already exists.

> Green threads introduce a completely new async programming model. The interaction between green threads and the existing async model is quite complex for .NET developers. For example, invoking async methods from green thread code requires a sync-over-async code pattern that is a very poor choice if the code is executed on a regular thread.

Also to note that even the current model is complex enough to warrant a FAQ,

https://devblogs.microsoft.com/dotnet/configureawait-faq

https://github.com/davidfowl/AspNetCoreDiagnosticScenarios/b...

This FAQ is a bit outdated in places, and is not something most users should worry about in practice.

JVM Green Threads here serve predominantly back-end scenarios, where most of the items on the list are not of concern. This list also exists to address bad habits that carried over from before the tasks were introduced, many years ago.

In general, the perceived want of green threads is in part caused by misunderstanding of that one bad article about function coloring. And that one bad article about function coloring also does not talk about the way you do async in C#.

Async/await in C# in back-end is a very easy to work with model with explicit understanding where a method returns an operation that promises to complete in the future or not, and composing tasks[0] for easy (massive) concurrency is significantly more idiomatic than doing so with green threads or completable futures that existed in Java before these. And as evidenced by adoption of green threads by large scale Java projects, turns out the failure modes share similarities except green threads end up violating way more expectations and the code author may not have any indication or explicit mechanism to address this, like using AsyncLocal.

Also one change to look for is "Runtime Handled Tasks" project in .NET that will replace Roslyn-generated state machine code with runtime-provided suspension mechanism which will only ever suspend at true suspension points where task's execution actually yields asynchronously. So far numbers show at least 5x decrease in overhead, which is massive and will bring performance of computation heavy async paths in line with sync ones:

https://github.com/dotnet/runtimelab/blob/feature/async2-exp...

Note that you were trivially able to have millions of scheduled tasks even before that as they are very lightweight.

[0]: e.g. sending requests in parallel is just this

    using var http = new HttpClient() {
        BaseAddress = new("https://news.ycombinator.com/news")
    };

    var requests = Enumerable
        .Range(1, 4)
        .Select(n => $"?p={n}")
        .Select(http.GetStringAsync);

    var pages = await Task.WhenAll(requests);
I take your point about the aforementioned article[0][1] being a popular reference when discussing async / await (and to a lesser extent, async programming in modern languages more generally) I think its popularity is highlighting the fact that it is a pain point for folks.

Take for instance Go. It is well liked in part, because its so easy to do concurrency with goroutines, and they're easy to reason about, easy to call, easy to write, and for how much heavy weight they're lifting, relatively simple to understand.

The reason Java is getting alot of kudos here for their implementation of green threads is exactly the same reason people talk about Go being an easy language to use for concurrency: It doesn't gate code behind specialized idioms / syntax / features that are only specific to asynchronous work. Rather, it largely utilizes the same idioms / syntax as synchronous code, and therefore is easier to reason about, adopt, and ultimately I think history is starting to show, to use.

Java is taking an approach paved by Go, and ultimately I think its the right choice, because having worked extensively with C# and other languages that use async / await, there are simply less footguns for the average developer to hit when you reduce the surface area of having to understand async / sync boundaries.

[0]: https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...

[1]: HN discussion: https://news.ycombinator.com/item?id=8984648

Green Threads increase the footgun count as methods which return tasks are rather explicit about their nature. The domain of async/await is well-studied, and enables crucial patterns that, like in my previous example, Green Threads do nothing to improve the UX of in any way. This also applies to Go approach which expects you to use Channels, which have their own plethora of footguns, even for things trivially solved by firing off a couple of tasks and awaiting their result. In Go, you are also expected to use explicit synchronization primitives for trivial concurrent code that require no cognitive effort in C# whatsoever. C# does have channels that work well, but turns out you rarely need them when you can just write simple task-based code instead.

I'm tired of this, that one article is bad, and incorrect, and promotes straight-up harmful intuition and probably sets the industry in terms of concurrent and asynchronous programming back by 10 years in the same way misinterpreting Donald Knuth's quote did in terms of performance.

That’s a very simplistic view. Especially that java does/will provide “structured concurrency” as something analogous to structured control flow, vs gotos.

Also, nothing prevents you from building your own, more limited but safer (the two always come together!) abstraction on top, but you couldn’t express Loom on async as the primitive.

I don't think that this would be a good showcase for Virtual Threads. The "async" API for Java is CompletableFutures, right? thats been stable for something like 10 years, so no real change since Java 8.

You'd jsut have to define a ThreadPool with n Threads before, where each request would've blocked one pending thread. Now it just keeps going.

So your equivalent Java example should've been something like this, but again: the completeable futures api is pretty old at this point.

    @HttpExchange(value = "https://news.ycombinator.com")
    interface HnClient {
        @GetExchange("news?p={page}")
        CompletableFuture<String> getNews(@PathVariable("page") Integer page);
    }

    @RequiredArgsConstructor
    @Service
    class HnService {
        private final HnClient hnClient;
        List<String> getNews() {
            var requests = IntStream.rangeClosed(1, 4)
                                    .boxed().map(hnClient::getNews).toList();
            return requests.stream().map(CompletableFuture::join).toList();
        }
    }
Structured concurrency is still being developed: https://openjdk.org/jeps/453

Also, I wouldnt consider that the equivalent Java code. That is all Spring and Lombok magic. Just write the code and just use java.net.HttpClient.

> and just use java.net.HttpClient.

No.

It would break a lot of the native interop and UI code devx of the language. Java was never as nice in those categories so it had less to lose going this path.
> My rough understanding is that this is similar to async/await in .NET?

Not really. What C# does is sort of similar but it has the disadvantages of splitting your code ecosystem into non-blocking/blocking code. This means you can “accidentally” start your non-blocking code. Something which may cause your relatively simple API to consume a ridiculous amount of resources. It also makes it much more complicated to update and maintain your code as it grows over the years. What is perhaps worse is that C# lacks an interruption model.

Java’s approach is much more modern but then it kind of had to be because the JVM already supported structured concurrency from Kotlin. Which means that Java’s “async/await” had to work in a way which wouldn’t break what was already there. Because Java is like that.

I think you can sort of view it as another example of how Java has overtaken C# (for now), but I imagine C# will get an improved async/await model in the next couple of years. Neither approach is something you would actually chose if concurrency is important to what you build and you don’t have a legacy reason to continue to build on Java/C# . This is because Go or Erlang would be the obvious choice, but it’s nice that you at least have the option if your organisation is married to a specific language.

I would not argue that golang is the obvious choice for concurrency. Java's approach is actually superior to golang's. It takes it a step further by offering structured concurrency[1].

Kotlin's design had no bearing on Java's or the JVM's implementation.

C# has an interruption model through CancellationToken as far as I'm aware.

[1] https://openjdk.org/jeps/453

It's foolish to say that green threads are strictly better and ignore async/await as something outdated. It can do a lot that green threads can't.

For example, you can actually share a thread with another runtime.

Cooperative threading allows for implicit critical sections that can be cumbersome in preemptive threading.

Async/await and virtual threads are solving different problems.

> What is perhaps worse is that C# lacks an interruption model

Btw, You'd just use OS threads if you really needed pre-emptively scheduled threads. Async tasks run on top of OS threads so you get both co-opertive scheduling within threads and pre-emptive scheduling of threads onto cores.

> It's foolish to say that green threads are strictly better and ignore async/await as something outdated

I’m not sure I said outdated, but I can see what you mean by how I called Javas approach “more modern”. What I should have called Javas approach was “correctly designed”.

C#’s async/await isn’t all terrible as you point out, but it’s designed wrong from the bottom up because computation should always be blocking by default. The fact that you can accidentally start running your code asynchronous is just… Aside from trapping developers with simple mistakes, it’s also part of what has lead to the ecosystem irrecoverably being split into two.

I was actually a little surprised to see Microsoft make their whole .Net to .Net core without addressing some of the glaring issues with it, when that massive disruption process uprooted everything anyway.

What do you think about the Structured Concurrency library Java is working with things like fork() and join()? Is that incorrectly designed? Why do you think there's a call for that if virtual threads serves every use case?
Erlang, not Go, should be the obvious choice for concurrency, but it's impossible to retrofit Erlang's concurrency onto existing systems.
As an Erlang person, from reading about Java's Virtual Threads, it feels like it should get a significant portion of the Erlang concurrency story.

With virtual threads, it seems like if you don't hit gotchas, you can spawn a thead, and run straight through blocking code and not worry about too many threads, etc. So you could do thread per connection/user chat servers and http servers and what not.

Yes, it's still shared memory, so you can miss out on the simplifying effect of explicit communication instead of shared memory communication and how that makes it easy to work with remote and local communication partners. But you can build a mailbox system if you want (it's not going to be as nice as built in one, of course). I'm not sure if Java virtual threads can kill each other effectively, either.

Erlang's concurrency story isn't green threads.

It's (with caveats, of course):

- a thread crashing will not bring the system down

- a thread cannot hog all processing time as the system ensures all threads get to run. The entire system is re-entrant and execution of each thread can be suspended to let other threads continue

- all CPU cores can and will be utilized transparently to the user

- you can monitor a thread and if it crashes you're guaranteed to receive info on why and how it crashed

- immutable data structures play a huge part of it, of course, but the above is probably more important

That's why Go's concurrency is not that good, actually. Goroutines are not even half-way there: an error in a goroutine can panic-kill your entire program, there are no good ways to monitor them etc.

Neither an error nor a recovered-from panic will cause a Go program to crash; only an unrecovered panic does that.

The bigger problem with Go in this regard is how easy it is to cause a panic thanks to nil.

In Erlang even a nil will not lead to an unrecovered panic (if it happens in the process aka green thread).

Go made half a step in the right direction with goroutines, but never committed fully

Isn't that Akka?
Akka is heavily inspired by Erlang, but the underlying system/VM has to provide certain guarantees for actual Erlang-style concurrency to work: https://news.ycombinator.com/item?id=40989995
Maybe C# is going to have a new asynv await model but the fragmentation of libs and codes cannot be undone probably.

Java has the power that they make relatively more decisions about the language and the libs that they don’t have to fix later. That’s a great value if you’re not building throw-away software but SaaS or something that has to live long.

> This is because Go or Erlang would be the obvious choice

Why go? It has a quite anemic standard library for concurrent data structures, compared to java and is a less expressive , and arguably worse language on any count, verbosity included.

From what I recall, and this is a while ago so bare with me, Java Virtual Threads still have a lot of pitfalls where the promise of concurrency isn't really fulfilled.

I seem to remember that is was some pretty basic operations (like maybe read or something) that caused the thread not to unmount, and therefore just block the underlying os thread. At that point you've just invented the world's most complicated thread pool.

Reading from sockets definitely works. It'd be pretty useless if it didn't.

Some operations that don't cause a task switch to another virtual thread are:

- If you've called into a native library and back into Java that then blocks. In practice this never happens because Java code doesn't rely on native libraries or frameworks that much and when it does happen it's nearly always in-and-out quickly without callbacks. This can't be fixed by the JVM, however.

- File IO. No fundamental problem here, it can be fixed, it's just that not so many programs need tens of thousands of threads doing async file IO.

- If you're holding a lock using 'synchronized'. No fundamental problem here, it's just annoying because of how HotSpot is implemented. They're fixing this at the moment.

In practice it's mostly the last one that causes issues in real apps. It's not hard to work around, and eventually those workarounds won't be needed anymore.

You're referring to thread pinning, and this is being addressed.
It's more like Erlang threads - they appear to be blocking, so existing code will work with zero changes. But you can create a gazillion of them.
> My rough understanding is that this is similar to async/await in .NET?

The biggest difference is that C# async/await code is rewritten by the compiler to be able to be async. This means that you see artifacts in the stack that weren’t there when you wrote the code.

There are no rewrites with virtual threads and the code is presented on the stack just as you write it.

They solve the same problem but in very different ways.

> They solve the same problem but in very different ways.

Yes. Async/await is stackless, which leads to the “coloured functions” problem (because it can only suspend function calls one-by-one). Threads are stackful (the whole stack can be suspended at once), which avoids the issue.

There is overlap but they really don't solve the same problem. Cooperative threading has its own advantages and patterns that won't be served by virtual threads.
What patterns does async/await solve which virtual threads don’t?
If you need to be explicit about thread contexts because you're using a thread that's bound to some other runtime (say, a GL Context) or you simply want to use a single thread for synchronization like is common in UI programming with a Main/UI Thread, async/await does quite well. The async/await sugar ends up being a better devx than thread locking and implicit threading just doesn't cut it.

In Java they're working on a structured concurrency library to bridge this gap, but IMO, it'll end up looking like async/await with all its ups and downs but with less sugar.

What’s stopping you from using a single thread for synchronization?
You can use virtual threads running on a single OS thread and that will work but then everything will be on that one thread. You'll have synchronization but you'll also always be blocking on that one thread as well.

Async/await is able to achieve good UX around explicitly defining what goes on your Main thread and what goes elsewhere. Its trivial to mix UI thread and background thread code by bouncing between synchronization contexts as needed.

When the threading model is implicit its impossible to have this control.

"Green Threads" as implemented in Java is a solution that solves only a single problem - blocking/multiplexing.

It does not enable easy concurrency and task/future composition the way C#/JS/Rust do, which offer strictly better and more comprehensive model.

Structured concurrency[1] offers task composition and more.

[1] https://openjdk.org/jeps/453

What do you mean? It implements the Future/Task interface and you can definitely use that. In fact you can’t tell the difference from a virtual thread vs a platform one, and it’s available everywhere. I for one thinks it’s much easier to use than the async/await pattern as I don’t need any special syntax to use it.
Can you expand on how the benefit in your rewrite came about? Threads don't consume CPU when they're waiting for the DB, after all. And threads share memory with each other.

(I guess scaling to ridiculous levels you could be approaching trouble if you have O(100k) outstanding DB queries per application server, hope you have a DB that can handle millions of oustanding DB queries then!)

In large numbers the cost of switching between threads does consume CPU while they're waiting for the database. This is why green threads exist, to have large numbers of in flight work executing over a smaller number of OS threads.
When using OS threads, there's no switching when they are waiting for a socket (db connection). The OS knows to wake the thread up only when there's something new to see on the connection.
Both sides of a sleep/awake transition with conventional blocking system calls involve heavyweight context switches: the CPU protection level changes and the thread registers get saved out or loaded back in.
Yes, but these don't happen while waiting.
>My rough understanding is that this is similar to async/await in .NET?

No, the I/O is still blocking with respect to the application code.