| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rstuart4133 58 days ago

> I suppose all your green threads / fibers could run on a single CPU core.

yes.

> But what's the point? How would that be an improvement over what we have now?

> But its basically async/await but without declaring functions as async.

You answered your own question: yes, you get what you have now, without all the overhead of async, await, promises and futures.

> But if you do that, any function call you make could yield before returning.

A green thread could be an instance of a particular type, so `input = self.yield()` would fail if you aren't a green thread. So no, not "any function" - just ones that instances of a green thread, or are passed a reference to one.

> Does it yield to other threads before returning?

It could if you pass it an instance to a green thread, otherwise it can't.

> This would lead to an avalanche of bugs.

It doesn't. Cooperative multitasking is at least 1/2 a century old at this point. The bugs you're imagining will happen mostly aren't an issue. To the extent they do happen, it's because someone hasn't thought about two control flows modifying the same data structure. Yes, that happens, but it happens in all single threaded code - async included. It's why we hate side effects. It's what Rust famously prevents with its borrow checker even in the face of side effects. It's not avoided by async. The explicit colouring does not help to prevent it - it's just overhead.

FWIW the one issue cooperative multitasking does often introduce is that they can take a long time to execute, so other cooperative tasks don't run in a timely fashion. Exactly the same thing can happen with async of course. It's not usually a problem in browsers, but in embedded solutions where cooperative multitasking is commonly used, it's a real issue because they are often real time. Ask me how I know.

> Javascript guarantees that while my (non async) function runs, no other code gets executed.

This remains true. You are getting confused by your mental model of threads as a form of concurrency. There is no concurrency going on there. Semantically it is near identical to async / await. The principle difference is in async / await, the program is explicitly creating each stack frame on the heap using manually allocated objects. In addition to the mental overhead that creates, it slower than using a real stack like green threads do. But now for the truly bizarre twist. Can you guess how modern javascript engines get around that speed issue? Wait for it .... they create an explicit stack ... that looks like what green threads would use anyway! And as a wonderful side effect - you get real stack back traces again. The irony is almost palpable. https://v8.dev/blog/fast-async

> Threads (cooperative or preemptive) would be a massive change to JS. It would cause an endless parade of bugs, and frozen websites. To say nothing of your notion we could casually reinvent DOM events. That ship sailed a long time ago.

I agree the ship has sailed at this point. The rest of the assertions you make there are wrong.

This assertion stands out: frozen websites. Can you tell me how they are going to block? There are no blocking calls in javascript now. The things you would await on now would be passed a green thread handle. But the javascript scripts events called from the DOM have no green-thread handle, so they can't block.

> Personally, I much prefer this information to be explicit. I need to know as a programmer whether or not execution will be interleaved.

You don't. You've just been conditioned to think that because you've never done it any other way. But the reality is people have been using cooperative multitasking for a long, long time. It pre-dates threads and async. The issues and bugs you are proclaiming would happen don't arise.

1 comments

josephg 58 days ago

If we invented a new language, sure. Cooperative multitasking might be a fun approach. The avalanche of bugs I’m imagining would come from existing JavaScript code being run in a different context than that in which it was written and tested. If you pass me a callback right now, and I call a(); callback(); b();. I can guarantee that the program doesn’t yield to the event loop or other executions between a() and b(). As I understand it, this guarantee no longer holds with coop. multitasking because your callback can yield to another thread.

Good on the V8 team. Sounds like they’ve figured out a way to get the performance of green threads with the better ergonomics of effects systems (async await). Great!

You sound like an expert in cooperative multithreading. If async await can use real stacks, what actual benefits are there to cooperative multithreading? Why prefer them over what JS has now? Pitch them to me.

link

rstuart4133 58 days ago

> The avalanche of bugs I’m imagining would come from existing JavaScript code being run in a different context than that in which it was written and tested.

Oh, right. As you said, the ship has sailed. I think you could bolt green threads onto javascript now without ill effects - apart from bloating the language. I can't see anything that could go badly (certainly no avalanche of bugs). But in javascript green threads are only mildly more ergonomic than async. I wouldn't be bloating the language for such a small return.

Rust is a different position. The current async implementation has two big black hairs. Firstly, they had to come up with a type-safe way saving the functions current state. By state, I mean what a function normally stores on its stack. What they came up with is a work of art in some ways, but it doesn't work well with the borrow checker. The borrow checker insists you prove that you have exclusive use of a variable while it exists. Things on the stack have a limited lifetime (the function call), so the compiler knows they don't exist for very long. Even with that small lifetime it's a battle, but it's workable. Async persists that state, usually to the heap, which can effectively live forever. That wreaks havoc with the borrow checker, causing comments like this: https://news.ycombinator.com/item?id=37436274, quote: "Yes, async is effectively a much harder version of Rust ...".

The second issue is colouring. In the current Rust async implementation of large chunks of it is left to libraries, like tokio. Each of these libraries has to provide their own I/O. They aren't compatible. So if you want to use a cute new HTTP server, you are out of luck unless they provided a version that talks to the async library you are using.

The library writers do their best to accommodate by providing interfaces to the popular async libraries. That forces them to do a extra work. Whereas before they could just call `std::file::File::read()`, now they have to abstract all the I/O they do to a different module, and provide an implementation of that module for each async library they want to support.

The outcome can only be described as a mess, and that's putting it politely. It's harming uptake of the language. It wasn't like they didn't know it was coming either - there were comments pleading for a better implementation. And it wasn't as if weren't better solutions weren't already apparent - they had green threads before, they made some wrong turns with its implementation that needed to be fixed. And it's not like these solutions were harder to do than the async implementation they came up with. Async needed new standard library features to stabilise (like `Pin<>`) and introduced new keywords - none of which was needed for green threads. (Although some would be useful for an efficient green thread implementation - like knowing the maximum amount of stack a function could use.)

In the face of all that, they persisted with async. You'd need a sociologist to explain how that happened - to my engineering brain it's inexplicable. Unlike Javascript it isn't just mildly ergonomic implementation of the same thing, it's a serious mistake - well worth the effort of throwing out and replacing.

link

josephg 58 days ago

On all that, we have near total agreement. I've been complaining about how broken and half-baked rust's async story is for years - for more or less the same reasons you list above:

- You can't name the type of a impl Future.

- They play terribly with the borrow checker because the borrow checker can't handle self referential types.

- There's no future executor in the standard library. You need 3rd party libraries. And the most common library is tokio, which is a whale.

- Despite all the work, there's still no async streams in the language.

- Pin. !Unpin. pin_project. Unsafe pin_project. What are we even doing.

But async works really well in javascript. Maybe where we disagree is that I don't think any of these issues are because async itself is a bad idea. But, async has become the place dreams go to die in rust. Look at the issues above. They're all problems with rust's type system, borrow checker and standard library.

What I think rust needs is:

- A way to have self-borrows in a struct. Types with self borrows would be implicitly pinned.

- A way to name the return value of a function. Eg let x: ReturnType<some_func>. People have been saying this is right around the corner since 2019.

- Generators. Futures are built on top of generators inside the compiler. But generators have - for some reason - never been exposed in stable rust. I think generators should have been stabilised first - since all the problems you need to solve to make generators work well (self referential types, return values you can name, etc) are things futures need too.

Unfortunately I think that ship has sailed too. I try to avoid async rust whenever I can. Its such a pity. I'm hoping someone makes a rust 2.0 language at some point which fixes this situation.

link

rstuart4133 57 days ago

> I think generators should have been stabilised first - since all the problems you need to solve to make generators work well (self referential types, return values you can name, etc) are things futures need too.

Generators are an interesting case. For example, if you implemented a Vec iterator as a generator, it becomes:

    fn vec_iter(&self) {
       for index in 0..self.len() {
           yield &self[i];
       }
    }

Which is arguably easier to understand than the current event driven formulation, which required you to declare a new type to hold your state, and the code looks like:

    fn next(&self) {
       if (self.index >= self.vec.len()) {
            None
       } else {
            self.index += 1;
            &self.vec[self.index - 1]
       }
    }

Effectively the stack frame has become your type, and sequential code is always so much more compact and clearer than the event driven model. The generator could be implemented as a green thread, but you would never entertain the overhead of creating the new stack needed by the green thread implementation.

However ... the async implemented all the mechanics needed to get rid of that green thread stack allocation when the size of the stack is known, as it is in this case. The state saving stuff they created for async could be used to translate that stack to a type. It would, surprise, surprise, contain just `index` - analogous the iterator type we have to manually create for event drive code. So compiler could translate the green thread to the same implementation as the event driven code, but you get to use the compact (and very familiar) syntax of a stack machine.

I found it interesting to see what happens for a more complex generator - like something that returns every node in a tree. You can do it recursively, which is simple clear code, but you don't know the size of the stack so the trick used for the vec iterator (translating it to a type) can't be used. Or you can manually store the state you stored in the stack with a recursive implementation in a Vec<> instead. Both require a memory allocation, but they are different. One is just normal malloc that must be reallocated and moved as the allocation grows. The other can use the OS's stack implementation, that doesn't move as it grows. If you re-used stacks, the OS's stack implementation would be faster in a long running program.

Notice that the transformation from a generator to async implementation is arguably more complex than the same transformation for green threads, especially for the tree traversal.

That observation is one of the reasons I'm such a strong proponent of green threads. The other is a simpler mental model. Unlike async, you don't have to expose the inner mechanisms it depends on, like futures.

link

josephg 57 days ago

> However ... the async implemented all the mechanics needed ...

As I understand it, the implementation of async in the rust compiler grew out of the implementation for generators in nightly. Its the same continuation-passing transformation that lets you implement both await and yield in your fictional example.

> Notice that the transformation from a generator to async implementation is arguably more complex than the same transformation for green threads, especially for the tree traversal.

Yeah for sure. Another nice thing about green threads is that the compiler doesn't need to invert the call stack. I suspect you'd get smaller binaries in many cases. A lot of the complexity of async in rust comes from moving stack variables into a hidden struct as part of this transformation. For example, this function:

    async fn foo() -> impl Future {
        let x = 5;
        let y = &x;
        await someexpr();
        // ...
    }

Emits something like this:

    enum FooFuture {
        AwaitPoint1 { x: usize, y: &'a usize }
    }

But y is a reference to x - which makes this struct impossible to actually write using the rust programming language. Hence pin and all that. This is a very common pattern, but the rust lifetime syntax makes this struct impossible to express.

> That observation is one of the reasons I'm such a strong proponent of green threads. The other is a simpler mental model. Unlike async, you don't have to expose the inner mechanisms it depends on, like futures.

Fair. But as I said earlier in this thread, I like the mechanism (futures) are exposed. I like that "async" is part of a function signature. I like that you need to be explicit about which functions yield, when, and where. I want programming languages to have more effect systems - for example, it would be great to have a nopanic effect. I just ... find it much easier to enjoy async in javascript.

link