Hacker News new | ask | show | jobs
by zackmorris 2271 days ago
I've shied away from async/await because I haven't seen a good writeup on how to make it deterministic. Come to think of it, all of the times I've encountered it, there was no way to analyze the codepaths and prove that every exceptional situation and failure mode was handled. Maybe I missed something along the way?

So my feeling about it is that it may turn out to be an evolutionary dead end, or anti-pattern at the least. My gut feeling is that async/await is functionally equivalent to the the synchronous/blocking/coroutine/channel system of other languages like Go. Could we write a thin wrapper that converts async/await to that or vice versa?

This is the primary reason why I've stuck to synchronous/blocking PHP with all of its flaws instead of Node.js. I think this is a fundamental thing that shouldn't be glossed over and accepted so readily into other languages.

7 comments

What do you mean by "deterministic" here?

> Come to think of it, all of the times I've encountered it, there was no way to analyze the codepaths and prove that every exceptional situation and failure mode was handled.

There is nothing special about async/await with regards to this in Rust, at least if I'm understanding you correctly. Async functions can return Results like any other function for recoverable errors, and can panic like any other function for un-recoverable errors.

> My gut feeling is that async/await is functionally equivalent to the the synchronous/blocking/coroutine/channel system of other languages like Go.

It depends on what level of abstraction you're talking about. For Rust, which cares a lot about details, they're not. I gave a talk comparing and contrasting all this stuff here: https://www.infoq.com/presentations/rust-2019/

Well, I'm coming from the Javascript side, where people use promises a lot for async, but it's almost impossible to trace execution in the debugger. It's usually immediately obvious to me that many exceptions and failure modes have been missed, but I find it difficult to reason about large promise chains and I usually have no idea where I would add more error handling. It tends towards a big ball of spaghetti.

Then if something does fail (wifi blips out without a retry, who knows) the page just hangs with no indication of what went wrong.

Contrast this with Unix-style synchronous blocking code piping data between threads with no shared state. Since every step of execution happens in a single thread, blocking until a pipe returns data, it's trivial to step through the code.

Async/await really becomes a problem in Javascript though because they made it a language keyword in Node.js. So you never know if you are dealing with a sync function or async function. Eventually the entire program has async in front of everything and ends up being written just exactly like sync blocking (but with superfluous syntactic sugar everywhere). So I question what was really gained there. IMHO it just doubles the workload in the mind since now we have to juggle both types of functions and explore those permutations. That's the last thing we need when we're trying to brainstorm a solution from a large possible search space.

I'm also looking at this from a one-shot functional programming perspective. I realize that sync blocking code blocks the main thread, which is usually the GUI rendering loop. There are ways around that though. The best solution I've seen so far is Apple's Grand Central Dispatch:

https://en.wikipedia.org/wiki/Grand_Central_Dispatch

https://www.raywenderlich.com/5370-grand-central-dispatch-tu...

https://www.swiftbysundell.com/articles/a-deep-dive-into-gra...

Basically it works by being sync blocking and providing simple ways to run closures on other threads. I find it much easier to debug than async/await though. Rust probably already has all of that functionality, so I don't understand the advantage of copying Javascript model with all of its caveats. I think what's happening is that people just want to run with the context they're already familiar with.

Ah, so, async/await, while conceptually similar, isn't the same across languages. Rust's semantics here are very different than JS's. I think Rust's implementation would address all of your issues, at least conceptually.

> I don't understand the advantage of copying Javascript model with all of its caveats. I think what's happening is that people just want to run with the context they're already familiar with.

With respect, I think it's because you don't understand it in Rust yet. It was not directly copied from JS, and was certainly not due to familiarity. Read the post! It's very good :)

There are two important things that are often conflated with async/await: the programming model, and the executor model.

The programming model of async/await is essentially syntactic sugar for a state machine, where each of the states are determined implicitly by the site of the call to await, and the ancillary storage of the state machine is managed implicitly by the compiler. It's basically no different from a generator pattern (and is usually implemented on top of the same infrastructure) [1].

Since the result of calling an async function is a task object that doesn't do anything until you call methods on it (including implicitly via await), most languages integrate these task objects with their event model. This model is completely orthogonal to the feature, and often exists (at least in the library, if not the language) before async/await is added to the language. JS has a built-in event loop that predates async/await by several years, whether programming in client-side browsers or using something like Node.js. Rust is unusual in that it prescribes absolutely nothing for the executor model; all it defines in its standard library is the interface to drive the tasks manually.

I doubt async/await is an evolutionary dead end. It's the third kind of coroutine to hit it big in mainstream languages, the first two being the zero-cost exception handling mechanism (i.e., try/catch) and the generator (i.e. yield). The biggest criticism of zero-cost exceptions essentially amounts to their types not being made explicit, but the design of async/await makes their effects on the type of coroutine quite apparent, so it's not going to run into the same problems.

[1] There is a difference in the pattern, which is that you're probably going to get back different kinds of objects implementing different interfaces, which is a bigger deal if you manually drive the interfaces. Structurally--that is, in terms of the actual machine code running--there is likely to be no difference.

Thank you, that was very insightful. I've felt that way about generators, but it never occurred to me that try/catch could be viewed similarly to a coroutine (because it's basically using goto to jump to alternate states too).

Edit: I forgot to mention that I came to your same conclusion a few years ago about async/await being equivalent to a state machine. It happened a different way for me, when I realized that a coroutine performs the same computation as a state machine, but is much easier to trace. I was working on a game, and made a rather large state machine with perhaps 20-50 states and some side effects where those states interacted with game state. But it simplified down to just a few lines of ordinary flow control code when I tried it as coroutines with message passing. So to me, async/await carries the burden of state machine complexity but loses the simplicity of coroutine (and sync/blocking obviously) traceability. So I just don't see a compelling reason to use it.

async/await-based code is deterministic until you deliberately introduce the ability for tasks to race. Look at Noether for a language design that explicitly makes those steps. Whereas if you have channels (and don't have some kind of single ownership) then you have nondeterministic races right from the start. So I rather doubt that any such thin wrapper could be formed.

Certainly as a human reader, Future-based code (which async/await is sugar for) is the easiest form of concurrency/parallelism to reason about: yes you have small tasks that might execute in parallel, but all your functions still return values, function composition still works the way you'd expect, there's no way for parallel tasks to silently interfere with each other in some hidden way. The control flow is still what it looks like (unlike with general coroutines). Of course if you can afford to avoid parallelism entirely you probably should, but async/await is the least intrusive/most understandable way to manage it if you do need it.

A state machine's control flow is even easier to follow, although more verbose.
I disagree, I find state machines almost impossible to reason about. Has it already been through state X? Who knows. Will it come back to this state in the future? Maybe. Is there a route from state Y to state Z? Shrug. State machines are effectively goto writ large, and there's a reason we switched to structured programming.
Well-designed deterministic state machines are incredibly easy to reason about. You simply render them as a graph and at a glance you can understand every possible transition and state.

> Will it come back to this state in the future?

If there's a path from the current state back to itself and it gets the correct inputs, yes. Otherwise no.

> Is there a route from state Y to state Z?

A simple glance at the graph will suffice to answer this.

Of course poorly written code, regardless of the constructs, will be hard to reason about.

> You simply render them as a graph and at a glance you can understand every possible transition and state.

But programming by editing graphs never caught on. So the graph is always going to be a secondary representation of your program; it won't be the natural view in your debugger or profiler (if it's even visible at all).

And more importantly, no-one's found a really compositional approach to programming with graphs yet. In conventional programming you can define an interface that exposes one function, and maybe you have an incredibly complex implementation behind that interface, but you can reason about the rest of your program without having to look inside that black box, even as the implementation changes drastically. I've never seen the same approach applied to state machines: maybe you know that this cluster of states is independent now, but anyone can add a new state transition that changes the global control flow willy-nilly.

I wish I could upvote you 100. State machines are fine up to some number of states, on the order of 10. Beyond that, they become incredibly difficult to reason about in real programs because each state may have side effects with program state.

What ends up happening is that exceptional situations happen and the state machine becomes muddled with special cases. So for example, say you're writing a networked game of Tetris or Pacman, something with a relatively small state machine. You get all done but realize that you forgot to handle people joining or leaving the game, or the game closing or jumping ahead when a remote player beats the level and the game needs to go back to the title screen. The state machine starts getting all of these checks interspersed with the game logic.

Typically this doesn't happen with sync/blocking code though. You write an ordinary main loop, adding checks for these exceptional situations just like any other event you watch for. In my experience, the sync/blocking code is roughly an order of magnitude smaller and easier to reason about than a state machine. But more importantly, it can be scaled infinitely, because you can always add tracing or step through it in a debugger as you normally would. But a large state machine is just exactly what you described: a large switch command of gotos. You can't just look at it and know where the code came from or what conditions got it there. You have to derive that history by hand in your head.

And since async/await is more conceptually equivalent to a state machine than a coroutine or sync/blocking code, that severely limits how well we can reason about large promise chains. See my comment to jcranmer on this thread for a bit more insight about the internals of this:

https://news.ycombinator.com/item?id=22731043

> And since async/await is more conceptually equivalent to a state machine than a coroutine or sync/blocking code, that severely limits how well we can reason about large promise chains.

I have to disagree with that. You can implement async/await using state machines, just as you can implement a loop using goto, but it's a more constrained model that's much easier to reason about than a fully general state machine or a fully general coroutine. The control flow still proceeds the way you expect - you still execute code from top to bottom, you still return once from every function call - you just yield at some points in it. You're not interleaving your control flow with some other function that you communicate with (like a generator), and you're certainly not treating suspended control flow as something to be copied or passed around (like a coroutine). It's a much simpler concept.

What do you think about hierarchical state machine [1] ?

>Has it already been through state X?

Like everything else, logging

>Will it come back to this state in the future?

Mentioned in the comment before, there is a transition graph that you can do static analysis on, note that you might run into a halting problem

>Is there a route from state Y to state Z?

There is a transition graph

>State machines are effectively goto writ large...

quoting Prof Carl Hewitt of Actor Model, "goto is harmless"[2], function/procedure call is a goto, much like sending an named event while in particular state with/out parameters

[1] https://en.wikipedia.org/wiki/UML_state_machine

[2] https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=34m54s

note: edited for formatting

> What do you think about hierarchical state machine [1] ?

Would need to see what the implementation actually looked like and how it was worked on. The graphs may look nice, but I've never seen people program effectively by editing graphs.

> Mentioned in the comment before, there is a transition graph that you can do static analysis on, note that you might run into a halting problem

Sure, but the graph seems to be very much secondary. You don't step through the graph in a debugger, for example.

> quoting Prof Carl Hewitt of Actor Model, "goto is harmless"[2]

Not at all convinced, and he fails to make any real case for it. I've worked on actor systems, and they're bad in the same way that unstructured code is bad.

> function/procedure call is a goto,

It's not, because you still have the call stack. You know where you came from and why (and can see it in the debugger), and you know you're going back there. You can reason compositionally, because in g(f(x)) f is a black box and always will be even if it gets refactored, whereas there's no equivalent "locality" in a state machine.

> much like sending an named event while in particular state with/out parameters

That also sounds like a bad way of doing things.

Interesting points, thanks

> Would need to see what the implementation actually looked like and how it was worked on

Examples are but not limited to Simulink, Rhapsody, Unreal Blueprint, QT SCXML

>The graphs may look nice, but I've never seen people program effectively by editing graphs

graph mentioned is a state transition graph, not necessarily a visual graph, although its a nice eventuality

> You don't step through the graph in a debugger, for example.

Yes you do, if using the mentioned example

Combining a state machine with actor model has been a work wonder for me. I have to agree with you on locality in pure state machine, but if a state machine is defined as an Actor you get locality by definition. Its more of a deterministic abstract modelling, testing and execution

What do you mean by functionally equivalent?

Channels are just queues and passing methods. Rust style futures are closer to continuations. Both are powerful and can accomplish neat taska, but not ABI compatible or even used the same way.

> Channels are just queues and passing methods.

My understanding is channels in `go` are 0 sized by default- and thus turn into "rendezvous channels" that can be used to synchronize progress between threads.

This article compares Go and Rust threads and includes an example of how go uses channel's to synchronize. [0]

[0] - https://medium.com/@deckarep/paradigms-of-rust-for-the-go-de...

Async or parallel systems are deterministic when the multiple paths don't cross, and when crossed the crossing points have well defined deterministic steps and interaction.
This is a huge issue when trying to fuzz async code also.
Dropbox does, essentially, fuzzing of Rust async code and it is determistic.

I couldn't find a post about it, sadly, but I know the information is out there somewhere.

source? Can't find anything on that either
https://dropbox.tech/infrastructure/rewriting-the-heart-of-o...

> The Control thread is designed to be entirely deterministic when its inputs and scheduling decisions are fixed. We use this property to fuzz it with pseudorandom simulation testing. With a seed for our random number generator, we can generate random initial filesystem state, schedules, and system perturbations and let the engine run to completion. Then, if we fail any of our sync correctness checks, we can always reproduce the bug from the original seed. We run millions of scenarios every day in our testing infrastructure.

I am no expert and I wish there were a start from 0 guide with examples not involving external crates but my sense is that you can write your own executor and do whatever you want. (Maybe I'm wrong?)
You're not wrong: that is basically what this article is, and it includes writing an executor!
The linked article finally helped me understand how Rust implements async. Give it a shot.