Hacker News new | ask | show | jobs
by jerf 2268 days ago
Is async/await a good idea for an OS kernel, even a toy one? Cooperative multitasking tends to break down at scale, because the probability that all the "threads" you're cooperating with are playing nice goes to zero as the number of threads increases. An OS will tend to have a concentrated number of the pathological cases in it as it deals with hardware and all the other hardest timing & concurrency problems.

It's a viable option for user-space programs because you can far more tightly characterize them and program them from top to bottom to work with that paradigm nice. Embedding islands of cooperative multitasking in a sea of pre-emptive multitasking seems to make a lot more sense than the other way around.

However, this post is a question, not a statement. If, for example, a Linux kernel developer posts "nah, it's no biggie, the Linux kernel is effectively structured the same", for instance, I would not quibble.

13 comments

There's a few different pieces to your question.

Async/await is a programming paradigm that moves the state management of an asynchronous program into the compiler rather than being explicitly managed by the user. The paradigm is almost universally a win: if you don't want to write asynchronous code, just don't use it; it doesn't impact your code at all.

A second question is how the language actually drives the execution of asynchronous tasks. Most of the languages that have added async/await have incorporated it into their execution model, so you are forced to use the particular semantics of the language. There are subtle, but very important, differences here: for example, when you actually provide the answer that resolves the await, do you execute the waiting code immediately, or schedule it for a future iteration? This is what can cause the most problems with async/await, but Rust does not define any executors [1], so it's not an issue here.

The final question is if asynchronous code at all makes sense for an OS kernel. I'd argue that the answer is very much yes: communication with devices are almost invariably asynchronous (you set some memory lines to tell the device to do something, and get an interrupt telling you it's arrived). Particularly for the filesystem, where there are both synchronous and asynchronous system calls for the same code, a good executor for asynchronous code could well be beneficial for simplifying code paths. (Note that filesystems also generally have some degree of task dependency requirements--you need to ensure that the various requests happen in a particular partial order).

Now do note that this requires a far more complex executor model than most async/await toolkits (including this tutorial) provide.

[1] Logically, it would make sense to provide a utility function that synchronously runs an async function, but this isn't implemented yet.

"Now do note that this requires a far more complex executor model than most async/await toolkits (including this tutorial) provide."

I'm assuming the async/await language support in Rust is intrinsically tied to a cooperative approach, which is the part I'm questioning. Obviously a kernel needs to be able to generically work in an asynchronous fashion... the question is, is this asynchronous fashion appropriate?

If the Rust async/await support is indeed so flexible that with some work you could tie it together with a pre-emptive runtime of your own custom creation (as would make sense for an OS kernel), or some intermediate support that may not be "pre-emptive" but would be fully immune by-construction to the sort of issues I'm worried about, then, yes, by all means use it.

But a fundamentally cooperative kernel would really worry me. You'd be looking at freezing the whole kernel, and likely the whole machine, if anything, anywhere, goes wrong with the cooperation. It won't take a lot of scale before that's a problem.

And that's the worst case. Kernels strike me as very likely to also have to deal with the starvation situations that cooperative multitasking can have, where one "task" never gives up control. Even if this doesn't freeze the machine, this can cause significant, human-visible latency issues to emerge.

The Rust async/await amounts to the following:

* Magic keywords (i.e., async and await) to manage the state machine aspect of the function for you.

* A standard interface for describing an asynchronous task (core::future::Future). The magic keywords create an implementation of this interface.

* An interface for saying that task progress can be made (core::task::Waker).

That's it. There is no runtime provided, not even optionally, in the standard library (let alone core). A fair amount of this post is giving instructions for how to build the runtime to execute asynchronous tasks.

One important thing to note is that Rust uses a polling-based approach to implement tasks. That is, the core interface to a task is to run it and have it return the result or indicate that is waiting on something else. In the case of the latter, there is a callback mechanism (the Waker object) that the task is responsible for calling when it has more information to make progress--and the caller is responsible for calling the task again when that happens. Many (most?) other implementations instead use a resolution model, where the caller provides a function that is called with the result, and when the function gets called (particularly in the case of tasks that can execute quickly) differs from implementation to implementation. Rust's model definitely maps much more easily to a kernel, since the dominant pattern will be to drain a buffer, poke a device, and then wait for an interrupt to happen (the interrupt handler calling the wake function).

> Rust's model definitely maps much more easily to a kernel, since the dominant pattern will be to drain a buffer, poke a device, and then wait for an interrupt to happen (the interrupt handler calling the wake function).

What is nice about Rusts model is that it prevents the continuation from running inline the event handler which signaled the completion. That avoids a ton of reentrancy issues and questions around "where does my code actually run"? That indeed makes it also interesting to use for interrupt handlers, since it is guaranteed that the code which waits for the interrupt to happen runs purely in the executors thread and will not accidentally take over the interrupt handler.

At a high level, your concern is that OSes should not use cooperative multitasking for OS Processes. i.e. using Rust's async-await as the only technology within the OS Scheduler would be a mistake.

The article isn't discussing schedulers, but async-await within the OS codebase in general. Forgetting the scheduler, for example, Interrupt handling and driver code is generally async in nature, and at a base level needs to be cooperative.

    while let Some(byte) = keyboard.interrupt().await() {
        ...
    }
In general, certain drivers also need pre-emption, however, given all of the code is written and curated by the kernel devs, they can ensure the code cooperates well.
> I'm assuming the async/await language support in Rust is intrinsically tied to a cooperative approach, which is the part I'm questioning.

That assumption is incorrect. The blog post explains this in detail better than anyone could here - and if that does not suffice, the Rust book and the async book, as well as the Fuchsia kernel documentation, also all go into this.

> If the Rust async/await support is indeed so flexible

It is indeed even more flexible than that. You don't even need a run-time to drive progress. You can manually drive progress of async tasks by polling them throughout your code at specific points if you wanted - essentially interleaving the async processing of tasks with your main thread's task (and well, you can build any sort of executor you want, with different priority levels or whatever you feel like doing). I have an HPC app that does this to drive MPI non-blocking communication.

The blog post explains all of this. It's pointless for you to invest a lot of your and others time into speculations built on top of incorrect assumptions, when the first 2 minutes of actually reading the blog post clarify this for you.

> But a fundamentally cooperative kernel would really worry me. You'd be looking at freezing the whole kernel, and likely the whole machine, if anything, anywhere, goes wrong with the cooperation. It won't take a lot of scale before that's a problem.

This is already the case, isn’t it? If you do a ‘while(!operation_complete) {}’ anywhere in the kernel and it never happens then I’d expect it’s pretty much done.

Whereas with async/await it’d be more likely that the coroutine would simply never resume. It might wind up being a memory leak, yeah, but not freeze your system.

The parallel case with async functions should be a long running computation that does not contain any yield point, so that it cannot be suspended easily. I remember that it is possible to construct (contrived) examples of this also in Erlang.
> If the Rust async/await support is indeed so flexible that with some work you could tie it together with a pre-emptive runtime of your own custom creation

I think that's exactly what it is. AIUI it's basically an API for async/await (and maybe a default runtime?) that different back-ends can be plugged into. At least that's what I've come away with over the months of reading random bits about it here.

As described in the blog post, the rust compiler generates a state machine for every async function and generates the appropriate poll methods for the state transition. This is fundamentally at odds with the preemption, which would then have to indroduce new intermediate states into the diagram which it won't be able to do at runtime.
But if Rust is the operating system/kernel, whenever it decides to to schedule something is preemption for anything downstream, right?

I mean, you don't actually use preemption in the kernel right? Don't you have to handle all that yourself, since there's nothing higher level than you to handle it for you? In that respect, doesn't plugging in a Futures runtime that looks for device flags and stuff as appropriate and flags tasks to wake up/finish accomplish the same thing? (those are actual questions, not leading statements)

If you would write a basic scheduler, at some point you'd have to await the userspace code but you wouldn't have any way to force it to stop running. If the userspace code would enter an infinite loop it would hold the kernel thread forever. Within a constrained environment, eg. the kernel itself (and even that's sufficiently complex with loadable drivers that you might end up with bad interactions) I could see some use for async await, but you'd still need to be able to preempt untrusted code.
> Is async/await a good idea for an OS kernel, even a toy one? Cooperative multitasking tends to break down at scale,

> Embedding islands of cooperative multitasking in a sea of pre-emptive multitasking seems to make a lot more sense than the other way around.

In my reading of the post, the proposal is to do cooperative multitasking within the kernel, and preemptive multitasking between the kernel and the userspace. I think this is tenable, within the kernel, you pretty much need to play nice with the rest of the kernel anyway. Most kernels don't have effective protection for one part of the kernel causing problems with the rest; although, some do have limited protection against drivers that crash or loop.

well.. there is CONFIG_PREEMPT and the even broader CONFIG_PREEMPT_RT in linux [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

> Is async/await a good idea for an OS kernel, even a toy one?

You might want to look at Joe Duffy's posts about Microsoft's Midori OS. In particular the post "Asynchronous Everything":

http://joeduffyblog.com/2015/11/03/blogging-about-midori/

Their findings sound pretty compelling to me. Personally I'm convinced that eventually we'll see much more system-level use of async/await style mechanisms. Perhaps with Rust, perhaps with C++ once coroutines land in C++20.

You can already use C++ coroutines with C++/Winrt, if feeling like having a go at them.
An OS has many aspects to it. Scheduling tasks is only one of the aspects. Async/await is just the interface/mechanism in dealing with the tasks, in this case cooperative tasks.

Interacting with hardware, dealing with interrupts, memory mapping/managing, task isolation, etc are all other aspects of an OS that are apart from task scheduling but still needed.

Cooperative multitasking works fine as long as the users/developers understand the limitation.

"Cooperative multitasking works fine as long as the users/developers understand the limitation."

The problem is, we basically already know that is not the case. We've got years of experience with what "cooperatively multitasked" OS looks like, and it was not a "works fine" situation. You can't understand the limitations at scale as they interact with each other far, far beyond the human mind's capacity to understand.

All I/O is asynchronous. Or rather, synchronous I/O is a subset of asynchronous. Often, in kernel I/O systems, you are communicating with, say, a disk controller that will take some time to process your request. You will want to do other stuff while the device finishes and process the results only when the device signals success or failure.
Async and sync I/O are not subsets of each other, although you can use one to implement the other.

The issue OP is discussing is cooperative vs. preemptive, not async vs. sync.

> Embedding islands of cooperative multitasking in a sea of pre-emptive multitasking seems to make a lot more sense than the other way around.

Pretty much my opinion. I think a preemptive threaded environment where some threads might run multiple async tasks (in order to boost efficiency) can make a lot of sense.

Pure cooperative environments seem too error-prone for bigger systems, because they lack the necessary isolation between tasks. Any bad task can degenerate the performance of all others tasks by blocking the execution thread.

True for a general-purpose OS kernel. But in an embedded system or unikernel it could make a lot of sense.
Also in a microkernel architecture possibly; if the scheduling were cooperative within each process and preemptive between processes that would look like it could make everyone here happy.
Coopertively scheduling across your other isolation boundaries is a pain point, but it's a great architecture for within a single context (ie. within the kernel, or within a single process).

Linux has a lot of cruft and isn't as async as some would like, but NT is pretty async underneath it all (although they still allow you to do syncronous work as well).

I'm not a Linux kernel developer, but it does have a few similar mechanisms AFAIU. If a driver needs to perform more work in response to an interrupt than would fit into the actual handler, it can use cooperatively scheduled "tasklets" or "softirqs."
Probably not at all related to the OP. But I found the thought entertaining, so offering it as such.

One way to address you concern would be to guarantee that task always terminate in a bounded amount of instructions.

One way to solve that would be offer a limited buffer in which the program instructions for the task might be expressed, and a language to express them in that is strongly normalizing (think something like Morte)

The Linux kernel makes a somewhat similar distinction between code that's executing in an atomic context and code that isn't. Atomic code can get stuck in an infinite loop or whatever, but what it isn't allowed to do is to go to sleep (i.e. put itself into a sleeping state and invoke the scheduler).

Atomic tasks are generally expected to do a finite amount of work that can be finished in a short time without waiting for anything else (except maybe a spinlock, which should be held by some other similarly-atomic task that will finish in a short time).

The seL4 microkernel, written in C, has formal proofs of the maximum clock cycles for all of its syscalls. That's one way they manage to not need in-kernel preemption.
Yes, Active Oberon already had it in the mid 90's.

Symbian also had it, both called them active objects.

>Cooperative multitasking tends to break down at scale, because the probability that all the "threads" you're cooperating with are playing nice goes to zero as the number of threads increases.

That's completely wrong. Either cooperative multitasking works or it doesn't. There is no probability involved. What often happens is that foreign code that is not under your control is not yielding. It only takes a single non cooperating task to block an entire core. If you have properly written code the probability of a malfunction is 0% and this probability doesn't grow with the number of threads.

you can have a kernel thread pool dedicated to one kernel feature. This way only task created by this kernel feature will run on this pool and compete together .