Hacker News new | ask | show | jobs
by Asooka 2771 days ago
Correct me if I'm wrong, but doesn't this suffer from a race condition? What if a process requests an awake, but then gets preempted before doing the blocking system call, and then gets awakened in response to its awake request (rather than because of normal scheduling)? Its awake request was already serviced, so if it performs a blocking syscall, it will wait indefinitely.

Alternatively, if the awake request is only done when a blocking syscall is done, doesn't it then suffer from the problem that a random buggy library function could request an awake without then doing a blocking syscall (due to whatever logic bug), so then when the process does a blocking syscall that it expects to block indefinitely, it instead gets a syscall with a timeout?

Wouldn't it be better for the awake syscall to take another syscall as a parameter (pretty simple to do in assembly and should be provided as a C library wrapper), in order to guarantee atomicity?

2 comments

> Wouldn't it be better for the awake syscall to take another syscall as a parameter (pretty simple to do in assembly and should be provided as a C library wrapper), in order to guarantee atomicity?

Plus in this case the awake call could be named something more intuitive (like syscall_with_timeout or whatever).

> Plus in this case the awake call could be named something more intuitive

This is an interesting objection.

I find awake/awakened/forgivewkp intuitive names, but I'm not a native English speaker.

I'm not going to add the syscall parameter (I considered and discarded that option during the analysis), but I welcome suggestions for a better naming.

Awake in itself is intuitive in some contexts, but it doesn't seem to describe the semantics you want in this case. First of all it's not obvious that it's related to syscalls. Secondly it doesn't really mean the process is guaranteed to awake after the specified time - if the syscall doesn't block or finishes faster, the process might well stay sleeping at the alleged awaking time. Someone who doesn't know all the details will easily get the wrong idea.
Jehanne hacker here.

> doesn't this suffer from a race condition?

This is a good question I should probably clarify in the article as it has been asked before but I can't answer in that forum (see https://lobste.rs/s/fqilcv/simplicity_awakes#c_8pvo0s).

To prevent race conditions the wakeup can occur only during a blocking system call (not even all, some cannot be interrupted to avoid unintuitive side effects).

> it then suffer from the problem that a random buggy library function could request an awake without then doing a blocking syscall (due to whatever logic bug), so then when the process does a blocking syscall that it expects to block indefinitely, it instead gets a syscall with a timeout?

This is by design.

The awake idiom described in the article is pretty simple: if you book a time slice you must release it if it didn't expire.

The operating system cannot prevent userspace bugs.

> Wouldn't it be better for the awake syscall to take another syscall as a parameter

This is an option I discarded during the analysis.

It's a matter of trade offs: an additional argument would increase the complexity a lot. In particular, you would need to maintain a map of syscall->wakeups in userspace if you want to be able to `forgivewkp` the right one. And, on successful completion of a sequence of syscalls, you would have to `forgivewkp` all unexpired wakeup in such map.

Thus a single addictional parameter would largely increase the complexity both of the kernel implementation and of the user space code, making several bugs harder to reproduce.