Hacker News new | ask | show | jobs
by nemanjaboric 3234 days ago
Yes, the trick is that disk device is considered "fast", so it's not selectable - it's always ready to read/write, which is a lie. I _feel_ (and think, I don't have any data to prove this) the main problem here is the file system layer, which may or may not need to block _after_ the operation is performed, and this complexity doesn't occur with sockets/pipes.

Having a always-ready state on regular files is a problem since Linux's non-blocking is going around readiness. On Windows (and I believe on Solaris/FreeBSD), completeness model is in place, so you schedule an operation, kernel does _everything_ and then you're resumed upon completion of IO.

1 comments

The most general layer in Linux (VFS) doesn't really support asynchronicity, so changing this would require a bunch of changes to every tree and out-of-tree FS - not viable. However, the usual suspects (ext, xfs) actually use a bunch of other APIs as well, where the FS itself is often not involved in simple stuff like a read(2). This theoretically would allow async IO on files in many cases; I believe there is ongoing work in this direction, though. For now, only raw / O_DIRECT is async.