Hacker News new | ask | show | jobs
by jesboat 1331 days ago
>> If you really think about it, the only real difference between main memory blocking and disk blocking is the amount of time they may block. > > This is a somewhat confusing analysis you have here. Direct read/write from memory for all intents and purposes doesn't block. Why do you say that reads and writes may also block?

Reads and writes from actual, physical, hardware memory might block, depending on how you define "block", in the sense that some reads may miss CPU cache. But once you get to that point, you could argue that every branch might block if the branch misprediction causes a pipeline stall. This is not a useful definition of "block".

The thing is, most programs are almost never low-level enough to be dealing with memory in that sense: they read and write virtual memory. And virtual memory can block for any number of reasons, including some pretty non-obvious ones like. For example:

- the system is under memory pressure and that page is no longer in RAM because it got written to a swap file

- the system is under memory pressure and that page is no longer in RAM because it was a read-only mapping from a file and could be purged

-- e.g. it's part of your executable's code

- this is your first access to a page of anonymous virtual memory and the kernel hadn't needed to allocate a physical page until now

- you're in a VM and the VMM can do whatever it wants

- the page is COW from another process

1 comments

> This is not a useful definition of "block".

I think what I'm saying is that calling file I/O "blocking" is also not a useful definition of "block". Because I don't really see the fundamental difference between "we have to wait for main memory to respond" and "we have to wait for disk to respond".

> this is your first access to a page of anonymous virtual memory and the kernel hadn't needed to allocate a physical page until now

And said allocation could block on all sorts of things you might not expect. Once upon a time I helped debug a problem where memory allocation would block waiting for the XFS filesystem driver to flush dirty inodes to disk. Our system generated lots of dirty inodes, and we were seeing programs randomly hang on allocation for minutes at a time.

> I think what I'm saying is that calling file I/O "blocking" is also not a useful definition of "block". Because I don't really see the fundamental difference between "we have to wait for main memory to respond" and "we have to wait for disk to respond".

In addition to the point elsewhere made that you're sort of implicitly denying the magnitude of the differences here - the latency differences are on the order of 1000s.

The other way of separating is if the OS (or some kind of software trap handler more generally) has to get involved. A main memory read to a non-faulting address doesn't involve the OS - ie it doesn't ever block. However faulting reads, calls to "disk" IO, and networking IO (ie just I/O in general) involving the OS/monitor/what have you are all potentially blocking operations.

It does not matter whether the OS is involved. Consider a spinlock; if it is spinning, waiting on the lock to be released, then it is blocking.

What matters is whether control returns to the process before the operation is complete. If the process waits, it is blocking (aka synchronous); if the process does not wait, it is non-blocking (and possibly also asynchronous if it checks later to see if the operation succeeded).

> Because I don't really see the fundamental difference between "we have to wait for main memory to respond" and "we have to wait for disk to respond".

The difference, conservatively, is a factor of 1000.

There are plenty of times in software engineering where scaling 1000x will force you to reconsider your architecture.

Sure, fair enough.

To be clear I do not believe that async disk I/O is never useful, I just think that it's not as useful as people at first imagine when they learn about async I/O.

Yes, it may be 1000 times slower than memory. But there's a fundamental paradigm difference from network events, in that with network events you are waiting for some other entity to take action, with no implicit expectation that they will do so in any particular timeframe. Like, if you're waiting for connections on a listen socket, there's no telling how long you will be waiting.

Disk I/O is fundamentally different in that once you submit an operation, you expect it to complete within a reasonable, finite time period.

Async disk I/O is primarily useful for implementing read-ahead / write-behind scheduling behaviors. While databases tend to be the obvious use case, the OS is often so poor at this that there are large performance improvements even for much simpler use cases that are otherwise disk I/O intensive.
I'm not sure that's the primary use case any more. Fast SSDs require high queue depths to use their full throughput, so async IO is desirable to use any time an application knows it has several IO requests to issue in parallel—one thread per request has too much overhead.
Sure, but that behavior is effectively read-ahead / write-behind on your I/O buffers. That doesn't mean much more than anticipating future I/O operations before completion of that I/O is required by the code for efficient forward progress.