Hacker News new | ask | show | jobs
by sedatk 2083 days ago
I see. Then it makes sense. Thanks.
1 comments

Assuming NVME queing works like SATA or SCSI queuing (which I believe it does), then basically queue entries are unordered [1]; the device is free to process them in any order. If you (as in, person who is implementing a block layer or file system in an OS kernel, or some fancy kernel-bypass stuff) want requests A and B to be ordered before request C, then you must do something like

1. Issue A and B.

2. Wait for A and B to complete.

3. Issue a FLUSH operation (to ensure that A and B are written from the drive cache to persistent storage), and wait for it to complete.

4. Issue C with FUA (force unit access) bit set.

5. Wait for C to complete.

Alternatively, if the device doesn't support FUA, for writing C you must instead do

4b. Issue C.

5b. Wait for C to complete.

6b. Issue FLUSH, and wait for the FLUSH to complete.

Now, like wtallis already said, NVME additionally has multiple queues per device, but these are independent from each other. If you somehow want ordering between different queues, you must implement that in higher level software.

[1] The SCSI spec has an optional feature to enable ordered tags. But apparently almost no devices ever implemented it, and AFAIK Linux and Windows never use that feature either.