Hacker News new | ask | show | jobs
by cryptonector 337 days ago
Correct, TFA needs to wait for the completion of _all_ writes to the WAL, which is what `fsync()` was doing. Waiting only for the completion of the "completion record" does not ensure that the "intent record" made it to the WAL. In the event of a power failure it is entirely possible that the intent record did not make it but the completion record did, and then on recovery you'll have to panic.
1 comments

Yes, but I suspect there might be some confusion by the author and others between "io_uring completion of a write" (ie: io_uring sends its completion queue event that corresponds to a previous submission queue event) and "fsync completion" (as you've put as "completion of all writes", though note that fsync the api is fd scoped and the io_uring operation for fsync has file range support).

The CQEs on a write indicate something different compared to the CQE of an fsync operation on that same range.