Hacker News new | ask | show | jobs
by tytso 1594 days ago
Or (3) companies that make $$$ or save $$$ from using Postgres could hire kernel developers to implement a better solution for buffered I/O. The problem is that the companies who are serious about I/O recovery use Direct I/O, so engineers that are employed by those companies have plenty of other improvements such as io_uring, improving general storage performance, adding support for inline encryption engines to improve performance on mobile devices, etc., etc., etc.

People seem to forget that Open Source does not mean that users get to demand that unpaid volunteers will magically do work for their pet feature requests. It just means that the source code is available and people are free to improve the code to make it better fit their use case. A proprietary OS is like a car whose hood is welded shut, and only the dealer is allowed to service it. An open source OS means that you can take the car to whomever you like, or even service the car, or improve the car, yourself. It does not mean that you get to have service or improvements to your car engine for free.

The other thing to note here is that Postgres was issuing fsync(2) calls from different processes, and some of those processes were ignoring the error return from fsync(2). If there is an I/O error, fsync(2) will tell userspace about it. However, there is nothing in POSIX which states that once a file has an I/O error associated with it, the fsync(2) system call will return errors forever and ever, Amen. So Postgres was being a bit dodgy with error returns as well, and was demanding that something that POSIX clearly never promised.