|
|
|
|
|
by formerly_proven
1979 days ago
|
|
Around four years ago I was working on a transactional data store and ran into these issues that virtually no one tells you how durable I/O is supposed to work. There were very few articles on the internet that went beyond some of the basic stuff (e.g. create file => fsync directory) and perhaps one article explaining what needs to be considered when using sync_file_range. Docs and POSIX were useless. I noticed that there seemed to be inherent problems with I/O error handling when using the page cache, i.e. whenever something that wasn't the app itself caused write I/O you really didn't know any more if all the data got there. Some two years later fsyncgate happened and since then I/O error handling on Linux has finally gotten at least some attention and people seemed to have woken up to the fact that this is a genuinely hard thing to do. |
|
My experience was the same as you.
What helped me was discovering all the fantastic storage and file system papers coming out of the University of Wisconsin Madison, supervised by Remzi and Andrea Arpaci-Dusseau.
Their teams have studied and documented almost all aspects of what is required to write reliable storage systems, even diving into interactions between local storage failures and global consensus protocols, how a single disk block failure can destroy Raft and Zookeeper. Most safety testing of these systems tends to focus on the network fault model. I think in a few years time we'll all look back and see how today we had almost no concept of a storage fault model. It's kind of exciting to think that there's going to be a new breed of replicated databases that are far more reliable than today's systems. On the another hand, perhaps the future is already here, just not very evenly distributed.
http://pages.cs.wisc.edu/~remzi/
Their OSTEP book (Operating Systems in Three Easy Pieces) is also a great fun read: http://pages.cs.wisc.edu/~remzi/OSTEP/