Hacker News new | ask | show | jobs
by _vvhw 1979 days ago
I have found that planning for DIO from the start makes for a better, simpler design when designing storage systems, because it keeps the focus on logical/physical sector alignment, latent sector error handling, and caching from the beginning. And even better to design data layouts to work with block devices.

Retrofitting DIO onto a non-DIO design and doing this cross-platform is going to be more work, but I don't think that's the fault of DIO (when you're already building a database that is).

1 comments

Is there a known library with a cross platform abstraction that could help?
I wrote this for Node.js, which is a native binding in C, exposing cross platform functionality: https://github.com/ronomon/direct-io

Although if it's a new project and you're used to C, I would recommend also taking a good look at Zig (https://ziglang.org/), because it's so explicit about alignment compared to C, and makes alignment a first-class part of the type system, see this other comment of mine that goes into more detail: https://news.ycombinator.com/item?id=25801542

Something that will also help, is setting your minimum IO unit to 4096 bytes, the Advanced Format sector size, because then your Direct IO system will just work, regardless of whether sysadmins swap disks of different sector sizes from underneath you. For example, a minimum sector size of 4096 bytes will work not only for newer AF disks but also for any 512 byte sector disks.

Lastly, Direct IO is actually more a property of the file system, not necessarily the OS (e.g. Linux), so you will find some file systems on Linux that return EINVAL when you try to open a file descriptor with O_DIRECT, simply because they don't support O_DIRECT (e.g. a macOS volume accessed from within a Linux VM) so that should be your way of testing for support, not only the OS.