Hacker News new | ask | show | jobs
by p_l 911 days ago
The issue is that pretty much all other filesystems at least on Linux, are effectively implemented as swap filesystem drivers with some hierarchical structure on top, because that's the interface pushed by Linux at kernel level.

In userland, we tend to think of streams of bytes, as provided by original Unix and as all the docs teach us to treat them - that read(), write() are the primitives and they do byte-aligned reads and writes.

Except the actual Linux VFS has, as its core primitive, mmap() + pagein/pageout mechanism, with read() and write() being simulated over the pagecache which treats the files as mmap()ed memory regions. It's how IO caching is done on Linux, and it's source of various issues for ZFS and people using different architectures because for a long time (changed quite recently, afaik) Linux VFS only supported page-sized or smaller filesystem blocks. Which is a bit of a problem if you're a filesystem like ZFS where the file block can go from 512b to 4MB (or more) in the same dataset, or VMFS which uses 1MB blocks.

1 comments

What any of that got to do with the bug described in the article? Presumably every filesystem is responsible for tracking the content of sparse files, and where holes are. That's not something the Linux kernel is going to give you for free, the FS needs tell the kernel which pages should be mapped to block address on disk and which pages should be simulated as continuous blocks of zeros with no on-disk representation.
It's related to the talk about filesystem interface metaphors in this specific subthread :)