Hacker News new | ask | show | jobs
by loeg 911 days ago
You should read my comment in the context of the one it is replying to. That comment suggested a torrent client using seeks + writes to randomly insert chunks as they were downloaded. I have summarized this approach in my comment as "sparse files," expecting charitable readers to be familiar with the context. This method of creating sparse files does not tell the filesystem anything about the intent of the application and usually creates a bunch of fragmentation under torrent-like workloads.
2 comments

“sparse files” are specific term[1] referring to files where the filesystem tracks and doesn’t allocate space for unwritten file content (i.e. content that would just be zeros if read) in large preallocated files.

To use the term “sparse file” to also refer to files with large continuous runs of zeros, created via a seek operation, is just confusing. Those are quite explicitly not sparse files, they’re just files, that happen to be full of zeros (all written to disk). “Sparse file” are quite explicitly the result of the optimisation to avoid writing pointless zeros when preallocating a large file that’s going to written into in an unordered manner.

Using the term “sparse files” to refer to both the “problem” and the “solution” is just unhelpful, and doesn’t align with the accepted meaning of the term.

[1] https://en.m.wikipedia.org/wiki/Sparse_file

It’s not about being charitable. For those unfamiliar with the terminology this is just confusing, and for those that are familiar this discussion is all fundamental and well known anyway.

Unfortunately for COW filesystems including zfs and btrfs fallocate doesn’t do anything useful for preallocation. You’re still going to get fragmentation. The two methods outlined are essentially equivalent.

> For those unfamiliar with the terminology this is just confusing, and for those that are familiar this discussion is all fundamental and well known anyway.

Eh, agree to disagree.

> Unfortunately for COW filesystems including zfs and btrfs fallocate doesn’t do anything useful for preallocation.

Both ZFS and BtrFS have "nocow" modes that are probably more suitable to this type of use case. And other filesystems are widely used.

Can you point me to docs for ZFS offering a nocow mode? I haven't used it in about a decade, but i can't see how that would work - wafl/cow is a pretty fundamental invariant in everything ZFS does