Hacker News new | ask | show | jobs
by MrDrMcCoy 430 days ago
Drives blindly store and retrieve blocks wherever you tell them, with no awareness of how or if they relate to one another. It's a filesystem's job to keep track of what's where. Filesystems get fragmented over time, and especially as they get full. The more full they get, the more seeking and shuffling they have to do to find a place to write stuff. This will be the case even after the last spinning drive rusts out, as even flash eventually has to contend with fragmentation. Heck, even RAM has to deal with fragmentation. See the discussion from the last few weeks about the ongoing work to figure out a contiguous memory allocator in Linux. It's one of the great unsolved problems in general comparing that you and your descendants would be set for life if you could solve.
1 comments

Not quite, AFAIK? Drive controllers may internally remap blocks to physical disk blocks (e.g. when a bad sector is detected; see the SMART attribute Reallocated Sector Count).
Logical Block Addressing (LBA) by its very nature provides no hard guarantees about where the blocks are located. However, the convention that both sides (file systems and drive controllers) recognize is that runs of consecutive LBAs generally refer to physically contiguous regions of the underlying storage (and this is true for both conventional spinning-platter HDDs as well as most flash-based SSDs). The protocols that bridge the two sides (like ATA, SCSI, and NVMe) use LBA runs as the basic unit of accessing storage.

So while block remapping can occur, and the physical storage has limits on its contiguity (you'll eventually reach the end of a track on a platter or an erasable page in a flash chip), the optimal way to use the storage is to put related things together in a run of consecutive LBAs as much as possible.

Sure, but bad block tracking and error correction are pretty different from the implied file/volume awareness I was responding to.
Yes, to be clear, the drive controller generally (*) has no concept of volumes or files, and presents itself to the rest of the computer as a flat, linear collection of fixed-size logical blocks. Any additional structure comes from software running outside the drive, which the drive isn't aware of. The conventional bias that adjacent logical blocks are probably also adjacent physical blocks merely allows the abstraction to be maintained while also giving the file system some ability to encourage locality of related data.

* = There are some exceptions to this, e.g. some older flash controllers were made that could "speak" FAT16/32 and actually know if blocks were free or not. This particular use was supplanted by TRIM support.