Hacker News new | ask | show | jobs
by nitrogen 4758 days ago
A) You have forgotten basic physics. The beginning of the disk is faster. Locality is desirable but is not and has never been the only thing that matters.

If you haven't actually looked at the block allocation patterns of common filesystems, then you can't say conclusively that fuzzbang is incorrect. Arguments from first principles (e.g. "basic physics") cannot override empirical evidence.

Further, locality of reference will have a much, much bigger influence on spinning disk I/O throughput than location at the front of the disk. The difference between the outer rim and inner rim might be 120MB/s to 70MB/s, so reading a contiguous 200MB file will take 1.7x as long if it's stored at the inner rim (286ms vs 167ms). However, if that 200MB file is stored in 100 2MB fragments, and seeking to each fragment takes 4ms, your reading time will be dominated by the seek time due to fragmentation (686ms vs 567ms, or a mere 1.2x difference).

Based on my experience I'm inclined to accept fuzzbang's description of block allocation strategies. It used to be common wisdom that you could defragment a volume by copying all the data off, formatting it, then copying the data back on. I did this once with an NTFS volume (using ntfs-3g), then checked the resulting data in the Windows disk defragmenter. The data was primarily located around the center of the volume, with numerous gaps. Filesystems leave gaps to allow room for files to expand.

B) You have just named three uncommon filesystems that few people will ever use in the first place, much less with TrueCrypt.

"Commonness" for the purposes of forensics is a much lower bar than for market analysis. I'd also wager that, servers included, there are at least as many ext2/ext3/ext4 volumes on the planet as NTFS volumes.

1 comments

> If you haven't actually looked at the block allocation patterns of common filesystems

I have.

> you can't say conclusively that fuzzbang is incorrect

And I can.

I'm aware of the degree of difference in speed. It is sufficient that it is standard practice for filesystems to be restricted to the first 1/4-1/2 of a spinning disk in performance-sensitive applications. Or at least it was, in the last few years we've become more likely to just use SSDs or keep everything in RAM.

> if that 200MB file is stored in 100 2MB fragments

Thank you for assuming I don't even have the knowledge of a typical computer user, it greatly increases the likelihood I'll not waste further time with you. Raises it, in fact, to 100%.

> If you haven't actually looked at the block allocation patterns of common filesystems

I have.

And? What distribution of block allocation did you observe on said filesystems? Does it contradict the original supposition that filesystems spread out allocations to prevent fragmentation, thus possibly overwriting hidden data at the end of a partition?

It is sufficient that it is standard practice for filesystems to be restricted to the first 1/4-1/2 of a spinning disk in performance-sensitive applications.

This has as much to do with seek times as sequential reading speed. A drive with 8ms average seek times might average 2ms if you only make the heads travel 25% of the width of the platter.

The fact that you have to restrict the filesystem to the beginning of the disk suggests that filesystems don't do this automatically.

Thank you for assuming I don't even have the knowledge of a typical computer user, it greatly increases the likelihood I'll not waste further time with you. Raises it, in fact, to 100%.

I'm not sure how you got that impression. I was just providing numbers to complete the example. There's no need to become defensive; and if you find that you might be wrong, saying, "That's a fair point, I'll have to do some more research," goes a lot further than continuing to beat a dead horse.

Maybe it's the fact that this thread is on an article related to law, politics, and morality that is causing needless argumentation.