Hacker News new | ask | show | jobs
by PaulKeeble 1492 days ago
I have seen SSDs on Windows at least lose a lot of performance due to file fragmentation. While there is no particular reason why the SSD would run slow once you try to read a file from the filesystem it does slow down and it can impact performance dramatically. Dropping drive performance to 1/5 of normal after 10x of overwrites of the drive contents.

The dogma at the moment is that SSDs don't require de-fragmentation and that is potentially true to a certain point but I think Windows actually needs the file system de-fragmented due to its overhead. I have a program to reproduce the effect and have been meaning to test EXT4 and write an article about it at some point. I need to check its something that happens across a range of devices before I publish and it really is just windows, I know defragging the files (copy away, delete files and replace) works to instantly fix performance but it could be device/controller/firmware specific.

The other possibility is large amounts of writes filling the device can result in reduced working space especially in drives with very small amounts of cache that cause slow downs near the end of tests.

2 comments

The effect of fragmentation can be estimated from the standard I/O tests-- just compare "sequential read" to "64KB random read".
> The dogma at the moment is that SSDs don't require de-fragmentation

They don't require a regular de-fragmentation, like HDDs, because if you are just occasionally read some files it would be fast enough AND with a physical layer hidden by remap it doesn't make sense at all, because the file what is logically present to you as a one continuous block could be really stored across multiple locations.[0]

> I know defragging the files (copy away, delete files and replace)

And this is one the real way to "defrag" on SSD backed media. Tossing around clusters like it is a HDD only wastes your TBW.[1]

> While there is no particular reason why the SSD would run slow once you try to read a file from the filesystem it does slow down and it can impact performance dramatically

THere is always a couple of factors what affect the performance.

Where is always a question what exactly you are reading: a bazillion of < 1KB files could be anywhere on the physical storage and while the time to access for a single file could be as fast as SSD can provide, the pattern of accessing thousands of files of small files not only fills the IO queue, but wastes tons of time on overhead, for every file you access there is not only "Hey SSD grab bytes at LBA 44444 to 5555", but also there are a before mentioned queue for IO operations, parsing MFT for the file location at LBA, reading and parsing DACL, allocating handles (and discarding them later) etc, etc. And if you run out of caches (most notably the DRAM cache on your SSD) then of course things starts to slow down to a crawl, especially if you not only reading those files, but do other things on the same drive at the same time.

Also while I mention MFT - some small files are stored in it entirely[2], so all the overhead is processed quickly (because in normal conditions most of the MFT is cached in memory anyway) but it should be small enough[3].

Also don't forget what if your file is 1KB the drive doesn't read 1KB from the storage. At best it reads 4KB (the default NTFS cluster size), but if your next file isn't in this block (or it is but by the time it comes to read it the cache of this block was already flushed) then you need to wait until the previous read completes. Yes, reads are fast, in theory, but again this is where IO queue, caches, NCQ starts do matter.

And last but not least: on Windows there is always a question if the antivirus software (be it built-in Defender or a 3rd-party one) is still sane or wastes your time rechecking all your already checked, static, non-executable files. Like a bazillion of jsons.

[0] and without TRIM support you can't even have even a very loose guarantee what you really cleared the block.

[1] back in the day I used this to defrag a very heavy fragmented HDDs, just Ghost it to another drive and then restore it back - all files are defraged and it takes way less time because source drive only reads, not read-write-repeat.

[2] https://superuser.com/questions/1185461/maximum-size-of-file...

[3] just checked a couple random files on my drive - cutoff is somewhere around ~700B.