Hacker News new | ask | show | jobs
by ploek 1179 days ago
When pointing out that HDDs can outperform these SSDs, 'sequential' is the key word. I regularly pull remote backups with syncoid (i.e. `zfs send | zfs receive`) and over time that fragmented the receiving side considerably. In the end `zpool list` showed over 80% capacity and 40% fragmentation. The hard drives were seeking constantly and the syncoid task would take over eight hours to complete. I replaced the disks with SSDs and now the task completes within 20 minutes.
2 comments

Back in the day, a common suggestion for speeding up your PC was to defragment your hdd. I didn't start using Linux until right around the SSD transition, so I've never done it there, but for setups like this are there not still tools to do something similar?

I'm sure you got other benefits out of swapping to SSD's, but your comment just got me thinking.

No, there is no defragmentation for ZFS, unfortunately. A way to get around that is to send the pool's content to another (fresh) ZFS pool, where it would be written sequentially. But for that you would need a set of drives of same (or larger) capacity.

There are ideas on how one would do an actual defrag. They are generally based on a concept called block pointer rewrite, which Matt Ahrens once said could be the 'last feature ever implemented in ZFS', as it would make everything so much more complicated, that it would be hard to add new features afterwards [1].

[1] https://www.youtube.com/watch?v=G2vIdPmsnTI#t=44m53s (Link to the beginning of the explanation, the 'last feature ever implemented' quote is at at around 50:25)

There's no point in defragging an SSD unless the low-level controller is doing it; the controller is always presenting a false picture of the mapping between data addresses and physical location of pages.

There's no good ZFS defragging tool, although the initial send to a new pool will accomplish that. This is just a thing for COW-style filesystems.

> This is just a thing for COW-style filesystems.

It doesn't have to be.

ZFS in particular has an architecture that's very hostile to ever moving things.

BTRFS has a design that's amenable to defragmentation, but the builtin option doesn't work with snapshots and the external programs I've tried are partial and finnicky.

long ago I worked on a graphical tool that showed disk fragmentation. Of course all the devs would test on their various hardware, pre-SSD. It was true that you could change the performance for daily tasks by some fragmentation management.

In recent years I use Linux with default ext4 mostly. Linux and ext4 appear to me to regularly maintain the disk allocations somehow, but I do not have a graphical tool to show that; details welcome.

The moment you as IO seek on hard drives they just suck, as you experienced.

In 'almost' every user based usage scenario a SSD is going to perform better than an HDD. About the only time an HDD is better is when you're writing out large singular data files. But even then you have to be cautious, as if the drive is shared with other read/write operations you can find the performance again drops off a cliff.