Hacker News new | ask | show | jobs
by Dylan16807 5412 days ago
I tried ZFS dedup but there was something like a 20x slowdown to write files compared to ZFS without dedup, and this was on under ten gigabytes of files. I don't know if I somehow had the cache settings wrong or what the problem was, but I didn't manage to fix it, even trying both FUSE and kernel versions. (On ubuntu 11.4)
1 comments

Yeah random acess on hard disks is awfully slow. And if you have dedup you can cause lots of random access... If you have a little bit of data the hashtable used for dedup can also be to big to fit into memory. Then ZFS puts it onto the disk and it is even slower. Luckily there is a feature to use SSDs as a cache device in this case.
The tricky part seems to be 'too big to fit into memory'. From what I understood and calculated the dedup tables on my system should have been well under 100MB, and the amount of memory designated for metadata was over 350MB, yet the performance was terrible.
Based on my testing (not published anywhere, sorry) ZFS dedup works best when you enable compression. With compression, it's only slightly slower then without dedup.
I did have compression on. Good to know that in some cases dedup will perform quite well. Was that with an SSD?

My best guess is that I either ruined the configuration in some way or dedup and only dedup reacted horribly to being in a virtual machine.

ZFS is designed to have lots of horsepower and memory thrown at it.......big servers, available CPU power, lots of ECC ram. If there's going to be an SSD allocated as a cache disk, it's probably expected to be huge and enterprisey too.....

ZFS is awesome, but some features will be disappointing unless you are dealing with adequate resources.