|
|
|
|
|
by kderbe
592 days ago
|
|
I clicked because of the bait-y title, but ended up reading pretty much the whole post, even though I have no reason to be interested in ZFS. (I skipped most of the stuff about logs...) Everything was explained clearly, I enjoyed the writing style, and the mobile CSS theme was particularly pleasing to my eyes. (It appears to be Pixyll theme with text set to the all-important #000, although I shouldn't derail this discussion with opinions on contrast ratios...) For less patient readers, note that the concise summary is at the bottom of the post, not the top. |
|
> As we’ve seen from the last 7000+ words, the overheads are not trivial. Even with all these changes, you still need to have a lot of deduplicated blocks to offset the weight of all the unique entries in your dedup table. [...] what might surprise you is how rare it is to find blocks eligible for deduplication are on most general purpose workloads.
> But the real reason you probably don’t want dedup these days is because since OpenZFS 2.2 we have the BRT (aka “block cloning” aka “reflinks”). [...] it’s actually pretty rare these days that you have a write operation coming from some kind of copy operation, but you don’t know that came from a copy operation. [...] [This isn't] saving as much raw data as dedup would get me, though it’s pretty close. But I’m not spending a fortune tracking all those uncloned and forgotten blocks.
> [Dedup is only useful if] you have a very very specific workload where data is heavily duplicated and clients can’t or won’t give direct “copy me!” signal
The section labeled "summary" imo doesn't do the article justice by being fairly vague. I hope these quotes from near the end of the article give a more concrete idea of why (not) use it