| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kderbe 592 days ago
	I clicked because of the bait-y title, but ended up reading pretty much the whole post, even though I have no reason to be interested in ZFS. (I skipped most of the stuff about logs...) Everything was explained clearly, I enjoyed the writing style, and the mobile CSS theme was particularly pleasing to my eyes. (It appears to be Pixyll theme with text set to the all-important #000, although I shouldn't derail this discussion with opinions on contrast ratios...) For less patient readers, note that the concise summary is at the bottom of the post, not the top.

2 comments

Aachen 592 days ago

That being:

> As we’ve seen from the last 7000+ words, the overheads are not trivial. Even with all these changes, you still need to have a lot of deduplicated blocks to offset the weight of all the unique entries in your dedup table. [...] what might surprise you is how rare it is to find blocks eligible for deduplication are on most general purpose workloads.

> But the real reason you probably don’t want dedup these days is because since OpenZFS 2.2 we have the BRT (aka “block cloning” aka “reflinks”). [...] it’s actually pretty rare these days that you have a write operation coming from some kind of copy operation, but you don’t know that came from a copy operation. [...] [This isn't] saving as much raw data as dedup would get me, though it’s pretty close. But I’m not spending a fortune tracking all those uncloned and forgotten blocks.

> [Dedup is only useful if] you have a very very specific workload where data is heavily duplicated and clients can’t or won’t give direct “copy me!” signal

The section labeled "summary" imo doesn't do the article justice by being fairly vague. I hope these quotes from near the end of the article give a more concrete idea of why (not) use it

link

londons_explore 592 days ago

> offset the weight of all the unique entries in your dedup table

Didn't read the 7000 words... But isn't the dedup table in the form of a bunch of bloom filters so the whole dedup table can be stored with ~1 bit per block?

When you know there is likely a duplicate, you can create a table of blocks where there is a likely duplicate, and find all the duplicates in a single scan later.

That saves having massive amounts of accounting overhead storing any per-block metadata.

link

emptiestplace 592 days ago

It scrolls horizontally :(

link

going_north 592 days ago

It's because of this element in one of the final sections [1]:

    <code>kstat.zfs.<pool>.misc.ddt_stats_<checksum></code>

Typesetting code on a narrow screen is tricky!

[1] https://despairlabs.com/blog/posts/2024-10-27-openzfs-dedup-...

link

ThePowerOfFuet 592 days ago

Not on Firefox on Android it doesn't.

link

dspillett 592 days ago

It does in chrome on android (1080 px wide screen, standard ppi & zoom levels) but not by enough that you see it on the main body text (scrolling just reveals more margin), so you might find it does for you too but not enough that you noticed.

As it is scrolling here, though inconsequentially, it might be bad on a smaller device with less screen and/or other ppi settings.

link