Hacker News new | ask | show | jobs
by blackmanta 303 days ago
I am working to improve the CLI tools to make getting this information easier but I have stored the yam repo in yams with multiple snapshots and metadata tags and I am seeing about 32% storage savings.
1 comments

Cool. I have no idea what "stored the yam repo in yams" means. What do you mean by "block-level deduplication"? What is a block?
I stored the codebase for yams in the tool. The "blocks" are content-defined blocks/chunks, not filesystem blocks. They're variable-size chunks (typically 4-64KB) created using Rabin fingerprinting to find natural content boundaries. This enables deduplication across files that share similar content.