Hacker News new | ask | show | jobs
by derefr 4049 days ago
> I think PBI does de-duplication at the package manager level by manipulating hard-links to common files, rather than installing multiple copies.

Which is, itself, a bad reinvention of Plan 9's Venti filesystem. Having one, or two, or a million files on disk containing the same data should take up as much space as having just one. "Hard links" are a policy-level way to express shared mutability; deduplication of backing storage, meanwhile, should be a mechanism-level implementation detail.

2 comments

ZFS has support for block-level deduplication and it comes with heavy memory and performance requirements. File-level deduplication with hard links is lightweight and requires no special support (besides a filesystem which supports hard links, obviously).
Well, there is something to be said for the 'worse is better' approach.