Hacker News new | ask | show | jobs
by edanm 5618 days ago
I doubt that's true for Dropbox, at least.

Consider that Dropbox gives you 50gb for the basic plan. I'm guessing most people don't back up videos, games or their OS using that space, but rather back up their documents, projects they're working on in whatever field, photos, and music.

Of those, only with music is there a chance to use deduplication, and that's assuming you can figure out two music files with different ID-tags are the same.

(Come to think of it, in my Dropbox, music easily takes up 70% of my quota, so maybe is is worthwhile after all.)

1 comments

You need a decent hash of every file anyway (to check for changes etc) so it's pretty trivial to deduplicate. I don't think you'd need to do stuff like check the ID-tag.
But then my music files, which I edit the id-tags for, will show up as different than other people's when hashing.

It would be interesting for Dropbox to release numbers on how many music files are identical between different people.

I believe they could (and probably do?) de-dupe at a lower than file level to handle this issue.
Good point. The ID3 tag is probably only in the header anyway. They'll just do it at a block level.