Hacker News new | ask | show | jobs
by sigil 5542 days ago
But you could prevent reuploads with per-user de-duplication, while avoiding the privacy issue of cross-user de-duplication.

I could see why this would be more work to implement (you have to key on user+contenthash), but it would still be interesting to know how much Dropbox and its users actually benefit from cross-user de-duplication.

1 comments

I have a hard time understanding this line of argument.

Per-user deduplication will mean, I need not upload the same file twice into my own account? What's the use of this?

I keep some of my 'paid for' software installables backed up in my Dropbox, and they tot up to ~1.5 GB (the Humble Indie Bundle games). When I started the upload however, it took maybe 5 seconds because of cross-user deduplication, and I am super grateful to them for this feature.

I imagine this feature saves users tons of bandwidth, as most of the people I know use Dropbox for backing up important software, rare music and videos.

> I have a hard time understanding this line of argument.

It's not a line of argument. It's a line of inquiry. You've given anecdotal evidence that cross-user deduplication benefits you and people you know, but what about some actual numbers from Dropbox?

Producing actual numbers -- "eg cross-user deduplication saves our users 30% of their upload time and bandwidth, on average" -- seems like a great way for Dropbox to counter this issue.

> I imagine this feature saves users tons of bandwidth, as most of the people I know use Dropbox for backing up important software, rare music and videos.

We don't have to imagine! Let there be numbers!

Also -- "rare" music and videos that everyone's uploading duplicates of? ;)