Hacker News new | ask | show | jobs
by cks 5505 days ago
Why don't the cloud storage service providers encrypt/store all user content in such a way that it can only be read by the user? I doubt there are any ethical benefits in storing the data in such a way that they (the service providers) can read it. Even if there is I doubt that users would agree to such usage.
3 comments

I think some storage providers might not be comfortable with the situation where they'd have to tell a paying customer that they can't help them at all when the customer loses the key - they choose simplicity as a feature over appealing to the security-conscious.

As for Dropbox, they need to be able to read the files to serve it to you from their website.

Can't they implement client side encryption? A javascript based encryption mechanism could be interesting.
And completely unusable for the average consumer (needing them to keep track of their key, and somehow secure their key, and somehow pass their key to every browser they use.)
Because then they can't store only one copy of each duplicate file, skip uploading duplicate files, work with file hashes knowing that the same file will have the same hash, and less pleasantly, they can't do analytics, advertising and recommendations on the types and quantities of stored files.
That's not true. As is, in a lot of crypto problems, there are powerful workarounds that require a lot of work. See here for one idea: http://news.ycombinator.com/item?id=2461713

Unfortunately, this starts to become a cat-and-mouse game.

I won't pretend I can follow all the implications of that scheme, but it looks like all files are encrypted with an unchangeable hash of the file as the key?

So that all the RIAA has to do is provide a sample mp3 and then DropBox can see who has AES(F, H(F)) stored. Only the files with user generated unknown content can remain mysterious to DropBox, widely used files cannot.

And since you use aes(f, h(f)) you can't change the encryption key on any particular file.

And since the client software needs to use the local DB and since they have the list of files you uploaded, they have most of the plaintext known if they want to try to decrypt the DB maliciously.

But if they do want to, they can leak the password you type in to themselves anyway.

Also, how would this scheme interact with DropBox's differential upload and revision tracking feature?

ZFS does encryption and deduor too, so yes it is possible, but secure trustable ecrypted DropBox where they also do the encryption part?

So that all the RIAA has to do is provide a sample mp3 and then DropBox can see who has AES(F, H(F)) stored. Only the files with user generated unknown content can remain mysterious to DropBox, widely used files cannot.

No, Dropbox does not know who has what hash. The list of files you have is encrypted by your own key. I realize the scheme is not fixed and there are ideas, since no one exactly published a paper on this.

Also, how would this scheme interact with DropBox's differential upload and revision tracking feature?

Yes, unfortunately there is always a trade-off between security and usability. Things like encrypted volumes are not very friendly and intuitive but provide security. Similarly, lots of neat tricks that Dropbox uses might become void. But at least dedup that was one of their strong features still works.

No, Dropbox does not know who has what hash.

They don't need to - you upload AES(F, H(F)), so if the RIAA give DropBox a sample "Beyonce: Pop Song #7.mp3" file, DropBox can do H(F), then do AES(F, H(F)), then say "do we have this? Yes. Who uploaded it? Accounts adambloggs1, beatricebloggs2, carltonbloggs3, delaneybloggs4".

They couldn't trawl for the RIAA by filename only, or by file hash only, but they could trawl from an example file.

The safe stuff would be your accounts - since there is nobody to provide an example file for them to hash/encrypt. (Except it wouldn't be totally safe since they could weaken the local database encryption or pass themselves the key to it, and you'd never know).

We can agree that they might be able to do it and keep DeDupe, though.

They don't need to - you upload AES(F, H(F)), so if the RIAA give DropBox a sample "Beyonce: Pop Song #7.mp3" file, DropBox can do H(F), then do AES(F, H(F)), then say "do we have this? Yes. Who uploaded it? Accounts adambloggs1, beatricebloggs2, carltonbloggs3, delaneybloggs4".

I've worked it out and if I'm not incorrect the table that adambloggs1 that has (hash2(file1), hash2(file2), ..., hash2(file10)) which are adambloggs1 10 files can be stored remotely encrypted by the client's key (derived from his password in a secure way that Dropbox cannot). What this means is that whenever the client has to send across hashes to dropbox to sync across files, he gets his encrypted database from dropbox, decrypts it remotely and proceeds to give dropbox relevant hash information.

There are 2 problems definitely that can compromise the system:

1. Dropbox decides to store your requests because of a subpoena (effectively they're logging you---which is not required for functionality). Then the encryption is useless.

2. If dropbox does not log you, then can collude and catch you in the act (i.e., an online attack)

So the solution is ugly, and reasonable, but has some weaknesses. Yet, it is better than nothing.

This system makes sure that RIAA cannot trawl by filename or hash only unless dropbox stores logs or some activity is done online.

There are cloud storage providers doing this. See for instance the tarsnap.com solution. Data is stored but with a key in your end. In other words, it is cloud storage done right.