| This is what I was recalling, this method gives you a clever way to do it using the file itself as the key: > “Convergent encryption solves this problem in a very clever way: “The way to make sure that every unique user with the same file ends up with an encrypted version of that file that is also identical is to ensure they use the same key.
However, you can’t share keys between users, because that defeats the entire point; you need a common reference point between users that is unknown to anyone but those users. “The answer is to use the file itself: the system creates a hash of the file’s content, and that hash (a long string of characters derived from a known algorithm) is the key that is used to encrypt said file. “If every iCloud user uses this technique — and given that Apple implements the system, they do — then every iCloud user with the same file will produce the same encrypted file, given that they are using the same key (which is derived from the file itself); that means that Apple only needs to store one version of that file even as it makes said file available to everyone who “uploaded” it (in truth, because iCloud integration goes down to the device, the file is probably never actually uploaded at all — Apple just includes a reference to the file that already exists on its servers, thus saving a huge amount of money on both storage costs and bandwidth). “There is one huge flaw in convergent encryption, however, called “confirmation of file”: if you know the original file you by definition can identify the encrypted version of that file (because the key is derived from the file itself). When it comes to CSAM, though, this flaw is a feature: because Apple uses convergent encryption for its end-to-end encryption it can by definition do server-side scanning of files and exploit the “confirmation of file” flaw to confirm if CSAM exists, and, by extension, who “uploaded” it. Apple’s extremely low rates of CSAM reporting suggest that the company is not currently pursuing this approach, but it is the most obvious way to scan for CSAM given it has abandoned its on-device plan.” https://stratechery.com/2022/apple-icloud-encryption-csam-sc... |
The file encryption part was based on using a hash of the file as the key.
It's always nice to later find out that one's quick amateur idea turns out to be an independent rediscovery of something legit. Now that I've learned it is called "convergent encryption" Googling tells me it it goes back to 1995 and a Stac patent.
[1] https://news.ycombinator.com/item?id=2461713