A hash collision allows you to create material that matches CSAM signatures, without being CSAM. This opens up a new class of attacks.
Specifically, many criminal actors don't touch CSAM because it's wrong. But some of these criminal actors will happily abuse legal systems, e.g. SWATTing.
I would gladly have a mobile phone full of memes that have been modified to match, just for the lulz. I honestly think every meme should be put through just to have "illegal memes"
The visual derivative is just a resized, very-low-resolution version of the uploaded image. "Matching the visual derivative" is completely meaningless. The visual derivative is not matched against anything, and there is no "original" visual derivative to match against.
If enough signatures match, Apple employees can decrypt the visual derivatives, and see if these extremely low resolution images look to the naked eye like they could come from CSAM. If so, they alert the authorities.. Given a way to obtain hash collisions, generating non-CSAM images that pass the visual derivative inspection is completely trivial.
Probably a mistake to say things like this, when the public documentation contradicts you.
> The visual derivative is not matched against anything, and there is no "original" visual derivative to match against.
Bullshit.
Here is the relevant paragraph from Apple’s documentation:
“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”
I just want to be clear if I understand this... many images can result in the same hash, but the hash can and will be reversible into one image? And that image is a low res porn photo derived from the algorithm's guesswork? So once a hash matches they don't check if there was a collision and the photo is completely unrelated, they just see the CG porn? If that's the case then why even look at the derived image?
No, this is not what's going on at all. The employees never see the original photos in the government CSAM hash database. Apple doesn't even have these photos: it's precisely the kind of content that they don't want to store on their servers. If some conditions are satisfied, the employees gain access to the visual derivatives (low-resolution copies) of your photos, and they judge whether these look like they could plausibly be related to CSAM materials.
The exact details of the algorithm are not public, but based on the technical summary that Apple provided, it almost certainly goes something like this.
Your device generates a secret number X. This secret is split into multiple fragments using a sharing scheme. Your device uses this secret number every time you upload a photo to iCloud, as follows:
1. Your device hashes the photo using a (many-to-one, hence irreversible) perceptual hash.
2. Your device also generates a fixed-size low resolution version of your image (the "visual derivative"). The visual derivative is encrypted using the secret X.
3. Your device encrypts some of your personally identifying information (device ids, Apple account, phone number, etc.) using X.
4. The hash, the encrypted visual derivative, and the encrypted personally identifying information are combined into what Apple calls the "safety voucher". A fragment of your key is attached to the safety voucher, and the voucher is sent to Apple over the internet. The safety vouchers are sent in a "blinded" way (with another encryption key derived using a Private Set Intersection scheme detailed in the technical summary), so that Apple cannot link them to specific files, devices or user accounts unless there's a match.
5. Apple receives the safety voucher. If the hash in the received safety voucher matches that of known CSAM content in the government-provided hash database (as determined by the private set intersection scheme), the voucher is saved and stored by Apple, and the fragment of your secret key X is revealed and saved. (You'd assume that they filter out / discard your voucher if there's no match; but the technical summary doesn't explicitly confirm this; this means that they may store and use it in the future to run further scans).
6. If your account uploads a large number of matching vouchers, then Apple will gather enough fragments to reassemble your entire secret key X. Now that they know your secret key, they can use it to decrypt the "visual derivatives" stored in all your saved vouchers.
7. An Apple employee will then inspect the "visual derivatives", and if your photos look like CSAM (more precisely, this employee can't rule out by visual inspection that your photos are CSAM-related), they will proceed to use your secret key X (which they now know) to decrypt the personally revealing information contained in your safety voucher, and report you to the authorities.
Keep in mind that the employee looking at the visual derivative does not, and cannot, know what the original image is supposed to look like. The only judgment they get to make is whether the low-resolution visual derivative of your photo looks like it can plausibly be CSAM-related or not. Plainly speaking, they will check if a small, say 48x48 pixel, thumbnail of your photo looks vaguely like naked people or not.
That bit you quoted seems to be actually correct. It does not mention visual derivatives at all.
That said I think your statement is a bit too strong, but generally true. A hash collision is not going to inherently be visually confusing. However you claim that it is impossible for an image to be both visually confusing and a hash collision, which seems unlikely. The real question is going to be how much more effort it takes to do both.
Unless you're relying on it being computationally infeasible, but I'm not sure we know enough to consider that true at this point. Usually when we make statements on those grounds we do so with substantial proof. I don't think we know enough to do so here. I'm not even sure how feasible it is when you throw DL into the mix.
From the docs: “as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, inde- pendent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database.”
Most people wouldn't of course. In this scenario you'd get someone to download the CSAM unknowingly. If they have iCloud sync it automatically uploads to iCloud, thereby triggering the system. At that point the authorities will be alerted by Apple, and you can inform media outlets. They in turn will ask law enforcement who will confirm the investigation, and the reputation of the person investigated will be tarnished.
Also as dannyw pointed out, you don't even have to send CSAM to trigger the system. If they found you you would still be charged, but not with possession of CSAM.
The sender would of course be charged with wasting police efforts, defamation attempts+++. In the case of false positives the receiver of course wouldn't be charged, it's more about the fact that this system can be manipulated with too much ease. Even if you're not charged, an investigation takes time away from already limited law enforcement resources. I'm also not interested in buying products from a company that blatantly spies on me. Today it's CSAM, but as others have pointed out, the hashes can be changed to look for anything.
Well that depends on the situation. Regardless the sender would be charged if found, but if they were able to get legitimate CSAM on the receiver's phone the receiver could possibly be charged too, or at least investigated. Just the idea of getting investigated in these kinds of attacks, much less being exposed publicly as being under investigation is a horrible thought.
I meant * could *. My point is that social engineering is a clear weak link in this system. They can also be sent regular photos whose hash matches the database, or use this repo to transform a regular pornographic photo's hash, making it hard for manual confirmation on Apple's part.
What kind of social engineering would lead an innocent person to install malware on their devices? Or do you think people like that want to take part in an illegal DDoS botnet?
I think there’s a difference between “I’ll click this totally legit button to protect my computer from viruses” and “I’ll save this picture of a child being raped to my photo library.”
A lot of people may not know how to avoid malware. But I don’t think very many of them would be so inept as to accidentally long press on child porn and tap “Add to Photos”.
If some commenters can be believed about their experience with the database, there are a bunch of completely innocuous images in it because they're from the same photosets or distributed alongside CSAM.
Is that enough to cause an investigation? Maybe, maybe not, but I wouldn't want it to be a risk.
Photos in the database are classified for their content. Only images classified as A1 (A: prepubescent minor, 1: sex act) are being included in the hash set on iOS. So this doesn't even include A2 (2: lascivious exhibition), B1 or B2 (B: pubescent minor) let alone images which are in the database and aren't classified as any of A1, A2, B1 or B2.
While I've no doubt that there's a lot of "before and after" images (which are still technically CSAM even if they're not strictly child porn) and possibly many innocuous images, they would not have been flagged as "A1".
I'm sure there's probably still a few images flagged as A1 which shouldn't be in the database at all, but that number is going to be small. How many of these incorrectly flagged images are going to make their way into your photo library? One? Two?
You need 30 in order for your account to be flagged.
Lending your phone to someone for a call, then a quick airdrop. Legitimate-looking emails with buttons. There's probably a list somewhere of proven attack vectors.
I posted another comment that was misunderstood as well. Folks, no one is proposing to download actual CSAM images to your photo lib. You could be duped thinking you downloaded an image of a beautiful sunset which was carefully manipulated to match the hash of an actual CSAM image.
The even worse part here is that not only could it impact an image of a beautiful sunset, which would fail the human check, it could impact a low quality version of legal porn, which could easily pass the human check and get passed on to law enforcement.
A sufficiently advanced catfishing attack could probably take advantage of this to get someone raided and have all their electronics confiscated.
Just send someone a zip of photos and let them extract it...
This is the really scary part. Of course getting someone to download blobs that corrolate to CSAM would be one thing, but downloading regular photos that have nefarious hashes is a trend /pol/ could start in an afternoon.
The parent was proposing to “just send known CSAM”.
But OK, say someone sends you a sunset that fools the hasher. Then what? Of course one match won’t do anything, so you’d need to download however many matching sunsets. Then what? The Apple reviewer would see they’re sunsets and you’d challenge the flag saying they’re sunsets. And if somehow NCMEC got involved, they’d see they’re just sunsets. And if law enforcement got involved, they’d see they’re just sunsets.
These proofs of concept might seem interesting from a ML pov, but all they do is just highlight why Apple put so many layers of checking into this.
> But OK, say someone sends you a sunset that fools the hasher. Then what? Of course one match won’t do anything, so you’d need to download however many matching sunsets. Then what?
A real attack would be to take legal porn images and make them collide with illegal images, so when a human goes to review the scaled down derivative images, those images very well look like they could be CSAM. Since there are many of them, they'd get sent to law enforcement. Then law enforcement would raid the victim's home and take all of their electronic devices in order to determine if they can be charged with a crime or not.
This where the "fog of war" kicks in. What with doors being busted down, police departments making press releases, etc. I can easily imagine that the victim could be prosecuted, convicted and sent away because no-one understood the subtlety that their legal porn was not in fact CSAM.
The fog of war is largely in the realm of post-puberty minors, photos of which are not being included in Apple's corpus of hashes. I find it difficult to believe that anyone could mistake or otherwise "fog of war" a photograph of an adult and a prepubescent minor.
And that's assuming someone develops a hash collision which doesn't substantially mangle the photograph like the example offered on Github.
Specifically, only images categorised as "A1" are being included in the hash set on iOS. The category definitions are:
A = prepubescent minor
B = pubescent minor
1 = sex act
2 = "lascivious exhibition"
> Specifically, only images categorised as "A1" are being included in the hash set on iOS.
Do we know that for sure?
Apple has changed their mind enough times in the last week and a half that I'm convinced they're in full on defensive "wing it and say whatever will get people off our backs!" mode.
You can't read the threat modeling PDF and conclude that it was run through the normal Apple document review process. It reads nothing like a standard Apple document - it reads like a bunch of sleep deprived people were told to whip it up and publish it.
I don't really want to do the research, so I'll take your word for it.
But by fog of war I was thinking more like the victim already has some sleazy (though marginally legal) stuff on their computer, or a search led to a find of pot in their house, or they lied to try and get out of the rap, or perhaps the FBI offered them a deal and they took it because they saw no way out, or perhaps they were simply an unlikable individual who the jury took a dislike to.
Basically that things are not always clear cut, and they come out of the wrong side of things, in a situation created by Apple's surveillance.
It would still be mentally draining to be accussed of CP. Can you imaging how terrified one would be if they see a warning message with a blurred sunset? I don't know exactly how the system works but from Apple's press release, it hides the image and gives a warning to the user. This would not go well on social media.
Remember, while you are refuting all this to each party, you are actually in the process of defending yourself against one of the worst criminal accusations possible. Your life will be investigated, your devices will be investigated - the amount of stress and reputational harm this causes is insane.
As I've commented elsewhere, DoS can be easily mitigated by implementing another layer with basic object recognition to filter out false positive collisions.
> You could be duped thinking you downloaded an image of a beautiful sunset
If it was anything like the image used to demonstrate this technique on Github, it's unlikely that anyone would describe that sunset as "beautiful". They'd be more likely to describe it as "bugger, this JPEG file is corrupted."
It was quite literally less than 24h from "Oh, hey, I can collide this grey blob with a dog!" to "Hey, this thing that looks like cat hashes to the same thing as this dog!"
You really think this is going to end at this proof of concept stage?
Of course it will get better. But it's not going to end at "Hey, this photograph of a sunset is visually unchanged" while now matching CSAM. That's just not plausible. It's not how these classifiers work.
Regardless, this whole thing is moot because there are two classifiers, only one of which has been made public. Before any matches can make it to human review, photos in decrypted vouchers have to pass the CSAM match against a second classifier that Apple keeps to itself.
Match the first classifier, and your file gets uploaded unencrypted to Apple. Which is fine if it's probable CSAM. But what if they switch efforts to combat, say, piracy?
So your concern is that Apple will start doing something evil at any moment without your consent. That's been true of any computer platform since the advent of software updates. You can such hypotheticals with any company you like.
That’s not how the technology works. The files are never decrypted. Instead, if enough hashes match, a “visual derivative” is revealed. What a “visual derivative” is hasn’t been explained, but most people seem to think it’s a low-res version of the file.
Specifically, many criminal actors don't touch CSAM because it's wrong. But some of these criminal actors will happily abuse legal systems, e.g. SWATTing.