Hacker News new | ask | show | jobs
by simondotau 1773 days ago
Tens of thousands of automated detections per day? Unlikely. More likely tens per year. Remember, this isn't a porn detector combined with a child detector. It is hashing images in your cloud-enabled photo library and comparing those to hashes of images already known to child abuse authorities.

In addition, consider how monumentally unlikely it is for any CSAM enthusiast to copy these illicit photos into their phone's general camera roll alongside pictures of their family and dog. This is only going to catch the stupidest and sloppiest CSAM enthusiast.

2 comments

For comparison to your "likely tens per year" number, Facebook is running the same kind of detectors and reports ~20 million instances a year: https://twitter.com/durumcrustulum/status/142377627884745113...
That doesn't seem to be the same kind of detectors at all.

"21.4 million of these reports were from Electronic Service Providers that report instances of apparent child sexual abuse material that they become aware of on their systems."

So those 20M seems to be images that Facebook looked at and determined to be CP. Apple's system is about comparing hashes against already known CP.

For the record: I don't support Apple's system here, but it's not the same kind of detection at all. Let's try to not make up random facts.

From the same thread: https://twitter.com/alexstamos/status/1424017125736280074

> The vast majority of Facebook NCMEC reports are hits for known CSAM using a couple of different perceptual fingerprints using both NCMEC's and FB's own hash banks.

Ah, I see. My apologies.
Facebook looked at them after they hash matched known CP. That is how all these providers do it.

If you think that this is 20 million people mashing the report button, that is almost certainly wrong

That's a summary number of many kinds of reports, of which CSAM hash matches would be one part.

That summary number also includes accusations of child sex trafficking and online enticement. I wouldn't be surprised if reported allegations of trafficking and enticement were in excess of 99.9% of Facebook's reporting. But since they don't break it out, I can only guess.

Given that guesses aren't useful to anyone, it would be interesting if you know of any statistics from any of the major tech vendors, of the reporting frequency of just CSAM hash matches.

> of which CSAM hash matches would be one part.

The majority part:

https://twitter.com/alexstamos/status/1424017125736280074

> The vast majority of Facebook NCMEC reports are hits for known CSAM using a couple of different perceptual fingerprints using both NCMEC's and FB's own hash banks.

Fascinating. Thank you for providing the clarification. I still find that number to be perplexingly huge. If it's indeed correct, one hopes that Apple know what they're getting themselves in for.
> If it's indeed correct

Just admit you are wrong and leave it at that without continuing to try to put a false light on this.

Thanks for the kind suggestion, but I'm not going to concede anything on the basis of an assertion made by one person in one tweet, with zero supporting evidence, zero specificity, zero context.

Assuming that number is correct, it means there are orders of magnitude more reports than there are entries in the CSAM database. So even if I conceded that Facebook were reporting over 10 million CSAM images, how many distinct images does this represent? More than four? We have no idea.

How many of those four were actually illegal? Remember, there's a Venn diagram of CSAM and illegal. A non-sexual, non-nude photograph of a child about to be abused is CSAM but not illegal.

This is a serious topic; you don't seem to be taking it seriously.

Google is probably a better comparison. I can't find the source atm, but IIRC it was ~500k/year.
That wouldn't surprise me as Google's reporting would include everything seen by GoogleBot as it crawls the internet.
Ten thousand iOS users doing something stupid or sloppy per day (noting they don’t have to be stupid or sloppy in general for that to happen) would not hit the monumentally unlikely criteria for me. Also this is not counting the false positives which is the premise of this thread.
Yes, being sloppy is common.

I don't know about anyone else but I've never had any issue with regular porn sloppily falling into my camera roll. And that's just regular legal porn. Maybe I'm more diligent than others but regardless, it's just not something that happens to me.

Being sloppy with material which you know is illegal? Material which, if stumbled upon by a loved one, could utterly ruin your life whether or not authorities are notified? Material which (I optimistically assume) is difficult to acquire and you'd know to guard with the most extreme trepidation? We're seriously expecting tens of thousands of CSAM enthusiasts to be sloppy with their deepest personal secret and have this stuff casually fall into their camera roll?

I'm not buying that.

A false positive will not have any effect. The threshold system they have means that they won’t be able to decrypt the results unless there are many separate matches.