Hacker News new | ask | show | jobs
by marsRoverDev 1492 days ago
My understanding of this technology was it was usually a hash or fingerprint of some form of the material, to be compared against a database of hashes of known-bad material. I doubt EU countries have the infrastructure or budget to run models across everything, no?
8 comments

An article about this was posted yesterday, and the EU wants to take it further that detecting known CSAM content, using AI to detect new content too.

Of course, it's all a pretence for getting access to read all of your data. It might be a language thing, but the wording of the report was such that they were almost stating this explicitly.

Honestly, what.the.hell are these clowns thinking?!

The article says they already have technology that detects "grooming", whatever that means. That's seriously frightening, who knows what else they can detect? Maybe a few years from now we'll be reading articles about people getting arrested for expressing prohibited thoughts, subversive ideas, political opposition in what they thought was a private communications channel.
The problem with hashes is, that you cannot prove what the original image was without actually getting caught with csam.

So basically, a whistleblower takes photos of some very incriminating documents, someone gets accused of eg. money laundering, hashes of those images get added to the "csam" database, and you find the first person (via metadata) who had that image on their phone.

Also, in some more repressive countries, an image of a famous cartoon bear photoshopped to look like some president can be added, and all the people who look/have/download images (memes) like that, can get put on a "list".

But would the system not insist images are hashed by it? I mean; I would think it’s not a weird demand that if this is to protect children (which it is not, at least not only) that the image hashes are solely for that purpose or otherwise not valid and thrown out by at least an AI. I know it’s naive but it seems you want the image with the hash and if someone gets flagged and it’s some money laundering doc, then it should be dismissed before you get out on a list.
The hashes are likely provided by some outside agency who is fine with transferring the hashes, but would have qualms about transferring multi-petabytes of the matching CSAM images.
They are effectively low resolution hashes - obviously they would be completely useless if they fail to detect a re-encoded or slightly transformed or cropped image. In this sense it's a hash with a very high collision rate... probably even higher if you limit it to innocent but otherwise visually similar image of parents and grandparents sending pictures of their own toddlers as GP describes.

This is when it gets scary, demonising people based on a hash collision with no evidence or context.

Even with a database of hashes the risk remains. If someone gets off on nude children, then there is nothing stopping them from collecting innocent photographs from social media¹, reposting these on seedy or oniony websites, and have the hash ending up in the huge black box of CSAM hashes by an automated scraper. I am not under the impression that such a database is curated too closely.

1: This is not something I'm personally at risk of, because my parents are tech savvy enough not to post nude pics of a toddler anywhere but in a private chat with his parents (i.e., me and my wife).

That’s the implementation used by Apple for client-side scanning of iCloud uploads. NeuralHash or something. The EU may (and likely will) build an unrelated system, although it may share ideas with prior work.
Unless the exact algorithm and data used is described in the law you don't know what the implementation will look like (or what it will look like in ten years).
I mean, at some point the amount of data & CPU needed to run a CV neural network is less than downloading all existing hashes and comparing them...
I think you are underestimating the power of modern servers and GPUs for scanning this stuff. It's mind boggling.