Hacker News new | ask | show | jobs
by mimsee 1774 days ago
Yes. This reminds me of when typing or receiving certain text would make an iPhone crash. But now having your account deleted makes it a feature. For example Whatsapp automatically downloads media to the camera roll which then get uploaded to iCloud. Of course that can be turned off prior, but this is like what happens with backing up. People want to backup, but don't invest the time in it. That's until it's too late, they lost their data and now want their stuff back.
1 comments

Backup is a good point:

- Apple: “Backup your phone to iCloud, it will be safe there.”

- 5 minutes later: “We’ve wiped your account because of a photos of (porn actor here) which is not CP but technically minor at the time she filmed.”

- “Also we’ve wiped your iPhone because we couldn’t knowingly let you keep that. Good luck contacting your parents, we’ve deleted your contacts. Good luck! PS: We’ve reported you to the police.”

- Also you can’t connect to your iMac now.

Or photos of your own children.

We have a Tumblr set up for family to view pics of the kids. Several photos and videos of our kids when they were under 2 were taken down either temporarily or permanently by their CP algo.

These were a pic or video of kids in the bath or without a shirt. In none of them could you see bum or bits. Just a semi naked baby.

Algorithms like this get things wrong all the time

This is not the kind of algorithm that Apple is be using. That one only scans for already known CSAM in NCMEC's database.
Quite funnily and disturbingly, one the databases of "known CSAM" hashes also apparently includes a picture of a clothed man holding a monkey[1]

[1]: https://www.hackerfactor.com/blog/index.php?/archives/929-On...

That was just a MD5 collision - an image that has same MD5 hash as some other image (in this case some CP). This is uncommon yet possible thing - see this example[0].

[0] https://natmchugh.blogspot.com/2014/11/three-way-md5-collisi...

I think a flawed process where the monkey image ended up in the database is more likely than a random unintentional hash collision.
Yes, hash collisions definitely occur. There is no such thing as collision-free hashes, and MD5 is definitely broken.

Even though the author says they were 3 million MD5 hashes the second time, the first one he calls them SHA1 and MD5 hashes (even though SHA1 is considered weak too).

I wonder what kind of hashes Apple is planning to use. Will it be whatever is made available to them or will they only accept (what is now considered) secure standards?

Which may contain the hashes of their photos, because they've been taken down in the past, which means they probably have been added to certain blacklists that may have been integrated into the blackbox of NCMEC's database.
Photographs of your naked child in the bath are not illegal, are not CSAM, and are not going to be in the NCMEC's database.
NCMEC's CSAM database already includes images that are not necessarily illegal. If _your particular_ photos have been flagged in the past, they may well be part of the database.
Step 1: Get copies of pictures of targets kid in bath from phone/SNS

Step 2: Manipulate pictures so that hash collides with CSAM

Step 3: Get pictures back on targets phone so they get scanned.

I don't have the skills or understanding of how the hashes are created but would this be possible?

This isn't an ML algorithm. It's a hash. It only matches already known material.
It is a hash created with ML. So it’s both. But yes, it only matches already known material.
I haven't seen anyone claim that any of this algorithm was "created with ML". I'm interested in learning more so do you have a citation for that?

Regardless, it's not both. Setting aside how the algorithm was created, it's incorrect to say that an algorithm "created with ML" is itself an ML algorithm.

NeuralHash was so named because it was optimised to run on the Apple Neural Engine for the sake of speed and power efficiency.

It’s both because it’s a multi-step process.

The image is not fed directly into the hashing function, like taking an MD5 hash of a file or something.

Rather, the image is first evaluated by a neural net that looks at specific visual details, and has been trained to match even if the image has been cropped or anything like that. The results of the neural net evaluation are what is then input for the hashing function.

This is explained in detail in Apple’s documentation they released with the announcement.