Hacker News new | ask | show | jobs
by eurasiantiger 1765 days ago
This basically invalidates any claims Apple made about accuracy, and brings up an interesting point about the hashing mechanism: it seems two visually similar images will also have similar hashes. This is interesting because humans quickly learn such patterns: for example, many here will know what dQw4w9WgXcQ is without thinking about it at all.
1 comments

> it seems two visually similar images will also have similar hashes

This is by-design - The whole idea of a perceptual hash is that the more similar the two hashes are, the more similar the two images are, so I don't think it invalidates any claims.

Perceptual hashes are different to a cryptographic hash, where any change in the message would completely change the hash.

> The whole idea of a perceptual hash is that the more similar the two hashes are, the more similar the two images are

If that is the case, then the word "hash" is terribly mis-applied here.

Hash is applied correctly here. A hash function is "any function that can be used to map data of arbitrary size to fixed-size values." The properties of being a(n) (essentially) unique fingerprint, or of small changes in input causing large changes in output, are properties of cryptographic hashes. Perceptual hashes do not have those properties.
Good explanation, thanks. I only knew about cryptographic hashes, or those that are used for hash tables where you absolutely do not want to have collisions. Anyhow, I'm not really comfortable with this usage of the word "hash". It is completely opposite of the meaning I'm used to.
Maybe the term fingerprint is better
It greatly increases the collision space if you only have to get near a bad number.
> The whole idea of a perceptual hash is that the more similar the two hashes are, the more similar the two images are

This is already proven to be inaccurate. There are adversarial hashes and collisions possible in the system. You don’t have to be very skeptically-minded to think that this is intentional. Links to examples of this already posted in this thread.

You are banking on an ideal scenario of this technology not the reality.

EDIT: Proof on the front page on HN right now https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issue...

I think you may have misread my comment: I did not mean that the similarity of hashes invalidates any claims.