Hacker News new | ask | show | jobs
by amckenna 4457 days ago
An idea: This would require more information storage, but would it be possible to hash an image and take snapshots of the hashing algorithm as it processes the image, say after each block of hashing (hashing digests a block - such as 64 bytes - at a time). Then simply compare the list of snapshots between two images and come up with a statistical threshold for a "similar image"? In the case of the two cats the images only differ in the nose, so the first half of the image up to the nose would produce the same list of snapshots.

You could also hash forwards, backwards and starting at several random midpoints to prevent someone from simply changing the first block to throw off the hashing algorithm.