Hacker News new | ask | show | jobs
by Corendos 1339 days ago
This was my internship subject (in another company) just before I graduated, I wonder what they used for the Perceptual Hash, ours was SIFT features. Happy to see that what I implemented would have been able to scale that much !
2 comments

So when you run SIFT on an image, one gets dozens (maybe hundreds) of SIFT features back. The trouble with SIFT features is that each individual SIFT feature is a local image descriptor -- it describes a single point in the image. One can't just append the two lists of SIFT descriptors together and do a Hamming comparison on them, because it's not guaranteed that both images will have all of the same SIFT descriptors, nor that they would be in the same order. When you want to do image comparison on image descriptors, one must compare every local feature with every local feature in every other image. This is great for comparing two images together, or for finding where one image is located in another image (homography matching), but this does not scale for large image sets.

In contrast, descriptors like perceptual hashes look at the entire image, and so are a _global_ image descriptor.

There are ways to convert local SIFT image descriptors into a single global image descriptor for doing more rapid lookup (Bag of Visual Words is one technique that comes to mind), but SIFT and pHash really are in two categories all their own.

More info on pHash: https://hackerfactor.com/blog/index.php%3F/archives/432-Look...

Example of SIFT for fine-grained image matching: https://docs.opencv.org/3.4/d1/de0/tutorial_py_feature_homog...

I have found the ML image categorization models an excellent method of extracting a unique descriptor. It is possible to compress the image for matching and storage into a compact signature.

I did it here: https://github.com/starkdg/phashml

https://github.com/starkdg/pyphashml

It is available in a python module that uses tensorflow model.

Feel free to message me.

thanks, first link 404s