Hacker News new | ask | show | jobs
by cyanoacry 4457 days ago
This is a bad approach for a couple reasons: 1) The total measurable space/information over a resized icon-size greyscale image is pretty small, so you run into a much higher likelyhood of collisions/false positives.

2) It's not too hard to program a Haar wavelet[1] transform[2] (basically iterative scaled thesholding). This has worked well over at IQDB[3], where they do reverse image lookups on databases of over several million images via a modified Haar wavelet.

You can't beat this algorithm for simplicity, though. Have you guys done any false positive checks with this algorithm? The saving grace might be that icons are fairly small and limited in detail/color.

[1] http://en.wikipedia.org/wiki/Haar_wavelet

[2] http://stackoverflow.com/questions/1034900/near-duplicate-im...

[3] http://iqdb.org/