|
|
|
|
|
by pippy360
3227 days ago
|
|
Hey guys, I'm the guy who made this. Let me know if anything isn't clear or if you have any questions. *I developed this after I saw how poor google (and any other reverse image search engine I tested, like bing/tineye) performed on rotated images, for example google doesn't match this image [0] to it's original [1] (google's neural network will find that it is a cat but won't match to the original). After playing around with trying to make my algorithm (uniform/nonuniform) scale and rotation invariant I found that I was able to make it fully 2D affine invariant (pretty much by luck) I developed all this in my spare time actively for about about a year. I also have a c++ implementation that let's you reproduce all this but it's just a proof of concept and so it's quite slow. You can check that out on my github here [2]. There are loads of ways this can be improved but I wanted to get the idea out quickly. [0] https://github.com/pippy360/transformationInvariantImageSear... [1]
https://github.com/pippy360/transformationInvariantImageSear... [2]
https://github.com/pippy360/transformationInvariantImageSear... |
|
Great work there!
One question - Why are we transforming the triangles to equilateral triangles?
Just throwing out a few ideas and wanted to discuss what potential flaws could they have.
1. For keypoint detection, what if you use ASIFT (Affine SIFT) which is Affine transformation friendly. In that case, you'd probably save time doing the rotations. Given the huge number of proposals we get, we might still need to filter out some keypoint proposals from this may be using a metric like if too many keypoints are within some 2D window, then choose the one which is farthest from all edges of the 2D window (very rough, don't know if there are other ways)
2. With the final set of keypoints, I propose that let us do Delaunay Triangulation in the hope of getting a collection of triangles which cover the complete surface area of the image making it a spatially equidistant breakdown of the image pixels.
3. Hash those triangles (maybe? how?)
4. Now given a query image, perform steps 1-3, and find the triangles which match the triangles from a database image. If the fraction of matches are above a given threshold, then this is a potential candidate for the search result.