Hacker News new | ask | show | jobs
by tw1010 3231 days ago
Really cool! I love how more and more mathematical ideas (like translation invariance) are seeping into the programming community. I suspect that this will have a bigger and longer lasting impact on software engineering as a result of the current machine learning hype, than will any particular technology that it may produce.
1 comments

These invariants are standard practice in computer vision. In fact, you won't go anywhere without translation and brightness invariance, more often you need also scale (uniform) and rotation invariance, and limited projective transform tolerance.

There are various ways to deal with this in ML (not really an expert in ML), but AFAIU in most cases you get candidate transforms via model-based methods, and then make the decision using ML-based methods trained to allow for limited transformations (convolution nets are good at that).

I think for handling truly/non-affine arbitrary transformations we will have to resort to ML. Then we could have matching very similar to how humans do it (where we really don't care if the transformation is affine/non-affine we just care if it's a huge transformation the makes the image unrecognisable). But I really don't know much about ML.

Okcupid talked about this in an article about image hashing [0] and they have a nice quote:

"The end-to-end approach of training a Convolutional Net to embed images such that the positive / negative case distances separate nicely is probably the cool new hypebeast way forward. This would have the added advantage of being able to handle whatever transformations you can think of at training time."

[0]https://tech.okcupid.com/evaluating-perceptual-image-hashes-...

I've been out of the Machine Vision space for a while, so my knowledge is somewhat out of date.

What research is out there on general Affine invariant vision algorithms and techniques?

For context, my practical experience ended in the mid 2000's when "jittering" was a bit of thing along with some occasional closed form estimators based on basic linearization approximations.

ASIFT or Affine SIFT is one of the Affine invariant versions of SIFT. Take a look here - http://www.cmap.polytechnique.fr/~yu/research/ASIFT/demo.htm....
It's actually perspective transform invariant, which is a more general class. The name is misleading.

SIFT and its multiple descendants, yes, and also things like Random Ferns, google "Boosted Random Ferns for Object Detection" (it's damn difficult to get a clean link to pdf from google...)