|
|
|
|
|
by ansgri
3232 days ago
|
|
These invariants are standard practice in computer vision. In fact, you won't go anywhere without translation and brightness invariance, more often you need also scale (uniform) and rotation invariance, and limited projective transform tolerance. There are various ways to deal with this in ML (not really an expert in ML), but AFAIU in most cases you get candidate transforms via model-based methods, and then make the decision using ML-based methods trained to allow for limited transformations (convolution nets are good at that). |
|
Okcupid talked about this in an article about image hashing [0] and they have a nice quote:
"The end-to-end approach of training a Convolutional Net to embed images such that the positive / negative case distances separate nicely is probably the cool new hypebeast way forward. This would have the added advantage of being able to handle whatever transformations you can think of at training time."
[0]https://tech.okcupid.com/evaluating-perceptual-image-hashes-...