Hacker News new | ask | show | jobs
by thaumasiotes 2775 days ago
> It is not that uncommon for humans to be uncertain about what they are looking at, but the first thing about such occurrences is that the human is usually aware of the fact that they are having a problem, and the second thing is that they take steps to resolve it

This is true, but step one is "move your head" (or in your words, "get a better view" -- but you get more value from just the fact that your head is in a different place than from the possibility of a better angle on whatever you're looking at).

That strategy doesn't work at all when you're trying to classify static images rather than physical objects.

2 comments

That raises the interesting question of how object recognition in streams of images is progressing, beyond being just object recognition within the individual frames. Humans are capable of extracting a lot of additional information in such situations, and are actually helped when the perspective on a given object changes. One cannot give current machine vision a pass if, through lacking this capability, it is under-performing.

And moving one's head to get a a better view is only one thing that a human might do. Firstly, of course, we must recognize that we are having a difficulty, and current machine vision seems to be somewhat deficient in this regard. Then, even without being able to get a different perspective, we will do things like make guesses as to what might be there (using our extensive semantic models of the world) and figure out if they might be a good fit to what we see, and/or we might try to extract specific features of the problematic area and search our memories for objects that might plausibly match, bearing in mind that it might be from a different perspective than we are accustomed to. We are also quite good at estimating whether an object might be a problem for us, even if we have not positively identified it. There is a lot more to it than just moving one's head.

GP's statement applies as much to observing objects in 3D space as it does to looking at photos, where just moving your head ain't gonna help you much. Optical illusions are great to study this process, because most of them are delivered in form of flat images on paper or computer screen.
Optical illusions are delivered as flat images because moving your head doesn't affect those.