|
|
|
|
|
by MAXPOOL
2595 days ago
|
|
Learning multilayer convolutional representations of statistical features is roughly equal to taking few first few layers in visual cortex and stacking them. Creating higher and higher stacks is not going to solve vision. We are essentially building a frog with better and better visual perception in the hope that it could become a taxi driver. It will become a totally amazing super-frog with super-vision, but it's still just a frog with frog-like visual perception and limits. Using pre-attentive feature recognition stage equivalent for complex object recognition can fake human like object recognition when we force it, but it's wrong approach. We get these catastrophic failures because we hit the limits. Features seem to exist independently from one another in the early processing stages of human perception. They are not associated with a specific object either. Human perception is not gradually turning features into objects like we do in deep learning. Properly distinguishing feature integration from detection and how to do it is a open question. |
|