|
|
|
|
|
by notahacker
2854 days ago
|
|
A human sees a 3D scene with objects, creatures, texture and lighting (and it evaluates the scene based on these concepts and how they related to each other even if it's never seen green fields, sheep, dry stone walls or fog before). The computer generally sees a set of pixel values, and takes a plenty of training to distinguish between "sheep" and "field the same shade of green as usually found in images containing sheep" because it doesn't have an innate concept of animate objects and habitats, how they relate to each other and which is more important. Whilst the computer's busy seeing a white pile of stones as a false positive for the presence of sheep, the human's looking at the way the stones are piled as possible evidence of human activity and noting the presence of droppings in the foreground might mean sheep were here recently. (of course, it's not entirely impossible for computer vision systems to deal with higher levels of abstraction: autonomous vehicles model the world in 3D and classify objects as vehicles in order to predict their near future behaviour and signals in order to regulate their own behaviour, but that goes well beyond mere learning processes. And of course a pixel-by-pixel understanding of the world has its use cases in spotting changes in colour and texture which are so subtle humans abstract away from, like crop discolouration on satellite images or cracks in rough surfaces) We're much better at abstraction than other animals too: show us a 20,000 year old cave painting and we'll easily grasp that it was produced by humans and the lines represent shapes of animals broadly similar to today's livestock. Same goes for 2000 year old marble bas reliefs. You might well be able to train an algorithm to recognise "paintings of animals" and "carvings of animals", but you'll struggle with a training set consisting purely of photographs of real world livestock |
|
https://en.wikipedia.org/wiki/Optical_illusion