| * For example, algorithms tend to be built from a pipeline in which the first stage is feature detection. * Yeah, Computer vision is an amazingly hard field that I've only done a small amount of work in. Basically, everything here seems "broken". A lot of the paradigm involves layers (a "pipeline" etc). The researcher is, just for example, supposed to "segment" an image into object or boundaries and then use the segments for further processing. "Segmentation" then is "low level" to be followed by clever stuff later. But the "segmentation" problem itself is entirely unsolved in any conventional conception of solved. It's not even solvable since there's no real criteria as whether a segmentation algorithm has succeed or failed other than what a later algorithm might think it wants.
In my case, the final criteria was someone looking at the lines I drew and deciding they "looked right" - terrible relative to a "real" test but also inevitable since a thousand images "correctly segmented" could only be produced by a person making their judgement by drawing lines on the image (which isn't much less arbitrary). And then there's the question whether a segmented image even carries the information the next step really needs, etc. http://en.wikipedia.org/wiki/Image_segmentation It seems that human (or animal) vision is an amorphous computation process with sight, judgement and action all run together. Description like this article are interesting for this: http://www.wired.com/wiredscience/2009/11/fly-eyes/ But I lack the optimism of the article. Any algorithm we create has to attempt to simulate these amorphous processes but using modular systems that in the end can't really hope to do so. Edit: Also, the video is hilarious and typical in that the effort is totally ad-hoc; stare-at-it and tweak-it till it works. "scalable and repeatable it ain't" |