Hacker News new | ask | show | jobs
by agitator 3029 days ago
I think this might also be due to the fact that the compute for neural nets and the complexity of the networks are still in their infancy. The neural nets in these cases are all simple classifiers, working on a fixed image resolution with fixed training data. What do you expect? You can't train an intelligent machine if your architecture is dumb to begin with.

If you showed me an image of a green field with a bunch of fur balls on it. I'd go "Oh look! Floofy Sheep!" but then maybe upon closer inspection, i'd go "heeyyy.... thats actually a herd of cats!" But a neural net isn't designed to make decisions, to say hey maybe I should investigate further, etc. Its just a black box that spits out probabilities of classifiers. I think if we want to get more sophisticated with judgements and something nearing more realistic intelligence, we would need something like nets of neural nets, and for ways to interconnect them. Like here is a model for sheep, it also has interconnections with environment, and here is another model for a sheep's facial features, etc. And maybe a net for decision making or asking questions if confidence is lacking or ambiguous.

I can see a toddler going "oooh sheep!", as well and then a parent going, "no, look closer, those are kittens!" And then the kid learns oh, maybe I shouldn't be so quick to conclude! Sometimes I may be deceived!

1 comments

I think this might also be due to the fact that the compute for neural nets and the complexity of the networks are still in their infancy.

Well, neural nets may be just starting out but I think one can they're approximation process are not complex. They are very complex in the sense of having many layers and many pseudo-neurons on each layers.

What's happening is that the networks are mapping images to high-dimensioned "feature space" and then drawing dividing line in the feature between matching and not-matching images. It is vastly complicated but heuristic process. Essentially, the division between image types are based on both meaningful and meaningless differences between the images. The example classified as "a boy holding a dog" (when it was a goat) and "a herd of giraffes in trees" (when it was goats that had climbed trees happened to have more random characteristics in common with the classification than their real qualities.

The thing is the method can be made relatively better but for absolute improvement, you'd want a way to not just have more approximation but to find a way to get rid of garbage approximation, garbage conclusions and so-forth. I suspect that would imply both different algorithms and a different training cycle.