Hacker News new | ask | show | jobs
by gjm11 2307 days ago
It's not really "only a speedup thing" because the training process is different: as a CNN learns to (say) recognize dog-noses in the top left portion of the image, it's simultaneously learning to recognize dog-noses everywhere else too. A fully-connected MLP with the same layer structure doesn't have that property.

It's true that once you've trained your CNN you could make a non-convolutional NN that computes exactly the same things but less efficiently, but the point of an NN is not just what it can compute -- there are lots of systems that can, given enough parameters, approximate arbitrary functions well -- but how you train it.

1 comments

Yes that's why I said "very similar". Without a convolution, you will have to replicate a lot of the network structure, and you'd have to train with shifted versions of your data. But fundamentally the convolution only gives an advantage in speed and memory, not in functionality.