Hacker News new | ask | show | jobs
by MillenialMan 1839 days ago
Neural nets generalise because they have to approximate the data at a lower resolution, it's not that they're constrained to only learn what is useful. They're lossy compressors, but they have a unique property that most lossy compressors don't have. They cannot learn all the properties of the input data - partly because they can't hold that much information - but uniquely because neurons cannot be modified in isolation. A change in one neuron changes the influence of every other neuron in that layer, on the next layer. So it's difficult to learn granular properties of specific examples, because the entire net is affected when you do that (and many granular properties that are learned, will be unlearned in subsequent examples). The deeper the net, the less able earlier layers are to extract granular information from the input. They have to extract very abstract information, and they will gradually converge on an abstraction strategy that works.

That's why residual blocks are interesting. They pass that low-level information to later blocks (which have an easier time processing the granular details) while also leveraging the ability of earlier blocks to extract abstract information. It allows you to extract and combine information at multiple levels of granularity (or abstraction).

Convnets are also invariant to generalisation (e.g. translation, and to some degree scale), which I think is a better definition than "can only learn something useful." They're forced learn information that is more general, which increases the usefulness of each bit, which means you get a higher density of usefulness per FLOP. But you also lose specific information in that process. What if location is meaningful? For example, audio spectrogram analysis can suffer from that property, because specific location on the Y axis is highly meaningful.

1 comments

What I meant by "forced to learn something useful" is what you put in a more clear way by being forced to generalize.