| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by radarsat1 834 days ago

Yes, this is one view of machine learning, the idea that you are training some function to map input to output, similar to "looking up" what output is addressed by some input.

And that's why the concept of generalization is so important on machine learning, and as a consequence, why the internal representation of that "lookup" matters.

By definition a lookup table can only store data it is given. However, the idea of ML systems is actually to predict values of inputs that are similar to but not given in their training data.

Interpolation and extrapolation, key components to applying ML systems to new data and therefore critical for actual usage, are enabled by internal representations that allow for modeling the space between and around data points. It so happens that multilayer neural networks accomplish this by general and smoothed (due to regularization tricks and inductive biases) iterative warpings of the representation (embedding) space.

Due to the manifold hypothesis, we can interpret this as determining underlying and semantically meaningful subspaces, and unfolding them to perform generalized operations such as logical manipulations and drawing classification boundaries in some relatively smooth semantic space, then refolding things to drive some output representation (pixels, classes, etc.)

Another view on this is that these manipulations allow a kind of compression by optimizing the representation to make manipulations easier, in other words they re-express the data in a form that allows algorithmic evaluation of some input program. This gives the chance of modeling intrinsic relationships such as infinite sequences as vector programs. (Here I mean things like mathematical recursions, etc.) When this is accomplished, and it happens due to the pressure to optimally compress data, you could say that "understanding" emerges, and the result is a program that extrapolates to unseen values of such sequences. At this point you could say that while the input-output relationship is like a lookup table, functionally it is not the same thing because the need to compress these input-output relationships has led to some representation which allows for extrapolation, aka "intelligence" by some definitions.

The fact that these systems are still very dumb sometimes is simply due to not developing these representations as well as we would like them to, for a variety of reasons. But theoretically this is the idea behind why emergence might occur in an NN but not in a lookup table.