Hacker News new | ask | show | jobs
by mikewarot 945 days ago
0. Ahead of the pattern recognition, is a set of layers (with an intentional bottleneck in the middle) that have taken a ton of tokens in small random chunks, and have been trained to reproduce the input, despite the bottleneck. This network is an autoencoder[1]. In my opinion, they are almost magic, and it's amazing to me that they work at all.

The Autoencoder is then split into an encoder and a decoder, so that tokens going in can be converted to a "embedding" (the values passed through the bottleneck).

It's that layer that does the grunt work of making similar words near to each other in the encoded values.

re #4. Neural networks are multiple layers of matrix multiplies with biases, and a non-linear output on each layer. The nonlinear part is important, otherwise you could just do the algebra and collapse all the layers down to one matrix multiply.

The autoencoder is what makes the autocomplete on steroids actually useful.

[1] https://en.wikipedia.org/wiki/Autoencoder