|
|
|
|
|
by jbay808
1120 days ago
|
|
I'm glad to see someone express this view, because I think this gets to the heart of the question. How does a stochastic parrot learn to sort lists? Embeddings are part of the compression-by-abstraction that I'm explaining in the first two parts, but the embeddings generated by an LLM go beyond the normal word2vec picture that most people have of embeddings, and I believe are closer to whatever "understanding" means if it could be formally defined. It would be quite a coincidence if GPT-4 happened to solve the riddle merely by virtue of "Moonling" and "cabbage" being closely-located vectors. |
|
We refer to algorithms like quicksort as 'reasoning' about the input. So it's fine to use the same sense of the word to apply to stochastic parrots.
The difference between an LLM learning how to sort things and compiling an implementation of an algorithm like quicksort is not terribly large, from a certain perspective.
I suppose something I'm interested in is whether an LLM that can't sort numbers could be instructed how as a prompt and then do so.
There are some examples of similar phenomenon (the one with some kids made up language was interesting) which suggests the LLMs have a lot of space dedicated towards dynamic pattern selection in their context windows (somewhat tautological) in order to have prompts tune the selection for other layers.
And, of course, lack of plasticity is really interesting.