Replace "LLM" with "student" and read that again. You don't just blindly give students output, you teach them, like what you are supposed to do with an LLM.
There are some similarities, but they are absolutely overwhelmed by the differences. Having a handful of superficial similarities is not enough to make draw a meaningful comparison. The act of teaching a human is very different from “training” an LLM because humans have the power of the whole brain and body, not just some information-integration part that the brain and LLMs may (or may not) share. Humans can be creative in ways that LLMs manifestly can’t be. Humans can act like mere token predictors, but we can (and routinely do) also transcend that, question it, play with it. LLMs can’t.
> but we can (and routinely do) also transcend that, question it, play with it. LLMs can’t.
Maybe not in a single inference, but you can have an LLM question itself by running another inference using its previous prompt as input. You can easily see this in a deep research agent loop where it might find some data and then it goes to find other data to back that up but then finds that it was actually incorrect and then it changes its mind
First, modern LLMs are not "a huge table of phrases". They are neural networks with billions of learned parameters that generate tokens by computing probability distributions over vocabulary given prior context. There is no lookup table of stored sentences.
Second, Eliza-style bots used explicit scripted pattern matching rules. LLMs instead learn statistical representations from large corpora and can generalize to produce novel sequences that were never present in the training data.
Kent Pitman's Lisp Eliza from MIT-AI's ITS History Project (sites.google.com):
Third, while "pattern matching" is sometimes used informally, it’s misleading technically. Transformers perform high-dimensional vector computations and attention over context to model relationships between tokens. That’s very different from rule-based pattern matching.
You can certainly debate whether LLMs "think", but describing them as "Eliza with a big phrase table" is not an accurate description of how they work.
You have the resources available at your fingertips to learn what the truth is, how LLMs actually work. You could start with Wikipedia, or read Steven Wolfram's article, or simply ask an LLM to explain how it works to you. It's quite good at that, while an Eliza bot certainly can't explain to you how it works, or even write code.
Enough with this analgoy. It's flawed on so many levels. First and foremost, stop devaluing humanitiy and hyping up AI companies by parroting their party line. Second, LLMs don't learn. They can hold a very limited amount of context, as you know. And every time you need to start over. So fuck no, "teaching" and LLM is nothing like teaching an actual human.
„Fitting“ is still too nice of a word choice, because it implies that it’s easy to identify the best solution.
I suggest „randomly adjusting parameters while trying to make things better“ as that accurately reflects the „precision“ that goes into stuffing LLMs with more data.
It was called learning already back when the field was called cybernetics and foundational figures like Shannon worked on this kind of stuff. People tried to decipher learning in the nervous system and implement the extracted principles in machines. Such as Hebbian learning, the Perception algorithm etc. This stuff goes back to the 40s/50s/60s, so things must have gone south pretty early then.
That isn't learning, it can read things in its context, and generate materials to assist answering further prompts but that doesn't change the model weights. It is just updating the context.
Unless you are actually fine tuning models, in which case sure, learning is taking place.
i don't know why you think it matters how it works internally. whether it changes its weights or not is not important. does it behave like a person who learns a thing? yes.
if i showed a human a codebase and asked them questions with good answers - yes i would say the human learned it. the analogy breaks at a point because of limited context but learning is a good enough word.
Maybe because I work on a legacy programming language with far less material in the training? For me it makes a difference because it partly needs to "learn" the language itself and have that in the context, along with codebase specific stuff. For something with the model already knowing the language and only needing codebase specific stuff it might feel different.