|
|
|
|
|
by fnordpiglet
793 days ago
|
|
There’s a difference between certainty of the next token given the context and the model evaluation so far and certainty about an abstract reasoning process being correct given it’s not reasoning at all. These probabilities and stuff coming out are more about token prediction than “knowing” or “certainty” and are often confusing to people in assuming they’re more powerful than they are. |
|
When you train a model on data made by humans, then it learns to imitate but is ungrounded. After you train the model with interactivity, it can learn from the consequences of its outputs. This grounding by feedback constitutes a new learning signal that does not simply copy humans, and is a necessary ingredient for pattern matching to become reasoning. Everything we know as humans comes from the environment. It is the ultimate teacher and validator. This is the missing ingredient for AI to be able to reason.