|
|
|
|
|
by gpm
101 days ago
|
|
I'd disagree, the other training on top doesn't alter the fundamental nature of the model that it's predicting the probabilities of the next token (and then there's a sampling step which can roughly be described as picking the most probable one). It just changes the probability distribution that it is approximating. To the extent that thinking is making a series of deductions from prior facts, it seems to me that thinking can be reduced to "pick the next most probable token from the correct probability distribution"... |
|
(With this perspective, I can feel my own brain subtly oferring up a panoply of possible responses in a similar way. I can even turn up the temperature on my own brain, making it more likely to decide to say the less-obvious words in response, by having a drink or two.)
(Similarly, mimicry is in humans too a very good learning technique to get started -- kids learning to speak are little parrots, artists just starting out will often copy existing works, etc. Before going on to develop further into their own style.)