|
|
|
|
|
by totorovirus
844 days ago
|
|
proves my point that llms are simply a next token predictor. There are many interesting properties that we see "emergence" of intelligence but I think it's just human's incapability to hold so much knowledge on active memory. |
|
GPT 4 is at a high enough level of performance that mere simple statistics aren't really helping it do any better, it really is developing structures especially in the middle layers that perform some amount of high level understanding.
I don't think that pure next token prediction will always be the optimal way to train and enhance these behaviors, but it's not fair to say that it's unrelated, if this really was just stochastic parroting then LLMs would have topped out way before the level they're at now.