|
|
|
|
|
by dangus
299 days ago
|
|
I feel like the logic of your question is actually inverted from reality. It’s kind of like you’re saying “prove god doesn’t exist” when it’s supposed to be “prove god exists.” If a problem isn’t documented LLMs simply have nowhere to go. It can’t really handle the knowledge boundary [1] at all, since it has no reasoning ability it just hallucinates or runs around in circles trying the same closest solution over and over. It’s awesome that they get some stuff right frequently and can work fast like a computer but it’s very obvious that there really isn’t anything in there that we would call “reasoning.” [1] https://matt.might.net/articles/phd-school-in-pictures/ |
|
I don't want to address directly your claim about lack of generalization, because there's a more basic issue with the GP statement. Even though I will say, today's models do seem to generalize quite a bit better than you make it sound.
But more importantly, you and GP don't mention any evidence for why that is due to specifically using next token prediction as a mechanism.
Why would it not be possible for a highly generalizing model to use next token prediction for its output?
That doesn't follow to me at all, which is why the GP statement reads so weird.