|
|
|
|
|
by svara
305 days ago
|
|
Not at all. I don't want to address directly your claim about lack of generalization, because there's a more basic issue with the GP statement. Even though I will say, today's models do seem to generalize quite a bit better than you make it sound. But more importantly, you and GP don't mention any evidence for why that is due to specifically using next token prediction as a mechanism. Why would it not be possible for a highly generalizing model to use next token prediction for its output? That doesn't follow to me at all, which is why the GP statement reads so weird. |
|
Again, inverted burden of proof. We don’t have to prove that next token prediction is unable to do things that it currently cannot do and has no compelling roadmap that would lead us to believe it will do those things.
It’s perhaps a lot like Tesla’s “we can do robocars with just cameras” manifesto. They are just saying that they can do it because humans use eyes and nothing else. But they haven’t actually shown their technology working as well as even impaired human driving, so the burden of proof is on them to prove naysayers wrong. Put up or shut up, their system is approaching a decade late from their promises.
To my knowledge Tesla is still failing simple collision avoidance tests while their competitors are operating revenue service.
https://www.carscoops.com/2025/06/teslas-fsd-botches-another...
This other article critical of the test methodology actually still points out (defends?) the Tesla system by saying that it’s not reasonable to expect Tesla to train the system on unrealistic scenarios:
https://www.forbes.com/sites/bradtempleton/2025/03/17/youtub...
That really gets back to my exact point: AI implemented the way it is today (e.g. next token prediction) can’t handle anything it has no training data for while the human brain is amazingly good at making new connections without taking a ton of time to be fed thousands of examples of that new discovery.