Hacker News new | ask | show | jobs
by sidlls 984 days ago
That's more or less exactly what these models do by design, otherwise the predictors/estimators for the next token in a sequence would fail spectacularly. These models aren't just plucking random tokens out of a bag and ordering them based on some brute force large-scale memory lookup or whatever. There is an internal representation of the tokens in a more meaningful sense. That sense, however, is limited to the statistical/mathematical framework upon which the models are built. It's a huge (in my opinion completely unjustified, wishful thinking exercise) leap to call it "reasoning."