|
|
|
|
|
by dekhn
1016 days ago
|
|
During the big GPT-4 news cycle I think a bunch of folks posted claims that were outrageously good- "language model passes medical exams better than humans", etc.
When I looked into them, in nearly all cases, the claims were boosted far beyond the reality. And the reality seemed much more consistent with a fairly banal interpretation: LLMs produce realistic looking text but have no real ability to distinguish truth from fabrication (which is a step beyond bullshit!). The one example that still interests me is math problem solving. Can next-token predictors really solve generalized math problems as well as children? https://arxiv.org/abs/2110.14168 |
|