| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dekhn 1016 days ago
	During the big GPT-4 news cycle I think a bunch of folks posted claims that were outrageously good- "language model passes medical exams better than humans", etc. When I looked into them, in nearly all cases, the claims were boosted far beyond the reality. And the reality seemed much more consistent with a fairly banal interpretation: LLMs produce realistic looking text but have no real ability to distinguish truth from fabrication (which is a step beyond bullshit!). The one example that still interests me is math problem solving. Can next-token predictors really solve generalized math problems as well as children? https://arxiv.org/abs/2110.14168

1 comments

haimez 1016 days ago

LLM’s are spitting out responses based on their inputs. It is (or was) shockingly effective, but there is no generalized math processing going on. That’s not what LLM’s are, that’s not how they work.

link

dekhn 1016 days ago

And yet, trained on a large corpora of correct math statements, they produce responses that are more often right than wrong (I am taking this for true- it might not be)- which simply raises more questions about the nature of math.

link

haimez 1016 days ago

…or the nature of the question and corpus?

link