Hacker News new | ask | show | jobs
by ProjectArcturis 1017 days ago
Surely the reason LLMs fail here is because this is an adaptation of a common word problem, except your version has been tweaked so that there is a trivial answer.
1 comments

Yes, that's the point of the question. We want to know if it's actually doing some reasoning, or if it has just memorized an answer.
It's the latter. For every LLM out there. They are trained to memorize, not reason. It will take radically different training techniques to make these networks reason in a human-like way.
Memorising is so trivial we've been doing it by default since forever, regardless of if that means magnetic core memory, the Jacquard Loom, the Gutenberg press, the ceramic movable type China had for a few centuries before Gutenberg, or using a stick to smudge words into soft clay tablets that were accidentally made permanent by a house fire.

AI like this aren't just memorisation.

They almost certainly don't think like us — even if they did at a low level, the training regime would take the equivalent of hundreds of human lifetimes, and the number of parameters in the larger models is a thousandth of the number in a human brain.

Then how do you explain zero-shot performance?