| HN Mirror

  > hah

And? That's not what's the issue with LLMs.

The issue is an inability to reason. Sure, a human might also have difficulties with river crossing problems, even trivial ones, but I can't get a person to tell me that all animals can fit in the boat, to then put all the animals into the boat, and then proceed to make multiple trips across the river. If they get the first two they always get the right answer. But this is not true for an LLM. That's a very clear demonstration of a lack of reasoning and a lack of having a world model.

It's not about coaching or finding the right prompt, it's that the logic is inconsistent and unreasonable (yes, humans will fail at logic, but *reasoning doesn't mean correct answer*). It fails to meet the basic definition of reasoning.

The whole fucking goal is generalization. That's the G in AGI and the most important thing in all 3 of those letters. We don't have strong evidence of generalization. For GI we want out of distribution generalization but we're not doing so well at in distribution generalization. That's demonstrated by the river crossing puzzles, Cheryl's birthday, and the recently famous 9.8 vs 9.11 (https://x.com/sainingxie/status/1834300251324256439)

Yes, next iteration will get better. But better in which direction. Being dismissive of what it fails at just means you don't get better at that direction unless you get lucky.