Hacker News new | ask | show | jobs
by jdietrich 557 days ago
I just don't see how anyone can see a study comparing the reasoning abilities of various LLMs, see that large LLMs have better reasoning abilities and conclude that LLMs can't reason. LLMs don't have human-like reasoning abilities, but it's just obviously true that they have some capacity for reasoning; that ability seems to scale roughly linearly with model size and training FLOPs.
1 comments

Yes, but is human-reasoning on the same spectrum as LLM-reasoning? Meaning that only scale will turn the latter into the former?

No definitive answer yet, but my bet is on no.

Agreed, and I think the answer is pretty clear.

Large models successful now have dodged recurrent architecture, which is harder to train but allows for open ended inference steps, which would allow straightforward scaling to any number of reasoning steps.

At some point, recurrent connections are going to get re-incorporated into these models.

Maybe two stage training. First stage, learn to integrate as much information as well as possible, without recurrence. As is happening now. Second training stage, embed that model in a larger iterative model, and train for variable step reasoning.

Finally, successful iterative reasoning responses can be used as further examples for the non-iterative module.

This would be similar to how we reason in steps at first, in unfamiliar areas. But quickly learn to reason with faster direct responses, as we gain familiarity.

We continually fine tune our fast mode on our own more powerful slow mode successes.

Lol, imagine being downvoted for asking a couple of questions.

Still 5k points to go, though! :D