| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by HarHarVeryFunny 593 days ago

Reasoning basically means multi-step prediction, but to be general the reasoner also needs to be able to:

1) Realize when it's reached an impasse, then backtrack and explore alternatives

2) Recognize when no further progress towards the goal appears possible, and switch from exploiting existing knowledge to exploring/acquiring new knowledge to attempt to proceed. An LLM has limited agency, but could for example ask a question or do a web search.

In either case, prediction failure needs to be treated as a learning signal so the same mistake isn't repeated, and when new knowledge is acquired that needs to be remembered. In both cases this learning would need to persist beyond the current context in order to be something that the LLM can build on in the future - e.g. to acquire a job skill that may take a lot of experience/experimentation to master.

It doesn't matter what you call it (basic or advanced), but it seems that current attempts at adding reasoning to LLMs (e.g. GPT-o1) are based around 1), a search-like strategy, and learning is in-context and ephemeral. General animal-like reasoning needs to also support 2) - resolving impasses by targeted new knowledge acquisition (and/or just curiosity-driven experimentation), as well as continual learning.