Hacker News new | ask | show | jobs
by dchichkov 498 days ago
There are examples of learning reasoning from scratch with reinforcement learning.

Emergent tool use from multi-agent interaction is a good example - https://openai.com/index/emergent-tool-use/

2 comments

Now you are asking for a perfect modeling of the system. Reinforcement learning works by discovering boundaries.
Now rediscover all the plants that are and aren't poisonous to most people.
I've suggested that long context should be included into the prompt.

In your particular case the prompt would look something like: <pubmed dump> what are the plants that aren't poisonous to most people?

A general reasoner would recover language and relevant world model from pubmed dump. And then would proceed to reason about it, to perform the task.

It doesn't look like a particularly efficient process.