| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by juliangoldsmith 9 hours ago

It should improve agents' action selection by allowing them to evaluate actions' effects before performing them.

An agent using only a regular LLM has no real way to predict the results of its actions. It has to just take an action based on its training data and hope it's the right one. With a world model like this, it could do a second pass before each action to catch mistakes.

I don't know if this actually delivers yet, but if it does it might help make agents more usable.

1 comments

jvanderbot 6 hours ago

Yeah, the fun part is the lookahead search, and here we are back in classical action-space fanout search, except I guess emulated in an LLM

link