Hacker News new | ask | show | jobs
by torginus 455 days ago
This isn't a novel idea - some people tried the exact same thing the day GPT4 came out.

And going back even further, there's Goal Oriented Action Planning - an old timey video game AI technique, that's basically searching through solution space to construct a plan:

https://medium.com/@vedantchaudhari/goal-oriented-action-pla...

(besides the fact that almost all old timey AI is state space solution search)

1 comments

What's new is to apply that to LLMs, that is.

> This isn't a novel idea - some people tried the exact same thing the day GPT4 came out.

What do you mean? Since GPT4's weights aren't available, you can't run RL on it by yourself. Only OpenAI can.