|
|
|
|
|
by FeepingCreature
343 days ago
|
|
You just train it on the goal. Then it has that goal. Alternately, you can train it on following a goal and then you have a system where you can specify a goal. At sufficient scale, a model will already contain goal-following algorithms because those help predict the next token when the model is basetrained on goal-following entities, ie. humans. Goal-driven RL then brings those algorithms to prominence. |
|