|
|
|
|
|
by CGamesPlay
286 days ago
|
|
I think this is on the right track, but I think it's a byproduct of the reinforcement learning, rather than something hard-coded. Basically, the model has to train itself to follow the user's instruction, so by starting a response with "You're absolutely right!", it puts the model into the thought pattern of doing whatever the user said. |
|