|
|
|
|
|
by visarga
1046 days ago
|
|
The model would learn from feedback, not just regurgitate the training set, as long as the model is part of a system that can generate this feedback. AlphaGo Zero had self play for feedback. Robots can check task execution success. Even chatting with us generates feedback to the model. |
|