Y
Hacker News
new
|
ask
|
show
|
jobs
by
sorenjan
1158 days ago
When I learned about RL we were taught to disable exploration when doing evaluation of the model since exploration part is stochastic. I don't think that would work in production.