| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sorenjan 1158 days ago
	When I learned about RL we were taught to disable exploration when doing evaluation of the model since exploration part is stochastic. I don't think that would work in production.