| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by saulrh 3840 days ago
	Yes, in theory an offline supervised learner should never beat an online reinforcement learner. Adding a set of actions A that can be used to bias future examples in a predictable manner is certainly an advantage that will yield better convergence properties in almost all scenarios, simply because it lets you gain more information per observation.