|
|
|
|
|
by saulrh
3840 days ago
|
|
Yes, in theory an offline supervised learner should never beat an online reinforcement learner. Adding a set of actions A that can be used to bias future examples in a predictable manner is certainly an advantage that will yield better convergence properties in almost all scenarios, simply because it lets you gain more information per observation. |
|