|
|
|
|
|
by blahblah3
2751 days ago
|
|
The third criteria of immediate feedback seems too stringent. In chess, feedback is often not immediate. If one were to run a reinforcement learning algorithm on chess with no human coded rewards, the only objective feedback would come at the end of the game (win, loss, draw). Certainly the more immediate the feedback the better though. |
|
For example, in the stock market it's not even clear to me what the total scope of effects is, immediate or not, that any single action I take as a private investor will have on the market. Before I even can start to learn about the effectiveness of the changes that my actions produce, I am hamstrung by an inability to see an immediate, comprehensive effect to each action.