Hacker News new | ask | show | jobs
by blahblah3 2751 days ago
The third criteria of immediate feedback seems too stringent. In chess, feedback is often not immediate. If one were to run a reinforcement learning algorithm on chess with no human coded rewards, the only objective feedback would come at the end of the game (win, loss, draw).

Certainly the more immediate the feedback the better though.

1 comments

I think it's more important that the input produces an immediate and readily observable change to the state of the world, not necessarily that you must have perfect information in the feedback as well that shows the exact utility of your previous action.

For example, in the stock market it's not even clear to me what the total scope of effects is, immediate or not, that any single action I take as a private investor will have on the market. Before I even can start to learn about the effectiveness of the changes that my actions produce, I am hamstrung by an inability to see an immediate, comprehensive effect to each action.