Y
Hacker News
new
|
ask
|
show
|
jobs
by
judk
4514 days ago
If you learn that all actions `a` from state `s_i` have very low reward, does that propagatet backward to `s_j,a` that feed into `s_i`?