|
|
|
|
|
by gabrielgoh
3268 days ago
|
|
i agree completely, and that what's happening is nothing more than brute force search. Though I do think this is still interesting as the reward here is potentially much more well-conditioned than the rewards in RL. Having said that there are situations where this will fail completely, e.g. in maze solving, where the goal is not to play to keep playing but to play to reach the end. |
|