|
|
|
|
|
by salty_biscuits
2949 days ago
|
|
I'd say getting better sample efficiency is a bigger deal. It isn't like POMDP's are a huge step away theoretically from MDP's. But if you attach one of these things to a robot, taking 10^7 samples to learn a policy is a deal breaker. So fine, please keep using games to research with. |
|