|
|
|
|
|
by hardmaru
3848 days ago
|
|
Yeah reinforcement learning and policy gradients can possibly attack the same problem. Although for a simple task like this demo, I stand by simple conventional neuroevolution as they would work best and are easy to train as the nets are quite small. |
|
They may be easy to train but I'm not sure about 'work best'. Watching your final trained agents, I see a lot of dumb obvious mistakes where the agent runs right into a line; and I see little higher-level strategic planning like trying to maximize free space around one or run away from a incoming line.