Hacker News new | ask | show | jobs
by hardmaru 3848 days ago
Yeah reinforcement learning and policy gradients can possibly attack the same problem.

Although for a simple task like this demo, I stand by simple conventional neuroevolution as they would work best and are easy to train as the nets are quite small.

1 comments

> Although for a simple task like this demo, I stand by simple conventional neuroevolution as they would work best and are easy to train as the nets are quite small.

They may be easy to train but I'm not sure about 'work best'. Watching your final trained agents, I see a lot of dumb obvious mistakes where the agent runs right into a line; and I see little higher-level strategic planning like trying to maximize free space around one or run away from a incoming line.