| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kotach 3755 days ago

Now, train the network jointly over the game sequence. Or even better, when given a chance to take action rollout on each action and learn jointly on that rest of gameplay.

Reinforcement learning is very hard. Especially when you create meaningful games and then don't use the fact that a whole game is a one long chain of events, and instead force learning on windowed sequence.

Neural network has enough parameters to remember much of these windows and will clearly perform well, but the training last too long given the fact that no structured information is used.