| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mjaskowski 3791 days ago

Yes, we try to approximate Q function with neural network. Which is basically an enhanced version of gradient-descent Sarsa.

The main trick to notice is that you can't provide consecutive frames as mini-batches as these would be highly correlated and would derail stochastic gradient descent.

So we keep many frames (and all other necessary information) in memory and draw these experiences uniformly to form a minibatch that becomes input to the neural network