| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by taneq 443 days ago
	In all reinforcement learning there is (explicitly as part of a fitness function, or implicitly as part of the algorithm) some impetus for exploration. It might be adding a tiny reward per square walked, a small reward for each block broken and a larger one for each new block type broken. Or it could be just forcing a random move every N steps so the agent encounters new situations through “clumsiness”.

1 comments

That is right, there is usually a parameter on the action selection function -- the exploitation vs exploration balance.