| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rtkwe 1816 days ago
	There are several incentive fixes: change the negative incentive to a factor that discounts the reward for catching a sheep, add a negative incentive to death, or a positive incentive to being alive at the end of the simulation. The failure here was they didn't think about what happens when the agent can't achieve a positive score, ie can't catch a sheep.