Hacker News new | ask | show | jobs
by croes 1816 days ago
Wouldn't a higher penalty for bolder hits solve that problem, especially a high penalty for suicide?

Would be more realistic because dying has higher cost than failing.

1 comments

There are several incentive fixes: change the negative incentive to a factor that discounts the reward for catching a sheep, add a negative incentive to death, or a positive incentive to being alive at the end of the simulation. The failure here was they didn't think about what happens when the agent can't achieve a positive score, ie can't catch a sheep.