Hacker News new | ask | show | jobs
by masklinn 744 days ago
Tfa states that the agent was trained for points, and an other user states that some critters are a lot more dangerous during full moons.

Wouldn’t be very surprising if the agent hyper-optimised farming those critters for points. It would not be able to change strategy if the cost/benefit of that farming changed massively, so would now be performing significantly worse.