Hacker News new | ask | show | jobs
by godelski 908 days ago
Reinforcement Learning. They are referencing a concept known as Reward Hacking (see Robert Miles videos for a high level explanation). You may be familiar with the concept already though, see Goodhart's Law.