Y
Hacker News
new
|
ask
|
show
|
jobs
by
godelski
908 days ago
Reinforcement Learning. They are referencing a concept known as Reward Hacking (see Robert Miles videos for a high level explanation). You may be familiar with the concept already though, see Goodhart's Law.