| HN Mirror

The Lebowski theorem: "No superintelligent AI is going to bother with a task that is harder than hacking its reward function" is surprisingly deep.

You can see the theory in action in human history. We have drives programmed by evolution, but we try to to shortcut them. Drugs are deliberate hack for example. TV, entertainment and games can provide rewards faster than effort towards real life goals.

We fully understand that there is difference between the drive's purpose and what we do, but but we don't actually value the purpose of our reward function, only the reward. Catholic church tries to insist that people should limit sexual pleasure to the purpose it was created (by god or evolution) but people don't listen.