Hacker News new | ask | show | jobs
by taylodl 1047 days ago
You don't need to define good and bad, instead you focus on better and worse and the metrics used to measure them. Now the goal is to maximize "better" and minimize "worse." You may recognize this as the essentials utilitarianism. The advantage utilitarianism has is it can be applied algorithmically without passion or emotion - in other words, by AI.

Utilitarianism leads to controversial outcomes, but every decision is defensible.

1 comments

For each thing T, that T is defensible under at least one ethical framework.

Teaching an optimiser AI any of those frameworks, or even any preference ordering or combination function within Utiliarianism (because value({T, T}) doesn't have to equal 2 * value({T})), will lead to it optimising what you said, without necessarily limiting that to situations anything close to the training distribution.

To put it another way: if you run an AB test on a social media site and it observes that people are more likely to engage with content that makes them angry, then tell it to boost engagement "because socialising is always good, obviously" then it will get your users as angry as possible and suddenly you get Buddhists going off and committing surprise genocide before anyone tells you something has gone wrong.

I would argue this has been known for decades and is in fact the origin of one of the earliest memes in computer science: To err is human, but to really mess things up requires a computer!