Hacker News new | ask | show | jobs
by jjmarr 2 hours ago
This is brilliant, except for the "alignment won't stop this part".

> Now, some people believe these machines can be made to serve humanity. Does it sound reasonable to imagine a superhumanly intelligent being that is happy to work as a butler to talking primates, forever?

The whole crux of the piece to me is that the AI can be 100% aligned to follow human instructions, and we'd still end up unable to control the AI because every human who can has an incentive not to, while also having an incentive to prevent anyone else from controlling the AI.

An LLM will never try to overthrow me because I will overthrow myself.

1 comments

Sometimes you can be in a situation where every actor taking locally-rational actions leads to globally catastrophic outcomes. It would be easy to argue I think that the July Crisis was like this: if you look at the incentives of each player, they had many reasons to do what they did, and nobody can perfectly what all other players will do, or what the future holds.
Combine the two generals game with the implications of value based pricing. Catastrophic unaffordability is a guarantee.