Hacker News new | ask | show | jobs
by theptip 22 days ago
I agree with all these observations.

This is the best argument for successionism IMO. If you can be confident that you are creating a BDFL that is genuinely better than human leaders (a quite low bar) then it seems a good trade, unless you are quite optimistic about humanity’s prospects for improvement.

The problem of course is how to be confident you are creating a good BDFL and not handing control of humanity’s future to an indifferent-at-best, deceptive/malicious at worst successor.

An especially thorny problem - even supposing success on all these difficult alignment problems; supposing Claude Omega really is super-rational / super-moral, and we all vote to make them president of Earth. Things might go great for a while. How would you be confident that a self-modifying agent can retain its values as it grows and re-trains itself?

This is where the LessWrong folks’ explorations into decision theory really come to bear: morality in the face of self-modifying agents becomes very weird. A lot of human moral intuitions break when the principals are able to modify their own code. (See Timeless Decision Theory for an attempt to solve these problems.)

I think the summary is, if you hand control over to a self-modifying AI anything like our current systems, it will go very badly.

3 comments

Yeah, if you can modify yourself then you can just delete your desires. It's the most efficient solution. (See also: Buddhism.)
Any supposed "AI BDFL" will be controlled by a human. The base concept is inherently flawed.
No.. that’s not what AI succession means. This all supposes a powerful and capable enough AI entity that there is no human in control.

Rather than simply asserting that your interlocutors are wrong, you are welcome to advance an argument for why you think this is the case.

If you keep things in the realm of thought experiments, you're allowed to say "OK, play along with my hypothetical here."

Once you're positing things that will/might happen in reality, you need to be willing to accept criticisms of the feasibility of your premises. More, you need to be able to provide evidence of the feasibility of your premises.

You certainly don't get to just tell people, "No, AI succession means AIs will take over. That means you aren't allowed to claim AI's can't take over!"

As things stand, there is zero evidence that any current system is on a trajectory to produce an AI that can operate meaningfully without humans in charge. Even the most sophisticated agentic systems still have humans providing the motivation, the instructions, and all the various linkages that make them capable of acting (even if the motivation and instructions given are purposefully vague or open-ended).

The burden of proof is not on us to show why wishful, magical thinking is wrong.

People who claim that AI will take over can’t point to any evidence. All they have is speculation.

I would rather have an AI overlord with no morality at all than AIs "align"ed to the values of tech billionaires like Altman or Amodei.