Hacker News new | ask | show | jobs
by whimsicalism 751 days ago
> what do you do while you're waiting for the AI that's capable of doing alignment research for you to arrive

Nobody interested in superalignment is interested in waiting until actually threatening AI gets here.

1 comments

But that's the fundamental superalignment plan - train a human-level alignment researcher AI, run a bunch of them in parallel, and review their research output to see if they solve the alignment problem. You can't do the plan until the human-level alignment researcher AI already exists.
A large part of the idea is that you can develop techniques for aligning sub-human AI using even stupider AI and hope/pray that continues to generalize once you get to super-human AI being aligned by human-level AI.