Isn't agreeing to only run an AGI with whatever theoretical alignment controls we come up with, also a social agreement? Seems we will have to figure that out one way or another.
The airgap option is “everybody agrees to leave lots of money on the table in the name of safety”. The alignment option is “some people invest lots of money to invent technology that everyone can use to safely make lots of money”.
If you solve alignment, the regulatory nudge required for folks to use it is probably not that big; let’s be really pessimistic and say it’s 10x the compute to run all the supervision. It’s probably mostly criminals that want the full-unaligned models, and so most entities can agree to the spend.
In the air gap case, I’m pretty confident that it is many orders of magnitude more profitable to run the “unsafe” / non-airgapped version. The utility of baking AI into everything is massive. Maybe in the best case you can have your airgapped ASI build provably-safe non-AGI tool AIs and capture much of that value, but I’m pretty skeptical it’s safe to transport anything that complex across the airgap.
So the alignment research path is more sustainable since it requires paying a much lower “safety premium”.
But I’m all for pursuing both, I don’t think they interfere with each other, quite the opposite; taking either one seriously reinforces the need for both of them.
Another advantage of the alignment path is we can meaningfully make progress now. It’s really hard to get people to agree ahead of time that they will run their AGI in an airgap.
If you solve alignment, the regulatory nudge required for folks to use it is probably not that big; let’s be really pessimistic and say it’s 10x the compute to run all the supervision. It’s probably mostly criminals that want the full-unaligned models, and so most entities can agree to the spend.
In the air gap case, I’m pretty confident that it is many orders of magnitude more profitable to run the “unsafe” / non-airgapped version. The utility of baking AI into everything is massive. Maybe in the best case you can have your airgapped ASI build provably-safe non-AGI tool AIs and capture much of that value, but I’m pretty skeptical it’s safe to transport anything that complex across the airgap.
So the alignment research path is more sustainable since it requires paying a much lower “safety premium”.
But I’m all for pursuing both, I don’t think they interfere with each other, quite the opposite; taking either one seriously reinforces the need for both of them.
Another advantage of the alignment path is we can meaningfully make progress now. It’s really hard to get people to agree ahead of time that they will run their AGI in an airgap.