Hacker News new | ask | show | jobs
by drdeca 1173 days ago
If the safety groups figure out how to do things in a way which we can be confident is safe, this would make it at least possible for the capabilities researchers to do things in those ways. I would imagine people would prefer to do things in safe ways all else being equal. So, if the safety researchers find safety methods which have small enough capabilities costs, then presumably the people who should use those methods, would tend to do so?
1 comments

That does nothing for the intentionally malicious actors.
Bad humans taking over the world is still better than some inhuman alien optimization process taking over the world.
It's not an either-or. Malicious actors will disregard the guardrails to achieve their objectives, but in the process, they will create that very "inhuman alien optimization process" and give it the keys.
Following the safety techniques should be helpful to the goals of most malicious actors, provided that they prefer being in control over an alien optimizer being in control?

Granted, if the safety/alignment techniques have too large of a cost to capabilities/effectiveness, then said malicious actors might be willing to risk a greater chance of an alien optimizer gaining control if it also means a higher chance of them gaining control.