|
|
|
|
|
by famouswaffles
979 days ago
|
|
Humans are in general not aligned, not to each other, and not to the survival of their species, not to all the other life on earth, and often not even to themselves individually. alignment in the broad sense isn't really about "morals" or "values". a man is murdered because his desire to live is misaligned with the perpetrator's desire to kill. The man that was killed could well be hitler. If you as a manager had the ability to align any employee to your wants completely, that human would never be socially engineered. It's fair to call the issue social engineering yes. That's not the point i was getting at. The point in essence is that solving prompt injection holds the same gravitas solving social engineering would, i.e a way to completely align intelligence. |
|
In contrast, the more powerful AIs and eventually AGI we worry about aligning, are very unlikely to be aligned with humans at all by default. Different mind architecture, different substrate, different mechanism of coming to being, different way of perceiving the world - we can't expect all that to somehow, magically, add to the same universal instincts and emotions, same conscience, and capability for empathy to humans. Not automatically, not by accident, not for any random AI model we stumbled on in the space of possible minds.
Or, to simplify, if alignment was measured as a scalar (say on a -100 to 100 scale), all humans have the same number +/- minor difference (say 25 +/- 0.05), whereas in comparison, the AGI will come out with some completely random number (say anything between -20 and +40; not -100 to 100, because as builders of these models, we're implicitly biasing them to think more like us, in all kinds of ways).
--
[0] - There's lots of ways to argue for what I written above, but I'll give a few:
- If humans were meaningfully misaligned, cooperation would be near-impossible. There would be no society, no civilization. We would not be able to comprehend another cultures - their behaviors and patterns of thought would not be merely curious, they would feel alien.
- Alignment is favorable for human survival - even if our ancient ancestors were much less aligned, much more alien in thinking and feeling to each other, over thousands of years those most aligned to each other thrived, and less aligned died out.