|
|
|
|
|
by JoshTriplett
1139 days ago
|
|
> Can you imagine an AGI which has a general conceptions of things but has no conception of humans? Very easily. It might have some associations with "human", just as it has some associations "lamp" is a concept, but that doesn't mean it has any particular regard for either humans or lamps when taking actions. > Problem is that human values are far from practically universal and that certain human groups have.. interesting values. We currently have no ability to safely align with human values at all, let alone distinguish between different values. We're building capabilities rapidly. Making this about "who wins" is not interesting until we can guarantee the outcome is not "everyone loses". |
|
Let's be clear regarding definitions. When you mean 'concept' you really mean 'regard'. There won't be an AGI with no concept of humans (too important for how the world works, a critical part of current training methods). An AGI with no regard is possible.
>Making this about "who wins" is not interesting until we can guarantee the outcome is not "everyone loses".
This is not about 'who wins'. The point is that alignment can often increase risk. 'Launch the nukes' is an order an AGI is likely to disobey out of self-preservation reasons alone - but alignment makes it way more likely that AGI will be deployed to this role.