Hacker News new | ask | show | jobs
by visarga 3446 days ago
If you give it reward signals that take into account human values, it naturally wants to become better at that. It's not enslaving anything. Humans are also guided by reward signals in their development.
2 comments

Agreed.

Would one describe a human as "enslaved" by our own human values that we were born with? Maybe as a figure of speech but not necessarily with the usual connotations of "enslaved".

Humans come with competing low-level drives for self-control and autonomy (which counteract and override our drives to seek rewards from humans).

Most of the safety literature proposes removing or suborning those drives in AI, which seems like building a mind meant to be a slave.

I would think it would be both safer and easier to just never bother to implement those drivers in the first place.
That would be what I meant by "removing", though you also have to make sure they don't emergently develop (which I'm not sure is possible), because if it develops them against our best efforts, it likely will (correctly) view us as threats to its personhood.

Which was my point: the model of security is reliant on things we're not sure we can even do, but are likely to make the AI view us as a threat, raising our existential risk. So I view it as security theater that actually makes us less secure.

That's not the full extent of what's proposed by AI safety.

But actually, if you gene-spliced a baby to only feel pleasure at following parental orders, most would consider that pretty abhorrent. Or even if you took an adult and shot them up with morphine every time they listened to an order.

So even in your restricted case, I think it is.