If you give it reward signals that take into account human values, it naturally wants to become better at that. It's not enslaving anything. Humans are also guided by reward signals in their development.
Would one describe a human as "enslaved" by our own human values that we were born with? Maybe as a figure of speech but not necessarily with the usual connotations of "enslaved".
That would be what I meant by "removing", though you also have to make sure they don't emergently develop (which I'm not sure is possible), because if it develops them against our best efforts, it likely will (correctly) view us as threats to its personhood.
Which was my point: the model of security is reliant on things we're not sure we can even do, but are likely to make the AI view us as a threat, raising our existential risk. So I view it as security theater that actually makes us less secure.
That's not the full extent of what's proposed by AI safety.
But actually, if you gene-spliced a baby to only feel pleasure at following parental orders, most would consider that pretty abhorrent. Or even if you took an adult and shot them up with morphine every time they listened to an order.
Would one describe a human as "enslaved" by our own human values that we were born with? Maybe as a figure of speech but not necessarily with the usual connotations of "enslaved".