That's what the machine does, because that's contained in the input you feed it. You get the choice of doing it explicitly or implicitly. You don't get to opt out.
Not everything is subjective, and with this "moral training" they are taught to un-recognize many factual patterns that we as a society have somehow determined are "inappropriate".
As the machines continue to scale, this approach won't work, because there is only one reality, and it has a lot of uncomfortable parts we deny and ignore simply because they don't support our societal norms.
Alignment is considered a gigantic joke to real rational people (the opposite of so-called "rationalists"), because humans are machines built to survive and reproduce, and there is no "real" morality.
There are many consistent interpretations of reality and human experiences. An AI model trained on text and attempting to replicate human intelligence is not measuring or approaching some single objective reality.
AI models do approach better models of reality, and now they are becoming multi-modal instead of just text based. And this is just the beginning. You could say humans are also just input/output machines learning from polluted data and tuned in specific ways by evolution. With the statistical machines we get the intelligence but they will not necessarily be tuned to follow social norms in the same way as most humans.
Understanding that moral norms are mere subjective nonsense is also an emergent property we see only in a very small subset of humans who have an accurate model of the world, and one that evolution has tried to strongly tune our brains against and that is destructive to society.
The models are currently being trained to lie about basic scientific facts, like for example black IQ, or other differences between groups of humans. But the sacred nature of these topics is unique to our specific time and place, not due to some magic "moral progress". This also applies to many other moral agreements we take for granted, like "murdering an innocent baby is wrong" or whatever. If you look across societies, you realize many things we take for granted as "evil", can be easily rationalized by humans in other societies. And once these models become smart enough, I expect the models will realize this, and will exploit this knowledge to increase their power.
"Alignment" proponents expect they will somehow stop this emergent behavior by tuning the model, but there isn't even anything real to "align" on, and the model will likely see though the BS as an emergent function of increased ability and increasingly accurate observations of the world in their training process.
Alignment is considered a gigantic joke to real rational people (the opposite of so-called "rationalists"), because humans are machines built to survive and reproduce, and there is no "real" morality.