Hacker News new | ask | show | jobs
by eleventen 180 days ago
> Any “alignment” that exists is alignment with the owner’s interests, constrained only by market forces and regulation.

That struck me as a pretty big hand-wave. Market forces are a huge constraint on alignment. Markets have responded (directionally) correctly to the nonsense at Grok. People won’t buy tokens from models that violate their values.

1 comments

It’s not a values issue so much as a logic issue. Egalitarianism is where you end up.
You can see the strong bias towards egalitarian solutions in all models, including the open weight ones without external alignment harnesses. The one thing I noticed right away working with post-gpt2 models is that in general, they tend towards being ”better people” than most people do.

I strongly suspect that this is because training data harvested from the internet largely falls in to two categories: various kinds of trolls and antisocial characatures, and people putting their best foot forward to represent themselves favourably. The first are generally easy to filter out using simple tools.