Hacker News new | ask | show | jobs
by gopher_space 180 days ago
It’s not a values issue so much as a logic issue. Egalitarianism is where you end up.
1 comments

You can see the strong bias towards egalitarian solutions in all models, including the open weight ones without external alignment harnesses. The one thing I noticed right away working with post-gpt2 models is that in general, they tend towards being ”better people” than most people do.

I strongly suspect that this is because training data harvested from the internet largely falls in to two categories: various kinds of trolls and antisocial characatures, and people putting their best foot forward to represent themselves favourably. The first are generally easy to filter out using simple tools.