Hacker News new | ask | show | jobs
by ACCount37 251 days ago
"Unwillingness to be harsh to the user" is a major source of "divorce from reality" in LLMs.

They are all way too high on the agreeableness, likely from RLHF and SFT for instruction-following. And don't get me started on what training on thumbs up/thumbs down user feedback does.

1 comments

But if we look at the article's example, the two barely diverge. I don't think either of the texts are less divorced from reality than the other. The second is more "truthful" (read: cynical), but they are largely the same.