Hacker News new | ask | show | jobs
by agnosticmantis 9 days ago
I believe sycophancy is a side effect of RLHF and whatever reward function it explicitly and implicitly optimizes.