Hacker News new | ask | show | jobs
by rzmmm 78 days ago
It's likely the RLHF process since there are significant differences between models about this.