Hacker News new | ask | show | jobs
How RLHF Preference Model Tuning Works (and How Things May Go Wrong) (assemblyai.com)
3 points by mr-ai 1055 days ago