Hacker News new | ask | show | jobs
by hkab 1184 days ago
No, RLHF only helps the model to align to human preference