Hacker News new | ask | show | jobs
Rlaif: Scaling Reinforcement Learning from Human Feedback with AI Feedback (arxiv.org)
1 points by maccaw 1015 days ago