Hacker News new | ask | show | jobs
by rrr_oh_man 860 days ago
RLHF = Reinforcement Learning from Human Feedback