Hacker News new | ask | show | jobs
by bornfreddy 501 days ago
RLHF == Reinforcement Learning from Human Feedback