Hacker News new | ask | show | jobs
by m-s-y 302 days ago
RLHF -> Reinforced Learning with Human Feedback

It’s not defined until the 13th paragraph of the linked article.