Hacker News new | ask | show | jobs
by drsim 477 days ago
It is RLHF if I understand correctly.
2 comments

The Venn diagram of people to whom this comment contains no new information and those who know what "RLHF" means is almost a perfect circle.

For anyone not part of that intersection: RLHF means reinforcement learning from human feedback.

Well, HF.
Right!

I suppose it's roughly as much "training AI models" as labeling training data is "training supervised models".

Have fun