| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by drsim 477 days ago
	It is RLHF if I understand correctly.

2 comments

The Venn diagram of people to whom this comment contains no new information and those who know what "RLHF" means is almost a perfect circle.

For anyone not part of that intersection: RLHF means reinforcement learning from human feedback.

Well, HF.

Right!

I suppose it's roughly as much "training AI models" as labeling training data is "training supervised models".

Have fun