Hacker News new | ask | show | jobs
by fnordpiglet 1118 days ago
I’m sorry what’s RLHP? I’m not able to Kagi that
2 comments

The P should be an F, it's reinforcement learning from human feedback
Reinforcement learning through human feedback.

Took me a bit of searching too.