Hacker News new | ask | show | jobs
by pestatije 1101 days ago
RLHF - reinforcement learning from human feedback
2 comments

A notable improvement over the GLHF strategy for interacting with GPT models.
(In case anybody's confused by the gaming culture reference: https://en.wiktionary.org/wiki/glhf. "Good Luck Have Fun")
I was familiar with that phrase and its shorthand ("GLHF") but the latter half of the sentence ("for interacting with GPT models") confused the punchline enough that the joke just didn't land, because the context is one of using RL to "interact with GPT" (relevant to this article) but a more appropriate context would have been regular ole RL using agents in a simulated environment, like - I don't know, a video game?

Maybe I'm overthinking it though.

Thank you!