|
|
|
|
|
by Calavar
340 days ago
|
|
> The transition from GPT-2 era models to now (GPT-3+) saw the transition from novel models that can kinda imitate speech to models that can converse, write code, and use tools. Which is fundamentally about data. OpenAI invested an absurd amount of money to get the human annotations to drive RHLF. RHLF itself is a very vanilla reinforcement learning algo + some branding/marketing. |
|