|
|
|
|
|
by nmfisher
1273 days ago
|
|
Just wanted to point out that ChatGPT is more than just a language model - from OpenAI's (very brief) description, it was also trained with reinforcement learning to select/rank the "best" answer [0]. I think the distinction is important because I suspect it explains why ChatGPT succeeds at certain tasks when previous LM-only models failed miserably. [0] https://openai.com/blog/chatgpt/ |
|
I'd describe it still a language model, but just one with "filtered" output.
I'm not sure if ChatGPT has been documented/described, but it's very similar to OpenAI's InstructGPT which they have described, and which they still refer to as a language model.
> We’ve trained language models that are much better at following user intentions than GPT-3 while also making them more truthful and less toxic, using techniques developed through our alignment research. These InstructGPT models, which are trained with humans in the loop, are now deployed as the default language models on our API.
https://openai.com/blog/instruction-following/