Hacker News new | ask | show | jobs
by enum 1147 days ago
Keep in mind that StarCoder(Base) is just a pretrained LM. The extra stuff that makes 3.5/4 like RLHF gets built on this.
1 comments

Aren't GPT-3 etc base LM and ChatGPT the instruction tuned? Or am I wrong?
code-davinci-002 is a base LM, and the other 3.5 models (text-davinci-{002,003}, gpt-3.5-turbo, and ChatGPT) use instruction tuning and/or RLHF. Source: https://platform.openai.com/docs/model-index-for-researchers