Y
Hacker News
new
|
ask
|
show
|
jobs
by
qwerty3344
1147 days ago
I think it's still significantly behind GPT 3.5/4, both of which can get 67% on HumanEval, and 88% with Reflexion
1 comments
enum
1147 days ago
Keep in mind that StarCoder(Base) is just a pretrained LM. The extra stuff that makes 3.5/4 like RLHF gets built on this.
link
manojlds
1147 days ago
Aren't GPT-3 etc base LM and ChatGPT the instruction tuned? Or am I wrong?
link
dpf
1146 days ago
code-davinci-002 is a base LM, and the other 3.5 models (text-davinci-{002,003}, gpt-3.5-turbo, and ChatGPT) use instruction tuning and/or RLHF. Source:
https://platform.openai.com/docs/model-index-for-researchers
link