Hacker News new | ask | show | jobs
by peaslock 1277 days ago
The model with the most similar name in this list is code-cushman-001 which is described as "Codex model that is a stronger, multilingual version of the Codex (12B) model in the paper".

https://crfm-models.stanford.edu/static/help.html

The next stronger Codex model is called code-davinci-001 which appears to be a fine-tuned version of the GPT-3 Davinci model which is known to have 175B parameters. The model naming is alphabetical in the order of the model size:

https://blog.eleuther.ai/gpt3-model-sizes/

See also A.2 here: https://arxiv.org/pdf/2204.00498.pdf#page=6

1 comments

Code is the base model in more recent iterations [0]

[0] https://beta.openai.com/docs/model-index-for-researchers