|
|
|
|
|
by sacred_numbers
1004 days ago
|
|
Based on my research, GPT-3.5 is likely significantly smaller than 70B parameters, so it would make sense that it's cheaper to run. My guess is that OpenAI significantly overtrained GPT-3.5 to get as small a model as possible to optimize for inference. Also, Nvidia chips are way more efficient at inference than M1 Max. OpenAI also has the advantage of batching API calls which leads to better hardware utilization. I don't have definitive proof that they're not dumping, but economies of scale and optimization seem like better explanations to me. |
|