|
|
|
Why is Chat GPT so expensive to operate?
|
|
82 points
by beavis000
1248 days ago
|
|
Altman has said "it's a few cents per chat", which probably means it closer to high single digit cents per chat. Does that estimate include amortization of upfront development costs, or is it actually the marginal cost of a chat? |
|
Meta released their OPT model which they claim is comparable to the GPT-3 model. Guidance for running that model [1] suggests a LOT of memory - at least 350GB of gpu memory which is roughly 4 A1000s, which are pricy.
Running this on AWS with the above suggestion would cost $25/hr - just for one model running. That’s almost $0.50 a minute. If you imagine it takes a few seconds to run the model for one request… easily you’ll hit $0.05 per request once you factor in the rest of the infra (storage, CDN, etc) and the engineering cost, and the research cost, and the fact that they probably have a scale to hundreds of instances for heavy traffic and that may mean less efficient purchased servers.
OpenAI has a sweetheart deal with Azure, but this is roughly the cost structure for serving requests. And this doesn’t include the upfront cost of training.
https://alpa.ai/tutorials/opt_serving.html