|
|
|
|
|
by simonw
248 days ago
|
|
I think the opposite is much more likely to be true: that vendors who charge money for inference are charging more than it costs them to service a prompt. I've heard from sources that I trust that both AWS and Google Gemini charge more than it costs them in energy to run inference. You can get a good estimate for the truth here by considering open weight models. It's possible to determine exactly how much energy it costs to serve DeepSeek V3.2 Exp, since that model is open weight. So run that calculation, then take a look at how much providers are charging to serve it and see if they are likely operating at a loss. Here are some prices for that particular model: https://openrouter.ai/deepseek/deepseek-v3.2-exp/providers |
|
This is like saying solar power is free if you ignore the equipment and installation costs.
Even worse still, model creators are in an arms race. They can't release a model and call it a day, waiting for it to start paying for itself. They need to immediately jump on to the next version of the model or risk falling behind.