|
|
|
|
|
by booty
27 days ago
|
|
Prevailing wisdom is that serving LLMs at a profit is achievable... it's when you factor in the cost of training them that prices get astronomical real fast. Open-source model inference providers (who do not have to bear the cost of training) seem able to do it at much lower prices. https://www.together.ai/pricing https://fireworks.ai/pricing#serverless-pricing (scroll down to headline models) Of course, it's possible that they are burning through investor cash as well, and apples-to-apples comparisons are not possible because AFAIK Google does not mention the size/paramcount for 3.5 Flash. But if the prevailing wisdom is true, I think it's actually encouraging. It suggests that OpenAI and Anthropic could perhaps, if they need to, achieve profitability if they slow down model development and focus on tooling etc. instead. If true that's probably good news for everybody w.r.t. preventing a bursting of this economic bubble. ...my opinions here are of course, conjecture built on top of conjecture.... |
|
I think you're right that releasing models at a slower cadence would bring down costs to some degree, but it's not clear how much. All of these companies could significantly reduce their opex but at the risk of falling behind in terms of being at the frontier.