|
|
|
|
|
by surgical_fire
112 days ago
|
|
> For our language model benchmarking, we note that we consider endpoints to be serverless when customers only pay for their usage, not a fixed rate for access to a system. Typically this means that endpoints are priced on a per token basis, often with different prices for input and output tokens. Okay, correct me if I am wrong, so this is measuring the inference costs for clients of AI services, not the the inference costs that the AI service itself has when they offer the service? I mean, the other guy's claim is that inference costs had come down 20x-30x. But the analysis, if I understood correctly, is based on how much clients are paying for it, not how much it actually costs. I can charge you 20x less for a service and have massive losses for it. |
|
Its easier to just admit that technological advances helped decrease the cost instead of coming up with more complicated reasons like VC funding, subsidies and so on.
For instance take Deepseek and other opensource models - even they have reduced their costs by a huge margin. What explanation is there for opensource models?