|
|
|
|
|
by howinator
316 days ago
|
|
I could be wrong, but I think this pricing is the first to admit that cost scales quadratically with number of tokens. It’s the first time I’ve seen nonlinear pricing from an LLM provider which implicitly mirrors the inference scaling laws I think we're all aware of. |
|
[1] https://cloud.google.com/vertex-ai/generative-ai/pricing
[2] https://openai.com/api-priority-processing/