|
|
|
|
|
by farkin88
319 days ago
|
|
Even though tokens are getting cheaper, I think the real killer of "unlimited" LLM plans isn't token costs themselves, it's the shape of the usage curve that's unsustainable. These products see a Zipf-like distribution: thousands of casual users nibble a few-hundred tokens a day while a tiny group of power automations devour tens of millions. Flat pricing works fine until one of those whales drops a repo-wide refactor or a 100 MB PDF into chat and instantly torpedoes the margin. Unless vendors turn those extreme loops into cheaper, purpose-built primitives (search, static analyzers, local quantized models, etc.), every "all-you-can-eat" AI subscription is just a slow-motion implosion waiting for its next whale. |
|
I'd prefer it just specify a number of tokens rather than be variable on demand - I see that lets them be more generous during low periods but the opacity of it all sucks. I have 5-minute time-of-use pricing on my electricty and can look up the current rate on my phone in an insant - why not simply provide an API to look up the current "demand factor" for Claude (along with the rules for how the demand factor can change - min and max values for example) and let it be fully transparent?