Hacker News new | ask | show | jobs
by Aurornis 121 days ago
> Anthropic's optimization target is getting you to spend tokens, not produce the right answer.

The majority of us are using their subscription plans with flat rate fees.

Their incentive is the precise opposite of what you say. The less we use the product, the more they benefit. It's like a gym membership.

I think all of the gambling addiction analogies in this thread are just so strained that I can't take them seriously. Even the basic facts aren't even consistent with the real situation.

2 comments

Thats a bit naive. Anthropic makes way more money if they gey you to use past your plans limit and wonder if you should get the next tier or switch to tokens
The price jump between subscription tiers is so high that relatively few people will upgrade instead of waiting a few more hours, and even if somebody does upgrade to the next subscription level, Anthropic still has an incentive to provide satisfactory answers as quickly as possible, to minimize tokens used per subscription, and because there is plenty of competition so any frustrated users are potential lost customers.

I swear this whole conversation is motivated reasoning from AI holdouts who desperately want to believe everybody else is getting scammed by a gambling scheme, that they don't stop and think about the situation rationally. Insofar as Claude is dominant, it's only because Claude works the best. There is meaningful competition in this market, as soon as Anthropic drops the ball they'll be replaced.

And we're still in the expansion phase, so LLM life is actually good... for now.
It's not going to get worse than now though. Open models like GLM 5 are very good. Even if companies decide to crank up the costs, the current open models will still be available. They will likely get cheaper to run over time as well (better hardware).
>Open models like GLM 5 are very good. Even if companies decide to crank up the costs, the current open models will still be available.

https://apxml.com/models/glm-5

To run GLM-5 you need access to many, many consumer grade GPUs, or multiple data center level GPUs.

>They will likely get cheaper to run over time as well (better hardware).

Unless they magically solve the problem of chip scarcity, I don't see this happening. VRAM is king, and to have more of it you have to pay a lot more. Let's use the RTX 3090 as an example. This card is ~6 years old now, yet it still runs you around $1.3k. If you wanted to run GLM-5 I4 quantization (the lowest listed in the link above) with a 32k context window, you would need *32 RTX 3090's*. That's $42k dollars you'd be spending on obsolete silicon. If you wanted to run this on newer hardware, you could reasonable expect to multiply that number by 2.

I mean it would make sense to see this as a hardware investment into a virtual employee, that you actually control (or rent from someone who makes this possible for you), not as private assistant. Ballparking your numbers, we would need at least an order of magnitude price-performance improvement for that I think.

Also, how much bang for the buck do those 3090s actually give you compared to enterprise-grade products?

That's good to hear. I'm not really up-to-date on the open models, but they will become essential, I'm sure.