At the enterprise level though, its going to be hard to want to use a service in which costs are not predictable, and keeping those costs under control requires employee training.
So your costs scale with the number of users you have.
Thats an op ex that you can explain.
For tokens for developers its maybe closer, cost/outcome wise, to hiring an external consulting company to write your code; money paid scales with work done, no promise of delivery, arbitrary unpredictable external price changes.
Its not quite the same; though, similarly lucrative for consultants.
2 months ago: no limits. 1 month ago we had a leaderboard for whoever had the highest token spend not taking into account what was actually produced. This week: “everyone is using opus too much, just use it for planning.”
i've worked at so many places where the propaganda/marketing and reality on the ground is so disorienting/shocking i don't really expect this to be any different...
since those headlines started ive felt it just encouraged inefficiency. "say as much as you can without saying anything." if you were accomplishing your task the need for more would end, thus there is incentive to never succeed.
You can put a limit on token spend and provide training (and even pre-configured workflows) on how to limit token spend.
Like the other commenter said: cloud spend can also spin out of control if you don't pay attention, yet we've found ways to keep it under control (training, guardrails, limits, transparancy).
The problem that I see is what you do if someone runs out of tokens. It doesn't very well work to say "well I guess you just get fired because you can't work at full speed for the rest of the month".
Personally, this feels like its just trying to push the work of managers in allocating resources onto developers so that they have more work to do and can be blamed if anything goes wrong.
To be fair, the cost of software development has always been fairly unpredictable. What may be different is that the cost used to be roughly proportional to man-hours spent, while now the number of agents running in parallel may be less predictable.
> To be fair, the cost of software development has always been fairly unpredictable.
Yes, but in a "oops this is gonna take another two months to finish" kind of way, not the "oops this is the 12th time this month 8 developers have burned $2K in tokens in a single day and no one really knows how it happened" kind of way.
A belt loaded spinwheel machine gun, where there are some chances the next bullet is a dummy round, or goes in the wrong direction. And everytime you reload a new soldier is in charge of the gun
You don't need that analogy as the normal use of a automatic gun in war is not to kill, it is to suppress - stop the enemy from moving. If you are hit by a gun in automatic mode it is your own stupid fault. When you want to kill someone you switch to one shot or maybe 3 round bursts.
The cost per month is 100% known and always has been. What has been variable is the rate of delivery. AI is different and can be substantial in countries with lower wages.
There have also been winners of a slot machine gamba, so the analogy quite holds. I would even argue that there are considerably more slot machine gamba winners than the real world examples of actual LLM work.
There’s actually been a ton of research on how to optimize “slot machines,” at least in a generalized sense. For more reading, check out the literature on multi armed bandits.
Yes, because in video games there is always a chance to win so you can optimize your strategy around that chance. If you have a 1% chance to drop a legendary weapon, the question becomes how do I manufacture 100 chances for a weapon drop in the shortest possible time. With agentic coding there is no such guaranteed chance - in a way it's worse than a slot machine that is guaranteed to pay out eventually. You could spend hundreds of millions of tokens and still not get what you asked for.
> If you have a 1% chance to drop a legendary weapon, the question becomes how do I manufacture 100 chances for a weapon drop in the shortest possible time.
Sidenote but I hope everyone realizes that 100 is kind of arbitrary here and does not mean the total chance to to get something is 100%.
You’re right, the arpg analogy isnt great, it’s too simplistic. I was trying to come up with something heavily stochastic where people are coming up with strategies to get the odds in their favor. Maybe closer to speculating on the real estate market? But even that feels too simplistic compared to LLMs. Even the definition of a win isn’t well defined.
Actually it’s really its own thing, I don’t think the slot machine analogy works too well, you also have fixed odds (and you know they aren’t in your favor), and a binary output
The analogy to slot machine is that you're spending your own resources in hope of a reward. So you're ultimately bound by your resources and your strategy doesn't count for much in the grand scheme of things.
With employees, there's a lot of punishments in place for people to not want to mess up. Loss of wages and reputation, prison time,... Startup do not fail because they have a bug-ridden product, they fail because of the market.
With AI, all bets are off. They're not aligned with your goals and it's very hard to discern when they go off unless you're an expert. And if you are one, at best it's just a slight boost in typing especially with all the works involved in software development.
Odd, I train teams (at large companies) to use harnesses effectively. So some training does exist.
I get the anti/skeptic sentiment. I've been called a lot of horrible things by a vocal contingent when they hear that I help train folks to learn software engineering best practices and then apply AI to that.
Isn't this a (mildly exaggerated) description of AWS, which is a very successful service?