Hacker News new | ask | show | jobs
by yturijea 3 hours ago
I am using perhaps 15% of usage count on Claude with just the normal subscription. And I do full time software engineering and would say I use quite a lot of AI input on thoughts, designs and code drafts.

So how these companies and people manage to use these absurd amount of tokens is a mystery to me. It feels like this are just running huge amount of non-vetted data to the LLM's and or running loops against the LLM's which only produce fractional results if not wasted results for insane cost.

So really it is the equivalent of just burning money, or heating your house in the winter while having all your windows open.

6 comments

Same for me, comfortable with a single Max sub, launching "claude --effort max" with Opus 4.8 (alas poor Fable, please come back!).

But try running Claude Opus at API prices through a 'clever' RAG based intermediate system 'managing' a 2024-era context size window completely unaligned with 2026 frontier model tool use expectations, that results in 100% cache miss and content coherency destruction on every single interaction. There's your typical 'Enterprise Agreement' GenAI setup.

I only really discovered this when trying to find out how my Enterprise friends' AI experiences were so completely opposite from my own successes as I could not believe how poor their results were even though on the surface it looked like we were using the same model, and I know they aren't 'bad' software engineers and developers.

> So how these companies and people manage to use these absurd amount of tokens is a mystery to me.

Fire and forget. They run multiple agents in parallel 24/7. AI isn't just a rubber ducky for them, its their main (only) tool at that point.

If you don't reset sessions eagerly or compact regularly it is easy to consume billions in input tokens while Claude churns away.
What is a "normal" subscription? Are you using Claude Code, or just Claude??
Just Claude, I have seen the weird hallucinations these LLM's make, yes that also means opus, fable etc. so I don't trust it to just run its own clause.

Yes that also means I get to inspect and confirm every step of the way, to ensure the design is followed, we are not making unneccesary changes, we have thought about edge cases, testing etc. And I also keep an understanding of what is produced, because I will manually copy it in, I will manually read through it. I will do secondary review of it myself in PR's whatever.

But I guess a lot of people just don't and just blow claude code through the roof on ad libitum infinity loop?

On subscription, I just checked, I have the Pro Plan, which for Claude I believe is the equal of the normal one?

>>So how these companies and people manage to use these absurd amount of tokens is a mystery to me.

Absolutely!

I know some colleagues who are routinely spending thousands of dollars worth of tokens, I can't see to even max out the subscription limits even if Im working all the time. Curiously enough their output is lower too.

I mean have you tried to tokenmax?

It is not that hard. Just launch 10 different windows and make sure to loop back in after every turn and you will be burning billions of tokens per month in no time.

The question is what work do you do, that burns so many tokens.

Are in you sending it to work in nested for loops? If yes, what sort of work would that be?