Hacker News new | ask | show | jobs
by wrxd 26 days ago
I’m at a FAANG and we have $300/day token quota. Personally I don’t use that much of it but management is pushing really hard for it. “the quota has been raised for a reason, use it”. Any task: “have you tried working on it with Claude?”. Every meeting “now engineer x and y will show you what he did with AI”.

It’s not all useless but most of the days I think I would be more productive if some processes were streamlined rather than if I had to throw tokens at them and still fail.

Of all the showcases I’ve seen the best are the ones written by people assuming that the token bonanza will not last so they used AI to build tools they wished they had. AI used to build the tool but by no means used by the tool, so if/when token quota gets reduced we still have a functional tool.

10 comments

300 a day?? 7K dollars a month? No wonder they need to lay people off!
At Nvidia, we have no limit for Anthropic or Open AI models (for now) and are heavily encouraged to use them as much as possible.
The fact that they've started promoting using the Caveman mode tells me that the unlimited usage policy is taking its toll.
Fwiw, nobody has ever suggested to me that I employ token compression in my daily workflow. I don't pay full attention in all the AI workflow demos I'm supposed to attend, but I don't recall that even being discussed. Is this an Nvidia blog or tweet you're referencing? I'm actually interested to see what they have to say.
What is Caveman mode?
Please don’t tell me you’re writing RTL
I'm not, I work higher level products. I've talked to a few people who do but I don't recall if they have different standards.
I by myself use now more than 15 accounts combined of all providers + API as well for external providers, more than 50K$ equivalent a month in API tokens, my team is doing the same thing, it's not really that much once you figured out the real automation loops and workflows, solving 300 issues a day with guarantees is common.

I feel that a lot of users are still stuck on Claude code or tools like this and don't really have a real argument about why they are even following the thread at all, everything has to be async for serious automation, you shouldn't even be seeing what Claude or any other model is replying (everything has to be digested with another model to increase relevancy and accuracy of the message so you can read faster (like a bot)), it's irrelevant, only human in the loop when a decision must be made, the rest has to be loops with all model, typical e2e, regression, computer use test, video into frames into all model loop and so-on.

That's interesting. What is the input into the process? Don't you need a PRD or a requirement doc to start with?
> No wonder they need to lay people off!

He clearly works at Apple, and they aren't laying people off.

I'm not aware of a limit in my current role. There is, however, a leaderboard.
Well, presumably (hopefully) they aren't expected to work weekends.
No days off for the agents.
Yes, the cost of AI is a big contributing factor.
The unsubsidised costs can't be revealed soon enough.
That’s funny I’ve been doing that too

Trying to crank out all the tools I never had time to build because I think we’re going to get cut off eventually

This seems seductive, but how do you get past the wall of "fixing XYZ or adding convenience ABC isn't on our pre-planned roadmap" so you can't get buy in from people who have to sign-off or deploy stuff?

Maybe that type of awkwardness is specific to my firm, but that's sort of what killed my drive to try to do that. We used to have one day every second week for that sort of work, but since it was scattered around, the tasks ended up disappearing-- nobody reviewed them and they didn't get merged.

So now they're trying to do a week-long internal hackathon to recover that vision, but I feel like that's going to produce a handful of big-bang ideas and not the 25 tiny tools that would actually streamline things.

Same. I've used it for debugging failed canary tests which required scripts and very specific knowledge on the canary platform that I wouldnt of ever spent time on.

I also have scripts to fetch specific database assets and forward them to slack channels so I can easily share them with a group rather than manually running a query and generating them.

I had a theory about improving a product. I asked it to build an offline simulation setup to try various implementations. The results were a bit fishy but i decided to give it a try and A/B testing is showing similar results.

And now im vibecoding a locally hosted dashboard. This one is less useful for anything specific, and more of a minor quality of life improvement, but its fun to just vibe code and see changes happen occasionally. Its not a critical thing.

I find it very useful for debugging tasks like that but it always ends up costing me like $3 despite doing incredible work. And then one of the other engineers at my company will rack up like $200 in tokens in one day producing tens of thousands of SLOC and we end up actually shipping about the same stuff. Sometimes I wonder if it's bad agent use discipline (just pointing it at massive codebases and having it read it all from scratch each time) and sometimes I wonder if they're just using it for personal projects. Because none of that code seems to land in prod, and I've found that cranking out 10s of thousands of SLOCs at a time is a recipe for a mess.
But depending on how much you get paid hourly, $3 would be very little comparatively, no?
Yeah that's my point. You can get a ton of value for a few bucks so I'm not sure what these people are doing to torch hundreds of dollars. It's possible they haven't figured out patterns to make AI work on large codebases, and it's also possible they're just churning endless on massively bloated AI written codebases.
I don't think we will. I think this level of token cost/availability will trend cheaper and faster, long term. These companies that spent too big and too fast might try to limit it and raise the prices and they might be temporarily successful but they'll very quickly be taken over if they keep doing it.
May I ask what tools did you make so far? And what is on your roadmap?
Not OP, but a very simple example: I use AI to review my work before opening a PR for my colleagues to review. I ask it to review the commits in my branch. Instead of consuming tokens just to instruct it how to use git operations and other tools to find the commits since the base commit, I asked AI to create a little bash script to make patch files commit1.patch, commit2.patch, commit3.patch, etc, for all the commits in my branch since the base commit. Now I just use this script to prepare the context of commits to review.

I feel like an imposter here, I’m definitely not using AI as much as it seems everyone is :( I can’t imagine using hundreds of dollars of tokens a day. But maybe this little tip for reviews might be helpful to someone.

> Instead of consuming tokens just to instruct it how to use git operations

Claude already knows how to use git and jj, very well.

I also find it useful for review, and sometimes I use multiple passes to review for different categories. Like security, performance and so on.
Not op, made a tool to convert Microsoft OneNote notes to Obsidian canvas and Markdown. First it used a python lib which was too limiting. Then it used windows API to plug into OneNote and read the doc in its original XML form. That made the conversion correct and fully featured.
Not OP, but I've been focusing on linting and automation.

Custom lint rules to encode best practices that previously relied on astute/alert code reviewer to call attention to. This is handy not just for humans but it steers the bots too. Or turning on some existing rule that required a big cleanup/migration to be compliant with. Now I just throw an LLM at it, since they're often laborious but mechanical changes. Which is the sweet spot for an LLM.

Also automating everything I can. That annoying release process that everyone hates but wasn't quite long/arduous enough to justify the time before? It's now automated. GitHub workflows for all the things.

This kind of stuff will forever be useful, even if the bottom drops out and the bubble bursts. And none of it is reliant on AI to run

"AI used to build the tool but by no means used by the tool" is a really good way to put it. Feels like the smart play right now is treating these credits as temporary subsidy and building stuff that still works when the bill comes due.
Seems like people are spending more time building tools than doing actual work. Lots of overlap too
In all fairness, doing actual work in this current slice of time is not what componies are prioritizing as of now.
It is fairly easy to tokenmax by having and inefficient automation set up.

Not something I would do personally. But it is surprisingly easy to set up a claw that eats half of your token budget in a meaningless "research" task. Set it up as a cron job and you will soon be promoted for being an AI visionary

Innovation signalling.
> $300/day token quota

Are companies using per-token billing? Why - is there some reason they can’t buy the $200/mo Claude plan for every employee?

The $200/mo Claude plan is not available for every employee. You can buy the $100/mo plan for up to 150 people, and then you have to switch to API billing.
Max 20x is for individuals only. (could probably have emps get it themselves, and reimburse)
IF they do individual billing the business doesn't get token reporting
> could probably have emps get it themselves, and reimburse

They can’t track token use this way. Also it’s a massive violation of the model providers TOS.

Yes, token use can be tracked the same way, just have to MITM everything. The ToS is a non-issue as it's not a legal issue, unless you plan to do business with Anthropic, not really an issue as you can always go to API later-on, in which case, Anthropic can't supposedly "ban you" as they are saying they don't record prompts.
Huh? I believe it’s completely fine for a company to pay for regular Claude subscriptions for employees, as long as they don’t share logins.
Not fine as per the ToS.
I can't find anything in the ToS that it would go against. I even asked Claude to check its own ToS and tell me if it's okay.
Most startups do this (multiple accounts per employee).
Those plans are going the way of the dinosaur, ai provider loses money on them. Most enterprise offerings are already there, Anthropic changed theirs to $20/seat plus token usage a couple weeks back
I’m curious what FAANG is actually doing per-token billing? I’m guessing not google or amazon (since my wife and I aren’t aware of that).
Compliance
I'm pretty sure with AI there is nothing that complies to anything.

Staring with the fact that the whole industry is based on copyright infringement.

You're welcome to have opinions on that, but the answer to the person's question is objectively compliance. A corp can't get enterprise features like ZDR without switching to token based billing. That's why they aren't using subs.

This isn't some kind of new thing. There's always been an enterprise tax, like SSO.

How do you even use that much daily?
I have an unrelated question, please. I am trying to make a post and get this error: "Sorry, your account isn't able to submit this site.", you know why or have a solution for it?
>we have $300/day token quota.

Unless other FAANG have the exact amount this is going to be Apple.

And no wonder why the quality of Apple software has gone downhill.

Apple in software development and design used to be very conservative. BSD like. Especially the lower end of the stack.

Now it is no different to other Silicon Valley companies.

Also at a FAANG here. Surprised you don't manage to use $300 in a whole day. It's almost trivial to productively use that much in under an hour.

Leadership is not being dumb, at least on this topic. If your token usage is that low, you just aren't using AI that much (even if you think you are.)

I use $30 a day to produce a decent amount of code. Certainly more than we need - thinking about/designing the correct solution/distilling requirements is still the bottleneck. How can you possibly even review $300/day worth of output?
It doesn’t have to be $300/day worth of output tokens. It could be like $290/day worth of input tokens to teach both you and the model about the problem you are solving and then $10/day worth of output tokens.
And what about you knowing the problem and the solution, but are just worrying about the impact downstream. Most of my time is spent managing those. I know the exact code to be published. And some time I already have it committed in my local branch. Then you need to make everyone aware of what it entails and that's usually how you can spend days on a simple bug or a change request.

Software is a big graph of interlocked rules. And if you can grasp the whole or the part you own (and you should be able to), it's often very easy to see the control points. You don't have a coding bottleneck anymore, you have a communication bottleneck[0]. Which is an organizational issue, not anything relevant to engineering.

[0]: See Naur's Programming as Theory Building and Brooke's Mythical Man Month.

It could be thinking tokens or tokens passed in via RAG.
If you give it $290 of input tokens for $10 of output tokens, you are doing something wrong. I.e. you paste the whole CI output into the prompt instead of giving it a link to the file, and then the AI greps its way through it (using a fraction of the tokens).

Sometimes AI overdoes things and it re-runs the whole testsuite because the tail command didn't have enough lines, but the other way round messes up the context so much so that in the end all that context is useless.

I used Claude about a week ago to do a pretty intensive refactoring. Cleanup, initial modularisation, beginnings of a test suite, and better isolated build. In a span of couple of hours, and over a sequence of 20+ new commits, I burned a hair over $100 in tokens.

If you are working on a seriously large legacy code base, I can see how you'd get to >$250 on a bad day.

If you build your own reviewer layer/tool it will burn a ton of tokens. Millions of tokens of input.
You, review bots and first pass bots can chew through tokens. Also if you haven't put effort into your harnesses, the agent will have to spend more time and tokens figuring things out again and again
Use expensive models at high effort
Also you regarding Claude usage limits:

> Before the doomers come in, you get $200 in API credits every month for claude -p usage. Usage counts against those API credits.

So which is it $300/day is trivial to consume or $200/month is a completely reasonable limit, it can't be both.

Do you even realize how insane your comment is?

"If you aren't donating at least your salary's worth of company money to another company every day, are you even working?"

Used to think exactly like you. That's why I know you all will "get it" eventually. Most companies and orgs are just so far behind the curve.
You might want to put this statement in your AI and ask if it was logically sound
Of course it's logically sound. The AI skepticism crowd is trying to tell me the reality I see before my very eyes and work with every day simply does not exist.

I know for sure that reality exists, and that they will either catch up or be left behind. Don't really need to explain myself beyond this.

Believing something to be real that isn't is basically what psychosis means.
What's more likely is that you are rationalising your religion. Some people break their conditioning others don't.
give some examples or real insights, otherwise it's difficult to take you seriously
"Used to think exactly like you until I accepted the love of Jesus, our Savior, in my heart."

No AI believer ever gives any concrete examples or evidence of what they’re doing with all the tokens and how it’s objectively helping them make the world a better place. Even for the shareholders (excluding the shareholders of Anthropic, or course), never mind the rest of us.

Why don't you explain it to us then? What are you actually doing with it? What type of products are you working on?
I've noticed that these comments (which are common) never seem to get a reply.. wonder why :)
This is a very common pattern with AI psychosis victims (and with crypto and NFT evangelists before). Comments whose haughtiness is matched only by their lack of content.
Its the same people. They didn't just up and vanish, they just moved over to AI!
Apparently both dev's and the AI are vulnerable to the Dunning Kruger effect.
Wouldn't they save an enormous amount of money by getting rid of either you and the token quota, or a bunch of other people to continue paying your salary plus this insane quota?
If you are burning through $2400 a day, you’re just wasting tokens on idiotic tasks.
He's rewriting Bun from Rust to Python now.
How are you able to get to $300/hr productively? (I’m assuming this isn’t fast mode tax).
not hard, massive elo stuff. every decision point needs to think up and implement 25 ideas and then rank them.
I am glad I am not on your team, the amount of slop they have to deal with coming from you must be overwhelming
How? I struggle to use the 1000 Kiro tokens I get a month, and that only costs $20. And I use it more then anyone else on my team. Maybe we're just massively behind?
300 an hour, that's insane
Not really, if you use the most expensive models and you have a large codebase stuffed into the context window
You must be using a really bad harness or just writing very vague prompts. 20 Million tokens is a lot.