| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wrxd 26 days ago

I’m at a FAANG and we have $300/day token quota. Personally I don’t use that much of it but management is pushing really hard for it. “the quota has been raised for a reason, use it”. Any task: “have you tried working on it with Claude?”. Every meeting “now engineer x and y will show you what he did with AI”.

It’s not all useless but most of the days I think I would be more productive if some processes were streamlined rather than if I had to throw tokens at them and still fail.

Of all the showcases I’ve seen the best are the ones written by people assuming that the token bonanza will not last so they used AI to build tools they wished they had. AI used to build the tool but by no means used by the tool, so if/when token quota gets reduced we still have a functional tool.

10 comments

mancerayder 26 days ago

300 a day?? 7K dollars a month? No wonder they need to lay people off!

Conscat 25 days ago

At Nvidia, we have no limit for Anthropic or Open AI models (for now) and are heavily encouraged to use them as much as possible.

mindv0rtex 25 days ago

The fact that they've started promoting using the Caveman mode tells me that the unlimited usage policy is taking its toll.

Conscat 24 days ago

Fwiw, nobody has ever suggested to me that I employ token compression in my daily workflow. I don't pay full attention in all the AI workflow demos I'm supposed to attend, but I don't recall that even being discussed. Is this an Nvidia blog or tweet you're referencing? I'm actually interested to see what they have to say.

andruby 24 days ago

What is Caveman mode?

soupspaces 24 days ago

https://news.ycombinator.com/item?id=47639077

checker659 25 days ago

Please don’t tell me you’re writing RTL

Conscat 25 days ago

I'm not, I work higher level products. I've talked to a few people who do but I don't recall if they have different standards.

pixel_popping 24 days ago

I by myself use now more than 15 accounts combined of all providers + API as well for external providers, more than 50K$ equivalent a month in API tokens, my team is doing the same thing, it's not really that much once you figured out the real automation loops and workflows, solving 300 issues a day with guarantees is common.

I feel that a lot of users are still stuck on Claude code or tools like this and don't really have a real argument about why they are even following the thread at all, everything has to be async for serious automation, you shouldn't even be seeing what Claude or any other model is replying (everything has to be digested with another model to increase relevancy and accuracy of the message so you can read faster (like a bot)), it's irrelevant, only human in the loop when a decision must be made, the rest has to be loops with all model, typical e2e, regression, computer use test, video into frames into all model loop and so-on.

boomzilla 23 days ago

That's interesting. What is the input into the process? Don't you need a PRD or a requirement doc to start with?

harambae 25 days ago

> No wonder they need to lay people off!

He clearly works at Apple, and they aren't laying people off.

gregoryl 25 days ago

I'm not aware of a limit in my current role. There is, however, a leaderboard.

delecti 26 days ago

Well, presumably (hopefully) they aren't expected to work weekends.

ruraljuror 26 days ago

No days off for the agents.

chezelenkoooo 26 days ago

Yes, the cost of AI is a big contributing factor.

jpfromlondon 21 days ago

The unsubsidised costs can't be revealed soon enough.

ex-aws-dude 26 days ago

That’s funny I’ve been doing that too

Trying to crank out all the tools I never had time to build because I think we’re going to get cut off eventually

hakfoo 25 days ago

This seems seductive, but how do you get past the wall of "fixing XYZ or adding convenience ABC isn't on our pre-planned roadmap" so you can't get buy in from people who have to sign-off or deploy stuff?

Maybe that type of awkwardness is specific to my firm, but that's sort of what killed my drive to try to do that. We used to have one day every second week for that sort of work, but since it was scattered around, the tasks ended up disappearing-- nobody reviewed them and they didn't get merged.

So now they're trying to do a week-long internal hackathon to recover that vision, but I feel like that's going to produce a handful of big-bang ideas and not the 25 tiny tools that would actually streamline things.

Brystephor 26 days ago

Same. I've used it for debugging failed canary tests which required scripts and very specific knowledge on the canary platform that I wouldnt of ever spent time on.

I also have scripts to fetch specific database assets and forward them to slack channels so I can easily share them with a group rather than manually running a query and generating them.

I had a theory about improving a product. I asked it to build an offline simulation setup to try various implementations. The results were a bit fishy but i decided to give it a try and A/B testing is showing similar results.

And now im vibecoding a locally hosted dashboard. This one is less useful for anything specific, and more of a minor quality of life improvement, but its fun to just vibe code and see changes happen occasionally. Its not a critical thing.

viccis 26 days ago

I find it very useful for debugging tasks like that but it always ends up costing me like $3 despite doing incredible work. And then one of the other engineers at my company will rack up like $200 in tokens in one day producing tens of thousands of SLOC and we end up actually shipping about the same stuff. Sometimes I wonder if it's bad agent use discipline (just pointing it at massive codebases and having it read it all from scratch each time) and sometimes I wonder if they're just using it for personal projects. Because none of that code seems to land in prod, and I've found that cranking out 10s of thousands of SLOCs at a time is a recipe for a mess.

mewpmewp2 25 days ago

But depending on how much you get paid hourly, $3 would be very little comparatively, no?

viccis 25 days ago

Yeah that's my point. You can get a ton of value for a few bucks so I'm not sure what these people are doing to torch hundreds of dollars. It's possible they haven't figured out patterns to make AI work on large codebases, and it's also possible they're just churning endless on massively bloated AI written codebases.

esperent 25 days ago

I don't think we will. I think this level of token cost/availability will trend cheaper and faster, long term. These companies that spent too big and too fast might try to limit it and raise the prices and they might be temporarily successful but they'll very quickly be taken over if they keep doing it.

hosteur 26 days ago

May I ask what tools did you make so far? And what is on your roadmap?

caarmen 26 days ago

Not OP, but a very simple example: I use AI to review my work before opening a PR for my colleagues to review. I ask it to review the commits in my branch. Instead of consuming tokens just to instruct it how to use git operations and other tools to find the commits since the base commit, I asked AI to create a little bash script to make patch files commit1.patch, commit2.patch, commit3.patch, etc, for all the commits in my branch since the base commit. Now I just use this script to prepare the context of commits to review.

I feel like an imposter here, I’m definitely not using AI as much as it seems everyone is :( I can’t imagine using hundreds of dollars of tokens a day. But maybe this little tip for reviews might be helpful to someone.

yencabulator 15 days ago

> Instead of consuming tokens just to instruct it how to use git operations

Claude already knows how to use git and jj, very well.

capitol_ 25 days ago

I also find it useful for review, and sometimes I use multiple passes to review for different categories. Like security, performance and so on.

smusamashah 26 days ago

Not op, made a tool to convert Microsoft OneNote notes to Obsidian canvas and Markdown. First it used a python lib which was too limiting. Then it used windows API to plug into OneNote and read the doc in its original XML form. That made the conversion correct and fully featured.

WickyNilliams 26 days ago

Not OP, but I've been focusing on linting and automation.

Custom lint rules to encode best practices that previously relied on astute/alert code reviewer to call attention to. This is handy not just for humans but it steers the bots too. Or turning on some existing rule that required a big cleanup/migration to be compliant with. Now I just throw an LLM at it, since they're often laborious but mechanical changes. Which is the sweet spot for an LLM.

Also automating everything I can. That annoying release process that everyone hates but wasn't quite long/arduous enough to justify the time before? It's now automated. GitHub workflows for all the things.

This kind of stuff will forever be useful, even if the bottom drops out and the bubble bursts. And none of it is reliant on AI to run

daniel3303 25 days ago

"AI used to build the tool but by no means used by the tool" is a really good way to put it. Feels like the smart play right now is treating these credits as temporary subsidy and building stuff that still works when the bill comes due.

nfRfqX5n 26 days ago

Seems like people are spending more time building tools than doing actual work. Lots of overlap too

johnnyanmac 26 days ago

In all fairness, doing actual work in this current slice of time is not what componies are prioritizing as of now.

throwa356262 25 days ago

It is fairly easy to tokenmax by having and inefficient automation set up.

Not something I would do personally. But it is surprisingly easy to set up a claw that eats half of your token budget in a meaningless "research" task. Set it up as a cron job and you will soon be promoted for being an AI visionary

utopiah 26 days ago

Innovation signalling.

nojs 26 days ago

> $300/day token quota

Are companies using per-token billing? Why - is there some reason they can’t buy the $200/mo Claude plan for every employee?

stavros 26 days ago

The $200/mo Claude plan is not available for every employee. You can buy the $100/mo plan for up to 150 people, and then you have to switch to API billing.

novaleaf 26 days ago

Max 20x is for individuals only. (could probably have emps get it themselves, and reimburse)

IF they do individual billing the business doesn't get token reporting

Our_Benefactors 25 days ago

> could probably have emps get it themselves, and reimburse

They can’t track token use this way. Also it’s a massive violation of the model providers TOS.

pixel_popping 24 days ago

Yes, token use can be tracked the same way, just have to MITM everything. The ToS is a non-issue as it's not a legal issue, unless you plan to do business with Anthropic, not really an issue as you can always go to API later-on, in which case, Anthropic can't supposedly "ban you" as they are saying they don't record prompts.

Toutouxc 25 days ago

Huh? I believe it’s completely fine for a company to pay for regular Claude subscriptions for employees, as long as they don’t share logins.

pixel_popping 24 days ago

Not fine as per the ToS.

Toutouxc 23 days ago

I can't find anything in the ToS that it would go against. I even asked Claude to check its own ToS and tell me if it's okay.

pixel_popping 24 days ago

Most startups do this (multiple accounts per employee).

verdverm 26 days ago

Those plans are going the way of the dinosaur, ai provider loses money on them. Most enterprise offerings are already there, Anthropic changed theirs to $20/seat plus token usage a couple weeks back

seanmcdirmid 26 days ago

I’m curious what FAANG is actually doing per-token billing? I’m guessing not google or amazon (since my wife and I aren’t aware of that).

goosejuice 26 days ago

Compliance

LtWorf 25 days ago

I'm pretty sure with AI there is nothing that complies to anything.

Staring with the fact that the whole industry is based on copyright infringement.

goosejuice 25 days ago

You're welcome to have opinions on that, but the answer to the person's question is objectively compliance. A corp can't get enterprise features like ZDR without switching to token based billing. That's why they aren't using subs.

This isn't some kind of new thing. There's always been an enterprise tax, like SSO.

JessieJanie 21 days ago

How do you even use that much daily?

samuelarogbonlo 21 days ago

I have an unrelated question, please. I am trying to make a post and get this error: "Sorry, your account isn't able to submit this site.", you know why or have a solution for it?

ksec 25 days ago

>we have $300/day token quota.

Unless other FAANG have the exact amount this is going to be Apple.

And no wonder why the quality of Apple software has gone downhill.

Apple in software development and design used to be very conservative. BSD like. Especially the lower end of the stack.

Now it is no different to other Silicon Valley companies.

solenoid0937 26 days ago

Also at a FAANG here. Surprised you don't manage to use $300 in a whole day. It's almost trivial to productively use that much in under an hour.

Leadership is not being dumb, at least on this topic. If your token usage is that low, you just aren't using AI that much (even if you think you are.)

denkmoon 26 days ago

I use $30 a day to produce a decent amount of code. Certainly more than we need - thinking about/designing the correct solution/distilling requirements is still the bottleneck. How can you possibly even review $300/day worth of output?

zomglings 26 days ago

It doesn’t have to be $300/day worth of output tokens. It could be like $290/day worth of input tokens to teach both you and the model about the problem you are solving and then $10/day worth of output tokens.

skydhash 26 days ago

And what about you knowing the problem and the solution, but are just worrying about the impact downstream. Most of my time is spent managing those. I know the exact code to be published. And some time I already have it committed in my local branch. Then you need to make everyone aware of what it entails and that's usually how you can spend days on a simple bug or a change request.

Software is a big graph of interlocked rules. And if you can grasp the whole or the part you own (and you should be able to), it's often very easy to see the control points. You don't have a coding bottleneck anymore, you have a communication bottleneck[0]. Which is an organizational issue, not anything relevant to engineering.

[0]: See Naur's Programming as Theory Building and Brooke's Mythical Man Month.

seanmcdirmid 26 days ago

It could be thinking tokens or tokens passed in via RAG.

est31 26 days ago

If you give it $290 of input tokens for $10 of output tokens, you are doing something wrong. I.e. you paste the whole CI output into the prompt instead of giving it a link to the file, and then the AI greps its way through it (using a fraction of the tokens).

Sometimes AI overdoes things and it re-runs the whole testsuite because the tail command didn't have enough lines, but the other way round messes up the context so much so that in the end all that context is useless.

bostik 25 days ago

I used Claude about a week ago to do a pretty intensive refactoring. Cleanup, initial modularisation, beginnings of a test suite, and better isolated build. In a span of couple of hours, and over a sequence of 20+ new commits, I burned a hair over $100 in tokens.

If you are working on a seriously large legacy code base, I can see how you'd get to >$250 on a bad day.

look_lookatme 26 days ago

If you build your own reviewer layer/tool it will burn a ton of tokens. Millions of tokens of input.

verdverm 26 days ago

You, review bots and first pass bots can chew through tokens. Also if you haven't put effort into your harnesses, the agent will have to spend more time and tokens figuring things out again and again

globnomulous 25 days ago

Use expensive models at high effort

eaglelamp 26 days ago

Also you regarding Claude usage limits:

> Before the doomers come in, you get $200 in API credits every month for claude -p usage. Usage counts against those API credits.

So which is it $300/day is trivial to consume or $200/month is a completely reasonable limit, it can't be both.

Sharlin 26 days ago

Do you even realize how insane your comment is?

"If you aren't donating at least your salary's worth of company money to another company every day, are you even working?"

solenoid0937 26 days ago

Used to think exactly like you. That's why I know you all will "get it" eventually. Most companies and orgs are just so far behind the curve.

mrits 26 days ago

You might want to put this statement in your AI and ask if it was logically sound

solenoid0937 25 days ago

Of course it's logically sound. The AI skepticism crowd is trying to tell me the reality I see before my very eyes and work with every day simply does not exist.

I know for sure that reality exists, and that they will either catch up or be left behind. Don't really need to explain myself beyond this.

desdenova 25 days ago

Believing something to be real that isn't is basically what psychosis means.

expedition32 25 days ago

What's more likely is that you are rationalising your religion. Some people break their conditioning others don't.

robotpepi 25 days ago

give some examples or real insights, otherwise it's difficult to take you seriously

Sharlin 25 days ago

"Used to think exactly like you until I accepted the love of Jesus, our Savior, in my heart."

No AI believer ever gives any concrete examples or evidence of what they’re doing with all the tokens and how it’s objectively helping them make the world a better place. Even for the shareholders (excluding the shareholders of Anthropic, or course), never mind the rest of us.

basisword 25 days ago

Why don't you explain it to us then? What are you actually doing with it? What type of products are you working on?

weakfish 21 days ago

I've noticed that these comments (which are common) never seem to get a reply.. wonder why :)

viccis 26 days ago

This is a very common pattern with AI psychosis victims (and with crypto and NFT evangelists before). Comments whose haughtiness is matched only by their lack of content.

gregoryl 25 days ago

Its the same people. They didn't just up and vanish, they just moved over to AI!

krona 25 days ago

Apparently both dev's and the AI are vulnerable to the Dunning Kruger effect.

mancerayder 26 days ago

Wouldn't they save an enormous amount of money by getting rid of either you and the token quota, or a bunch of other people to continue paying your salary plus this insane quota?

kortilla 26 days ago

If you are burning through $2400 a day, you’re just wasting tokens on idiotic tasks.

k4rli 26 days ago

He's rewriting Bun from Rust to Python now.

haneul 26 days ago

How are you able to get to $300/hr productively? (I’m assuming this isn’t fast mode tax).

blazespin 25 days ago

not hard, massive elo stuff. every decision point needs to think up and implement 25 ideas and then rank them.

sumeno 26 days ago

I am glad I am not on your team, the amount of slop they have to deal with coming from you must be overwhelming

Miner49er 26 days ago

How? I struggle to use the 1000 Kiro tokens I get a month, and that only costs $20. And I use it more then anyone else on my team. Maybe we're just massively behind?

upcoming-sesame 26 days ago

300 an hour, that's insane

peab 26 days ago

Not really, if you use the most expensive models and you have a large codebase stuffed into the context window

never_inline 25 days ago

You must be using a really bad harness or just writing very vague prompts. 20 Million tokens is a lot.