Hacker News new | ask | show | jobs
by mg 263 days ago

    Fiber networks were using less
    than 0.002% of available capacity,
    with potential for 60,000x speed
    increases. It was just too early.
I doubt we will see unused GPU capacity. As soon as we can prompt "Think about the codebase over night. Try different ways to refactor it. Tomorrow, show me your best solution." we will want as much GPU time at the current rate as possible.

If a minute of GPU usage is currently $0.10, a night of GPU usage is 8 * 60 * 0.1 = $48. Which might very well be worth it for an improved codebase. Or a better design of a car. Or a better book cover. Or a better business plan.

9 comments

> I doubt we will see unused GPU capacity

I'd argue we very certainly will. Companies are gobbling up GPUs like there's no tomorrow, assuming demand will remain stable and continue growing indefinitely. Meanwhile LLM fatigue has started to set in, models are getting smaller and smaller and consumer hardware is getting better and better. There's no way this won't end up with a lot of idle GPUs.

>Meanwhile LLM fatigue has started to set in

Has it?

I think there is this compulsion to think that LLMs are made for senior devs, and if devs are getting wary of LLMs, the experiment is over.

I'm not a programmer, my day job isn't tech, and the only people I know who express discontent with LLMs are a few of programmer friends I have. Which I get, but everyone else is using them gleefully for all manner of stuff. And now I am seeing the very first inklings of completely non-technical people making bespoke applets for themselves.

From OpenAI, programming is ~4% of chatGPTs usage. That's 96% being used for other stuff.

I don't see any realistic or grounded forecast that includes a diminishing demand for compute. We're still at the tip of adoption...

You should get on Reddit, people hate AI with a passion there. People I meet in real life hate it also. I think the public actually hates AI more than it should now.
I spent 13 years chronically on reddit before stumbling into a exit hatch of the bubble chamber.

Those people (well really it's teens and college kids) live on reddit, they are so far from an accurate representation of reality its insane.

Worse when you find out there’s a couple dozen of the same moderators running nearly all the top 500 subreddits.
So they own the media...
That makes some sense. Are they paid for this?
"i quit reddit but I'm 100% bullish in llms that just distil reddit posts to me"

oh you

Everyone should learn the concept of a Skinner Box. [1]

Reddit is a Skinner Box. HN is too, though to a much lesser extent [2]. Every Skinner Box has one dominant opinion on every matter, which means, by simply using the product, your beliefs on any matter will shift towards the dominant opinion of the platform.

I was a chronically online Reddit user once. I can spot any chronically online Reddit user in just a few minutes in any social event by their mannerisms and the way they talk. I’ll ask and without fail indeed they are a daily Reddit user. It’s even more obvious in writing where you can spot them in just a few always-grammatically-correct text messages flavored with reddit-funny remarks and snarks and jokes.

Same goes for chronic X users. Their signature behavior is talking about social/political issues unprompted. It’s even easier to spot them.

I think the main reason behind platforms shaping user behavior is this: The most upvoted content will always surface to the top, where it will be seen by most users, meaning, its belief-shaping impact is exponential instead of linear. In the same manner unpopular opinions will be pushed to the bottom, and will have exponentially small impact. Some opinions will even be banned or shadowbanned, which means they are beyond the Overton Window of the specific platform.

This way, the platform both nudges you towards the dominant opinion and limits the range of possible opinions you will be exposed to. Over time, this affects your personality and character.

1: https://en.m.wikipedia.org/wiki/Operant_conditioning_chamber

2: The HN moderators and the algorithm both actively resist the effect and try to increase diversification.

The irony of this is so much of Reddit comments these days are AI generated.
It's a pretty biased sample. Not to mention that people who are neutral and is just using AI won't be bothered to comment. So you only ever see one extreme or another.
I had to quit Reddit after a decade of heavy use because of the doomerism. It's a place you go if you want to kill your spirit. It's just not healthy.
The view I have is that people hate having AI slop spewed at them, but will find value in asking an LLM about things they're interested in / help with things.
> From OpenAI, programming is ~4% of chatGPTs usage. That's 96% being used for other stuff.

I think it's important to remember that a good bunch of this is going to be people using it as an artificial friend, which is not really productive. Really that's destructive, because in that time you could be creating a relationship with an actual person instead of a context soon to be deleted.

But on the other hand, some people are using it as an artificial smart friend, asking it questions that they would be embarrassed to ask to other people, and learning. That could be a very good thing, but it's only as good as the people who own and tune the LLMs are. Sadly, they seem to be a bunch of oligarchs who are half sociopaths and half holy warriors.

As for compute, people using it as an artificial friend are either going to have a low price ceiling, or in an even worse case scenario they are not and it's going to be like gambling addiction.

Productive or destructive, demand is there, so it isn’t late bubble. It’s still early. (Which is scary, I’ll readily admit.)
But demand isn't there (or rather, proven to be there.) Demand is measured in dollars, and right now VC is paying. This is peak bubble - farthest distance from valuation and income.
Microsoft and meta are not VCs and they’re spending money on data centers like there’s no tomorrow, doesn’t seem very low demand.
Even if its there, will it be in 10 years?
Test time compute has made consumption highly elastic. More compute = better results. Marginal cost of running these GPUs when they would otherwise be idle is relatively very low. It will be utilized.
> There's no way this won't end up with a lot of idle GPUs.

Nvidia is betting the farm on reinventing GPU compute every 2 years. The GPUs wont end up idle, because they will end up in landfills.

Do I believe that's likely, no, but it is what I believe Nvidia is aiming for.

What’s the lifetime of these things once they’ve been running hot for 2-3 years
This. I just found out that for my MCP needs, Qwen3 4B running local is good enough! So I just stopped using Gemini API.
Your bet is that people will simply use less compute, for the first time in the history of the human race?
No, mostly less external compute
Look at the human body.

2% of it is dedicated to thinking.

My guess is that as a species, we will turn a similar percentage of our environment into thinking matter.

If there are a billion houses on planet earth, 2% of it are 20 million datacenters we still have to build.

An analogy is not proof. It is not even evidence.
> As soon as we can prompt "Think about the codebase over night. Try different ways to refactor it. Tomorrow, show me your best solution." we will want as much GPU time at the current rate as possible.

That is nothing. Coding is done via text. Very soon people will use generative AI for high resolution movies. Maybe even HDR and high FPS (120 maybe?). Such videos will very likely cost in the range of $100-$1000 per minute. And will require lots and lots of GPUs. The US military (and I bet others as well) are already envisioning generative AI use for creating a picture of the battlespace. This type of generation will be even more intensive than high resolution videos.

> "Try different ways to refactor it. Tomorrow, show me your best solution."

The cost/benefit analysis doesn't add up for two reasons:

First, a refactored codebase works almost the same as non-refactored one, that is, the tangible benefit is small.

Second, how many times are you going to refactor the codebase? Once and... that's it. There's simply no need for that much compute for lack of sufficient beneficial work.

That is, the present investments are going to waste unless we automate and robotize everything, I'm OK with that but it's not where the industry is going.

> improved codebase

I've seen lots of claims about AI coding skill, but that one might be able to improve (and not merely passably extend) a codebase is a new one. I'd want to see it before I believe it.

It depends what you're fitting to. At the simplest, you can ask for a reduction in cyclomatic/cognitive complexity measured using a linter, extraction of methods (where a paragraph of code serves no purpose other than to populate a variable) or complex conditionals, move from an imperative to a declarative approach, etc. These are all things that can be caught through pattern matching and measured using a linter or code review tool (CodeRabbit, Sourcery or Codescene).

Other things might need to be done in two stages. You might ask the agent to first identify where code violates CQRS, then for each instance, explain the problem, and spawn a sub-agent to address that problem.

Other things the agent might identify this way: multiple implications, use of conflicted APIs, poor separation of concerns at a module or class level.

I don't typically let the agent do any of this end to end, but I would typically manually review findings before spawning subagents with those findings.

Claude will refactor but more than that, it can add documentation. And it can be asked about a codebase too. "Where does FOO happen?" "How does BAR work?".
This is such a short sighted take glaringly ommitting a crucial ingredient in learning or improvement - both for humans or machines alike: feedback loops.

And you can't really hack / outsmart feedback loops.

Just because something is conceptually possible, interaction with the real rest of the world separates a possible from an optimal solution.

The low hanging fruits/ obvious incremental improvements might be quickly implemented by LLMs based on established patterns in their training data.

That doesn't get you from 0 to 1 dollar, though and that's what it's all about.

this. Was highlighted by Sutton in a recent podcast rather starkly.

LLMs are a great tool. But, the real world is far too nuanced to be captured in text and tokens. So, LLMs will be a great productivity boosting tool like a calculator or a spreadsheet. Expecting it to do more is science fiction.

With improvements on the algorithm side and new techniques, even older hardware will become useful.
I get what you're saying and the reasoning behind it, but older hardware has never been useful where power consumption is part of determining usefulness.
This is the biggest threat to the GPU economy – software breakthroughs that enable inference on commodity CPU hardware or specialized ASIC boards that hyperscalers can fabricate themselves. Google has a stockpile of TPUs that seem fairly effective, although it’s hard to tell for certain because they don’t make it easy to rent them.
I don't think we will need to wait for anything as unpredictable as a breakthrough. Optimizing inference for the most clearly defined tasks, which are also the tasks where value is most readily quantified, like coding, is underway now.
More efficient inference = more reasoning token. Hyperscaler ASICs are closing the gap at the hardware/system level, yes.
> As soon as we can prompt…

This is the fundamental error I see people making. LLMs can’t operate independently today, not on substantive problems. A lot of people are assuming that they will some day be able to, but the fact is that, today, they cannot.

The AI bubble has been driven by people seeing the beginning of an S-curve and combining it with their science-fiction fantasies about what AI is capable of. Maybe they’re right, but I’m skeptical, and I think the capabilities we see today are close to as good as LLMs are going to get. And today, it’s not good enough.

Getting gold in the math Olympiad is a pretty strong indicator of operating independently on substantive problems.

A year ago they need an extensive harness to get silver, and two years ago they could hardly multiply 1000x10000.

Terence Tao tweeted yesterday about using GPT5 to help quickly solve a problem he was working on.

Yes but why did ChatGPT work on math Olympiad problems? Because it got a prompt giving it the instruction and context etc.

Why did GPT5 help Terence Tao solve a math problem, because he gave it a prompt and the context etc.

None of these models are useful without a human prompting them and giving it tasks, goals, context etc, they don't operate independently, they don't get ideas of work to be done, they don't operate over long time horizons, they can't accept long term goals and sub-divide those goals into sub goals, and sub tasks etc.

They are useless without humans telling them what to do.

Why don't you stick them in a robot, give them agency, continuously train them, and see what happens? Be careful what you ask for.
You should see what happens when you let them talk to each other
Errors compound? Context drift?
Try it, and let them pick the topic. Though they will probably pick AI development, mysteriously it seems to be their favorite topic...
I’ve never understood why time is the metric people are using here. If LLMs get so much better we can “run them overnight”, what makes you think that they won’t also get faster and so they accomplish exactly what you’re talking about in 5 minutes?
I just had to double check (have not been paying attention for a couple of years) but indeed it seems GPU underutilization remains a fact and the numbers are pretty significant. Main issues are being memory bound so the compute sits idle.
The actual computation speed isn't as important nowadays but it doesn't really change the conclusion with respect to whether they're underutilized.

Because the main reason for the price premiums in AI-class GPUs are the gobs of insanely fast memory, and that is very much not underutilized. AI companies grab GPUs with as much memory (at the fastest memory bandwidth) as possible and underclock the GPU to save on power. Linus Tech Tips had a great video about the H200 that touched on this this week: https://www.youtube.com/watch?v=lNumJwHpXIA

Tasks being memory bound is not the same thing as GPU's being idle for economic reasons though.