Hacker News new | ask | show | jobs
by oudlys 12 days ago
Productivity is not value. It's quite possible for you to experience productivity improvements, and actual value to not be created. That is what I think the most robust data is showing.

https://unessays.substack.com/p/talk-is-cheap

4 comments

Also, supposed productivity gains are dubious. I personally experience at best no productivity gains when using LLMs to write code, and sometimes it's an active drain on my productivity. There was that one study a year or so ago showing similar results. People are trying to say the productivity gains are there and undeniable, but that is not true. It is very much a subject of controversy whether AI helps productivity.
I can see an argument that the productivity gains are illusory / don’t translate to economic productivity. I’m not denying the possibility.

However, most of the engineers I respect have gone from being skeptics a year ago to convinced today. I don’t personally know any true holdouts any more. If there are studies that disprove productivity gains more than six months ago, I’m happy to believe that it was true of the AIs that were available at the time. But I’m going to need something much more recent before I disbelieve my lyin’ eyes where it pertains to the AIs available today.

There is an observational study that was published in March 2026 that followed 4000 teams over 2 years. It shows, in my view, exactly that the productivity gains don't translate into economic value.

Here is the report:

https://www.faros.ai/blog/ai-acceleration-whiplash-takeaways

And my commentary:

https://unessays.substack.com/p/talk-is-cheap

If it was published in March 2026, even if the data was collected up to the day the study was published, 7/8ths of it would fail my “within the last six months” test. But I am looking forward to the results of future studies on this topic!
I get wanting to wait for more data. And thinking that LLMs have improved enough that this will change.

My view is that it's not really about how good the models are - it's about how we're using them. Understanding what you've built is an important part of value creation, and LLMs eliminate that.

Its funny, I've noticed the same thing, but did not come to the same conclusion.

I currently don't have work access to Claude Code, but most of my teammates do. Watching from the outside, the cycle seems to look like this:

1. Experience some success, which hooks you into relying on AI.

2. The AI keeps failing at some task, but you don't want to stop. Keep trying over and over again.

3. Run out of tokens and take a break.

Now, sometimes 1 doesn't happen. Sometimes 2 doesn't happen. 3 is a certainty though.

Now, if you told me that the productivity gain from 1 is enough to offset the loss from 2 and 3, I could believe you. But I also wouldn't be surprised if it didn't.

As I work with Claude more and gain a feel for its capabilities, I tend to run into 2 far less often, as I'll decompose my messages more for the current model limitations. The threshold also changes each release.
I’m going back to being a holdout, but it’s nuanced - My theory into why LLMs don’t lead to the colloquial definition of productivity would be something like - if code was never the bottleneck than generating code faster doesn’t result in more meaningful output.

Even if you take for granted that AI is as good as the best people say in writing code. And Ive spent a lot of time generating codes, I won’t disagree - Then the question becomes - does this change your daily incentives such that you reach for code as the solution to your problems rather than something else (coordinating with your colleagues? Product management? Planning and Design?

So from a holistic perspective, I think intentionally limiting your own AI usage is the best approach for maximum long-term productivity.

>So from a holistic perspective, I think intentionally limiting your own AI usage is the best approach for maximum long-term productivity.

I think this is right. They are much better applied as editors than authors, IMO.

The key thing is stay in control of your output. i.e. understand it thoroguhly. I think you let the LLM make decisions you don't really understand, you're increasing the likelihood of introducing defects that are expensive to address.

I’m not completely closed to your idea but if code was never the bottleneck why did so many organizations always feel so chronically low on coders? And of course this requires the AI to be no help at all with what is actually the bottleneck.
So, one is, it probably does depend on workflow. Some folks probably are doing things that can be accelerated by AI, and I think if you’re a small team with a good product head and know what you’re doing then AI probably helps a lot.

But what if the problem you’re trying to solve is the altogether too often problem of like getting teams that are dependent on you to upgrade the library they use. And what if the library is a breaking change, and last year they upgraded to the library on your advice and it broke production and now they’re suss and want to accept all changes, and integrating that library change isn’t in their critical path so they’re just not going to spend time on it, even if you submit the MR them. Even if you show them their tests pass after the change.

Importantly to the above, you probably need more devs to do more of the above in parallel. You don’t hire devs to write more code, you hire more devs to carry on the mental load of a broader scope of work. Even in the before times, so much code got stuck at the integration step.

But because all that is hard, instead you go and codegen to fix an obscure bug that sure makes a few customers happy, but no one thought was a limiting factor for paying your company more money.

It’s not that I don’t think AI can help, I think it’s a prerequisite for the job and everyone should use it. It’s more that I think in the grand scheme of things, people will bias towards using it for tasks that aren’t in the critical path - refactors, tech debt, bug smashing, tool building; and I think it could really help devex and that’s good.

But I think people are bad at knowing the difference between “my job feels a bit easier” or “I’m more productive” and “this task had an impact on the bottom line” and when you extrapolate that out to a whole engineering org, that’s where the productivity statistics get lost.

I’ll addd one data point to this is like this thread itself. So many people on AI skepticism threads point to their own subjective experience as evidence we’re not in a bubble, and sort of ignore the entire concept of economics. I’m not saying we’re in 100% in a bubble, but subjective experience isn’t great evidence of it.

And this is just sort of one of the factors, what about the increased cost and mental load of supporting more software? What about junior engineers who feel pressured to ship work but don’t actually learn the software engineering? What about lost context from not intimately understanding your software?

All good questions. I am not a big believer in claiming I know whether we are in a financial bubble or not. I just put it all in VT and we will see what happens. I know that the AI allows me to write code I couldn’t write before much more quickly than before but I admit that this may not help with organizational friction.

Although if this theory is true — that AI helps with coding but coding is not the friction point in organizations with multiple humans, even that should allow faster iteration by allowing one human to do more coding therefore reducing the size of teams required to make some programs. You should see good acceleration in solo shops too.

From an economic perspective productivity is defined as the creation of value isn't it? Then if you "improve productivity" and does not create value in the end you're no improving productivity at all.
It does depend on how you define productivity. But the way it's commonly used is "I'm going faster, personally, with these tools."

The thing people I think have a hard time seeing is that "I go faster" does not mean "more features get finished".

It's a scale issue, and one scale is better than the other. People only pay for finished features, they do not pay for how much code you emit.

economists define productivity as gdp per hour worked. Like a lot of other economic measurements, its mostly a bogus number people use as an argument on why their politics are better than someone elses politics. You can have an efficient business located in a poor country making the same product and same quality as that same business in a rich country, the rich country will be more "productive" because local cost of goods is higher there (i.e. a restaurant in NYC is more "productive" than a restaurant in bangladesh).
Sure. But that's not, in my view, how most people use the word productivity when describing LLM use.

In my field - operations - productivity is usually described as some rate of production for a specific asset. 100 widgets / machine / hour - for example.

"My productivity is 3 PRs / day with the LLM as opposed to 1 PR per every three days". That's how I think people are thinking about it.

My point is that's not the same thing as value. I.e. what people will pay for.

You're correct, I just wanted to add that there is another definition that you may see used online, and it is very specific, and it's important to be aware it's NOT exactly the same thing most normal people mean when they say "productivity".
I appreciate you looking out. Thank you.
I’ve noticed more gold-plating.

“This random part of the code is slow, I used an LLM to generate a PR that speeds it up.”

Okay, you optimized the part that’s not a bottleneck, sped up nothing and cost the company $100 in tokens. Good job?

I'm quite fond of this play on "if a tree falls in the woods":

"If an LLM builds a feature, and no one uses it, did it make value?"

Productivity is defined revenue per worker hour. And we know worker hours are going down as there are fewer workers with the layoffs.
Which is why modern management thinks firing everyone in your factory and selling the inventory in your warehouse is amazingly productive.
That report doesn't match what faros.ai conclude which is mostly a paywalled report.
The report was not paywalled for me. It just required a work email. Which is totally fair from my perspective. Faros is providing a ton of value with the report. People do deserve to get paid - even if in collected emails.

You're right my analysis is at variance to what Faros.ai says. I think they interpret their data trying to rescue utility for the dominant patterns of LLM use.

But I think to anyone who is experienced with process improvement or queuing theory, their interpretation is clearly weak. Rework is a huge problem in queue systems, and they mostly just elide the throughput impact of an 860% increase in code churn coupled to a massive spike in bugs.

Obviously draw your own conclusions. But I don't think because I disagree with the interpretation of the people who originated the data makes me wrong.

That's possible, sure. But I think the answer is more likely in the numbers, not in just qualitatively saying AI isn't worth anything. Like if I pay $30k for an ounce of gold, I got value. Gold is worth something. But that amount of gold wasn't worth what I spent.

EDIT: In fact, parent comment has a link to some numbers.

[EDIT: Most] people don't want to go through the numbers. Ok. But there's a history here. When people don't want to see the numbers, certain kinds of things tend to happen.

I've posted numbers that indicate that productivity is becoming decoupled from value delivery. If you follow the link in my comment it reviews a pretty robust study of 4000 teams over 2 years. There is no product throughput increase.
Yep.

Code acceleration is great, but.... something precedes that. Vision and strategy re. expansion of offerings and businesses. Once a firm reaches maturity in what it offers and is only touching the edges - this code acceleration is literally useless when you factor in all of the trade-offs.

This is a good thing - it means fat and slow incumbents are sitting ducks to be out-witted by creative and imaginative founders, which is healthy for a well-functioning economy.

Now the economics of existing frontier models are not sustainable - its looking like a mix of the airline (supersonic vs subsonic) and EV industry with China in the background providing decent offerings at much lower prices.

I think its worse than that.

I admit that if a small team or an individual uses an LLM, it's likely they can create value faster.

I think as soon as you don't own the responsibility for the defects you generate with an LLM, their use starts to destroy value. Regardless of product maturity.

This is what I think the data says.

https://unessays.substack.com/p/talk-is-cheap

Yeah this part scares me a little. I imagine it scares everyone who is more than a couple of years out of school. I hear that "the solution to LLM tech debt is more LLM." That might be true, but it might not be.
It scares me too.

I actually think this is precisely the reason LLMs can't be the basis for a technological revolution. Because it's only one way.

Like, if you have a compiler, and it has a bug. You can discover if that bug is influencing your code execution and patch it. You can go both up and down the stack.

With LLMs, there is no way to patch it's translation function. You have to rely on it to forward process.

I don't think there is any way to avoid us understanding our tech stacks.

You're not really getting it.

If you are producing something that delivers a far better experience, irrespective of what's under the hood (see Claude Code et al), you will decimate an incumbent who is trying to use LLMs in the context of incrementally improving a mature product.

LLMs are suited for the development of revolutionary innovation, not incremental.

I think we mostly agree.

I think I just disagree about the power of the LLM to deliver revolutionary innovation. That's something you do. Not the machine.

And, pretty soon on your journey to scale, the LLM becomes a hinderance rather than a help.

Interesting data, thanks.