| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by aerhardt 23 days ago
	I find this analysis confusing. PMF for coding was likely reached some time last year. Profitability, which is different, we don’t know. The article kind of confuses both without making a strong economic case or using numbers in a compelling way. I don’t understand what the Uber case has to do with this either. The Uber COO clearly said that at least in terms of ROI he’s not seeing the results either. My take is the product has been very useful for coding (PMF) for months. But it’s certainly not useful at any cost…

7 comments

aspenmartin 22 days ago

What I also find confusing though is that folks seem to ignore trajectory which is maybe the biggest lede to bury. As Simon says, we have had "good enough" coding agents for 6 months, that is a blink of an eye, and at my company my job has now completely changed. It's almost like a dream.

And that's just one inflection point. We've had several and there are many more on the horizon. So while I could be convinced that ROI is maybe not even positive today despite the ridiculous enterprise spend, it's perfectly rational to pave the way today for what's coming over the next few months let alone years down the line.

plaidfuji 22 days ago

There may be additional major leaps forward, and there may not. I kind of struggle to imagine what the next step actually is. Certainly there will be improvements in performance (speed) and cost. But at a point you reach a barrier where the limiting factor is the specificity of the human prompt and our ability to manage all the code we’re generating.

Somewhat oversimplifying; writing software and building apps was a bottleneck - now it is not. What is the next bottleneck that LLMs can solve? Is there one? And is there enough publicly available data to solve it repeatably at scale? Or did we just automate stack overflow searches and now we’re stuck again?

Or is the endgame of this innovation cycle the complete removal of interaction with machines through code? Will we simply interact with machine coworkers purely through natural language? Can an LLM make PowerPoint slides and run a meeting? So far not seeing much progress on that.

dcre 22 days ago

Judging from the fact that the Opus 4.5 inflection point was not really anticipated, and we still don’t really know what threshold was crossed that suddenly made agentic coding accessible to so many more people, I think it’s safe to say we don’t know what the thresholds will be until they’re crossed. The fact that we don’t know exactly what they’ll be isn’t a good reason to think there won’t be any more.

palata 22 days ago

> The fact that we don’t know exactly what they’ll be isn’t a good reason to think there won’t be any more.

Nor is it a good reason to think there will be more.

pas 22 days ago

We should expect to see the process slowing down first. Until then we should expect it to continue with pretty high likelihood.

https://substackcdn.com/image/fetch/$s_!_ZW2!,f_auto,q_auto:...

dcre 22 days ago

I think we have quite good reason to expect more. As I said, we already know (caveat with your level of irrational skepticism toward the overwhelming evidence) that the best existing models are better than the ones publicly available.

simonw 22 days ago

For what it's worth, at PyCon US this year I ran into a few people with access to Claude Mythos and they confirmed that it's notably better at writing code than public Claude Opus 4.7.

palata 20 days ago

> caveat with your level of irrational skepticism toward the overwhelming evidence

If you can talk about my irrational skepticism (because I said that "we don't know the future", I suppose?), can I talk about your total lack of common sense?

Because the economy has been growing in the last decades does not mean that it will keep growing for the next decades. Because LLMs have been improving in the last few years does not mean that they will keep improving in the next few years. Maybe, maybe not, your guess is as good as mine. If you know the future, put your money where you mouth is and invest everything you own in LLM companies.

Your overwhelming evidence is about the past: it has been improving in the past.

pas 22 days ago

Based on how much money is chasing returns, and how steep the slope is, it's almost certain that we are still not at the end of this sigmoid cycle.

Sure, it might start to slow down, but even then we will likely see a doubling in the next 10-15 years.

https://substackcdn.com/image/fetch/$s_!_ZW2!,f_auto,q_auto:...

vidarh 22 days ago

I am currently eating lunch. Meanwhile Claude is triaging and writing reproducers for 70+ tickets nobody has had time to look at. Next it will attempt to fix them. I have not read the tickets. I will not look at the code until there are review ready PRs and a code review bot have done the first pass.

In other words, most of the prompting will also go away.

mintplant 22 days ago

Are you not concerned that you, too, will go away?

treis 22 days ago

Feels like everyone should be on one hand. On the other hand it also feels like a massive recalibration of what companies can/should do. They spend massive amounts of money on AWS, Datadog, GitHub, CircleCi, et al. If it becomes easier to host/roll your own it's a big increase in the demand for engineers.

Ultimately software is everything these days and the economics make the demand insatiable. We've gone through many cycles of "X" but on computers/web/mobile. There's going to be a massive amount of "X" but with AI companies that will need engineers.

Or at least this is what I tell myself to sleep at night.

vidarh 22 days ago

If I don't stay ahead of the curve, yes. But I can't stop that development. What I can do is leverage the technology enough to be more valuable than those who don't. By e.g. knowing how to set up processes like the above.

Ultimately, we'll need UBI or large scale cuts in working hours or similar if AI progresses to the point of mass unemployment - the alternative would be massive social unrest. In the meantime I expect to keep doing better than average.

Oras 22 days ago

yeah but if you have to pay $2k to $3k per month, would you still use it?

aspenmartin 21 days ago

me, personally, today? no. My company? Yes.

sixhobbits 22 days ago

Pmf is this weirdly defined thing where "if you're not sure you have it then you don't".

I think it was clearly useful for months to people who had tried it and taken the time to understand it, but now that knowledge has spread to the point where wallet holders are convinced it's not just passing fad or hype so now pmf can be "claimed".

I agree it's weird to say "those people have pmf" though, usually it's something you define for yourself

timmg 22 days ago

> Pmf is this weirdly defined thing where "if you're not sure you have it then you don't".

I'm not sure if this runs counter to your point or not, but: I don't see any future where LLMs aren't a core part of Software Engineering. The horse is out of the barn. There is no going back.

theschmed 22 days ago

Yeah but the product is not “LLM” it’s “proprietary frontier model LLM paid by the token”.

And I don’t even necessarily disagree with OP! It’s more like the competition is shifting so quickly that your competitors could undercut your PMF in a blink of an eye.

timmg 22 days ago

There will be cheaper solutions. And they will generally be less capable than the more expensive ones. Just like most other products.

But my guess is that the cost of SWEs themselves mean that the more expensive ones will be worth the delta to most companies.

But time will tell.

darkerside 22 days ago

History bears out that cheap and satisficing soundly beats expensive and optimal every time. Until we have smarter and more prescient decision makers in leadership, the bottleneck on output will be the quality of decision making not the quality of code. Trying more things faster and cheaper will win.

oblio 22 days ago

Aka the cheap plastic solution always wins.

airstrike 22 days ago

True but that is maybe 5% of what is being promised by the average booster

signatoremo 22 days ago

Give examples of boosters (average or not) and what they've promised?

repeekad 22 days ago

> clearly useful for people who took the time to understand it

people -> programmers, I haven’t met a non-developer who reports getting more time out of current AI platforms than they put in. If anything I’ve anecdotally heard the opposite, introducing AI at work creates so much slop (output) it takes more time to process it all without a tangible bump in overall productivity

AndrewKemendo 22 days ago

I have at least a half dozen examples of people not hiring people or buying other tools/subscriptions because they built their own with Claude

bmau5 22 days ago

PMF implies profitability. I could give away dollars for $0.80 and have unlimited demand but it doesn't mean I've found PMF.

grttq 22 days ago

Correct the cost is part of the economics.

Thats why most here shouldn’t engage in the discussion - they parrot on about benefits without identifying and articulating the costs and moreover how it affects the firms financial position.

squeegmeister 22 days ago

The article also treats the word "good" as load-bearing in a way that should have you questioning their analysis:

"I’ve called November 2025 the November inflection point because that was when GPT-5.1 and Opus 4.5, combined with their respective coding agent harnesses, got good—good enough that we’ve spent the last six months adapting to agent systems that can reliably get useful work done."

aspenmartin 22 days ago

Yet it’s backed up by adoption across the industry

nozzlegear 22 days ago

MongoDB was once backed up by adoption across the industry. Or for a more recent example, blockchain took off like wildfire across the industry before ultimately fizzling out in all but the most niche applications.

Not saying this trend will do the same, just that the industry adopting something doesn't guarantee its success.

afavour 22 days ago

I don’t think those are really comparable. The blockchain was trendy hype, relatively few companies actually adopted it. Where did Netflix use the blockchain? Google?

By comparison almost all tech companies I know have leaned heavily into AI.

ipaddr 22 days ago

leaned heavily = purchases subscription to claude not changed processes around ai.

righthand 23 days ago

It’s not supposed to be logical, it’s an LLM evangelism blog that rarely, if ever, has any critical analysis that isn’t pro-industry. Read any/all of the other posts and you won’t find much skepticism but you will find a lot of shilling how great it all is.

aerhardt 22 days ago

I like his other posts. He's bullish on AI, which is fine. I'd like to read a mix of bearish and bullish level-headed takes from people who are subject matter experts. His technical credentials are well past discussion - I love Django, and he comes across as a pretty upbeat but level-headed guy. Certainly beats radical takes in either direction from people who have no clue what they're talking about. It's just this article that I find rather confusing.

simonw 22 days ago

The thing that matters most to me is if reading what I wrote teaches you some new things and gives you something useful to think about.

If I make an argument and you disagree that's fine with me, provided I didn't use misinformation or sloppy thinking in making that argument.

aerhardt 22 days ago

That's how I feel about most of your writing. I click through most times when I see you either on the front page or in the comments, and I generally walk away feeling like I have food for thought, without necessarily buying everything wholesale. It's part of why I keep coming back.

My root comment simply represented my two cents about the current post. I don't think anything about the post is outrageously incorrect or anything, just somewhat confusing. You're a very prolific contributor in this community and I don't think me or anyone else that welcomes your takes expects everything you write to rock our collective socks every single time, anyway.

simonw 23 days ago

308 posts on AI ethics: https://simonwillison.net/tags/ai-ethics/

52 on AI misuse: https://simonwillison.net/tags/ai-misuse/

149 on the unsolved challenge of prompt injection: https://simonwillison.net/tags/prompt-injection/

40 on slop: https://simonwillison.net/tags/slop/

If you want an "LLM evangelism blog that rarely, if ever, has any critical analysis that isn’t pro-industry" there are plenty out there. I'm not one of them.

saulpw 22 days ago

People are confusing "excitement" with "evangelism". Your blog is definitely on the pro-AI side of things, but as you say, it's not one-sided or uncritical.

alexchamberlain 22 days ago

I think you should highlight your exemplary pre-AI writing too.

csomar 22 days ago

All of these are about AI misuse, not skepticism of AI. By skepticism I mean doubting whether AI actually delivers on its promises which, based on this last post, sounds like something you think we're already past.

Many people still think AI coding agents are slop on steroids despite all the current hype around AI actually shipping functional products.

simonw 22 days ago

It's hard for me to write about skepticism that coding agents deliver on their promises when I've been using them daily and know, for an absolute fact, that they boost my own productivity.

(And that's after taking into account the METR paper that says engineers over-estimate their productivity with these tools.)

I have plenty of doubts about AI delivering on its promises outside of coding. I don't write about AGI because I think it's science-fiction hysteria. I write about slop precisely because it represents a mis-use of AI that demonstrates people completely misunderstanding what it's useful for.

aspenmartin 22 days ago

Love when people say "its promises". What specifically are you disappointed with? Simon's posts are high quality and evidence driven. AI has already delivered an incredible amount. Read Epoch for industry trends and analyses, METR to, everything points to a pretty consistent picture.

"Many people still think AI coding agents are slop on steroids despite all the current hype around AI actually shipping functional products."

Oh yes, tons and tons, especially on HN. But the plural of anecdote is not data. Enterprise spend speaks for itself. You are using AI-coded functional products all the time. Do you want like a diff history for the Google codebase or something?

ModernMech 22 days ago

Tbf the OPs blog and comments (including their sibling to your comment) are also heavily anecdotal.

> I’ve called November 2025 the November inflection point because that was when GPT-5.1 and Opus 4.5, combined with their respective coding agent harnesses, got good—good enough that we’ve spent the last six months adapting to agent systems that can reliably get useful work done.

Claiming a grand inflection point based on your own personal usage is very anecdotal.

aspenmartin 22 days ago

If that were it I would absolutely agree with you. But this experience maps exactly to adoption trends. My job in the last 6 months has become so unrecognizeable to me it’s insane, the adoption at the very least at large companies is truly truly incredible, and it really does coincide with the quality of opus 4.5 (which has now been surpassed).

simonw 22 days ago

I think my claim about November is looking very solid today.

0xDEAFBEAD 22 days ago

And what happens when open models catch up in 6 months or so?