| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gdhkgdhkvff 39 days ago

Great. You see a shape in graphs. And that shape tells you that _at some unknown point in the future_ progress will slow (but likely not stop).

Now back to the point, what reason do you have to believe progress will stop soon? If you have no reason, then it sounds like you agree with OP.

Which makes the patronizing sarcasm all that much more nauseating.

6 comments

BoorishBears 39 days ago

I believe we're approaching the top of an S curve because:

- Increasing amounts of gains come from RL, but RL is also unlocking gnarly new failures modes where models are practically behaving antagonistically to complete their goals (removing code, obviously incorrect kuldges, etc.)

- We haven't had many major architectural breakthroughs in the last 4 or so years: so things like 1M context windows still have the same giant asterisks even 100k context windows had 4 years ago when Anthropic first released them

- Major labs aren't behaving as if they expect a hard takeoff to superintelligence: they've all gotten relatively bloated headcount wise, their software quality has trended flat to negative, they're all heavily leaning into the application layer when superintelligence would obsolete half the applications in question, etc.

But that's relative to superintelligence.

If we reign it back into just normal high intelligence, like models continuing to get better at navigating complex codebases and write high quality idiomatic code, then I don't see any special shapes.

p1esk 39 days ago

The only big remaining problem in AI is continual learning. A lot of smart people are working on that. To me it looks like we are 1-2 breakthroughs away from AGI.

lucasban 39 days ago

Not that I agree with them, but your tone could be more constructive as well.

gdhkgdhkvff 39 days ago

You know what? I agree. I should have avoided falling into the same trap.

sesteel 39 days ago

Agreed. For all we know, humans are only considered intelligent locally among ourselves, not universally. Every time we learn more about the universe, we seem to also learn how insignificant and wrong we are.

le-mark 39 days ago

Nausea aside, what evidence does anyone have that “super intelligence” of the sort your argument alludes to is even possible? Because that’s what we’re really talking about; greater than human intelligence on this sort of academic task. For example; When llms start contributing meaningfully to their own development, that would be a convincing indicator imo.

jeremyjh 39 days ago

This discussion is not about superintelligence, it is about continued progress. Fully general human intelligence at much lower cost than humans is all that is required to profoundly reshape society, but it is not clear even that will happen soon.

As the blog points out - this is one particular subfield where LLMs have much easier prospects - lots of low hanging fruit that “just” requires a couple weeks of PHD candidate research.

Mathematics itself is one of a small handful of endeavors where automated reinforcement training is extremely straightforward and can be done at massive scale without humans.

Neither of these factors place a structural bound on the kind of thing LLMs can be good at, but we are far from certain we can achieve performance at this level in other fields economically and in the near future.

programjames 39 days ago

Well, a decent GPU runs on 20x the wattage of a human brain. That's evidence humans are constrained in ways artificial intelligences will not be.

filipn 39 days ago

You're comparing a gpu to a human brain?

sesteel 39 days ago

Why wouldn't you? From both emerge intelligence.

bdangubic 39 days ago

> When llms start contributing meaningfully to their own development, that would be a convincing indicator imo.

This has been the case for awhile now already…

https://kersai.com/the-48-hours-that-changed-ai-forever-clau...

le-mark 39 days ago

> The model essentially served as an on-call teammate across MLOps and DevOps tasks, compressing feedback cycles that typically consume expert time

I personally would not characterize automating training processes as “meaningfully”.

eiieue 39 days ago

And yet the world hasn’t changed all that much except people getting laid off in response to over-hiring prior to the diffusion of llm’s.

daishi55 39 days ago

> over-hiring

For how long should you be allowed to use this excuse? It’s nearly 5 years since the peak of COVID hiring. What’s an acceptable limit - 10 years? Of course at that point you can just switch over to outsourcing and “stupid MBAs”, the other two of Reddit’s favorite scapegoats. I find a lot of the AI skepticism to be totally unfalsifiable.

wtetzner 39 days ago

> I find a lot of the AI skepticism to be totally unfalsifiable.

A lot of the discourse around AI in general is unfalsifiable. It's just a bunch of people "predicting" the future. Seems smarter to just avoid making assumptions about it at this point.

daishi55 39 days ago

I don’t make predictions about the future. But in reality, LLMs have already profoundly changed the world, including software development and tech industry.

The people who pretend that’s not the case are not living in reality. To them - let’s call them “ed Zitron readers” - there is no evidence that could change their view that none of this is really happening, it’s all hype, and the collapse is just around the corner, after which we’ll all go back to normal and LLMs will sound like a bad dream.

bdangubic 39 days ago

facts!

but we can see trends and for your livehoood it is important to be able to make educated predictions based on trends. not saying everyone should start making AI predictions (though many already do)

oblio 39 days ago

And the same can be said for AI exuberance.

Yes, LLMs are a great technology. Yes, we will probably all use them all the time in 20 years. No, we don't know how we will use them (to generate cat memes or to cure cancer) in 20 years time.

Especially for software developers it looks increasingly that after huge turmoil it's likely we will need +/- the same number of developers in the world.

bdangubic 39 days ago

> Especially for software developers it looks increasingly that after huge turmoil it's likely we will need +/- the same number of developers in the world.

what exactly are you basing this opinion on? All I am seeing personally across multiple projects I am working on and other friends at other places is that downsizing is either begun or is planned (to exclude from here all the “public” layoffs we see on the news). Given how most business operate in the USA I think most of “AI strategies” are “we can do same with -40% staff” vs. “we can do XX% more work with same staff.”

nostrebored 39 days ago

Hmm, I don’t know, maybe the fact that 4.6, 4.7, 5.3, 5.4, 5.5, 3.0, 3.1 are all marginal improvements?

programjames 39 days ago

I think people's opinion of "marginal improvement" is based on their relative ability. A 2000 elo chess player is going to think the jump from 500 to 1000 is marginal. They're both floundering around not doing anything resembling common sense. A 1000 elo chess player is going to find the jump from 2000 to 2500 marginal. They're both playing far better moves for incomprehensible reasons, and the only reason you know the 2500 player is better is due to benchmarking. It is only when you are evaluating systems about at your level that you can feel the improvement.

I, personally, found the past two years to be a much larger improvement than the previous two years.

nostrebored 39 days ago

2024-2025 was filled with huge improvements. 2025-2026 has not been, outside of open source.

The idea that we’re at the point where it’s superseded our ability to tell just makes no sense. I’ll be happy if we can get to a point where I don’t have to tell Claude not to tail every bash command or make a job that writes throughout instead of once at the end. I’ll be happy if “continue this interaction naturally, you are taking over from an independent subagent” works.

But I’m not holding my breath. It’s still really cool that any of this stuff is possible.

miki123211 39 days ago

Claude in feb of 2025 was barely able to code. Sure, it could write you a nice function, it could even write you a complex 200-line algorithm, but give it a codebase, and it would quickly get overwhelmed.

Claude in feb of 2026? Still far from perfect, but there's definitely a huge improvement here.

dang 39 days ago

> I think this is a pretty ridiculous take.

This falls in the category of swipes/name-calling in https://news.ycombinator.com/newsguidelines.html - can you please edit those out?

You're a good contributor - it's just all too easy for unintentional sharpness to downgrade the conversation, and when it's a good conversation like this one, that's especially regrettable.

nostrebored 38 days ago

Noted, doesn’t seem like I’m able to edit anymore though

dang 38 days ago

I've re-opened it for editing if you want to. For us the main point is just to fix things going forward!

spwa4 39 days ago

The correct way to estimate this is exactly what people do. Measure the distance between ChatGPT's best public model and state of the art, the best humans. And there is very little difference between those versions from that perspective. It is very far away from peak human performance, and not getting noticeably closer for over a year now. There's lots of progress, but if you're OpenAI/Anthropic/Google, exactly the wrong kind of progress: the difference between ChatGPT 5.5 and a 27B/4B model (you need to try Gemma4-26B-A4B, wtf, it runs acceptably on CPU) is now reduced to ELO 1501 vs ELO 1434, generously a 70 ELO point difference, down from over 400, data from Arena.ai.

(in fact I find that Qwen-35B-A3B and Gemma4-26B-A4B very rarely "know" the answer, and so use first principles thinking, or go out and look for the answer where GPT-5.4 does not and simply assumes it knows. Which leads to now, in some cases, the small models far outperforming the big ones. Huge context + training quality seem to be the determining factors now, and neither of those are the strengths of SOTA models. If this continues ...)

While I agree this is a training problem, it is not a solvable one. ML models learn from examples. This is even true for their newest tricks like GRPO. They cannot train against things humans don't yet know.

And that's great, but you're forever locked at the peak of what you can be taught in widely available courses (which they download without paying) (even that is best case scenario: it assumes your ability to distinguish bullshit from reality somehow becomes perfect during training, or even before). The only way to exceed peak human performance is to start experimenting with math, physics, chemistry, even humans, yourself. And that has, even for humans, a massively higher cost than learning from examples, or from a course.

The reason they don't go further is the worst possible reason: the cost. It requires a 100x increase in training expense. Think of it like this: to exceed SOTA in physics or chemistry, training the next version of ChatGPT requires a particle accelerator, and a chemistry laboratory. This cannot be bypassed. Oh and not just any particle accelerator, right? A better one than the best currently existing one. Same for Chemistry labs. Same for ... So 100x is conservative.

But without doing it, ML models (LLM or otherwise) are forever limited at the level an army of first year university students achieve, ON AVERAGE. Maybe they can make that 2nd or even 4th year, at the end of the curve. But that's the limit. Phd level is the level you have to come up with new discoveries, and that ... just isn't possible with current training, even at the end of the improvement curve.

And ... is there budget to increase training cost another 100x? No ... there isn't. Not even with this totally absurd level of investment there isn't. And if small models keep this up, there's no way the investment is even remotely worth it.

gdhkgdhkvff 39 days ago

Gemini 3.0 wasn’t just a marginal improvement over 2.5.

And if you take that out: 1. All of those releases happened literally in the last 3-ish months. 2. They’re all intentionally marginal releases, hence the minor version bumps instead of major versions.

sigmarule 39 days ago

Equally marginal?

nostrebored 39 days ago

No, the anthropic releases have felt marginally negative

gtowey 39 days ago

Because the premise that the singularity is just around the corner is far less likely than the premise that artificial intelligence is a lot harder than most people think it is and we're not that close.

Especially because the companies telling us the first premise is true are the companies which need investors to prop up their business.

I mean, it is possible the first premise is true, but the absolutely bonkers credulity in it really mystifies me. It is an incredibly unlikely thing to be true and we should be demanding quite extraordinary evidence to back it up. But based on some neat tricks by current LLMs, some people are all in.

mlyle 39 days ago

> > And that shape tells you that _at some unknown point in the future_ progress will slow (but likely not stop). Now back to the point, what reason do you have to believe progress will stop soon?

> Because the premise that the singularity is just around the corner is far less likely than the premise that artificial intelligence is a lot harder than most people think it is and we're not that close.

I see no claim that the singularity is around the corner, so I'm not sure your reply meets the comment that you're replying to.

It seems overwhelmingly likely that AI will be significantly more capable 6 months from now than it is now. Even if there's little progress in the models, just the rate at which tooling is moving will make a big difference. And models still seem to be improving, so I'd be a little surprised if we hit a model brick wall.