Hacker News new | ask | show | jobs
Prediction: Claude 5 will be a major regression
3 points by cadabrabra 132 days ago
At this point it should be completely obvious to everyone that there’s what is approximately a linear relationship between model cost and model performance. Anthropic is claiming that Claude 5 Sonnet will cost about half as much as their current SOTA models. Therefore, expect about half the performance. This is Anthropic’s version of GPT-5, i.e. a way to fool their customers into using a less compute intensive model, almost purely for the benefit of the company. But as usual, they will rig the benchmarks and make it appear as though the model is better at certain things, like coding.

It’s an illusion, folks. You’re being played. Wake the hell up.

Also, I can’t believe that people still talk about SWE-Bench when there is a paper proving that the benchmark is completely useless because models regurgitate memorized answers.

Again, please, wake up.

https://arxiv.org/abs/2506.12286

2 comments

> Anthropic is claiming that Claude 5 Sonnet will cost about half as much as their current SOTA models. Therefore, expect about half the performance.

That's not how LLM quality works.

Maybe not in theory but definitely in practice, as we’ve seen with GPT-5. These companies are lightning money on fire. If they reduce the cost, expect a proportional decrease in quality. All of the GPT-5 anecdotes confirm this. When the data and anecdotes disagree, the anecdotes are usually right, and the data is usually bullshit.
GPT-5's issues were due to router shenanigans which Claude models do not do.
No dude, the latest versions of the models it routes to are markedly poorer in performance than their predecessors.

I’m observing a law that states: There appears to be a direct relationship between model performance and cost, such that whenever a company claims to have reduced inference costs, customers immediately notice a corresponding decline in model performance.

> It’s an illusion, folk. You’re being played.

How are they "being played" if Claude 5 isn't even out yet

It’s already obvious that it will be a scam. Higher benchmark scores and lower cost are two signs that customers are about to get scammed. We saw it with GPT-5.
Respectfully,

Claude 3 Opus: $15.00 (Input) / $75.00 (Output) per 1M tokens

Claude 4 Opus: $15.00 (Input) / $75.00 (Output) per 1M tokens

Claude 4.1 Opus: $15.00 (Input) / $75.00 (Output) per 1M tokens

Claude 4.5 Opus: $5.00 (Input) / $25.00 (Output) per 1M tokens

This actually proves my point because if you read the anecdotes, you will notice a marked decline in performance. The version number goes up but the actual performance declines. The benchmarks can tell any story you want them to.
Is it? It might be possible that it's a scam, but for something to be "obvious" it has to release first.

There are plenty of ways to reduce inference cost for a high-intelligence model. Making sparser weights, for example, can increase the parameter count while reducing the inference cost and time.

I get what you’re saying, but I still think that it will be a scam. Bookmark this thread and let’s continue the conversation after it’s released.
I think you are informed by more of an emotional interest than a technical one, here. You've written several such posts and many of them are astronomically unlikely predictions.
Ok but didn’t Karpathy make it clear that we live in the vibe era? I’m inclined to trust vibes more than technical jargon, and boy are the vibes off with what’s been happening!

Let’s see what happens :)