Hacker News new | ask | show | jobs
by fcoury 293 days ago
It's sad to see that they have their focus on these while their flagship, once SOTA CLI solution, is rotting away by the day.

You can check the general feeling in X, but it's almost unanimous that the quality of both Sonnet 4 and Opus 4.1 is diminishing.

At first, I didn't notice this quality drop until this week. Now it's really, really terrible: it's not following instructions, pretending to work and Opus 4.1 is specially bad.

And that's coming from a anthropic fanboy, I used to really like CC.

I am now using Codex CLI and it's been a surprisingly good alternative.

2 comments

They had a 56 hour "quality degradation" event last week but things seem to be back to normal now. Been running it all day and getting great results again.

I know that's anecdotal but anecdotes are basically all we have with these things

Oh I wasn't aware of that. I will try it again. Thank you for letting me know!
If I am bitching at Claude, then something is wrong. Something was wrong. It broke its deixis and frobnobulated its implied referents.

I briefly thought of canning a bunch of tasks as an eval so I could know quantitatively if the thing was off the rails. But I just stopped for awhile and it got better.

... and I totally agree: anecdotes are all we have indeed.
"The model is getting worse" has been rumored so often, by now, shouldn't there be some trusted group(s) continually testing the models so we have evidence beyond anecdote?