| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fcoury 293 days ago

It's sad to see that they have their focus on these while their flagship, once SOTA CLI solution, is rotting away by the day.

You can check the general feeling in X, but it's almost unanimous that the quality of both Sonnet 4 and Opus 4.1 is diminishing.

At first, I didn't notice this quality drop until this week. Now it's really, really terrible: it's not following instructions, pretending to work and Opus 4.1 is specially bad.

And that's coming from a anthropic fanboy, I used to really like CC.

I am now using Codex CLI and it's been a surprisingly good alternative.

2 comments

wild_egg 293 days ago

They had a 56 hour "quality degradation" event last week but things seem to be back to normal now. Been running it all day and getting great results again.

I know that's anecdotal but anecdotes are basically all we have with these things

link

fcoury 293 days ago

Oh I wasn't aware of that. I will try it again. Thank you for letting me know!

link

sitkack 293 days ago

If I am bitching at Claude, then something is wrong. Something was wrong. It broke its deixis and frobnobulated its implied referents.

I briefly thought of canning a bunch of tasks as an eval so I could know quantitatively if the thing was off the rails. But I just stopped for awhile and it got better.

link

fcoury 293 days ago

... and I totally agree: anecdotes are all we have indeed.

link

armchairhacker 293 days ago

"The model is getting worse" has been rumored so often, by now, shouldn't there be some trusted group(s) continually testing the models so we have evidence beyond anecdote?

link