Hacker News new | ask | show | jobs
by Balinares 77 days ago
Yo, just because you can't tell when Claude is wrong, doesn't mean it's right.

I do agree that the Q1 2026 models in general have passed a threshold, but goodness almighty Opus 4.6 still screws up a lot.