Hacker News new | ask | show | jobs
by andai 1 day ago
I've been testing some models that score higher than Opus 4.6.

They:

- hallucinate constantly

- can't follow basic instructions

- think they're Claude for some reason ;)

1 comments

The only one I see that thinks it is claude other than claude itself is the GLM series.
I have screenshots of Deepseek V4 doing this too - in a non-Claude-Code harness.
Also MiMo...