Hacker News new | ask | show | jobs
by rstuart4133 3 hours ago
GLM 5.1 gets close to 4.6. It can happily run for hours and achieve a result. It given it bugs like a race condition that lead to a count being out by 1 after millions of operations, somewhere in a hundred thousand lines of C code littered with locks and atomic swaps, and it found (as did Opus). Most other models can't.

I'm using Fable now and GLM 5.1 doesn't really compare. But it's literally 1/20 the price. I can't use Fable for coding - it's too expensive. So now we have three levels of models - lightweight ones you dispatch en masse to find things, ones capable of agentic coding tasks that can run for hours like Opus, and GLM (and possibly open source ones - I've only tried a few), and now Fable, which is a truly helpful "architecture buddy". Fable still makes many, many, mistakes, so you have to review every word it writes.