Hacker News new | ask | show | jobs
by generalizations 3 days ago
Presumably a deepswe benchmark, which IIRC puts GLM 5.2 between opus 4.8 and fable.