Hacker News new | ask | show | jobs
by sankalpmukim 57 days ago
I think this kind of overthinking is an extremely common pattern in the Chinese models. GLM's models are also very much like this.