Hacker News new | ask | show | jobs
by SilverElfin 97 days ago
This has been a common issue with the Chinese open weight models. It appears most or all have been trained via distillation on OpenAI and Anthropic models.
1 comments

They most likely weren't, despite very dubious claims of Amodei and Altman and a certain twitter influencer running a pretty naive writing benchmark ("slop test") that is wrong in a very obvious manner. The only unambiguous cases of distillation were Gemini 2.0 experimentals being trained on Claude outputs, and GLM-4.7 being trained on Gemini 3.0 Pro. The rest are pretty different from each other.
What makes these cases unambiguous?
GLM-4.7 (specifically this version) repeats the guardrail prompt injections from 3.0 Pro, word-by-word, and never follows them, which is consistent with training on a reward-hacked CoT. Gemini 3.0 only discusses snippets from this injection in its native CoT (hidden by default, trivial to uncover), but GLM-4.7 was able to reconstruct it in full during training. The only possible reason for this is direct training on a large amount of examples of Gemini's CoT. Its structure and a lot of replies were identical in GLM too.

Gemini 2.0 Exp 1206 was reported to be indirectly trained on Claude's outputs with humans in between [1], which was pretty consistent with its outputs at the time. No other Gemini versions except two experimental ones were similar to Claude.

[1] https://techcrunch.com/2024/12/24/google-is-using-anthropics...

Interesting! Thanks for taking the time to write down the explanation, I appreciate it.