| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by anon373839 513 days ago

This issue is raised and addressed ad nauseam on HN, but here goes:

It doesn't mean anything when a model tells you it is ChatGPT or Claude or Mickey Mouse. The model doesn't actually "know" anything about its identity. And the fact that most models default to saying ChatGPT is not evidence that they are distilled from ChatGPT: it's evidence that there are a lot of ChatGPT chat logs floating around on the web, which have ended up in pre-training datasets.

In this case, especially, distillation from o1 isn't possible because "Open"AI somewhat laughably hides the model's reasoning trace (even though you pay for it).

2 comments

int_19h 513 days ago

It's not distillation from o1 for the reasons that you have cited, but it's also no secret that ChatGPT (and Claude) are used to generate a lot of synthetic data to train other models, so it's reasonable to take this as evidence for the same wrt DeepSeek.

Of course it's also silly to assume that just because they did it that way, they don't have the know-how to do it from scratch if need be. But why would you do it from scratch when there is a readily available shortcut? Their goal is to get the best bang for the buck right now, not appease nerds on HN.

link

orbital-decay 512 days ago

> but it's also no secret that ChatGPT (and Claude) are used to generate a lot of synthetic data to train other models

Is it true? The main part of training any modern model is finetuning, and by sending prompts to your competitors en masse to generate your dataset you're essentially giving up your know-how. Anthropic themselves do it on early snapshots of their own models, I don't see a problem believing DeepSeek when they claim to have trained v3 on early R1's outputs.

link

luma 513 days ago

So how is it then that none of the other models behave in this way? Why is it just Deepseek?

link

orbital-decay 512 days ago

Because they're being trained to answer this particular question. In other contexts it wasn't prepared for, Sonnet v2 readily refers to "OpenAI policy" or "Reddit Anti-Evil Operations Team". That's just dataset contamination.

link