| What's not clear to me is if DeepSeek and other Chinese models are... a) censored at output by a separate process b) explicitly trained to not output "sensitive" content c) implicitly trained to not output "sensitive" content by the fact that it uses censored content, and/or content that references censoring in training, or selectively chooses training content I would assume most models are a combination. As others have pointed out, it seems you get different results with local models implying that (a) is a factor for hosted models. The thing is, censoring by hosts is always going to be a thing. OpenAI already do this, because someone lodges a legal complaint, and they decide the easiest thing to do is just censor output, and honestly I don't have a problem with it, especially when the model is open (source/weight) and users can run it themselves. More interesting I think is whether trained censoring is implicit or explicit. I'd bet there's a lot more uncensored training material in some languages than in others. It might be quite hard to not implicitly train a model to censor itself. Maybe that's not even a problem, humans already censor themselves in that we decide not to say things that we think could be upsetting or cause problems in some circumstances. |
In an earlier HN comment, I noted that DeepSeek v3 doesn't censor a response to "what happened at Tiananmen square?" when running on a US-hosted server (Fireworks.ai). It is definitely censored on DeepSeek.com, suggesting that there is a separate process doing the censoring for v3.
DeepSeek R1 seems to be censored even when running on a US-hosted server. A reply to my earlier comment pointed that out and I confirmed that the response to the question "what happened at Tiananmen square?" is censored on R1 even on Fireworks.ai. It is naturally also censored on DeepSeek.com. So this suggests that R1 self-censors, because I doubt that Fireworks would be running a separate censorship process for one model and not the other.
Qwen is another prominent Chinese research group (owned by Alibaba). Their models appear to have varying levels of censoring even when hosted on other hardware. Their Qwen Coder 32B model and Qwen 2.5 7B models don't appear to have censoring built-in and will respond to a question about Tinamen. Their Qwen QwQ 32B (their reasoning/chain of thought model) and Qwen 2.5 72B will either refuse to answer or will avoid the question, suggesting that the bigger models have room for the censoring to be built in. Or maybe the CCP doesn't mandate censoring on task-specific (coding-related) or low-power (7B weights) models.