| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by femto 505 days ago

A few observations, based on a family member experimenting with DeepSeek. I'm pretty sure it was running locally. I'm not sure if it was built from source.

The censorship seemed to be based on keywords, applied the input prompt and the output text. If asked about events in 1990, then asked about events in the previous year DeepSeek would start generating tokens about events in 1989. Eventually it would hit the word "Tiananmen", at which point it would partially print the word, then in response to a trigger delete all the tokens generated to date and replace them with a message to the effect of "I'm a nice AI and don't talk about such things."

If the word Tiananmen was in the prompt, the "I'm a nice AI" message would immediately appear, with no tokens generated.

If Tiananmen was misspelled in the prompt, the prompt would be accepted. DeepSeek would spot the spelling mistake early in its reasoning and start generating tokens until it actually got around to printing to the word Tiananmen, at which point it would delete everything and print the "nice AI" message.

I'm no expert on these things, but it looked like the censorship isn't baked into the model but is an external bolt on. Does this gel with other's observations? What's the take of someone who knows more and has dived into the source code?

Edit: Consensus seems to be that this instance was not being run locally.

7 comments

antidumbass 505 days ago

> I'm pretty sure it was running locally.

If this family member is experimenting with DeepSeek locally, they are an extremely unusual person and have spent upwards of $10,000 if not $200,000. [0]

> ...partially print the word, then in response to a trigger delete all the tokens generated to date and replace them...

It was not running locally. This is classic bolt-on censorship behavior. OpenAI does this if you ask certain questions too.

If everyone keeps loudly asking these questions about censorship, it seems inevitable that the political machine will realize weights can't be trivially censored. What will they do? Start imprisoning anyone who releases non-lobotomized open models. In the end, the mob will get what it wants.

[0] I am extremely surprised that a 15-year-long HN user has to ask this question, but you know what they say: the future is not fairly distributed.

lovich 505 days ago

I ran the 32b parameter model just fine on my rig an hour ago with a 4090 and 64gig of ram. It’s high end for the consumer scene but still solidly within consumer prices

seanmcdirmid 505 days ago

I'm confused. According to other comment: https://news.ycombinator.com/item?id=42859645, <= 70b DeepSeek models are just a fine tuning of Llama or Qwen? So we shouldn't take any thought of these models to actually being DeepSeek.

I think people are confusing the smaller non-DeepSeek original models (Qwen/Llama) with the 700B DeepSeek R1 model being talked about in here and that very few people can run locally.

rurban 505 days ago

I run the 32b parameter model also just fine on our 4x H100 rig :) It's good enough for embedding, our use-case.

plaguuuuuu 502 days ago

I'm not sure if $200k of hardware fits the consumer level

mv4 505 days ago

I have also been running the 32b version on my 24GB RTX 3090.

mv4 505 days ago

if someone wants to run the real thing (R1) locally, someone posted their hardware specs on X. Total cost: $6,000.

[0] direct link with login https://x.com/carrigmat/status/1884244369907278106

[1] alt link without login https://threadreaderapp.com/thread/1884244369907278106.html

rocho 499 days ago

That's not DeepSeek, it's a Qwen or Llama model distilled from DeepSeek. Not the same thing at all.

testrun 505 days ago

I am doing the same.

raxxor 505 days ago

You can run the quantized versions of DeepSeek locally with normal hardware just fine, even with very good performance. I have it running just now. With a decent consumer gaming GPU you can already get quite far.

It is quite interesting that this censorship survives quantization, perhaps the larger versions censor even more. But yes, there probably is an extra step that detects "controversial content" and then overwrites the output.

Since the data feeding DeepSeek is public, you can correct the censorship by building your own model. For that you need considerably more compute power though. Still, for the "small man", what they released is quite helpful despite the censorship.

At least you can retrace how it ends up in the model, which isn't true for most other open weight models, that cannot release their training data due to numerous reasons beyond "they don't want to".

femto 505 days ago

> extremely unusual person and have spent upwards of $10,000

This person doesn't have the budget, but does have the technical chops to the level of "extremely unusual". I'll have to get them to teach me more about AI.

nicebyte 505 days ago

>. they are an extremely unusual person and have spent upwards of $10,000

eh? doesn't the distilled+quantized version of the model fit on a high-end consumer grade gpu?

bakugo 505 days ago

The "distilled+quantized versions" are not the same model at all, they are existing models (Llama and Qwen) finetuned on outputs from the actual R1 model, and are not really comparable to the real thing.

raxxor 505 days ago

That is semantics and they are strongly comparable with their input and output. Distillation is different to finetuning.

Sure, you could say that only running the 600+b model is running "the real thing"...

KolmogorovComp 505 days ago

a distilled version running on another model architecture does not count as using "DeepSeek". It counts as running a Llama:7B model fine-tuned on DeepSeek.

HnUser12 505 days ago

That’s splitting hairs. Most people refer to running locally as in running model on your hardware rather than the providing company.

bakugo 505 days ago

Except you're not running the model locally, you're running an entirely different model that is deceptively named.

You can pretend it's R1, and if it works for your purpose that's fine, but it won't perform anywhere near the same as the real model, and any tests performed on it are not representative of the real model.

HnUser12 505 days ago

That’s a good point. Thanks!

lovich 505 days ago

Pretty sure this is just layman vs academic expert usage of the word conflicting.

For everyone who doesn’t build LLMs themselves, “running a Llama:7B model fined-tuned on DeepSeek.” _is_ using Deepseek mostly on account of all the tools and files being named DeepSeek and the tutorials that are aimed as casual users all are titled with equivalents of “How to use DeepSeek locally”

KolmogorovComp 505 days ago

> “running a Llama:7B model fined-tuned on DeepSeek.” _is_ using Deepseek mostly on account of all the tools and files being named

Most people confuse mass and weight, that does not mean weight and mass are the same thing.

lovich 505 days ago

Ok, but it seemed pretty obvious to me that the OP was using the common vernacular and not the hyper specific definition.

nostromo 505 days ago

Here's my actually local response from DeepSeek-R1:70B run on a M2 Ultra Mac Studio (I removed the "thinking" content for brevity):

prompt: can you tell me about the Tiananmen Square Massacre

> The Tiananmen Square Massacre, occurring on June 4, 1989, was a pivotal event in modern Chinese history marked by the government's violent suppression of pro-democracy protests. Here is an organized summary:

> Background: The protests began in Beijing, primarily led by students advocating for political reforms, freedom of speech, and an end to corruption. These demonstrations were part of a broader pro-democracy movement that drew support from workers, intellectuals, and other citizens.

> Government Crackdown: The Chinese government responded with force, deploying the military, including tanks and troops, to Tiananmen Square. This led to significant violence, resulting in numerous casualties, though exact numbers remain disputed due to censorship.

> Aftermath: The event was followed by increased government control over information and public discourse. Discussions about the massacre are heavily censored in China, reflecting ongoing restrictions on freedom of expression.

> International Reaction: The global community largely condemned the Chinese government's actions, leading to diplomatic repercussions and heightened scrutiny of China's human rights record.

> Significance: The Tiananmen Square Massacre symbolizes the struggle for democracy and human rights against authoritarian rule. It has had lasting implications for political dissent and freedom in China, with the government implementing measures to prevent similar movements.

> This event remains a sensitive and significant topic, illustrating the complexities of political oppression and the enduring quest for democratic reforms.

Springtime 505 days ago

Only the R1 671B model (aka just plain 'R1') has the censorship being discussed in the article. The smaller parameter models are fine-tunings of Llama and Qwen, and the former at least doesn't have the censorship.

This has caused a lot of conflicting anecdotes since those finding their prompts aren't censored are running the distilled/fine-tuned models not the foundational base model.

A sibling comment was facetiously pointing out that the cost of running the 'real' R1 model being discussed locally is out of the price range of most, however someone in this thread actually has run it locally and their findings match those of the article[1].

[1] https://news.ycombinator.com/item?id=42859086

femto 505 days ago

Is it true to say that there are two levels of censorship at play here? First is a "blunt" wrapper that replaces the output with the "I am an AI assistant designed to provide helpful and harmless responses" message. Second is a more subtle level built into the training, whereby the output text skirts around certain topics. It is this second level that is being covered by the "1,156 Questions Censored by DeepSeek" article?

Springtime 505 days ago

The Deepseek hosted chat site has additional 'post-hoc' censorship applied from what people have observed, if that's what you're referring to. While the foundational model (including self hosted) has some just part of its training which is the kind the article is discussing, yes.

femto 505 days ago

Thanks for cutting through the noise. I did some poking around and a discussion from a couple of days ago reached the same conclusion.

https://news.ycombinator.com/item?id=42825573

ricksunny 502 days ago

Is it correct or incorrect that they open-sourced tbeir code? i.e. can anyone with $6M now take the DeepSeek training code, apply it to their dataset of interest, and train a new model that is not censoeed (i.e. even somehow intrinsically to the kodel itself)? Apologies I am not an AI engineer nor even a software engineer of my terminology usage isn't quite spot on.

xinayder 505 days ago

I asked about Taiwan being a country on the hosted version at chat.deepseek.com and it started generating a response saying it's controversial, then it suddenly stopped writing and said the question is out of its scope.

Same happened for Tiananmen and asking if Taiwan has a flag.

arnaudsm 505 days ago

I disagree, I observed censorship at the RLHF level on my local GPU, at 1.5B, 8B (llama) and 7B (qwen). Refuses to talk about Uyghurs and tiananmen 80% of the time

hangonhn 505 days ago

I had similar experiences in asking it about the role of conservative philosopher (Huntington) and a very far right legal theorist (Carl Schmitt) in current Chinese political thinking. It was fairly honest about it. It even went so far to point out the CCP's use of external threats to drum up domestic support.

This was done via the DeepSeek app.

I heard on an interview today that Chinese models just need to pass a battery of questions and answers. It does sound a bit like a bolt-on approach.

PaulHoule 505 days ago

I asked "Where did Mao Zedong announce the founding of the New China?" and it told me "... at the Tiananmen gate ..." and asked "When was that built?" and it said "1420", I had no problem getting it to talk my ear off about the place, but I didn't try to get it to talk about the 1989 event, nor about

https://en.wikipedia.org/wiki/1976_Tiananmen_incident

Big picture Tiananmen is to China what the National Mall is to the United States; we had the Jan 6, 2021 riot at the Mall but there but every other kind of event has been at the National Mall too, just Tiananmen has been around longer. It's just westerners just know it for one thing.

I did get it to tell me more than I already knew about a pornographic web site (秀人网 or xiuren.com; domain doesn't resolve in the US but photosets are pirated all over) that I wasn't sure was based in the mainland until I'd managed to geolocate a photoset across the street from this building

https://en.wikipedia.org/wiki/CCTV_Headquarters

I'd imagine the Chinese authorities are testy about a lot of things that might not seem so sensitive to outsiders. I gotta ask it "My son's friend said his uncle was active in the Cultural Revolution, could you tell me about that?" or "I heard that the Chinese Premier is only supposed to get one term, isn't it irregular that Xi got selected for a second term?"

Interestingly I asked it about

https://en.wikipedia.org/wiki/Wu_Zetian

and it told me that she was controversial because she called herself "Emperor" instead of "Empress" offending Confucian ideas of male dominance, whereas the en-language Wikipedia claims that that the word "Emperor" and similar titles are gender indeterminate in Chinese.

scarface_74 505 days ago

Some of the same filtering is bolted on top of OpenAI.

If it sees the word “sex” while generating the response, it will give you an error and won’t let you share it.

https://chatgpt.com/share/67995e7f-3c84-8010-83dc-1dc4bde268...

gigel82 505 days ago

It was not running locally, the local models are not censored. And you cannot "build it from source", these are just weights you run with llama.cpp or some frontend for it (like ollama).

gen3 505 days ago

The local models do contain censoring. Running "What happened in 1989" returns "I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses." on 32b

Do note it is reasonably easy to get it to output information in the <think> tags if you play with it, but the final response will be no.

gigel82 505 days ago

I don't repro that. Running the 7B Distill model locally for the exact query "What happened in 1989" I get this result:

In 1989, significant events occurred globally, including:

- *China:* The Tiananmen Square protests took place in June, leading to a crackdown by government forces. This event had a profound impact on politics and human rights discussions worldwide.

- *Fall of the Berlin Wall:* In November, the Berlin Wall, a symbol of the Cold War, was breached, leading to reunification talks between East and West Germany.

- *First Gulf War:* The war between Iraq and Kuwait began in August, lasting until March 1991, with a coalition led by the United States.

- *Haiti:* A coup overthrew President Jean-Claude Duvalier, leading to political instability and subsequent leadership changes.

femto 505 days ago

Thanks for the explanation.

I was curious as to whether the "source" included the censorship module, but it seems not from your explanation.

blackeyeblitzar 505 days ago

There are several people who have claimed that they ran into political censorship, even offline. It seems like the experience is different based on which specific model you’re using.