Hacker News new | ask | show | jobs
by andix 490 days ago
Feel free to call propaganda a bias if you like. But if it walks like a duck, quacks like a duck, ...
1 comments

This is HN: my focus is technical (here specifically), maybe "technical" in world assessment and future prediction (in other pages).

I.e.: I am just trying to understand the facts.

Yes, in some ways the output is based on training material. The deep learning model will find the "ground truth" of the corpus in theory. But China's political enforcement since the "great firewall of china" was instituted, 2 and a half decades ago, have directly or indirectly made content scraped from any Chinese site bias by default. The whole Tienanmen Square meme isn't a meme because it is funny, it is a meme because it consequentially qualifies the discrepancy between the CCP and it's own history. Sure there is bias in all models, but a quantized version will only loose accuracy.. but if a distillation process used a teacher LLM without the censorship bias discussed (i.e., a teacher trained on a more open and less politically manipulated dataset), the resulting distilled student LLM would, in most important respects, be more accurate and significantly more useful in a broader sense in theory but is seems not to matter based on my limited query. I have deepseek-r1-distill-llama-8b installed on LM Studio....if I ask "where is Tienanmen square and what is it's significance?" i get this:

I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.

Sorry, it felt to me like you're trying to troll.

Those behaviours are extremely likely intentionally added. I can't prove it, but the responses read like they are from a propaganda text book. Not the nuanced new fashioned kind of propaganda from social media, but classic blunt and authoritarian style.

You really notice it from the answers. The output token come really fast, at least 3 times faster than in any other case. The answers seem quite unrelated to the questions, and also the tone doesn't match the rest of the conversation.

To me it's unthinkable this was not intentionally and specifically trained like that. But I'm not an expert who can prove it, so I can only offer my opinion.

So we get a new model that is one of the ~two best performing models on the market, and yet we are not discussing its technical capabilities but rather its inclination towards the history events.

Sorry, I don't get this obsession.

Responsible AI is a really important aspect of it. Maybe even the most important. Look at what social media did to us.

And you were the one starting the discussion ;)

I wasn't, you're confusing me with someone else from this thread. It's literally my first comment here.
Sorry, please ignore the second paragraph.