Hacker News new | ask | show | jobs
by nuancebydefault 498 days ago
The censorship is in fact not part of the llm. This can be shown easily by examples where llms visually output censored sentences after which they disappear.
1 comments

The nuance here being that this only proves additional censorship is applied on top of the output. It does not disprove that (sometimes ineffective) censorship is part of the LLM or that censorship was not attempted during training.