|
|
|
|
|
by ben_w
1177 days ago
|
|
While I share your belief, I am unaware of any proof that such censorship would actually fail as an alignment method. Nor even how much impact it would have on capabilities. Of course, to actually function this would also need to e.g. filter out soap operas, murder mysteries, and action films, lest it overestimate the frequency and underestimate the impact of homicide. |
|
You: "What is grblf?"
As parents, my wife and I go through this on a daily basis. We have to explain what the behavior is, and why it is unacceptable or harmful.
The reason LLM models have such trouble with this is because LLMs have no theory of mind. They cannot project that text they generate will be read, conceptualized, and understood by a living being in a way that will harm them, or cause them to harm others.
Either way, censorship is definitely not the answer.