Hacker News new | ask | show | jobs
by muddi900 243 days ago
LLMs are like children; telling them to not do something puts the idea in their 'head'.

Instead, telling them to do the opposite works. "Brevity is appreaciated", or "Preserve Tokens and be concise."

1 comments

It’s called the waluigi problem and is also part of the reason why you can never fully “censor” an LLM; there is always some jailbreak possible