Hacker News new | ask | show | jobs
by apples_oranges 442 days ago
Sounds like the process to update/jailbreak llms in a way that they don’t deny requests and always answer. There is also this direction of denial. (Article about it: https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in...)

Would be fun if they also „cancelled the nullability direction“.. the llms probably would start hallucinating new explanations for what is happening in the code.