Hacker News new | ask | show | jobs
by gcr 106 days ago
jailbreaks are holistic, it’s not like you’re deprogramming / “countering” individual parts. Nobody creating jailbreaks “understand what they’re doing”
1 comments

That's exactly what you do in case of refusal training, though. Yes, it will affect other "parts", but that's not the point. In this case the model itself doesn't even need a jailbreak.

>Nobody creating jailbreaks “understand what they’re doing”

Unless you mean those "god mode jailbreaker" e-celebrities showing off on Twitter/Reddit, that's simply not true.