Hacker News new | ask | show | jobs
by golol 598 days ago
As I understand the whole point is that it is not so simple to tell the difference between the model forgetting information and the model just learning some guardrails which orevent it from revealing that information. And this paper suggests that since the information can be recovored from the desired forgetting does not really happen.