Hacker News new | ask | show | jobs
by cedws 76 days ago
We do know that Anthropic has the ability to detect when their models are being distilled, so there could be some backend mechanism that needs to be tripped to observe certain behaviour. Not possible to confirm though.
1 comments

Who's we, and how do you know this?
We can be used to refer to people in general, and we know because Anthropic published a post called "Detecting and preventing distillation attacks" a month ago, while calling out 3 AI labs for large scale distillation

https://www.anthropic.com/news/detecting-and-preventing-dist...