| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cedws 76 days ago
	We do know that Anthropic has the ability to detect when their models are being distilled, so there could be some backend mechanism that needs to be tripped to observe certain behaviour. Not possible to confirm though.

1 comments

mmaunder 76 days ago

Who's we, and how do you know this?

link

BoorishBears 76 days ago

We can be used to refer to people in general, and we know because Anthropic published a post called "Detecting and preventing distillation attacks" a month ago, while calling out 3 AI labs for large scale distillation

https://www.anthropic.com/news/detecting-and-preventing-dist...

link