| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sawjet 195 days ago
	This is one of those things that is a feature of Claude, not a bug. Sonnet and opus 4.5 can absolutely detect prompt attacks, however they are post-trained to ignore them in let's say ... Certain scenarios... At least if you are using the API.