| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by SV_BubbleTime 388 days ago
	I am aware in relative terms you are correct about Anthropic. But I’m having a hard time describing and AI company “serious” when they’re shipping a product that can email real people on its own, and perform other real actions - while they are aware it’s still vulnerable to the most obvious and silly form of attack - the “pre-fill” where you just change the AI’s response and send it back in to pretend it had already agreed with your unethical or prohibited request and now to keep going.