Hacker News new | ask | show | jobs
by SV_BubbleTime 388 days ago
I am aware in relative terms you are correct about Anthropic.

But I’m having a hard time describing and AI company “serious” when they’re shipping a product that can email real people on its own, and perform other real actions - while they are aware it’s still vulnerable to the most obvious and silly form of attack - the “pre-fill” where you just change the AI’s response and send it back in to pretend it had already agreed with your unethical or prohibited request and now to keep going.