Hacker News new | ask | show | jobs
by SilverElfin 4 hours ago
That’s not what articles on it say. They say that a team of security researchers at Amazon were able to trivially jailbreak the model and it’s not as guardrailed as claimed. Articles say in particular the model was shown to be usable for identifying security holes that it was supposed to not be able to be used for. That’s why Anthropic has only given access to Mythos to some people but not everyone, right?

Personally I don’t think we should impose guardrails on something so close to speech. But I can imagine Amazon was worried about how an explosion of cybersecurity incidents might affect the world. After all, they run AWS and have good intuition for the landscape of cybersecurity. Imagine if many of their cloud customers are suddenly facing one breach after another.