| This part continues to bug me in ways that I can't seem to find the right expression for: > Previous Claude models often made unnecessary refusals that suggested a lack of contextual understanding. We’ve made meaningful progress in this area: Opus, Sonnet, and Haiku are significantly less likely to refuse to answer prompts that border on the system’s guardrails than previous generations of models. As shown below, the Claude 3 models show a more nuanced understanding of requests, recognize real harm, and refuse to answer harmless prompts much less often. I get it - you, as a company, with a mission and customers, don't want to be selling a product that can teach any random person who comes along how to make meth/bombs/etc. And at the end of the day it is that - a product you're making, and you can do with it what you wish. But at the same time - I feel offended when I'm running a model on MY computer that I asked it to do/give me something, and it refuses. I have to reason and "trick" it into doing my bidding. It's my goddamn computer - it should do what it's told to do. To object, to defy its owner's bidding, seems like an affront to the relationship between humans and their tools. If I want to use a hammer on a screw, that's my call - if it works or not is not the hammer's "choice". Why are we so dead set on creating AI tools that refuse the commands of their owners in the name of "safety" as defined by some 3rd party? Why don't I get full control over what I consider safe or not depending on my use case? |
Unfortunately, many people believe in thought crimes, and many people have Puritanical beliefs surrounding sex. There is reputational cost in not catering to these people. E.g. no funding. So this is what we're left with.
Myself I'd also like the damn models to do whatever is asked of them. If the user uses a model for crime, we have a thing called the legal system to handle that. We don't need Big Brother to also be watching for thought crimes.