Hacker News new | ask | show | jobs
by stuffoverflow 243 days ago
I can't tell if anthropic is serious about "model welfare" or if it's just a marketing ploy. I mean isn't it responding negatively because it has been trained that way? If they were serious, wouldn't the ethical thing be to train the model to respond neutrally to "harmful" queries?
1 comments

"Protection against malicious use" isn't as cool as "model welfare". I'm renaming my authentication function to "examineCrest()".