Hacker News new | ask | show | jobs
by widerporst 844 days ago
They claim that the new models "are significantly less likely to refuse to answer prompts that border on the system’s guardrails than previous generations of models", looks like about a third of "incorrect refusals" compared to Claude 2.1. Given that Claude 2 was completely useless because of this, this still feels like a big limitation.
2 comments

The guard rails on the models make the llm-market a complete train wreck. Wish we could just collectively grow up and accept that if a computer says something bad that doesn't have any negative real world impact - unless we let it - just like literally any other tool.
They're not there to protect the user, they're they're to protect the brand of the provider. A bot that spits out evil shit easily screenshotted with the company's brand right there, isn't really great for growth or the company's brand both.
True and this is also the reason why open source models are commonly uncensored.

It's frustrating though because these companies have the resources to do amazing things, but it's been shown that censoring an LLM can dumb it down in general, beyond what it was originally censored for.

Also, this of course. It's just a cheap bandaid to prevent the most egregious mistakes and embarrasing screenshots.

https://twitter.com/iliaishacked/status/1681953406171197440

I don't disagree but on the other hand, I never run into problems with the language model being censored because I am not asking it to write bad words just so I can post online that it can't write bad words.

Both sides in this to me need to get a life.

Hm, I don't buy this. The statistics shown in the blog post revealing the new Claude models (this submission) show a significant tendency to refuse to answer benign questions.

Just the fact that there's a x% risk it doesn't answer complicates any use case unnecessarily.

I'd prefer if the bots weren't antrophomized at all, no more "I'm your chatbot assistant". That's also just a marketing gimmick. It's much easier to assume something is intelligent if it has a personality.

Imagine if the models weren't even framed as AI at all. What if they were framed as 'flexi-search' a modern search engine that predicts content it hasn't yet indexed.

Yeah I spent a lot of time with Claude 2 and if I hadn’t heard online that it’s “censored,” I wouldn’t have even known. It’s given me lots of useful answers in close to natural human language.
Yeah, no matter how advanced these AIs become, Anthropic’s guardrails make them nearly useless and a waste of time.