| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by widerporst 844 days ago
	They claim that the new models "are significantly less likely to refuse to answer prompts that border on the system’s guardrails than previous generations of models", looks like about a third of "incorrect refusals" compared to Claude 2.1. Given that Claude 2 was completely useless because of this, this still feels like a big limitation.

2 comments

geysersam 844 days ago

The guard rails on the models make the llm-market a complete train wreck. Wish we could just collectively grow up and accept that if a computer says something bad that doesn't have any negative real world impact - unless we let it - just like literally any other tool.

link

asadotzler 844 days ago

They're not there to protect the user, they're they're to protect the brand of the provider. A bot that spits out evil shit easily screenshotted with the company's brand right there, isn't really great for growth or the company's brand both.

link

jug 844 days ago

True and this is also the reason why open source models are commonly uncensored.

It's frustrating though because these companies have the resources to do amazing things, but it's been shown that censoring an LLM can dumb it down in general, beyond what it was originally censored for.

Also, this of course. It's just a cheap bandaid to prevent the most egregious mistakes and embarrasing screenshots.

https://twitter.com/iliaishacked/status/1681953406171197440

link

xetplan 844 days ago

I don't disagree but on the other hand, I never run into problems with the language model being censored because I am not asking it to write bad words just so I can post online that it can't write bad words.

Both sides in this to me need to get a life.

link

geysersam 844 days ago

Hm, I don't buy this. The statistics shown in the blog post revealing the new Claude models (this submission) show a significant tendency to refuse to answer benign questions.

Just the fact that there's a x% risk it doesn't answer complicates any use case unnecessarily.

I'd prefer if the bots weren't antrophomized at all, no more "I'm your chatbot assistant". That's also just a marketing gimmick. It's much easier to assume something is intelligent if it has a personality.

Imagine if the models weren't even framed as AI at all. What if they were framed as 'flexi-search' a modern search engine that predicts content it hasn't yet indexed.

link

barfingclouds 844 days ago

Yeah I spent a lot of time with Claude 2 and if I hadn’t heard online that it’s “censored,” I wouldn’t have even known. It’s given me lots of useful answers in close to natural human language.

link

chaostheory 844 days ago

Yeah, no matter how advanced these AIs become, Anthropic’s guardrails make them nearly useless and a waste of time.

link