| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jsw97 49 days ago
	Given the high rate of false positives people are reporting for the non-silent cybersecurity, biological, etc., safeguards, there is a strong likelihood that you will encounter silently nerfed behavior even if you are _not_ violating their TOS. Ultimately this will be evident in the way customers / external benchmarkers experience Fable. Hopefully competition will drive future models toward a lower false positive rate. Until that happens, Mythos and Fable users seem likely to have pretty divergent experiences.

5 comments

nsingh2 49 days ago

It's such an obviously bad policy, it's mind-boggling that they thought this was a good idea. It just breeds paranoia and mistrust, especially when people are already a bit paranoid about silent model quantification for cost cutting reasons.

link

SXX 49 days ago

Its not pranoia when entity you are dealing with cant be trusted and will do everything to abuse your trust.

link

llelouch 49 days ago

What's the alternative? Not release the model at all?

"Make the guardrails better" isn't very hard and probably not worth the effort.

link

hagbarth 49 days ago

The alternative is to be explicit when you nerf, so users know what they are working with.

link

port11 49 days ago

I guess people would just game the system and find ways around these guardrails.

link

rootlocus 49 days ago

They have enough info on you and your sessions to eventually catch you, label you as bad faith actor and ban you automatically. I don't think many would risk it.

link

schnitzelstoat 49 days ago

That seems to be working well for Mythos. Just never release it and keep talking about how 'dangerous' it is to pump up the IPO price.

link

SamvitJ 49 days ago

Do you mean "quantization" not quantification?

link

nsingh2 49 days ago

Yup, I meant to write quantization there.

link

KennyBlanken 49 days ago

Another "knob" is reducing the thinking time...

link

azalemeth 49 days ago

I'm a medical physicist. I use the word nuclear a lot. Opus is fine (well, 99% of the time - I've certainly hit the CBRN filters a few times and even been invited to email anthropic about the false positives).

Fable has literally refused to work on any of my problems (even those about fluid dynamics!) and just tells me that I'm violating anthropic's AUP.

link

jsw97 49 days ago

This problem is compounded by the fact that you can be banned (really by any provider) based on an algorithm, and the methods for restoring your account seem like they do not function as well as might be desired. So be careful with your queries, basically, or you might get locked out.

link

imrehg 49 days ago

I encountered this when I was checking why my gluten-free bread came out the bread machine the way it did. I guess it latched onto some yeast-related points and it fell back to Opus...

Having said that, on this query I've seen very little difference in the quality, there's nothing to be "2x as good on" for the "2x quota usage", so shrugs?

link

KennyBlanken 49 days ago

If a benchmark is affected the model owner will almost certainly tune it, so there will be a game of cat and mouse...

Honestly, wouldn't surprise me if the AI companies try to detect benchmarking. Most hardware companies do...

link

supriyo-biswas 49 days ago

I mean, the other day I got blocked from Claude for asking about releasing genetically modified sterile mosquitoes; I'm sure everything will be totally fine as Anthropic's restrictions are completely reasonable, measured and appropriate.

link