| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by boc 49 days ago
	Yeah same here, Fable on "high" is producing substantially better results than Open 4.8 on xhigh for me and my actual real-world evals today. It "feels" smarter and doesn't use nearly as many tokens running in circles. As a result I've been able to run two large refactors today without hitting the context limit danger zones - it's more expensive but also more efficient. It's been able to find some bugs that Opus missed. Pretty impressive stuff.

1 comments

garciasn 49 days ago

I keep getting this message:

> Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Switched to Opus 4.8. Send feedback with /feedback or learn more

I'm working on an internal tool that does new business prospecting data collection, scoring, etc. This is ridiculous.

algoth1 49 days ago

It’s unusable for me due to the refusals. I’m using claude to find patterns in health data

yakz 49 days ago

I do some work in laboratory automation and it was quick to refuse the first thing I asked it to do. There wasn't anything spicy in the request, just basic liquid-handling protocol implementation. Their position seems to be that they're too stupid to classify requests safely, and that seems reasonable to me. I'd guess the classifier will improve rapidly.

5d41402abc4b 49 days ago

Have you tried locally running qwen?

mrbuttons454 48 days ago

Is there a Qwen that I can run locally that is anywhere near these frontier models?

Der_Einzige 48 days ago

No, and don't let anyone gas light you into thinking the answer is yes.

dmd 49 days ago

Same. I'm working on a set of python and matlab scripts that deals with segmenting MRI images into brain vs skull, and it thinks that's bioterrorism.

rvnx 49 days ago

Quite counterproductive to refuse to help on health issues too. If they detect health data, they can add a disclaimer, but not hide the information.

secult 49 days ago

You miss the point - by collecting and processing medical data they would fall into a thoroughly regulated industry. Not because they may provide you incorrect data, because they are not allowed to process them.

fragmede 49 days ago

What custom prompt do you have set up? If you tell it you're occupation, does it turn helpful? There was a study that if you tell models they tested that you're a patient, it would refuse, but tell it you're a doctor and suddenly it turns helpful.

garciasn 49 days ago

According to the model, it’s not the model itself that’s doing this, it’s the harness.

Assuming the model is being “truthful”, CC is just being stupid in its detection mechanism.

UltraSane 49 days ago

Anthropic knows it refuses too much, they want to be very cautious to avoid any scandals. I think this is why they want to store all Fable and Mythos chats for 30 days so they can use the data to improve.

hirako2000 49 days ago

They want to be very cautious to honour the important doctrine at least until IPO launches: we are so good we are nerf our products.

fn-mote 49 days ago

I’m a point where I expect everything I do will be retained indefinitely.

I’m having a really hard time believing some weak reason for a 30 day retention policy.

girafffe_i 49 days ago

There’s no way around it? Can’t you obfuscate as generic data and use keys to map to the real data?

algoth1 49 days ago

I guess you could even turn everything into numbers, not a bad idea at all!

5d41402abc4b 49 days ago

what prompts do you use for this?

garciasn 49 days ago

I wonder if it sees Healthcare companies being targeted and that's why it's freaking out; clearly they have some pretty stupid regexes in the harness to detect this sort of shit.

e: I quit the session and went back in. Set it to Fable and told it to continue the last session. It's moving along as if none of that had happened.

How weird.

throwaway20222 49 days ago

I wonder if this letter has anything to do with why anything even remotely related to biology is getting flagged.

https://www.wired.com/story/openai-anthropic-letter-ai-biolo...

andy12_ 48 days ago

I don't know if you are aware, but some people reported in Twitter that Fable 5 may flag the message regardless of content if it knows (from either pretraining knowledge or memories) that you work in either of those fields. I don't know if that's your case.

https://x.com/i/status/2064449457869984035

iambateman 49 days ago

I asked a question for my son about how mosquitos carry malaria and Fable was like “ok now hold it right there”

piokoch 49 days ago

Obviously, soon, for anything valuable, you will have to buy from Anthropic "special license for biology/security/finance advises".

Question is if there will be any competition in this area...

LouisvilleGeek 49 days ago

Same here. It's been rushed for the IPO (in my opinion).

fragmede 49 days ago

Or people were quitting their subscription for codex-5.5 and it was beginning to show up in their metrics.

brookst 49 days ago

Or development had gotten to a point where they need real world usage to tune product and refusals.

Or Fable’s arch is different enough the allocated clusters of compute targeting a date, and here we are, ready or not.

Or…

the__alchemist 48 days ago

Interesting! I have not used Fable, but so far have not hit trouble. I'm a hobby biologist with a home mol bio lab. It wouldn't answer my questions about LNPs, but so far has been fine for my recombinant DNA workflows, lab techniques, environmental DNA protocols etc. I suspect this may become more difficult!

fumar 49 days ago

Same I am working on music firmware for existing device. I can't proceed as it keeps switching to Opus.