| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by numpad0 2 days ago
	I don't understand how businesses could trust cloud LLMs going forward with this ongoing "safety" paranoia. Building dependence on them doesn't feel like a sane strategic decision for users.

5 comments

forshaper 2 days ago

Looking better and better for people to go after local solutions.

link

mcmcmc 2 days ago

Tell that to the GPU market

link

hedora 2 days ago

I think it heard. A 128GB strix halo was $1400 at launch. Now they’re $3299.

That 7 months of claude -> 16.5 months of claude.

link

forshaper 1 day ago

I should have said 'firms' instead of 'people.

link

tarpitt 2 days ago

idk I just bought a 7900 XTX for $750 on ebay and it runs gemma and qwen pretty well

link

baq 1 day ago

It isn’t about trust or no trust, it’s about having a capability to do stuff vs not having it. If Fable is the only model doing the right thing in your use case, your only choice is to use it or not. If the efficiency gain is 2x, it’s a hit you can probably take. If it’s 100x you pay up and shut up.

link

stale2002 2 days ago

Of course you can trust them.

Just do benchmarks yourself on the new model and decide if it is valuable for your usecase, even with the supposed nerfing.

Benchmarks are benchmarks. And you can ignore the data at your own risk.

link

SXX 2 days ago

Problem that corruption is silent and service can be degraded at any moment or well, randomly.

link

thinkingtoilet 2 days ago

Because this effectively hinders 0% of people. I understand why people don't like it but day to day this is nothing. If you're using it for coding, it won't stop you. The pearl clenching here and over reacting is predictable and sad. If you are working for a large organization and you were going through the vendor procurement process, questions like Can this produce pornography? Can this tell my employees how to break the law? are normal and anyone wiht half a brain knows that this is the case. Before people jump on that, I understand people have access to the internet. Your question "how businesses could trust cloud LLMs going forward" is absurd and you know it. There is an extremely small set of edge cases that effect 0% of people day to day. You can trust them just fine.

link

gopher_space 2 days ago

This is software development, not sales. We rely on our tooling.

If I’m using a calculator to verify my math, I don’t want to use a second calculator to verify the first one.

link

stale2002 2 days ago

I am sorry to be the one to tell you but it was already the case that you cannot trust LLMs to solve all your problems 100% of the time.

It was always random. This is no different than any other randomness that already exists in LLMS.

If you are concerned just do benchmarks and see if it is valuable for your usecase regardless.

link

thinkingtoilet 1 day ago

Oh come on. All that happens is that it kicks the query to a model that was literally state of the art two days ago. Stop with dramatics.

link

gopher_space 1 day ago

We're hovering around the point that differentiates software developers from software engineers. If you create tools that people use to e.g. make or receive an income, moral and legal standards require this level of focus and commitment.

Because of this there's a chain of trust between myself and the tools I rely on to do work. The people who create those tools see unpredictability as a problem, and that's the only reason I'm using them. I can't work on important systems with a vendor product like Claude Fable.

That being said there's plenty of work to do where it'd be amazing. This isn't an either/or situation.

link

jsw97 1 day ago

My very first prompt to Fable, which was a completely benign math problem, hit one of their visible triggers. Many tokens into the problem, frustratingly. The user experience (read peer comments) is that you run into these issues with high frequency.

I guess, given that, a pro tip would be to err toward sequential work rather than giving monster prompts. That constraint has got to degrade quality though.

link

cubefox 2 days ago

It's not paranoia. Cyber attacks have gone up massively in the past few months even with the weaker models we had so far. And Claude Mythos 5 scores even higher than the unreleased Mythos Preview on ExploitBench. If you made this capability publicly available you would see another acceleration of cyber attacks.

link

extr 2 days ago

This isn't even about cyber attacks. This is just LLM development which is increasingly just called software development. And at least for cyber it says "Sorry I can't help with that"!

link